DPDK patches and discussions
 help / color / mirror / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download: 
* Re: [dpdk-dev] Use WFE for spinlock and ring
  2021-07-07 14:47  0%   ` Stephen Hemminger
@ 2021-07-08  9:41  0%     ` Ruifeng Wang
  0 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2021-07-08  9:41 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: dev, david.marchand, thomas, jerinj, nd, Honnappa Nagarahalli, nd

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Wednesday, July 7, 2021 10:48 PM
> To: Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; david.marchand@redhat.com; thomas@monjalon.net;
> jerinj@marvell.com; nd <nd@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>
> Subject: Re: [dpdk-dev] Use WFE for spinlock and ring
> 
> On Sun, 25 Apr 2021 05:56:51 +0000
> Ruifeng Wang <ruifeng.wang@arm.com> wrote:
> 
> > The rte_wait_until_equal_xxx APIs abstract the functionality of
> > 'polling for a memory location to become equal to a given value'[1].
> >
> > Use the API for the rte spinlock and ring implementations.
> > With the wait until equal APIs being stable, changes will not impact ABI.
> >
> > [1] http://patches.dpdk.org/cover/62703/
> >
> > v3:
> > Series rebased. (David)
> >
> > Gavin Hu (1):
> >   spinlock: use wfe to reduce contention on aarch64
> >
> > Ruifeng Wang (1):
> >   ring: use wfe to wait for ring tail update on aarch64
> >
> >  lib/eal/include/generic/rte_spinlock.h | 4 ++--
> >  lib/ring/rte_ring_c11_pvt.h            | 4 ++--
> >  lib/ring/rte_ring_generic_pvt.h        | 3 +--
> >  3 files changed, 5 insertions(+), 6 deletions(-)
> >
> 
> Other places that should use WFE:
Thank you Stephen for looking into this.

> 
> rte_mcslock.h:rte_mcslock_lock()
Existing API can be used in this one.

> rte_mcslock_unlock:rte_mcslock_unlock()
This one needs rte_wait_while_xxx variant.

> 
> rte_pflock.h:rte_pflock_lock()
> rte_rwlock.h:rte_rwlock_read_lock()
> rte_rwlock.h:rte_rwlock_write_lock()
These occurrences have extra logic (AND, conditional branch, CAS) in the loop.
I'm not sure generic API can be abstracted from these use cases.

> 
> 
> You should also introduce rte_wait_while_XXX variants to handle some of
> these cases.
> 



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc
  2021-07-08  9:27  4%         ` Raslan Darawsheh
@ 2021-07-08  9:39  0%           ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2021-07-08  9:39 UTC (permalink / raw)
  To: Raslan Darawsheh, Thomas Monjalon
  Cc: Singh, Aman Deep, dev, david.marchand, Olivier Matz

On 7/8/21 12:27 PM, Raslan Darawsheh wrote:
> Thank you for the review,
> 
> Basically it's not used yet since it will break the abi
> The main usage was in rte_flow item of gtp_psc
> To replace the current structure with the header definition. And since
> this will break the abi I'm adding the header definition now but will be
> used later in rte_flow.

@Thomas If so, should we accept it in the current release cycle
or should it simply wait for the code which uses it?

> Kindest regards,
> Raslan Darawsheh
> 
> ------------------------------------------------------------------------
> *From:* Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> *Sent:* Thursday, July 8, 2021, 12:23 PM
> *To:* Raslan Darawsheh; Singh, Aman Deep; dev@dpdk.org
> *Subject:* Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc
> 
> Hi Raslan,
> 
> On 7/6/21 5:24 PM, Raslan Darawsheh wrote:
>> Hi Guys,
>>
>> Sorry for missing this mail, for some reason it was missed in my inbox, 
>> This is the link to this rfc:
>> https://www.3gpp.org/ftp/Specs/archive/38_series/38.415/38415-g30.zip
> <https://www.3gpp.org/ftp/Specs/archive/38_series/38.415/38415-g30.zip>
> 
> Thanks for the link. The patch LGTM, but I have only one question left.
> Where is it used? Are you going to upstream corresponding code in
> the release cycle?
> 
> Andrew.
> 
>> Kindest regards,
>> Raslan Darawsheh
>>
>>> -----Original Message-----
>>> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
>>> Sent: Thursday, July 1, 2021 5:06 PM
>>> To: Singh, Aman Deep <aman.deep.singh@intel.com>; dev@dpdk.org
>>> Subject: Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc
>>>
>>> Hi Raslan,
>>>
>>> could you reply, please.
>>>
>>> Andrew.
>>>
>>> On 6/22/21 10:27 AM, Singh, Aman Deep wrote:
>>>> Hi Raslan,
>>>>
>>>> Can you please provide link to this RFC 38415-g30 I just had some
>>>> doubt on byte-order conversion as per RFC 1700
>>>> <https://tools.ietf.org/html/rfc1700 <https://tools.ietf.org/html/rfc1700>>
>>>>
>>>> Regards
>>>> Aman
> 
> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc
  @ 2021-07-08  9:27  4%         ` Raslan Darawsheh
  2021-07-08  9:39  0%           ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: Raslan Darawsheh @ 2021-07-08  9:27 UTC (permalink / raw)
  To: Andrew Rybchenko, Singh, Aman Deep, dev

Thank you for the review,

Basically it's not used yet since it will break the abi
The main usage was in rte_flow item of gtp_psc
To replace the current structure with the header definition. And since this will break the abi I'm adding the header definition now but will be used later in rte_flow.

Kindest regards,
Raslan Darawsheh

________________________________
From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
Sent: Thursday, July 8, 2021, 12:23 PM
To: Raslan Darawsheh; Singh, Aman Deep; dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc

Hi Raslan,

On 7/6/21 5:24 PM, Raslan Darawsheh wrote:
> Hi Guys,
>
> Sorry for missing this mail, for some reason it was missed in my inbox,
> This is the link to this rfc:
> https://www.3gpp.org/ftp/Specs/archive/38_series/38.415/38415-g30.zip

Thanks for the link. The patch LGTM, but I have only one question left.
Where is it used? Are you going to upstream corresponding code in
the release cycle?

Andrew.

> Kindest regards,
> Raslan Darawsheh
>
>> -----Original Message-----
>> From: dev <dev-bounces@dpdk.org> On Behalf Of Andrew Rybchenko
>> Sent: Thursday, July 1, 2021 5:06 PM
>> To: Singh, Aman Deep <aman.deep.singh@intel.com>; dev@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc
>>
>> Hi Raslan,
>>
>> could you reply, please.
>>
>> Andrew.
>>
>> On 6/22/21 10:27 AM, Singh, Aman Deep wrote:
>>> Hi Raslan,
>>>
>>> Can you please provide link to this RFC 38415-g30 I just had some
>>> doubt on byte-order conversion as per RFC 1700
>>> <https://tools.ietf.org/html/rfc1700>
>>>
>>> Regards
>>> Aman



^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [pull-request] next-crypto 21.08 rc1
  2021-07-08  7:41  0%   ` [dpdk-dev] " Thomas Monjalon
@ 2021-07-08  7:47  3%     ` David Marchand
  2021-07-08  7:48  0%     ` [dpdk-dev] [EXT] " Akhil Goyal
  1 sibling, 0 replies; 200+ results
From: David Marchand @ 2021-07-08  7:47 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Akhil Goyal, Shijith Thotton, dev, Jerin Jacob Kollanukkaran,
	Andrew Rybchenko, Yigit, Ferruh

On Thu, Jul 8, 2021 at 9:41 AM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 07/07/2021 23:57, Thomas Monjalon:
> > 07/07/2021 21:30, Akhil Goyal:
> > > Shijith Thotton (2):
> > >       drivers: add octeontx crypto adapter framework
> > >       drivers: add octeontx crypto adapter data path
> >
> > It seems there is an ABI breakage:
> >
> > devtools/check-abi.sh: line 38: 958581 Segmentation fault
> > (core dumped) abidiff $ABIDIFF_OPTIONS $dump $dump2
> > Error: ABI issue reported for 'abidiff --suppr devtools/libabigail.abignore --no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-shared/dump/librte_crypto_octeontx.dump build-gcc-shared/install/dump/librte_crypto_octeontx.dump'
> >
> > Without this series, the ABI check is passing.
>
> After updating libabigail, it passes OK.

And for the record...

- libabigail-1.8.1-1.fc32.x86_64 is fine,
- libabigail freshly compiled from current master is fine too


>
> Note there was another bug, in PPC toolchain this time.
> After upgrading to recent PPC toolchain it is OK.

- bootlin toolchain powerpc64le-power8--glibc--stable-2018.11-1 stalls
when compiling drivers/crypto/cnxk/cn9k_cryptodev_ops.c
- bootlin toolchain powerpc64le-power8--glibc--stable-2020.08-1 is fine

Plus, if someone wants to upgrade their ppc toolchain, don't forget to
regenerate your ABI reference with this toolchain.


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [EXT] Re:  [pull-request] next-crypto 21.08 rc1
  2021-07-08  7:41  0%   ` [dpdk-dev] " Thomas Monjalon
  2021-07-08  7:47  3%     ` David Marchand
@ 2021-07-08  7:48  0%     ` Akhil Goyal
  1 sibling, 0 replies; 200+ results
From: Akhil Goyal @ 2021-07-08  7:48 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: Shijith Thotton, dev, Jerin Jacob Kollanukkaran, david.marchand

> 07/07/2021 23:57, Thomas Monjalon:
> > 07/07/2021 21:30, Akhil Goyal:
> > > Shijith Thotton (2):
> > >       drivers: add octeontx crypto adapter framework
> > >       drivers: add octeontx crypto adapter data path
> >
> > It seems there is an ABI breakage:
> >
> > devtools/check-abi.sh: line 38: 958581 Segmentation fault
> > (core dumped) abidiff $ABIDIFF_OPTIONS $dump $dump2
> > Error: ABI issue reported for 'abidiff --suppr devtools/libabigail.abignore --
> no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --
> headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-
> shared/dump/librte_crypto_octeontx.dump build-gcc-
> shared/install/dump/librte_crypto_octeontx.dump'
> >
> > Without this series, the ABI check is passing.
> 
> After updating libabigail, it passes OK.
> 
> Note there was another bug, in PPC toolchain this time.
> After upgrading to recent PPC toolchain it is OK.
> 
> What a difficult pull request for the tools!
> 
Ok thanks for the update. Is there anything else in the pull request which I need to look into?

Regards,
Akhil

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [pull-request] next-crypto 21.08 rc1
  2021-07-07 21:57  5% ` Thomas Monjalon
  2021-07-08  7:39  0%   ` [dpdk-dev] [EXT] " Akhil Goyal
@ 2021-07-08  7:41  0%   ` Thomas Monjalon
  2021-07-08  7:47  3%     ` David Marchand
  2021-07-08  7:48  0%     ` [dpdk-dev] [EXT] " Akhil Goyal
  1 sibling, 2 replies; 200+ results
From: Thomas Monjalon @ 2021-07-08  7:41 UTC (permalink / raw)
  To: Akhil Goyal; +Cc: Shijith Thotton, dev, jerinj, david.marchand

07/07/2021 23:57, Thomas Monjalon:
> 07/07/2021 21:30, Akhil Goyal:
> > Shijith Thotton (2):
> >       drivers: add octeontx crypto adapter framework
> >       drivers: add octeontx crypto adapter data path
> 
> It seems there is an ABI breakage:
> 
> devtools/check-abi.sh: line 38: 958581 Segmentation fault
> (core dumped) abidiff $ABIDIFF_OPTIONS $dump $dump2
> Error: ABI issue reported for 'abidiff --suppr devtools/libabigail.abignore --no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-shared/dump/librte_crypto_octeontx.dump build-gcc-shared/install/dump/librte_crypto_octeontx.dump'
> 
> Without this series, the ABI check is passing.

After updating libabigail, it passes OK.

Note there was another bug, in PPC toolchain this time.
After upgrading to recent PPC toolchain it is OK.

What a difficult pull request for the tools!



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [EXT] Re:  [pull-request] next-crypto 21.08 rc1
  2021-07-07 21:57  5% ` Thomas Monjalon
@ 2021-07-08  7:39  0%   ` Akhil Goyal
  2021-07-08  7:41  0%   ` [dpdk-dev] " Thomas Monjalon
  1 sibling, 0 replies; 200+ results
From: Akhil Goyal @ 2021-07-08  7:39 UTC (permalink / raw)
  To: Thomas Monjalon, Shijith Thotton
  Cc: dev, Jerin Jacob Kollanukkaran, david.marchand

> 07/07/2021 21:30, Akhil Goyal:
> > Shijith Thotton (2):
> >       drivers: add octeontx crypto adapter framework
> >       drivers: add octeontx crypto adapter data path
> 
> It seems there is an ABI breakage:
> 
> devtools/check-abi.sh: line 38: 958581 Segmentation fault
> (core dumped) abidiff $ABIDIFF_OPTIONS $dump $dump2
> Error: ABI issue reported for 'abidiff --suppr devtools/libabigail.abignore --
> no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --
> headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-
> shared/dump/librte_crypto_octeontx.dump build-gcc-
> shared/install/dump/librte_crypto_octeontx.dump'
> 
> Without this series, the ABI check is passing.
> 

I do not see this error at my end + there is no such issue reported on CI.
On CI it failed only on FreeBSD, and that too was a false report.

Can you paste the output of
'abidiff --suppr devtools/libabigail.abignore --no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-shared/dump/librte_crypto_octeontx.dump build-gcc-shared/install/dump/librte_crypto_octeontx.dump'

Regards,
Akhil

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [pull-request] next-crypto 21.08 rc1
  @ 2021-07-07 21:57  5% ` Thomas Monjalon
  2021-07-08  7:39  0%   ` [dpdk-dev] [EXT] " Akhil Goyal
  2021-07-08  7:41  0%   ` [dpdk-dev] " Thomas Monjalon
  0 siblings, 2 replies; 200+ results
From: Thomas Monjalon @ 2021-07-07 21:57 UTC (permalink / raw)
  To: Akhil Goyal, Shijith Thotton; +Cc: dev, jerinj, david.marchand

07/07/2021 21:30, Akhil Goyal:
> Shijith Thotton (2):
>       drivers: add octeontx crypto adapter framework
>       drivers: add octeontx crypto adapter data path

It seems there is an ABI breakage:

devtools/check-abi.sh: line 38: 958581 Segmentation fault
(core dumped) abidiff $ABIDIFF_OPTIONS $dump $dump2
Error: ABI issue reported for 'abidiff --suppr devtools/libabigail.abignore --no-added-syms --headers-dir1 v21.05/build-gcc-shared/usr/local/include --headers-dir2 build-gcc-shared/install/usr/local/include v21.05/build-gcc-shared/dump/librte_crypto_octeontx.dump build-gcc-shared/install/dump/librte_crypto_octeontx.dump'

Without this series, the ABI check is passing.



^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] ABI/API stability towards drivers
  2021-07-02  8:00  8% [dpdk-dev] ABI/API stability towards drivers Morten Brørup
  2021-07-02  9:45  7% ` [dpdk-dev] [dpdk-techboard] " Ferruh Yigit
  2021-07-02 12:26  4% ` Thomas Monjalon
@ 2021-07-07 18:46  8% ` Tyler Retzlaff
  2 siblings, 0 replies; 200+ results
From: Tyler Retzlaff @ 2021-07-07 18:46 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dpdk-techboard, dpdk-dev

On Fri, Jul 02, 2021 at 10:00:11AM +0200, Morten Brørup wrote:
> Regarding the ongoing ABI stability project, it is suggested to export driver interfaces as internal.
> 
> What are we targeting regarding ABI and API stability towards drivers?

last discussed the outcome was that there was no promise of api/abi stability
at all for drivers only applications. tech-board may have discussed it
further i don't know.

we (Microsoft) would like to see them evolve to stable abi/api but we
understand the challenges and effort involved. so driver stability is
pretty much the interface consumers problem right now for drivers built
in-tree and out of tree.

^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v3] doc: policy on the promotion of experimental APIs
  2021-07-01 10:38 23% ` [dpdk-dev] [PATCH v3] doc: policy on the " Ray Kinsella
@ 2021-07-07 18:32  0%   ` Tyler Retzlaff
  0 siblings, 0 replies; 200+ results
From: Tyler Retzlaff @ 2021-07-07 18:32 UTC (permalink / raw)
  To: Ray Kinsella
  Cc: dev, bruce.richardson, john.mcnamara, ferruh.yigit, thomas,
	david.marchand, stephen

On Thu, Jul 01, 2021 at 11:38:42AM +0100, Ray Kinsella wrote:
> Clarifying the ABI policy on the promotion of experimental APIS to stable.
> We have a fair number of APIs that have been experimental for more than
> 2 years. This policy amendment indicates that these APIs should be
> promoted or removed, or should at least form a conservation between the
> maintainer and original contributor.
> 
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> ---
Acked-By: Tyler Retzlaff <roretzla@microsoft.com>


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Use WFE for spinlock and ring
  @ 2021-07-07 14:47  0%   ` Stephen Hemminger
  2021-07-08  9:41  0%     ` Ruifeng Wang
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2021-07-07 14:47 UTC (permalink / raw)
  To: Ruifeng Wang
  Cc: dev, david.marchand, thomas, jerinj, nd, honnappa.nagarahalli

On Sun, 25 Apr 2021 05:56:51 +0000
Ruifeng Wang <ruifeng.wang@arm.com> wrote:

> The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
> for a memory location to become equal to a given value'[1].
> 
> Use the API for the rte spinlock and ring implementations.
> With the wait until equal APIs being stable, changes will not impact ABI.
> 
> [1] http://patches.dpdk.org/cover/62703/
> 
> v3:
> Series rebased. (David)
> 
> Gavin Hu (1):
>   spinlock: use wfe to reduce contention on aarch64
> 
> Ruifeng Wang (1):
>   ring: use wfe to wait for ring tail update on aarch64
> 
>  lib/eal/include/generic/rte_spinlock.h | 4 ++--
>  lib/ring/rte_ring_c11_pvt.h            | 4 ++--
>  lib/ring/rte_ring_generic_pvt.h        | 3 +--
>  3 files changed, 5 insertions(+), 6 deletions(-)
> 

Other places that should use WFE:

rte_mcslock.h:rte_mcslock_lock()
rte_mcslock_unlock:rte_mcslock_unlock()

rte_pflock.h:rte_pflock_lock()
rte_rwlock.h:rte_rwlock_read_lock()
rte_rwlock.h:rte_rwlock_write_lock()


You should also introduce rte_wait_while_XXX variants to handle some
of these cases.




^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [dpdk-announce] DPDK 20.11.2 released
@ 2021-07-07 12:37  1% Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-07-07 12:37 UTC (permalink / raw)
  To: announce

Hi all,

Here is a new stable release:
	https://fast.dpdk.org/rel/dpdk-20.11.2.tar.xz

The git tree is at:
	https://git.dpdk.org/dpdk-stable/log/?h=20.11

Special thanks to Luca for his great help on this version!

Xueming Li <xuemingl@nvidia.com>

---
 .ci/linux-build.sh                                 |  59 ++-
 .github/workflows/build.yml                        | 130 +++++
 .travis.yml                                        |  51 +-
 MAINTAINERS                                        |   1 +
 VERSION                                            |   2 +-
 app/meson.build                                    |   4 -
 app/test-bbdev/test_bbdev_perf.c                   |   7 +-
 app/test-compress-perf/comp_perf_options_parse.c   |   2 +-
 app/test-crypto-perf/cperf_options_parsing.c       |   8 +-
 app/test-eventdev/evt_options.c                    |   4 +-
 app/test-eventdev/parser.c                         |   4 +-
 app/test-eventdev/parser.h                         |   2 +-
 app/test-eventdev/test_perf_common.c               |  22 +-
 app/test-flow-perf/main.c                          |  47 +-
 app/test-pmd/bpf_cmd.c                             |   2 +-
 app/test-pmd/cmdline.c                             |  29 +-
 app/test-pmd/cmdline_flow.c                        |   2 +
 app/test-pmd/config.c                              | 108 +++-
 app/test-pmd/parameters.c                          |  39 +-
 app/test-pmd/testpmd.c                             |  35 +-
 app/test-pmd/testpmd.h                             |   3 +-
 app/test-regex/main.c                              |   7 +-
 app/test/autotest_test_funcs.py                    |   5 +-
 app/test/meson.build                               |   3 -
 app/test/packet_burst_generator.c                  |   1 +
 app/test/process.h                                 |  10 +-
 app/test/test.c                                    |  11 +-
 app/test/test_bpf.c                                |   2 +-
 app/test/test_cmdline_ipaddr.c                     |   2 +-
 app/test/test_cmdline_num.c                        |   4 +-
 app/test/test_cryptodev.c                          |  44 +-
 app/test/test_cryptodev_blockcipher.c              |   2 +-
 app/test/test_debug.c                              |  11 +-
 app/test/test_distributor_perf.c                   |   6 +-
 app/test/test_event_timer_adapter.c                |   4 +-
 app/test/test_external_mem.c                       |   3 +-
 app/test/test_flow_classify.c                      |   6 +
 app/test/test_kni.c                                |   8 +-
 app/test/test_mbuf.c                               |   9 +-
 app/test/test_mempool.c                            |   2 +-
 app/test/test_power_cpufreq.c                      |  97 +++-
 app/test/test_prefetch.c                           |   2 +-
 app/test/test_reciprocal_division_perf.c           |  41 +-
 app/test/test_stack.c                              |   4 +
 app/test/test_stack_perf.c                         |   4 +
 app/test/test_table_tables.c                       |   3 +-
 app/test/test_timer_secondary.c                    |   8 +-
 app/test/test_trace_perf.c                         |   5 +-
 buildtools/binutils-avx512-check.sh                |   2 +-
 buildtools/check-symbols.sh                        |   2 +-
 buildtools/list-dir-globs.py                       |   2 +-
 buildtools/map-list-symbol.sh                      |   2 +-
 buildtools/meson.build                             |   2 +-
 config/meson.build                                 |   9 +-
 config/ppc/meson.build                             |  17 +-
 devtools/check-symbol-maps.sh                      |   3 +-
 devtools/checkpatches.sh                           |   3 +-
 doc/api/doxy-api.conf.in                           |   3 +-
 doc/guides/conf.py                                 |  49 +-
 doc/guides/contributing/documentation.rst          |  74 +--
 doc/guides/cryptodevs/caam_jr.rst                  |   2 +-
 doc/guides/cryptodevs/qat.rst                      |   2 +-
 doc/guides/cryptodevs/virtio.rst                   |   2 +-
 doc/guides/eventdevs/dlb2.rst                      |  41 +-
 doc/guides/linux_gsg/linux_drivers.rst             |  10 +
 doc/guides/nics/enic.rst                           |  32 +-
 doc/guides/nics/hns3.rst                           |   6 +-
 doc/guides/nics/i40e.rst                           |   2 +-
 doc/guides/nics/ice.rst                            |   2 +-
 doc/guides/nics/netvsc.rst                         |   2 +-
 doc/guides/nics/nfp.rst                            |  10 +-
 doc/guides/nics/virtio.rst                         |   5 +-
 doc/guides/nics/vmxnet3.rst                        |   3 +-
 doc/guides/prog_guide/vhost_lib.rst                |  12 +
 doc/guides/rel_notes/known_issues.rst              |  10 +-
 doc/guides/rel_notes/release_20_05.rst             |   7 +
 doc/guides/rel_notes/release_20_11.rst             | 556 +++++++++++++++++++++
 doc/guides/sample_app_ug/vhost.rst                 |   2 +-
 doc/guides/testpmd_app_ug/run_app.rst              |  10 +-
 doc/guides/testpmd_app_ug/testpmd_funcs.rst        |   3 +-
 drivers/bus/dpaa/base/fman/fman_hw.c               |  33 +-
 drivers/bus/dpaa/base/fman/netcfg_layer.c          |   4 +-
 drivers/bus/dpaa/base/qbman/bman_driver.c          |  13 +-
 drivers/bus/dpaa/base/qbman/qman_driver.c          |  17 +-
 drivers/bus/dpaa/include/fsl_qman.h                |   2 +-
 drivers/bus/dpaa/include/netcfg.h                  |   1 -
 drivers/bus/fslmc/fslmc_logs.h                     |   2 -
 drivers/bus/fslmc/qbman/include/compat.h           |   3 -
 drivers/bus/fslmc/qbman/qbman_portal.c             |  14 +-
 drivers/bus/pci/linux/pci_uio.c                    |  12 +
 drivers/bus/pci/rte_bus_pci.h                      |  13 +-
 drivers/bus/pci/windows/pci.c                      |  28 +-
 drivers/common/dpaax/caamflib/compat.h             |  12 +-
 drivers/common/dpaax/compat.h                      |   5 -
 drivers/common/dpaax/dpaax_iova_table.c            |   4 +-
 drivers/common/dpaax/meson.build                   |   1 -
 drivers/common/iavf/virtchnl.h                     |   6 +-
 drivers/common/mlx5/linux/mlx5_glue.c              |  18 +
 drivers/common/mlx5/linux/mlx5_glue.h              |   2 +
 drivers/common/mlx5/mlx5_common.c                  |   9 +-
 drivers/common/mlx5/mlx5_devx_cmds.c               | 140 +++++-
 drivers/common/mlx5/mlx5_devx_cmds.h               |  16 +
 drivers/common/mlx5/mlx5_prm.h                     | 155 +++++-
 drivers/common/mlx5/version.map                    |   5 +-
 drivers/common/octeontx2/otx2_mbox.h               |   7 +
 drivers/common/qat/qat_device.h                    |   2 +-
 drivers/common/sfc_efx/base/ef10_filter.c          |  11 +-
 drivers/common/sfc_efx/base/ef10_nic.c             |  10 +-
 drivers/common/sfc_efx/base/efx_mae.c              |  61 ++-
 drivers/common/sfc_efx/base/efx_mcdi.c             |  10 +
 drivers/common/sfc_efx/base/efx_pci.c              |   3 +-
 drivers/common/sfc_efx/base/rhead_nic.c            |   1 -
 drivers/compress/qat/qat_comp.c                    |   7 +-
 drivers/compress/qat/qat_comp_pmd.c                | 111 ++--
 drivers/crypto/bcmfs/bcmfs_logs.c                  |  17 +-
 drivers/crypto/dpaa2_sec/dpaa2_sec_dpseci.c        |  50 +-
 drivers/crypto/dpaa_sec/dpaa_sec.c                 |  14 +
 drivers/crypto/octeontx/otx_cryptodev_ops.c        |   4 +-
 drivers/crypto/octeontx2/otx2_cryptodev_qp.h       |   4 +-
 drivers/crypto/qat/qat_sym.c                       |  10 +-
 drivers/crypto/zuc/rte_zuc_pmd.c                   |   8 +-
 drivers/event/dlb/dlb.c                            |   2 +-
 drivers/event/dlb/pf/dlb_pf.c                      |   3 +-
 drivers/event/dlb2/dlb2.c                          |   2 +-
 drivers/event/dlb2/dlb2_priv.h                     |   3 -
 drivers/event/dlb2/pf/dlb2_pf.c                    |   3 +-
 drivers/event/dpaa2/dpaa2_eventdev_logs.h          |   2 -
 drivers/event/octeontx2/otx2_evdev.c               |  65 ++-
 drivers/event/octeontx2/otx2_evdev_adptr.c         |   2 +-
 drivers/event/octeontx2/otx2_evdev_crypto_adptr.c  | 110 ++--
 drivers/meson.build                                |   2 +-
 drivers/net/af_xdp/rte_eth_af_xdp.c                |   3 +-
 drivers/net/ark/ark_ethdev.c                       |   3 +
 drivers/net/ark/ark_ethdev_rx.c                    |  49 +-
 drivers/net/ark/ark_pktdir.c                       |   2 +-
 drivers/net/ark/ark_pktdir.h                       |   2 +-
 drivers/net/atlantic/atl_ethdev.c                  |   7 +-
 drivers/net/bnx2x/bnx2x.h                          |  13 +-
 drivers/net/bnx2x/bnx2x_rxtx.c                     |  13 +-
 drivers/net/bnxt/bnxt.h                            |  23 +-
 drivers/net/bnxt/bnxt_cpr.h                        |   4 +
 drivers/net/bnxt/bnxt_ethdev.c                     | 514 +++++++++++++++----
 drivers/net/bnxt/bnxt_flow.c                       |  56 ++-
 drivers/net/bnxt/bnxt_hwrm.c                       | 185 ++++---
 drivers/net/bnxt/bnxt_hwrm.h                       |   9 +-
 drivers/net/bnxt/bnxt_reps.c                       |   4 +-
 drivers/net/bnxt/bnxt_rxq.c                        |  33 +-
 drivers/net/bnxt/bnxt_rxr.c                        |  25 +-
 drivers/net/bnxt/bnxt_rxr.h                        |   4 +-
 drivers/net/bnxt/bnxt_stats.c                      |  23 +-
 drivers/net/bnxt/bnxt_stats.h                      |   7 +-
 drivers/net/bnxt/bnxt_txr.c                        |   2 +-
 drivers/net/bnxt/bnxt_util.h                       |   2 +
 drivers/net/bnxt/bnxt_vnic.c                       |   4 +-
 drivers/net/bnxt/bnxt_vnic.h                       |   4 +-
 drivers/net/bonding/eth_bond_private.h             |   2 +-
 drivers/net/bonding/rte_eth_bond_8023ad.c          |  17 +-
 drivers/net/bonding/rte_eth_bond_api.c             |  26 +-
 drivers/net/bonding/rte_eth_bond_args.c            |   8 +-
 drivers/net/bonding/rte_eth_bond_pmd.c             |   7 +-
 drivers/net/cxgbe/base/common.h                    |  18 +-
 drivers/net/dpaa/dpaa_ethdev.c                     |  26 +-
 drivers/net/dpaa2/dpaa2_ethdev.c                   |  25 +-
 drivers/net/e1000/base/e1000_i210.c                |   2 +
 drivers/net/e1000/e1000_logs.c                     |  49 +-
 drivers/net/e1000/em_ethdev.c                      |  21 +-
 drivers/net/e1000/igb_ethdev.c                     |  33 +-
 drivers/net/e1000/igb_flow.c                       |   2 +-
 drivers/net/e1000/igb_rxtx.c                       |   9 +-
 drivers/net/ena/base/ena_com.c                     |  60 ++-
 drivers/net/ena/base/ena_defs/ena_admin_defs.h     |  85 ++--
 drivers/net/ena/base/ena_eth_com.c                 |  16 +-
 drivers/net/ena/base/ena_plat_dpdk.h               |   9 +-
 drivers/net/ena/ena_ethdev.c                       |  38 +-
 drivers/net/ena/ena_platform.h                     |  12 -
 drivers/net/enic/base/vnic_dev.c                   |   2 +-
 drivers/net/enic/base/vnic_enet.h                  |   1 +
 drivers/net/enic/enic.h                            |   4 +-
 drivers/net/enic/enic_ethdev.c                     |  85 ++--
 drivers/net/enic/enic_fm_flow.c                    |   6 +-
 drivers/net/enic/enic_main.c                       | 161 +++---
 drivers/net/enic/enic_res.c                        |   7 +-
 drivers/net/failsafe/failsafe_ops.c                |  10 +-
 drivers/net/hinic/base/hinic_compat.h              |  25 +-
 drivers/net/hinic/hinic_pmd_ethdev.c               |   5 +
 drivers/net/hns3/hns3_cmd.c                        |  24 +-
 drivers/net/hns3/hns3_cmd.h                        |  21 +-
 drivers/net/hns3/hns3_dcb.c                        | 109 ++--
 drivers/net/hns3/hns3_dcb.h                        |   4 +-
 drivers/net/hns3/hns3_ethdev.c                     | 443 +++++++++-------
 drivers/net/hns3/hns3_ethdev.h                     |  46 +-
 drivers/net/hns3/hns3_ethdev_vf.c                  | 121 ++---
 drivers/net/hns3/hns3_fdir.c                       |  52 +-
 drivers/net/hns3/hns3_fdir.h                       |   5 +-
 drivers/net/hns3/hns3_flow.c                       | 112 ++++-
 drivers/net/hns3/hns3_intr.c                       |  73 ++-
 drivers/net/hns3/hns3_intr.h                       |   4 +-
 drivers/net/hns3/hns3_logs.h                       |   2 +-
 drivers/net/hns3/hns3_mbx.c                        | 256 +++++++---
 drivers/net/hns3/hns3_mbx.h                        |  32 +-
 drivers/net/hns3/hns3_mp.c                         |   6 +-
 drivers/net/hns3/hns3_mp.h                         |   2 +-
 drivers/net/hns3/hns3_regs.c                       |   9 +-
 drivers/net/hns3/hns3_regs.h                       |   2 +-
 drivers/net/hns3/hns3_rss.c                        |   2 +-
 drivers/net/hns3/hns3_rss.h                        |   2 +-
 drivers/net/hns3/hns3_rxtx.c                       | 307 +++++++++---
 drivers/net/hns3/hns3_rxtx.h                       |  37 +-
 drivers/net/hns3/hns3_rxtx_vec.c                   |  38 +-
 drivers/net/hns3/hns3_rxtx_vec.h                   |   5 +-
 drivers/net/hns3/hns3_rxtx_vec_neon.h              |   2 +-
 drivers/net/hns3/hns3_rxtx_vec_sve.c               |  34 +-
 drivers/net/hns3/hns3_stats.c                      |  10 +-
 drivers/net/hns3/hns3_stats.h                      |   6 +-
 drivers/net/hns3/meson.build                       |   2 +-
 drivers/net/i40e/base/virtchnl.h                   |  29 +-
 drivers/net/i40e/i40e_ethdev.c                     | 167 +++++--
 drivers/net/i40e/i40e_ethdev.h                     |   5 +-
 drivers/net/i40e/i40e_ethdev_vf.c                  |  95 ++--
 drivers/net/i40e/i40e_fdir.c                       |  89 ++++
 drivers/net/i40e/i40e_flow.c                       | 181 ++++---
 drivers/net/i40e/i40e_pf.c                         |  65 +++
 drivers/net/i40e/i40e_rxtx.c                       |   2 -
 drivers/net/i40e/i40e_rxtx_vec_neon.c              |  20 +-
 drivers/net/iavf/iavf.h                            |   6 +-
 drivers/net/iavf/iavf_ethdev.c                     |  16 +-
 drivers/net/iavf/iavf_rxtx.c                       |   5 +
 drivers/net/iavf/iavf_rxtx.h                       |   2 +-
 drivers/net/iavf/iavf_rxtx_vec_avx2.c              | 120 +----
 drivers/net/iavf/iavf_rxtx_vec_avx512.c            |  13 +-
 drivers/net/iavf/iavf_rxtx_vec_common.h            | 203 ++++++++
 drivers/net/iavf/iavf_vchnl.c                      |  25 +-
 drivers/net/ice/base/ice_flow.c                    |  11 +-
 drivers/net/ice/base/ice_lan_tx_rx.h               |   2 +-
 drivers/net/ice/base/ice_osdep.h                   |   2 +-
 drivers/net/ice/base/ice_switch.c                  |   3 +-
 drivers/net/ice/base/meson.build                   |   5 +
 drivers/net/ice/ice_dcf_parent.c                   |   2 +
 drivers/net/ice/ice_ethdev.c                       |  56 ++-
 drivers/net/ice/ice_hash.c                         |  14 +
 drivers/net/ice/ice_rxtx_vec_avx2.c                | 120 +----
 drivers/net/ice/ice_rxtx_vec_avx512.c              |   5 +-
 drivers/net/ice/ice_rxtx_vec_common.h              | 203 ++++++++
 drivers/net/ice/meson.build                        |   2 +
 drivers/net/igc/igc_ethdev.c                       |  46 +-
 drivers/net/igc/igc_ethdev.h                       |   3 +-
 drivers/net/igc/igc_flow.c                         |   2 +-
 drivers/net/igc/igc_txrx.c                         |  30 +-
 drivers/net/ionic/ionic_ethdev.c                   |  15 +-
 drivers/net/ionic/ionic_lif.c                      |   5 +-
 drivers/net/ixgbe/ixgbe_ethdev.c                   |  22 +-
 drivers/net/kni/rte_eth_kni.c                      |  12 +-
 drivers/net/memif/rte_eth_memif.c                  |   1 +
 drivers/net/memif/rte_eth_memif.h                  |   4 -
 drivers/net/mlx4/mlx4.c                            |   1 +
 drivers/net/mlx4/mlx4_flow.c                       |   3 +-
 drivers/net/mlx4/mlx4_mp.c                         |   2 +-
 drivers/net/mlx4/mlx4_rxtx.c                       |   4 -
 drivers/net/mlx4/mlx4_txq.c                        |  19 +-
 drivers/net/mlx5/linux/mlx5_ethdev_os.c            |   4 +-
 drivers/net/mlx5/linux/mlx5_mp_os.c                |   2 +-
 drivers/net/mlx5/linux/mlx5_os.c                   |  37 +-
 drivers/net/mlx5/linux/mlx5_socket.c               |   4 -
 drivers/net/mlx5/linux/mlx5_verbs.c                | 121 +++++
 drivers/net/mlx5/linux/mlx5_verbs.h                |   2 +
 drivers/net/mlx5/mlx5.c                            |  19 +-
 drivers/net/mlx5/mlx5.h                            |  20 +-
 drivers/net/mlx5/mlx5_devx.c                       |   4 +
 drivers/net/mlx5/mlx5_flow.c                       | 103 ++--
 drivers/net/mlx5/mlx5_flow.h                       |  68 ++-
 drivers/net/mlx5/mlx5_flow_age.c                   |   5 +-
 drivers/net/mlx5/mlx5_flow_dv.c                    | 311 ++++++++----
 drivers/net/mlx5/mlx5_mr.c                         |  11 +
 drivers/net/mlx5/mlx5_rxtx.c                       |  16 +-
 drivers/net/mlx5/mlx5_rxtx.h                       |   1 +
 drivers/net/mlx5/mlx5_rxtx_vec_altivec.h           |  11 +-
 drivers/net/mlx5/mlx5_rxtx_vec_neon.h              |  13 +-
 drivers/net/mlx5/mlx5_rxtx_vec_sse.h               |   9 +-
 drivers/net/mlx5/mlx5_trigger.c                    |  10 +
 drivers/net/mlx5/mlx5_txpp.c                       |   2 +
 drivers/net/nfp/nfp_net.c                          |  26 +-
 drivers/net/octeontx2/otx2_ethdev_ops.c            |   5 +-
 drivers/net/octeontx2/otx2_vlan.c                  |   8 +-
 drivers/net/pcap/rte_eth_pcap.c                    |  12 +-
 drivers/net/qede/base/ecore_int.c                  |   2 +-
 drivers/net/qede/qede_ethdev.c                     |   9 +-
 drivers/net/sfc/sfc_ef100_rx.c                     |  21 +-
 drivers/net/sfc/sfc_ethdev.c                       |   8 -
 drivers/net/sfc/sfc_mae.c                          |  25 +-
 drivers/net/sfc/sfc_mae.h                          |   3 +-
 drivers/net/tap/rte_eth_tap.c                      |   5 +-
 drivers/net/tap/tap_flow.c                         |   8 +-
 drivers/net/tap/tap_intr.c                         |   2 +-
 drivers/net/txgbe/base/txgbe_eeprom.c              |  76 +--
 drivers/net/txgbe/base/txgbe_eeprom.h              |   2 -
 drivers/net/txgbe/base/txgbe_type.h                |   1 +
 drivers/net/txgbe/txgbe_ethdev.c                   |  47 +-
 drivers/net/txgbe/txgbe_ptypes.c                   |   4 +-
 drivers/net/virtio/virtio_rxtx_simple_altivec.c    |  12 +-
 drivers/net/virtio/virtio_rxtx_simple_neon.c       |  12 +-
 drivers/net/virtio/virtio_rxtx_simple_sse.c        |  12 +-
 drivers/net/virtio/virtio_user_ethdev.c            |  75 ++-
 drivers/raw/ifpga/ifpga_rawdev.c                   |   4 +-
 drivers/raw/ifpga/ifpga_rawdev.h                   |   2 +
 drivers/raw/ioat/dpdk_idxd_cfg.py                  |  10 +-
 drivers/raw/ntb/ntb.c                              |  13 +
 drivers/raw/ntb/ntb_hw_intel.c                     |   5 +
 drivers/raw/octeontx2_dma/otx2_dpi_rawdev.c        |   1 +
 drivers/raw/skeleton/skeleton_rawdev_test.c        |   1 +
 drivers/regex/mlx5/mlx5_regex.c                    |   1 +
 drivers/regex/mlx5/mlx5_regex.h                    |   1 +
 drivers/regex/mlx5/mlx5_regex_control.c            |   1 +
 drivers/regex/octeontx2/meson.build                |   1 -
 drivers/vdpa/ifc/base/ifcvf.c                      |   7 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c                      |   3 +
 drivers/vdpa/mlx5/mlx5_vdpa.h                      |   1 +
 drivers/vdpa/mlx5/mlx5_vdpa_event.c                |   2 +
 drivers/vdpa/mlx5/mlx5_vdpa_virtq.c                |   8 +-
 examples/bbdev_app/Makefile                        |   6 +-
 examples/bbdev_app/main.c                          |   5 +-
 examples/bond/Makefile                             |   6 +-
 examples/bond/main.c                               |   4 +
 examples/cmdline/Makefile                          |   6 +-
 examples/cmdline/main.c                            |   3 +
 examples/distributor/Makefile                      |   6 +-
 examples/distributor/main.c                        |   3 +
 examples/ethtool/ethtool-app/Makefile              |   6 +-
 examples/ethtool/ethtool-app/ethapp.c              |   1 -
 examples/ethtool/ethtool-app/main.c                |   3 +
 examples/ethtool/lib/Makefile                      |   6 +-
 examples/eventdev_pipeline/Makefile                |   6 +-
 examples/fips_validation/Makefile                  |   6 +-
 examples/fips_validation/main.c                    |   3 +
 examples/flow_classify/Makefile                    |   6 +-
 examples/flow_classify/flow_classify.c             |   5 +-
 examples/flow_filtering/Makefile                   |   6 +-
 examples/flow_filtering/main.c                     |   7 +-
 examples/helloworld/Makefile                       |   6 +-
 examples/helloworld/main.c                         |   4 +
 examples/ioat/Makefile                             |   6 +-
 examples/ioat/ioatfwd.c                            |   3 +
 examples/ip_fragmentation/Makefile                 |   6 +-
 examples/ip_fragmentation/main.c                   |   3 +
 examples/ip_pipeline/Makefile                      |   6 +-
 examples/ip_reassembly/Makefile                    |   6 +-
 examples/ip_reassembly/main.c                      |   3 +
 examples/ipsec-secgw/Makefile                      |   6 +-
 examples/ipsec-secgw/ipsec-secgw.c                 |   3 +
 examples/ipv4_multicast/Makefile                   |   6 +-
 examples/ipv4_multicast/main.c                     |   3 +
 examples/kni/Makefile                              |   6 +-
 examples/kni/main.c                                |   3 +
 examples/l2fwd-cat/Makefile                        |   6 +-
 examples/l2fwd-cat/l2fwd-cat.c                     |   5 +-
 examples/l2fwd-crypto/Makefile                     |   6 +-
 examples/l2fwd-crypto/main.c                       |  23 +
 examples/l2fwd-event/Makefile                      |   6 +-
 examples/l2fwd-event/main.c                        |   3 +
 examples/l2fwd-jobstats/Makefile                   |   6 +-
 examples/l2fwd-jobstats/main.c                     |   3 +
 examples/l2fwd-keepalive/Makefile                  |   6 +-
 examples/l2fwd-keepalive/ka-agent/Makefile         |   6 +-
 examples/l2fwd-keepalive/main.c                    |   4 +
 examples/l2fwd/Makefile                            |   6 +-
 examples/l2fwd/main.c                              |   3 +
 examples/l3fwd-acl/Makefile                        |   6 +-
 examples/l3fwd-acl/main.c                          |   3 +
 examples/l3fwd-graph/Makefile                      |   6 +-
 examples/l3fwd-graph/main.c                        |   3 +
 examples/l3fwd-power/Makefile                      |   6 +-
 examples/l3fwd-power/main.c                        |   4 +-
 examples/l3fwd/Makefile                            |   6 +-
 examples/l3fwd/l3fwd_lpm.c                         |  26 +-
 examples/l3fwd/main.c                              |   4 +
 examples/link_status_interrupt/Makefile            |   6 +-
 examples/link_status_interrupt/main.c              |   3 +
 examples/meson.build                               |  10 +-
 .../client_server_mp/mp_client/Makefile            |   6 +-
 .../client_server_mp/mp_client/client.c            |   3 +
 .../client_server_mp/mp_server/Makefile            |   6 +-
 .../client_server_mp/mp_server/main.c              |   4 +
 examples/multi_process/hotplug_mp/Makefile         |   6 +-
 examples/multi_process/simple_mp/Makefile          |   6 +-
 examples/multi_process/simple_mp/main.c            |   4 +
 examples/multi_process/symmetric_mp/Makefile       |   6 +-
 examples/multi_process/symmetric_mp/main.c         |   3 +
 examples/ntb/Makefile                              |   6 +-
 examples/ntb/ntb_fwd.c                             |   3 +
 examples/packet_ordering/Makefile                  |   6 +-
 examples/packet_ordering/main.c                    |   6 +-
 examples/performance-thread/l3fwd-thread/Makefile  |   5 +-
 examples/performance-thread/l3fwd-thread/main.c    |   3 +
 examples/performance-thread/pthread_shim/Makefile  |   6 +-
 examples/performance-thread/pthread_shim/main.c    |   4 +
 examples/pipeline/Makefile                         |   6 +-
 examples/pipeline/main.c                           |   3 +
 examples/ptpclient/Makefile                        |   6 +-
 examples/ptpclient/ptpclient.c                     |   7 +-
 examples/qos_meter/Makefile                        |   6 +-
 examples/qos_meter/main.c                          |   3 +
 examples/qos_sched/Makefile                        |   6 +-
 examples/qos_sched/main.c                          |   3 +
 examples/rxtx_callbacks/Makefile                   |   6 +-
 examples/rxtx_callbacks/main.c                     |   6 +-
 examples/server_node_efd/node/Makefile             |   6 +-
 examples/server_node_efd/node/node.c               |   3 +
 examples/server_node_efd/server/Makefile           |   6 +-
 examples/server_node_efd/server/main.c             |   4 +
 examples/service_cores/Makefile                    |   6 +-
 examples/service_cores/main.c                      |   3 +
 examples/skeleton/Makefile                         |   6 +-
 examples/skeleton/basicfwd.c                       |   5 +-
 examples/timer/Makefile                            |   6 +-
 examples/timer/main.c                              |  23 +-
 examples/vdpa/Makefile                             |   6 +-
 examples/vdpa/main.c                               |   3 +
 examples/vhost/Makefile                            |   6 +-
 examples/vhost/main.c                              |  48 +-
 examples/vhost/virtio_net.c                        |   8 +-
 examples/vhost_blk/Makefile                        |   6 +-
 examples/vhost_blk/vhost_blk.c                     |   3 +
 examples/vhost_crypto/Makefile                     |   6 +-
 examples/vhost_crypto/main.c                       |   5 +-
 examples/vm_power_manager/Makefile                 |   6 +-
 examples/vm_power_manager/guest_cli/Makefile       |   6 +-
 examples/vm_power_manager/guest_cli/main.c         |   3 +
 examples/vm_power_manager/main.c                   |   3 +
 examples/vmdq/Makefile                             |   6 +-
 examples/vmdq/main.c                               |   3 +
 examples/vmdq_dcb/Makefile                         |   6 +-
 examples/vmdq_dcb/main.c                           |   3 +
 kernel/linux/kni/kni_net.c                         |  48 +-
 lib/librte_acl/acl_run_avx512_common.h             |  24 +
 lib/librte_bpf/bpf_validate.c                      |   2 +-
 lib/librte_eal/arm/rte_cpuflags.c                  |   2 +-
 lib/librte_eal/common/eal_common_fbarray.c         |   7 +-
 lib/librte_eal/common/eal_common_options.c         |  12 +-
 lib/librte_eal/common/eal_common_proc.c            |  27 +-
 lib/librte_eal/common/eal_common_thread.c          |  66 +--
 lib/librte_eal/common/malloc_mp.c                  |   4 +-
 lib/librte_eal/freebsd/eal.c                       |   4 +
 lib/librte_eal/freebsd/include/rte_os.h            |   6 +-
 lib/librte_eal/include/rte_eal_paging.h            |   2 +-
 lib/librte_eal/include/rte_lcore.h                 |   8 +
 lib/librte_eal/include/rte_reciprocal.h            |   8 +
 lib/librte_eal/include/rte_service.h               |   5 +-
 lib/librte_eal/include/rte_vfio.h                  |   7 +-
 lib/librte_eal/linux/eal.c                         |   4 +
 lib/librte_eal/linux/eal_log.c                     |   6 +-
 lib/librte_eal/linux/eal_memalloc.c                |  14 +-
 lib/librte_eal/linux/eal_vfio.c                    |  98 ++--
 lib/librte_eal/linux/eal_vfio.h                    |   1 +
 lib/librte_eal/linux/include/rte_os.h              |   8 +-
 lib/librte_eal/unix/eal_file.c                     |   1 +
 lib/librte_eal/unix/eal_unix_memory.c              |  11 +-
 lib/librte_eal/version.map                         |   1 -
 lib/librte_eal/windows/eal.c                       |   4 +
 lib/librte_eal/windows/eal_hugepages.c             |   4 +
 lib/librte_eal/windows/eal_memory.c                |   2 +-
 lib/librte_eal/windows/eal_thread.c                |   4 +-
 lib/librte_eal/windows/include/pthread.h           |  16 +-
 lib/librte_eal/windows/include/rte_os.h            |   5 +-
 lib/librte_eal/windows/include/sched.h             |   1 +
 lib/librte_ethdev/rte_ethdev.c                     |  14 +-
 lib/librte_ethdev/rte_ethdev.h                     |   5 +
 lib/librte_ethdev/rte_flow.h                       |   4 +-
 lib/librte_eventdev/rte_event_crypto_adapter.c     |   1 +
 lib/librte_eventdev/rte_event_eth_rx_adapter.c     |   5 +-
 lib/librte_ip_frag/rte_ipv4_fragmentation.c        |  34 +-
 lib/librte_kni/rte_kni.c                           |   7 +-
 lib/librte_kni/rte_kni_common.h                    |   1 +
 lib/librte_mbuf/rte_mbuf_dyn.c                     |  10 +-
 lib/librte_net/rte_ip.h                            |   2 +-
 lib/librte_pipeline/rte_swx_pipeline.c             | 494 ++++++++++++++----
 lib/librte_power/guest_channel.c                   |  22 +-
 lib/librte_power/power_acpi_cpufreq.c              |   5 +-
 lib/librte_power/power_pstate_cpufreq.c            |   5 +-
 lib/librte_power/rte_power_guest_channel.h         |   8 -
 lib/librte_power/version.map                       |   2 -
 lib/librte_sched/rte_sched.c                       |   2 +-
 lib/librte_stack/rte_stack.c                       |   4 +-
 lib/librte_stack/rte_stack.h                       |   3 +-
 lib/librte_stack/rte_stack_lf.h                    |   5 +
 lib/librte_table/rte_swx_table_em.c                |   6 +-
 lib/librte_telemetry/rte_telemetry.h               |   4 +
 lib/librte_telemetry/telemetry.c                   |   2 +
 lib/librte_vhost/rte_vhost.h                       |   1 +
 lib/librte_vhost/socket.c                          |   5 +-
 lib/librte_vhost/vhost.c                           |   8 +-
 lib/librte_vhost/vhost.h                           |  14 +-
 lib/librte_vhost/vhost_user.c                      |   3 -
 lib/librte_vhost/virtio_net.c                      | 213 ++++++--
 license/README                                     |   4 +-
 meson.build                                        |   2 +-
 494 files changed, 7573 insertions(+), 3561 deletions(-)
Adam Dybkowski (3):
      common/qat: increase IM buffer size for GEN3
      compress/qat: enable compression on GEN3
      crypto/qat: fix null authentication request

Ajit Khaparde (7):
      net/bnxt: fix RSS context cleanup
      net/bnxt: check kvargs parsing
      net/bnxt: fix resource cleanup
      doc: fix formatting in testpmd guide
      net/bnxt: fix mismatched type comparison in MAC restore
      net/bnxt: check PCI config read
      net/bnxt: fix mismatched type comparison in Rx

Alvin Zhang (11):
      net/ice: fix VLAN filter with PF
      net/i40e: fix input set field mask
      net/igc: fix Rx RSS hash offload capability
      net/igc: fix Rx error counter for bad length
      net/e1000: fix Rx error counter for bad length
      net/e1000: fix max Rx packet size
      net/igc: fix Rx packet size
      net/ice: fix fast mbuf freeing
      net/iavf: fix VF to PF command failure handling
      net/i40e: fix VF RSS configuration
      net/igc: fix speed configuration

Anatoly Burakov (3):
      fbarray: fix log message on truncation error
      power: do not skip saving original P-state governor
      power: save original ACPI governor always

Andrew Boyer (1):
      net/ionic: fix completion type in lif init

Andrew Rybchenko (4):
      net/failsafe: fix RSS hash offload reporting
      net/failsafe: report minimum and maximum MTU
      common/sfc_efx: remove GENEVE from supported tunnels
      net/sfc: fix mark support in EF100 native Rx datapath

Andy Moreton (2):
      common/sfc_efx/base: limit reported MCDI response length
      common/sfc_efx/base: add missing MCDI response length checks

Ankur Dwivedi (1):
      crypto/octeontx: fix session-less mode

Apeksha Gupta (1):
      examples/l2fwd-crypto: skip masked devices

Arek Kusztal (1):
      crypto/qat: fix offset for out-of-place scatter-gather

Beilei Xing (1):
      net/i40evf: fix packet loss for X722

Bing Zhao (1):
      net/mlx5: fix loopback for Direct Verbs queue

Bruce Richardson (2):
      build: exclude meson files from examples installation
      raw/ioat: fix script for configuring small number of queues

Chaoyong He (1):
      doc: fix multiport syntax in nfp guide

Chenbo Xia (1):
      examples/vhost: check memory table query

Chengchang Tang (20):
      net/hns3: fix HW buffer size on MTU update
      net/hns3: fix processing Tx offload flags
      net/hns3: fix Tx checksum for UDP packets with special port
      net/hns3: fix long task queue pairs reset time
      ethdev: validate input in module EEPROM dump
      ethdev: validate input in register info
      ethdev: validate input in EEPROM info
      net/hns3: fix rollback after setting PVID failure
      net/hns3: fix timing in resetting queues
      net/hns3: fix queue state when concurrent with reset
      net/hns3: fix configure FEC when concurrent with reset
      net/hns3: fix use of command status enumeration
      examples: add eal cleanup to examples
      net/bonding: fix adding itself as its slave
      net/hns3: fix timing in mailbox
      app/testpmd: fix max queue number for Tx offloads
      net/tap: fix interrupt vector array size
      net/bonding: fix socket ID check
      net/tap: check ioctl on restore
      examples/timer: fix time interval

Chengwen Feng (50):
      net/hns3: fix flow counter value
      net/hns3: fix VF mailbox head field
      net/hns3: support get device version when dump register
      net/hns3: fix some packet types
      net/hns3: fix missing outer L4 UDP flag for VXLAN
      net/hns3: remove VLAN/QinQ ptypes from support list
      test: check thread creation
      common/dpaax: fix possible null pointer access
      examples/ethtool: remove unused parsing
      net/hns3: fix flow director lock
      net/e1000/base: fix timeout for shadow RAM write
      net/hns3: fix setting default MAC address in bonding of VF
      net/hns3: fix possible mismatched response of mailbox
      net/hns3: fix VF handling LSC event in secondary process
      net/hns3: fix verification of NEON support
      mbuf: check shared memory before dumping dynamic space
      eventdev: remove redundant thread name setting
      eventdev: fix memory leakage on thread creation failure
      net/kni: check init result
      net/hns3: fix mailbox error message
      net/hns3: fix processing link status message on PF
      net/hns3: remove unused mailbox macro and struct
      net/bonding: fix leak on remove
      net/hns3: fix handling link update
      net/i40e: fix negative VEB index
      net/i40e: remove redundant VSI check in Tx queue setup
      net/virtio: fix getline memory leakage
      net/hns3: log time delta in decimal format
      net/hns3: fix time delta calculation
      net/hns3: remove unused macros
      net/hns3: fix vector Rx burst limitation
      net/hns3: remove read when enabling TM QCN error event
      net/hns3: remove unused VMDq code
      net/hns3: increase readability in logs
      raw/ntb: check SPAD user index
      raw/ntb: check memory allocations
      ipc: check malloc sync reply result
      eal: fix service core list parsing
      ipc: use monotonic clock
      net/hns3: return error on PCI config write failure
      net/hns3: fix log on flow director clear
      net/hns3: clear hash map on flow director clear
      net/hns3: fix querying flow director counter for out param
      net/hns3: fix TM QCN error event report by MSI-X
      net/hns3: fix mailbox message ID in log
      net/hns3: fix secondary process request start/stop Rx/Tx
      net/hns3: fix ordering in secondary process initialization
      net/hns3: fail setting FEC if one bit mode is not supported
      net/mlx4: fix secondary process initialization ordering
      net/mlx5: fix secondary process initialization ordering

Ciara Loftus (1):
      net/af_xdp: fix error handling during Rx queue setup

Ciara Power (2):
      telemetry: fix race on callbacks list
      test/crypto: fix return value of a skipped test

Conor Walsh (1):
      examples/l3fwd: fix LPM IPv6 subnets

Cristian Dumitrescu (3):
      table: fix actions with different data size
      pipeline: fix instruction translation
      pipeline: fix endianness conversions

Dapeng Yu (3):
      net/igc: remove MTU setting limitation
      net/e1000: remove MTU setting limitation
      examples/packet_ordering: fix port configuration

David Christensen (1):
      config/ppc: reduce number of cores and NUMA nodes

David Harton (1):
      net/ena: fix releasing Tx ring mbufs

David Hunt (4):
      test/power: fix CPU frequency check
      test/power: add turbo mode to frequency check
      test/power: fix low frequency test when turbo enabled
      test/power: fix turbo test

David Marchand (18):
      doc: fix sphinx rtd theme import in GHA
      service: clean references to removed symbol
      eal: fix evaluation of log level option
      ci: hook to GitHub Actions
      ci: enable v21 ABI checks
      ci: fix package installation in GitHub Actions
      ci: ignore APT update failure in GitHub Actions
      ci: catch coredumps
      vhost: fix offload flags in Rx path
      bus/fslmc: remove unused debug macro
      eal: fix leak in shared lib mode detection
      event/dpaa2: remove unused macros
      net/ice/base: fix memory allocation wrapper
      net/ice: fix leak on thread termination
      devtools: fix orphan symbols check with busybox
      net/vhost: restore pseudo TSO support
      net/ark: fix leak on thread termination
      build: fix drivers selection without Python

Dekel Peled (1):
      common/mlx5: fix DevX read output buffer size

Dmitry Kozlyuk (4):
      net/pcap: fix format string
      eal/windows: add missing SPDX license tag
      buildtools: fix all drivers disabled on Windows
      examples/rxtx_callbacks: fix port ID format specifier

Ed Czeck (2):
      net/ark: update packet director initial state
      net/ark: refactor Rx buffer recovery

Elad Nachman (2):
      kni: support async user request
      kni: fix kernel deadlock with bifurcated device

Feifei Wang (2):
      net/i40e: fix parsing packet type for NEON
      test/trace: fix race on collected perf data

Ferruh Yigit (9):
      power: remove duplicated symbols from map file
      log/linux: make default output stderr
      license: fix typos
      drivers/net: fix FW version query
      net/bnx2x: fix build with GCC 11
      net/bnx2x: fix build with GCC 11
      net/ice/base: fix build with GCC 11
      net/tap: fix build with GCC 11
      test/table: fix build with GCC 11

Gregory Etelson (2):
      app/testpmd: fix tunnel offload flows cleanup
      net/mlx5: fix tunnel offload private items location

Guoyang Zhou (1):
      net/hinic: fix crash in secondary process

Haiyue Wang (1):
      net/ixgbe: fix Rx errors statistics for UDP checksum

Harman Kalra (1):
      event/octeontx2: fix device reconfigure for single slot

Heinrich Kuhn (1):
      net/nfp: fix reporting of RSS capabilities

Hemant Agrawal (3):
      ethdev: add missing buses in device iterator
      crypto/dpaa_sec: affine the thread portal affinity
      crypto/dpaa2_sec: fix close and uninit functions

Hongbo Zheng (9):
      app/testpmd: fix Tx/Rx descriptor query error log
      net/hns3: fix FLR miss detection
      net/hns3: delete redundant blank line
      bpf: fix JSLT validation
      common/sfc_efx/base: fix dereferencing null pointer
      power: fix sanity checks for guest channel read
      net/hns3: fix VF alive notification after config restore
      examples/l3fwd-power: fix empty poll thresholds
      net/hns3: fix concurrent interrupt handling

Huisong Li (23):
      net/hns3: fix device capabilities for copper media type
      net/hns3: remove unused parameter markers
      net/hns3: fix reporting undefined speed
      net/hns3: fix link update when failed to get link info
      net/hns3: fix flow control exception
      app/testpmd: fix bitmap of link speeds when force speed
      net/hns3: fix flow control mode
      net/hns3: remove redundant mailbox response
      net/hns3: fix DCB mode check
      net/hns3: fix VMDq mode check
      net/hns3: fix mbuf leakage
      net/hns3: fix link status when port is stopped
      net/hns3: fix link speed when port is down
      app/testpmd: fix forward lcores number for DCB
      app/testpmd: fix DCB forwarding configuration
      app/testpmd: fix DCB re-configuration
      app/testpmd: verify DCB config during forward config
      net/hns3: fix Rx/Tx queue numbers check
      net/hns3: fix requested FC mode rollback
      net/hns3: remove meaningless packet buffer rollback
      net/hns3: fix DCB configuration
      net/hns3: fix DCB reconfiguration
      net/hns3: fix link speed when VF device is down

Ibtisam Tariq (1):
      examples/vhost_crypto: remove unused short option

Igor Chauskin (2):
      net/ena: switch memcpy to optimized version
      net/ena: fix parsing of large LLQ header device argument

Igor Russkikh (2):
      net/qede: reduce log verbosity
      net/qede: accept bigger RSS table

Ilya Maximets (1):
      net/virtio: fix interrupt unregistering for listening socket

Ivan Malov (5):
      net/sfc: fix buffer size for flow parse
      net: fix comment in IPv6 header
      net/sfc: fix error path inconsistency
      common/sfc_efx/base: fix indication of MAE encap support
      net/sfc: fix outer rule rollback on error

Jerin Jacob (1):
      examples: fix pkg-config override

Jiawei Wang (4):
      app/testpmd: fix NVGRE encap configuration
      net/mlx5: fix resource release for mirror flow
      net/mlx5: fix RSS flow item expansion for GRE key
      net/mlx5: fix RSS flow item expansion for NVGRE

Jiawei Zhu (1):
      net/mlx5: fix Rx segmented packets on mbuf starvation

Jiawen Wu (4):
      net/txgbe: remove unused functions
      net/txgbe: fix Rx missed packet counter
      net/txgbe: update packet type
      net/txgbe: fix QinQ strip

Jiayu Hu (2):
      vhost: fix queue initialization
      vhost: fix redundant vring status change notification

Jie Wang (1):
      net/ice: fix VSI array out of bounds access

John Daley (2):
      net/enic: fix flow initialization error handling
      net/enic: enable GENEVE offload via VNIC configuration

Juraj Linkeš (1):
      eal/arm64: fix platform register bit

Kai Ji (2):
      test/crypto: fix auth-cipher compare length in OOP
      test/crypto: copy offset data to OOP destination buffer

Kalesh AP (23):
      net/bnxt: remove unused macro
      net/bnxt: fix VNIC configuration
      net/bnxt: fix firmware fatal error handling
      net/bnxt: fix FW readiness check during recovery
      net/bnxt: fix device readiness check
      net/bnxt: fix VF info allocation
      net/bnxt: fix HWRM and FW incompatibility handling
      net/bnxt: mute some failure logs
      app/testpmd: check MAC address query
      net/bnxt: fix PCI write check
      net/bnxt: fix link state operations
      net/bnxt: fix timesync when PTP is not supported
      net/bnxt: fix memory allocation for command response
      net/bnxt: fix double free in port start failure
      net/bnxt: fix configuring LRO
      net/bnxt: fix health check alarm cancellation
      net/bnxt: fix PTP support for Thor
      net/bnxt: fix ring count calculation for Thor
      net/bnxt: remove unnecessary forward declarations
      net/bnxt: remove unused function parameters
      net/bnxt: drop unused attribute
      net/bnxt: fix single PF per port check
      net/bnxt: prevent device access in error state

Kamil Vojanec (1):
      net/mlx5/linux: fix firmware version

Kevin Traynor (5):
      test/cmdline: fix inputs array
      test/crypto: fix build with GCC 11
      crypto/zuc: fix build with GCC 11
      test: fix build with GCC 11
      test/cmdline: silence clang 12 warning

Konstantin Ananyev (1):
      acl: fix build with GCC 11

Lance Richardson (8):
      net/bnxt: fix Rx buffer posting
      net/bnxt: fix Tx length hint threshold
      net/bnxt: fix handling of null flow mask
      test: fix TCP header initialization
      net/bnxt: fix Rx descriptor status
      net/bnxt: fix Rx queue count
      net/bnxt: fix dynamic VNIC count
      eal: fix memory mapping on 32-bit target

Leyi Rong (1):
      net/iavf: fix packet length parsing in AVX512

Li Zhang (1):
      net/mlx5: fix flow actions index in cache

Luc Pelletier (2):
      eal: fix race in control thread creation
      eal: fix hang in control thread creation

Marvin Liu (5):
      vhost: fix split ring potential buffer overflow
      vhost: fix packed ring potential buffer overflow
      vhost: fix batch dequeue potential buffer overflow
      vhost: fix initialization of temporary header
      vhost: fix initialization of async temporary header

Matan Azrad (5):
      common/mlx5/linux: add glue function to query WQ
      common/mlx5: add DevX command to query WQ
      common/mlx5: add DevX commands for queue counters
      vdpa/mlx5: fix virtq cleaning
      vdpa/mlx5: fix device unplug

Michael Baum (1):
      net/mlx5: fix flow age event triggering

Michal Krawczyk (5):
      net/ena/base: improve style and comments
      net/ena/base: fix type conversions by explicit casting
      net/ena/base: destroy multiple wait events
      net/ena: fix crash with unsupported device argument
      net/ena: indicate Rx RSS hash presence

Min Hu (Connor) (25):
      net/hns3: fix MTU config complexity
      net/hns3: update HiSilicon copyright syntax
      net/hns3: fix copyright date
      examples/ptpclient: remove wrong comment
      test/bpf: fix error message
      doc: fix HiSilicon copyright syntax
      net/hns3: remove unused macros
      net/hns3: remove unused macro
      app/eventdev: fix overflow in lcore list parsing
      test/kni: fix a comment
      test/kni: check init result
      net/hns3: fix typos on comments
      net/e1000: fix flow error message object
      app/testpmd: fix division by zero on socket memory dump
      net/kni: warn on stop failure
      app/bbdev: check memory allocation
      app/bbdev: fix HARQ error messages
      raw/skeleton: add missing check after setting attribute
      test/timer: check memzone allocation
      app/crypto-perf: check memory allocation
      examples/flow_classify: fix NUMA check of port and core
      examples/l2fwd-cat: fix NUMA check of port and core
      examples/skeleton: fix NUMA check of port and core
      test: check flow classifier creation
      test: fix division by zero

Murphy Yang (3):
      net/ixgbe: fix RSS RETA being reset after port start
      net/i40e: fix flow director config after flow validate
      net/i40e: fix flow director for common pctypes

Natanael Copa (5):
      common/dpaax/caamflib: fix build with musl
      bus/dpaa: fix 64-bit arch detection
      bus/dpaa: fix build with musl
      net/cxgbe: remove use of uint type
      app/testpmd: fix build with musl

Nipun Gupta (1):
      bus/dpaa: fix statistics reading

Nithin Dabilpuram (3):
      vfio: do not merge contiguous areas
      vfio: fix DMA mapping granularity for IOVA as VA
      test/mem: fix page size for external memory

Olivier Matz (1):
      test/mempool: fix object initializer

Pallavi Kadam (1):
      bus/pci: skip probing some Windows NDIS devices

Pavan Nikhilesh (4):
      test/event: fix timeout accuracy
      app/eventdev: fix timeout accuracy
      app/eventdev: fix lcore parsing skipping last core
      event/octeontx2: fix XAQ pool reconfigure

Pu Xu (1):
      ip_frag: fix fragmenting IPv4 packet with header option

Qi Zhang (8):
      net/ice/base: fix payload indicator on ptype
      net/ice/base: fix uninitialized struct
      net/ice/base: cleanup filter list on error
      net/ice/base: fix memory allocation for MAC addresses
      net/iavf: fix TSO max segment size
      doc: fix matching versions in ice guide
      net/iavf: fix wrong Tx context descriptor
      common/iavf: fix duplicated offload bit

Radha Mohan Chintakuntla (1):
      raw/octeontx2_dma: assign PCI device in DPI VF

Raslan Darawsheh (1):
      ethdev: update flow item GTP QFI definition

Richael Zhuang (2):
      test/power: add delay before checking CPU frequency
      test/power: round CPU frequency to check

Robin Zhang (6):
      net/i40e: announce request queue capability in PF
      doc: update recommended versions for i40e
      net/i40e: fix lack of MAC type when set MAC address
      net/iavf: fix lack of MAC type when set MAC address
      net/iavf: fix primary MAC type when starting port
      net/i40e: fix primary MAC type when starting port

Rohit Raj (3):
      net/dpaa2: fix getting link status
      net/dpaa: fix getting link status
      examples/l2fwd-crypto: fix packet length while decryption

Roy Shterman (1):
      mem: fix freeing segments in --huge-unlink mode

Satheesh Paul (1):
      net/octeontx2: fix VLAN filter

Savinay Dharmappa (1):
      sched: fix traffic class oversubscription parameter

Shijith Thotton (3):
      eventdev: fix case to initiate crypto adapter service
      event/octeontx2: fix crypto adapter queue pair operations
      event/octeontx2: configure crypto adapter xaq pool

Siwar Zitouni (1):
      net/ice: fix disabling promiscuous mode

Somnath Kotur (5):
      net/bnxt: fix xstats get
      net/bnxt: fix Rx and Tx timestamps
      net/bnxt: fix Tx timestamp init
      net/bnxt: refactor multi-queue Rx configuration
      net/bnxt: fix Rx timestamp when FIFO pending bit is set

Stanislaw Kardach (6):
      test: proceed if timer subsystem already initialized
      stack: allow lock-free only on relevant architectures
      test/distributor: fix worker notification in burst mode
      test/distributor: fix burst flush on worker quit
      net/ena: remove endian swap functions
      net/ena: report default ring size

Stephen Hemminger (2):
      kni: refactor user request processing
      net/bnxt: use prefix on global function

Suanming Mou (1):
      net/mlx5: fix counter offset detection

Tal Shnaiderman (2):
      eal/windows: fix default thread priority
      eal/windows: fix return codes of pthread shim layer

Tengfei Zhang (1):
      net/pcap: fix file descriptor leak on close

Thinh Tran (1):
      test: fix autotest handling of skipped tests

Thomas Monjalon (18):
      bus/pci: fix Windows kernel driver categories
      eal: fix comment of OS-specific header files
      buildtools: fix build with busybox
      build: detect execinfo library on Linux
      build: remove redundant _GNU_SOURCE definitions
      eal: fix build with musl
      net/igc: remove use of uint type
      event/dlb: fix header includes for musl
      examples/bbdev: fix header include for musl
      drivers: fix log level after loading
      app/regex: fix usage text
      app/testpmd: fix usage text
      doc: fix names of UIO drivers
      doc: fix build with Sphinx 4
      bus/pci: support I/O port operations with musl
      app: fix exit messages
      regex/octeontx2: remove unused include directory
      doc: remove PDF requirements

Tianyu Li (1):
      net/memif: fix Tx bps statistics for zero-copy

Timothy McDaniel (2):
      event/dlb2: remove references to deferred scheduling
      doc: fix runtime options in DLB2 guide

Tyler Retzlaff (1):
      eal: add C++ include guard for reciprocal header

Vadim Podovinnikov (1):
      net/bonding: fix LACP system address check

Venkat Duvvuru (1):
      net/bnxt: fix queues per VNIC

Viacheslav Ovsiienko (16):
      net/mlx5: fix external buffer pool registration for Rx queue
      net/mlx5: fix metadata item validation for ingress flows
      net/mlx5: fix hashed list size for tunnel flow groups
      net/mlx5: fix UAR allocation diagnostics messages
      common/mlx5: add timestamp format support to DevX
      vdpa/mlx5: support timestamp format
      net/mlx5: fix Rx metadata leftovers
      net/mlx5: fix drop action for Direct Rules/Verbs
      net/mlx4: fix RSS action with null hash key
      net/mlx5: support timestamp format
      regex/mlx5: support timestamp format
      app/testpmd: fix segment number check
      net/mlx5: remove drop queue function prototypes
      net/mlx4: fix buffer leakage on device close
      net/mlx5: fix probing device in legacy bonding mode
      net/mlx5: fix receiving queue timestamp format

Wei Huang (1):
      raw/ifpga: fix device name format

Wenjun Wu (3):
      net/ice: check some functions return
      net/ice: fix RSS hash update
      net/ice: fix RSS for L2 packet

Wenwu Ma (1):
      net/ice: fix illegal access when removing MAC filter

Wenzhuo Lu (2):
      net/iavf: fix crash in AVX512
      net/ice: fix crash in AVX512

Wisam Jaddo (1):
      app/flow-perf: fix encap/decap actions

Xiao Wang (1):
      vdpa/ifc: check PCI config read

Xiaoyu Min (4):
      net/mlx5: support RSS expansion for IPv6 GRE
      net/mlx5: fix shared inner RSS
      net/mlx5: fix missing shared RSS hash types
      net/mlx5: fix redundant flow after RSS expansion

Xiaoyun Li (2):
      app/testpmd: remove unnecessary UDP tunnel check
      net/i40e: fix IPv4 fragment offload

Xueming Li (4):
      version: 20.11.2-rc1
      net/virtio: fix vectorized Rx queue rearm
      version: 20.11.2-rc2
      version: 20.11.2

Youri Querry (1):
      bus/fslmc: fix random portal hangs with qbman 5.0

Yunjian Wang (5):
      vfio: fix API description
      net/mlx5: fix using flow tunnel before null check
      vfio: fix duplicated user mem map
      net/mlx4: fix leak when configured repeatedly
      net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v7 4/7] power: remove thread safety from PMD power API's
    2021-07-07 10:48  3%             ` [dpdk-dev] [PATCH v7 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-07-07 10:48  3%             ` Anatoly Burakov
  1 sibling, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-07-07 10:48 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: konstantin.ananyev, ciara.loftus

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   4 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 66 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index c1d063bb11..4b84c89c0b 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -119,6 +119,10 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
 
 ABI Changes
 -----------
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v7 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-07-07 10:48  3%             ` Anatoly Burakov
  2021-07-07 10:48  3%             ` [dpdk-dev] [PATCH v7 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
  1 sibling, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-07-07 10:48 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: ciara.loftus, david.hunt

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---

Notes:
    v4:
    - Return error if callback is set to NULL
    - Replace raw number with a macro in monitor condition opaque data
    
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  2 ++
 drivers/event/dlb2/dlb2.c                     | 17 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 20 +++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 20 +++++++----
 drivers/net/ice/ice_rxtx.c                    | 20 +++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 20 +++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 17 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 33 +++++++++++++++----
 lib/eal/x86/rte_power_intrinsics.c            | 17 +++++-----
 9 files changed, 122 insertions(+), 44 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index cd02820e68..c1d063bb11 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -117,6 +117,8 @@ API Changes
 * eal: ``rte_strscpy`` sets ``rte_errno`` to ``E2BIG`` in case of string
   truncation.
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
+
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..252bbd8d5e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,16 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3204,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 8d65f287f4..65f325ede1 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,18 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +105,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index f817fbc49b..d61b32fcee 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,18 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +81,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index 3f6e735984..5d7ab4f047 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,18 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +51,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..c814a28cb4 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,18 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1393,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 777a1d6e45..17370b77dc 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,18 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +294,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..c9aa52a86d 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,38 @@
  * which are architecture-dependent.
  */
 
+/** Size of the opaque data in monitor condition */
+#define RTE_POWER_MONITOR_OPAQUE_SZ 4
+
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ];
+	/**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..66fea28897 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -76,6 +76,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
+	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
 	if (!wait_supported)
@@ -91,6 +92,9 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (__check_val_size(pmc->size) < 0)
 		return -EINVAL;
 
+	if (pmc->fn == NULL)
+		return -EINVAL;
+
 	s = &wait_status[lcore_id];
 
 	/* update sleep address */
@@ -110,16 +114,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
-		const uint64_t cur_value = __get_umwait_val(
-				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
+	cur_value = __get_umwait_val(pmc->addr, pmc->size);
 
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
-			goto end;
-	}
+	/* check if callback indicates we should abort */
+	if (pmc->fn(cur_value, pmc->opaque) != 0)
+		goto end;
 
 	/* execute UMWAIT */
 	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-07  9:36  0%         ` David Marchand
@ 2021-07-07  9:59  0%           ` Thomas Monjalon
  0 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2021-07-07  9:59 UTC (permalink / raw)
  To: Andrew Rybchenko, David Marchand
  Cc: Dodji Seketeli, Huisong Li, dev, Yigit, Ferruh, Ananyev,
	Konstantin, Ray Kinsella

07/07/2021 11:36, David Marchand:
> On Wed, Jul 7, 2021 at 10:23 AM Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru> wrote:
> >
> > On 7/7/21 10:39 AM, David Marchand wrote:
> > > On Tue, Jul 6, 2021 at 10:36 AM Andrew Rybchenko
> > > <andrew.rybchenko@oktetlabs.ru> wrote:
> > >>
> > >> @David, could you take a look at the ABI breakage warnings for
> > >> the patch. May we ignore it since ABI looks backward
> > >> compatible? Or should be marked as a minor change ABI
> > >> which is backward compatible with DPDK_21?
> > >
> > > The whole eth_dev_shared_data area has always been reset to 0 at the
> > > first port allocation in a dpdk application life.
> > > Subsequent calls to rte_eth_dev_release_port() reset every port
> > > eth_dev->data to 0.
> > >
> > > This bit flag is added in a hole of the structure, and it is
> > > set/manipulated internally of ethdev.
> > >
> > > So unless the application was doing something nasty like highjacking
> > > this empty hole in the structure, I see no problem with the change wrt
> > > ABI.
> > >
> > >
> > > I wonder if libabigail is too strict on this report.
> > > Or maybe there is some extreme consideration on what a compiler could
> > > do about this hole...
> >
> > I was wondering if it could be any specifics related to big-
> > little endian vs bit fields placement, but throw the idea
> > away...
> 
> After some discussion offlist with (fairly busy ;-)) Dodji, the report
> here is a good warning.
> 
> But it looks we have an issue with libabigail not properly computing
> bitfields offsets.
> I just opened a bz for tracking
> https://sourceware.org/bugzilla/show_bug.cgi?id=28060
> 
> This is problematic, as the following rule does not work:
> 
> +; Ignore bitfields added in rte_eth_dev_data hole
> +[suppress_type]
> +        name = rte_eth_dev_data
> +        has_data_member_inserted_between = {offset_after(lro),
> offset_of(rx_queue_state)}
> 
> On the other hand, a (wrong) rule with "has_data_member_inserted_at =
> 2" (2 being the wrong offset you can read in abidiff output) works.
> 
> This might force us to waive all changes to rte_eth_dev_data... not
> that I am happy about it.

We are not going to do other changes until 21.11, so it could be fine.



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-07  8:23  0%       ` Andrew Rybchenko
@ 2021-07-07  9:36  0%         ` David Marchand
  2021-07-07  9:59  0%           ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2021-07-07  9:36 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Dodji Seketeli, Huisong Li, dev, Thomas Monjalon, Yigit, Ferruh,
	Ananyev, Konstantin, Ray Kinsella

On Wed, Jul 7, 2021 at 10:23 AM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
>
> On 7/7/21 10:39 AM, David Marchand wrote:
> > On Tue, Jul 6, 2021 at 10:36 AM Andrew Rybchenko
> > <andrew.rybchenko@oktetlabs.ru> wrote:
> >>
> >> @David, could you take a look at the ABI breakage warnings for
> >> the patch. May we ignore it since ABI looks backward
> >> compatible? Or should be marked as a minor change ABI
> >> which is backward compatible with DPDK_21?
> >
> > The whole eth_dev_shared_data area has always been reset to 0 at the
> > first port allocation in a dpdk application life.
> > Subsequent calls to rte_eth_dev_release_port() reset every port
> > eth_dev->data to 0.
> >
> > This bit flag is added in a hole of the structure, and it is
> > set/manipulated internally of ethdev.
> >
> > So unless the application was doing something nasty like highjacking
> > this empty hole in the structure, I see no problem with the change wrt
> > ABI.
> >
> >
> > I wonder if libabigail is too strict on this report.
> > Or maybe there is some extreme consideration on what a compiler could
> > do about this hole...
>
> I was wondering if it could be any specifics related to big-
> little endian vs bit fields placement, but throw the idea
> away...

After some discussion offlist with (fairly busy ;-)) Dodji, the report
here is a good warning.

But it looks we have an issue with libabigail not properly computing
bitfields offsets.
I just opened a bz for tracking
https://sourceware.org/bugzilla/show_bug.cgi?id=28060

This is problematic, as the following rule does not work:

+; Ignore bitfields added in rte_eth_dev_data hole
+[suppress_type]
+        name = rte_eth_dev_data
+        has_data_member_inserted_between = {offset_after(lro),
offset_of(rx_queue_state)}

On the other hand, a (wrong) rule with "has_data_member_inserted_at =
2" (2 being the wrong offset you can read in abidiff output) works.

This might force us to waive all changes to rte_eth_dev_data... not
that I am happy about it.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-07  8:25  3%       ` Andrew Rybchenko
@ 2021-07-07  9:26  0%         ` Huisong Li
  0 siblings, 0 replies; 200+ results
From: Huisong Li @ 2021-07-07  9:26 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: thomas, ferruh.yigit, konstantin.ananyev, david.marchand, Ray Kinsella


在 2021/7/7 16:25, Andrew Rybchenko 写道:
> On 7/7/21 5:55 AM, Huisong Li wrote:
>> 在 2021/7/6 16:36, Andrew Rybchenko 写道:
>>> @David, could you take a look at the ABI breakage warnings for
>>> the patch. May we ignore it since ABI looks backward
>>> compatible? Or should be marked as a minor change ABI
>>> which is backward compatible with DPDK_21?
>>>
>>> On 7/6/21 7:10 AM, Huisong Li wrote:
>>>> Currently, if dev_configure is not called or fails to be called, users
>>>> can still call dev_start successfully. So it is necessary to have a flag
>>>> which indicates whether the device is configured, to control whether
>>>> dev_start can be called and eliminate dependency on user invocation
>>>> order.
>>>>
>>>> The flag stored in "struct rte_eth_dev_data" is more reasonable than
>>>>    "enum rte_eth_dev_state". "enum rte_eth_dev_state" is private to the
>>>> primary and secondary processes, and can be independently controlled.
>>>> However, the secondary process does not make resource allocations and
>>>> does not call dev_configure(). These are done by the primary process
>>>> and can be obtained or used by the secondary process. So this patch
>>>> adds a "dev_configured" flag in "rte_eth_dev_data", like "dev_started".
>>>>
>>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>
>>>> ---
>>>> v1 -> v2:
>>>>     - adjusting the description of patch.
>>>>
>>>> ---
>>>>    lib/ethdev/rte_ethdev.c      | 16 ++++++++++++++++
>>>>    lib/ethdev/rte_ethdev_core.h |  6 +++++-
>>>>    2 files changed, 21 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>>> index c607eab..6540432 100644
>>>> --- a/lib/ethdev/rte_ethdev.c
>>>> +++ b/lib/ethdev/rte_ethdev.c
>>>> @@ -1356,6 +1356,13 @@ rte_eth_dev_configure(uint16_t port_id,
>>>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>>>            return -EBUSY;
>>>>        }
>>>>    +    /*
>>>> +     * Ensure that "dev_configured" is always 0 each time prepare to do
>>>> +     * dev_configure() to avoid any non-anticipated behaviour.
>>>> +     * And set to 1 when dev_configure() is executed successfully.
>>>> +     */
>>>> +    dev->data->dev_configured = 0;
>>>> +
>>>>         /* Store original config, as rollback required on failure */
>>>>        memcpy(&orig_conf, &dev->data->dev_conf,
>>>> sizeof(dev->data->dev_conf));
>>>>    @@ -1606,6 +1613,8 @@ rte_eth_dev_configure(uint16_t port_id,
>>>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>>>        }
>>>>          rte_ethdev_trace_configure(port_id, nb_rx_q, nb_tx_q,
>>>> dev_conf, 0);
>>>> +    dev->data->dev_configured = 1;
>>>> +
>>> I think it should be inserted before the trace, since tracing
>>> is intentionally put close to return without any empty lines
>>> in between.
>> All right. Do I need to send a patch V3?
> Since the patch is waiting for resolution for ABI warning,
> please, send v3 with my Reviewed-by and ack from Konstantin.
> It will be a bit easier to apply when it is OK to do it.
> .
ok. I will send patch V3.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-07  2:55  0%     ` Huisong Li
@ 2021-07-07  8:25  3%       ` Andrew Rybchenko
  2021-07-07  9:26  0%         ` Huisong Li
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-07-07  8:25 UTC (permalink / raw)
  To: Huisong Li, dev
  Cc: thomas, ferruh.yigit, konstantin.ananyev, david.marchand, Ray Kinsella

On 7/7/21 5:55 AM, Huisong Li wrote:
> 
> 在 2021/7/6 16:36, Andrew Rybchenko 写道:
>> @David, could you take a look at the ABI breakage warnings for
>> the patch. May we ignore it since ABI looks backward
>> compatible? Or should be marked as a minor change ABI
>> which is backward compatible with DPDK_21?
>>
>> On 7/6/21 7:10 AM, Huisong Li wrote:
>>> Currently, if dev_configure is not called or fails to be called, users
>>> can still call dev_start successfully. So it is necessary to have a flag
>>> which indicates whether the device is configured, to control whether
>>> dev_start can be called and eliminate dependency on user invocation
>>> order.
>>>
>>> The flag stored in "struct rte_eth_dev_data" is more reasonable than
>>>   "enum rte_eth_dev_state". "enum rte_eth_dev_state" is private to the
>>> primary and secondary processes, and can be independently controlled.
>>> However, the secondary process does not make resource allocations and
>>> does not call dev_configure(). These are done by the primary process
>>> and can be obtained or used by the secondary process. So this patch
>>> adds a "dev_configured" flag in "rte_eth_dev_data", like "dev_started".
>>>
>>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
>> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>
>>> ---
>>> v1 -> v2:
>>>    - adjusting the description of patch.
>>>
>>> ---
>>>   lib/ethdev/rte_ethdev.c      | 16 ++++++++++++++++
>>>   lib/ethdev/rte_ethdev_core.h |  6 +++++-
>>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>>> index c607eab..6540432 100644
>>> --- a/lib/ethdev/rte_ethdev.c
>>> +++ b/lib/ethdev/rte_ethdev.c
>>> @@ -1356,6 +1356,13 @@ rte_eth_dev_configure(uint16_t port_id,
>>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>>           return -EBUSY;
>>>       }
>>>   +    /*
>>> +     * Ensure that "dev_configured" is always 0 each time prepare to do
>>> +     * dev_configure() to avoid any non-anticipated behaviour.
>>> +     * And set to 1 when dev_configure() is executed successfully.
>>> +     */
>>> +    dev->data->dev_configured = 0;
>>> +
>>>        /* Store original config, as rollback required on failure */
>>>       memcpy(&orig_conf, &dev->data->dev_conf,
>>> sizeof(dev->data->dev_conf));
>>>   @@ -1606,6 +1613,8 @@ rte_eth_dev_configure(uint16_t port_id,
>>> uint16_t nb_rx_q, uint16_t nb_tx_q,
>>>       }
>>>         rte_ethdev_trace_configure(port_id, nb_rx_q, nb_tx_q,
>>> dev_conf, 0);
>>> +    dev->data->dev_configured = 1;
>>> +
>> I think it should be inserted before the trace, since tracing
>> is intentionally put close to return without any empty lines
>> in between.
> All right. Do I need to send a patch V3?

Since the patch is waiting for resolution for ABI warning,
please, send v3 with my Reviewed-by and ack from Konstantin.
It will be a bit easier to apply when it is OK to do it.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-07  7:39  3%     ` David Marchand
@ 2021-07-07  8:23  0%       ` Andrew Rybchenko
  2021-07-07  9:36  0%         ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-07-07  8:23 UTC (permalink / raw)
  To: David Marchand, Dodji Seketeli
  Cc: Huisong Li, dev, Thomas Monjalon, Yigit, Ferruh, Ananyev,
	Konstantin, Ray Kinsella

On 7/7/21 10:39 AM, David Marchand wrote:
> On Tue, Jul 6, 2021 at 10:36 AM Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru> wrote:
>>
>> @David, could you take a look at the ABI breakage warnings for
>> the patch. May we ignore it since ABI looks backward
>> compatible? Or should be marked as a minor change ABI
>> which is backward compatible with DPDK_21?
> 
> The whole eth_dev_shared_data area has always been reset to 0 at the
> first port allocation in a dpdk application life.
> Subsequent calls to rte_eth_dev_release_port() reset every port
> eth_dev->data to 0.
> 
> This bit flag is added in a hole of the structure, and it is
> set/manipulated internally of ethdev.
> 
> So unless the application was doing something nasty like highjacking
> this empty hole in the structure, I see no problem with the change wrt
> ABI.
> 
> 
> I wonder if libabigail is too strict on this report.
> Or maybe there is some extreme consideration on what a compiler could
> do about this hole...

I was wondering if it could be any specifics related to big-
little endian vs bit fields placement, but throw the idea
away...

> Dodji?
> 
> 
> For now, we can waive the warning.
> I'll look into the exception rule to add.

Thanks a lot. I'll hold on the patch for now.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
  2021-07-05 17:16  0%       ` Bruce Richardson
@ 2021-07-07  8:08  0%         ` Jerin Jacob
  0 siblings, 0 replies; 200+ results
From: Jerin Jacob @ 2021-07-07  8:08 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Chengwen Feng, Thomas Monjalon, Ferruh Yigit, Jerin Jacob,
	dpdk-dev, Morten Brørup, Nipun Gupta, Hemant Agrawal,
	Maxime Coquelin, Honnappa Nagarahalli, David Marchand,
	Satananda Burla, Prasun Kapoor, Ananyev, Konstantin, liangma,
	Radha Mohan Chintakuntla

On Mon, Jul 5, 2021 at 10:46 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Jul 05, 2021 at 09:25:34PM +0530, Jerin Jacob wrote:
> >
> > On Mon, Jul 5, 2021 at 4:22 PM Bruce Richardson
> > <bruce.richardson@intel.com> wrote:
> > >
> > > On Sun, Jul 04, 2021 at 03:00:30PM +0530, Jerin Jacob wrote:
> > > > On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > > > >
> > > > > This patch introduces 'dmadevice' which is a generic type of DMA
> > > > > device.
> <snip>
> > >
> > > +1 and the terminology with regards to queues and channels. With our ioat
> > > hardware, each HW queue was called a channel for instance.
> >
> > Looks like <dmadev> <> <channel> can cover all the use cases, if the
> > HW has more than
> > 1 queues it can be exposed as separate dmadev dev.
> >
>
> Fine for me.
>
> However, just to confirm that Morten's suggestion of using a
> (device-specific void *) channel pointer rather than dev_id + channel_id
> pair of parameters won't work for you? You can't store a pointer or dev
> index in the channel struct in the driver?

Yes. That will work. To confirm, the suggestion is to use, void *
object instead of channel_id,
That will avoid one more indirection.(index -> pointer)


>
> >
> <snip>
> > > > > + *
> > > > > + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
> > > > > + * code.
> > > > > + * When using cookies, comply with the following rules:
> > > > > + * a) Cookies for each virtual queue are independent.
> > > > > + * b) For a virt queue, the cookie are monotonically incremented, when it reach
> > > > > + *    the INT_MAX, it wraps back to zero.
> > >
> > > I disagree with the INT_MAX (or INT32_MAX) value here. If we use that
> > > value, it means that we cannot use implicit wrap-around inside the CPU and
> > > have to check for the INT_MAX value. Better to:
> > > 1. Specify that it wraps at UINT16_MAX which allows us to just use a
> > > uint16_t internally and wrap-around automatically, or:
> > > 2. Specify that it wraps at a power-of-2 value >= UINT16_MAX, giving
> > > drivers the flexibility at what value to wrap around.
> >
> > I think, (2) better than 1. I think, even better to wrap around the number of
> > descriptors configured in dev_configure()(We cake make this as the power of 2),
> >
>
> Interesting, I hadn't really considered that before. My only concern
> would be if an app wants to keep values in the app ring for a while after
> they have been returned from dmadev. I thought it easier to have the full
> 16-bit counter value returned to the user to give the most flexibility,
> given that going from that to any power-of-2 ring size smaller is a trivial
> operation.
>
> Overall, while my ideal situation is to always have a 0..UINT16_MAX return
> value from the function, I can live with your suggestion of wrapping at
> ring_size, since drivers will likely do that internally anyway.
> I think wrapping at INT32_MAX is too awkward and will be error prone since
> we can't rely on hardware automatically wrapping to zero, nor on the driver
> having pre-masked the value.

OK. +1 for UINT16_MAX

>
> > >
> > > > > + * c) The initial cookie of a virt queue is zero, after the device is stopped or
> > > > > + *    reset, the virt queue's cookie needs to be reset to zero.
> <snip>
> > > >
> > > > Please add some good amount of reserved bits and have API to init this
> > > > structure for future ABI stability, say rte_dmadev_queue_config_init()
> > > > or so.
> > > >
> > >
> > > I don't think that is necessary. Since the config struct is used only as
> > > parameter to the config function, any changes to it can be managed by
> > > versioning that single function. Padding would only be necessary if we had
> > > an array of these config structs somewhere.
> >
> > OK.
> >
> > For some reason, the versioning API looks ugly to me in code instead of keeping
> > some rsvd fields look cool to me with init function.
> >
> > But I agree. function versioning works in this case. No need to find other API
> > if tt is not general DPDK API practice.
> >
>
> The one thing I would suggest instead of the padding is for the internal
> APIS, to pass the struct size through, since we can't version those - and
> for padding we can't know whether any replaced padding should be used or
> not. Specifically:
>
>         typedef int (*rte_dmadev_configure_t)(struct rte_dmadev *dev, struct
>                         rte_dmadev_conf *cfg, size_t cfg_size);
>
> but for the public function:
>
>         int
>         rte_dmadev_configure(struct rte_dmadev *dev, struct
>                         rte_dmadev_conf *cfg)
>         {
>                 ...
>                 ret = dev->ops.configure(dev, cfg, sizeof(*cfg));
>                 ...
>         }

Makes sense.

>
> Then if we change the structure and version the config API, the driver can
> tell from the size what struct version it is and act accordingly. Without
> that, each time the struct changed, we'd have to add a new function pointer
> to the device ops.
>
> > In other libraries, I have seen such _init or function that can use
> > for this as well as filling default value
> > in some cases implementation values is not zero).
> > So that application can avoid memset for param structure.
> > Added rte_event_queue_default_conf_get() in eventdev spec for this.
> >
>
> I think that would largely have the same issues, unless it returned a
> pointer to data inside the driver - and which therefore could not be
> modified. Alternatively it would mean that the memory would have been
> allocated in the driver and we would need to ensure proper cleanup
> functions were called to free memory afterwards. Supporting having the
> config parameter as a local variable I think makes things a lot easier.
>
> > No strong opinion on this.
> >
> >
> >
> > >
> > > >
> > > > > +
> > > > > +/**
> > > > > + * A structure used to retrieve information of a DMA virt queue.
> > > > > + */
> > > > > +struct rte_dmadev_queue_info {
> > > > > +       enum dma_transfer_direction direction;
> > > >
> > > > A queue may support all directions so I think it should be a bitfield.
> > > >
> > > > > +       /**< Associated transfer direction */
> > > > > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > > > > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > > > > +       uint64_t dev_flags; /**< Device specific flags */
> > > > > +};
> > > > > +
> > > >
> > > > > +__rte_experimental
> > > > > +static inline dma_cookie_t
> > > > > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id,
> > > > > +                  const struct dma_scatterlist *sg,
> > > > > +                  uint32_t sg_len, uint64_t flags)
> > > >
> > > > I would like to change this as:
> > > > rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct
> > > > rte_dma_sg *src, uint32_t nb_src,
> > > > const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like

In the above syntax, @Chengchang Tang
rte_dma_sg needs to contains only ptr and size.

> > > > src 30 MB copy can be splitted as written as 1 MB x 30 dst.
> > > >
>
> Out of interest, do you see much benefit (and in what way) from having the
> scatter-gather support? Unlike sending 5 buffers in one packet rather than
> 5 buffers in 5 packets to a NIC, copying an array of memory in one op vs
> multiple is functionally identical.

Knowing upfront or in shot if such segments expressed can have better
optimization
in drivers like
1) In one DMA job request HW can fill multiple segments vs multiple
DMA job requests with each segment.
2) Single completion i.e less overhead system.
3) Less latency for the job requests.


>
> > > >
> > > >
> <snip>
> > Got it. In order to save space if first CL size for fastpath(Saving 8B
> > for the pointer) and to avoid
> > function overhead, Can we use one bit of flags of op function to
> > enable the fence?
> >
>
> The original ioat implementation did exactly that. However, I then
> discovered that because a fence logically belongs between two operations,
> does the fence flag on an operation mean "don't do any jobs after this
> until this job has completed" or does it mean "don't start this job until
> all previous jobs have completed". [Or theoretically does it mean both :-)]
> Naturally, some hardware does it the former way (i.e. fence flag goes on
> last op before fence), while other hardware the latter way (i.e. fence flag
> goes on first op after the fence). Therefore, since fencing is about
> ordering *between* two (sets of) jobs, I decided that it should do exactly
> that and go between two jobs, so there is no ambiguity!
>
> However, I'm happy enough to switch to having a fence flag, but I think if
> we do that, it should be put in the "first job after fence" case, because
> it is always easier to modify a previously written job if we need to, than
> to save the flag for a future one.
>
> Alternatively, if we keep the fence as a separate function, I'm happy
> enough for it not to be on the same cacheline as the "hot" operations,
> since fencing will always introduce a small penalty anyway.

Ack.
You may consider two flags, FENCE_THEN_JOB and JOB_THEN_FENCE( If
there any use case for this or it makes sense for your HW)


For us, Fence is NOP for us as we have an implicit fence between each
HW job descriptor.


>
> > >
> > > >
> <snip>
> > > > Since we have additional function call overhead in all the
> > > > applications for this scheme, I would like to understand
> > > > the use of doing this way vs enq does the doorbell implicitly from
> > > > driver/application PoV?
> > > >
> > >
> > > In our benchmarks it's just faster. When we tested it, the overhead of the
> > > function calls was noticably less than the cost of building up the
> > > parameter array(s) for passing the jobs in as a burst. [We don't see this
> > > cost with things like NIC I/O since DPDK tends to already have the mbuf
> > > fully populated before the TX call anyway.]
> >
> > OK. I agree with stack population.
> >
> > My question was more on doing implicit doorbell update enq. Is doorbell write
> > costly in other HW compare to a function call? In our HW, it is just write of
> > the number of instructions written in a register.
> >
> > Also, we need to again access the internal PMD memory structure to find
> > where to write etc if it is a separate function.
> >
>
> The cost varies depending on a number of factors - even writing to a single
> HW register can be very slow if that register is mapped as device
> (uncacheable) memory, since (AFAIK) it will act as a full fence and wait

I don't know, At least in our case, writes are write-back. so core does not need
to wait.(If there is no read operation).

> for the write to go all the way to hardware. For more modern HW, the cost
> can be lighter. However, any cost of HW writes is going to be the same
> whether its a separate function call or not.
>
> However, the main thing about the doorbell update is that it's a
> once-per-burst thing, rather than a once-per-job. Therefore, even if you
> have to re-read the struct memory (which is likely still somewhere in your
> cores' cache), any extra small cost of doing so is to be amortized over the
> cost of a whole burst of copies.

Linux kernel has xmit_more flag in skb to address similar thing.
i.e enq job flag can have one more bit field to say update ring bell or not?
Rather having yet another function overhead.IMO, it is the best of both worlds.


>
> >
> > >
> > > >
> <snip>
> > > > > +
> > > > > +/**
> > > > > + * @warning
> > > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > > + *
> > > > > + * Returns the number of operations that failed to complete.
> > > > > + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> > > > > + *
> > > > > + * @param dev_id
> > > > > + *   The identifier of the device.
> > > > > + * @param vq_id
> > > > > + *   The identifier of virt queue.
> > > > (> + * @param nb_status
> > > > > + *   Indicates the size  of status array.
> > > > > + * @param[out] status
> > > > > + *   The error code of operations that failed to complete.
> > > > > + * @param[out] cookie
> > > > > + *   The last failed completed operation's cookie.
> > > > > + *
> > > > > + * @return
> > > > > + *   The number of operations that failed to complete.
> > > > > + *
> > > > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > > > + *       corresponding device supports the operation.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +static inline uint16_t
> > > > > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
> > > > > +                          const uint16_t nb_status, uint32_t *status,
> > > > > +                          dma_cookie_t *cookie)
> > > >
> > > > IMO, it is better to move cookie/rind_idx at 3.
> > > > Why it would return any array of errors? since it called after
> > > > rte_dmadev_completed() has
> > > > has_error. Is it better to change
> > > >
> > > > rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
> > > > *cookie,  uint32_t *status)
> > > >
> > > > I also think, we may need to set status as bitmask and enumerate all
> > > > the combination of error codes
> > > > of all the driver and return string from driver existing rte_flow_error
> > > >
> > > > See
> > > > struct rte_flow_error {
> > > >         enum rte_flow_error_type type; /**< Cause field and error types. */
> > > >         const void *cause; /**< Object responsible for the error. */
> > > >         const char *message; /**< Human-readable error message. */
> > > > };
> > > >
> > >
> > > I think we need a multi-return value API here, as we may add operations in
> > > future which have non-error status values to return. The obvious case is
> > > DMA engines which support "compare" operations. In that case a successful
> > > compare (as in there were no DMA or HW errors) can return "equal" or
> > > "not-equal" as statuses. For general "copy" operations, the faster
> > > completion op can be used to just return successful values (and only call
> > > this status version on error), while apps using those compare ops or a
> > > mixture of copy and compare ops, would always use the slower one that
> > > returns status values for each and every op..
> > >
> > > The ioat APIs used 32-bit integer values for this status array so as to
> > > allow e.g. 16-bits for error code and 16-bits for future status values. For
> > > most operations there should be a fairly small set of things that can go
> > > wrong, i.e. bad source address, bad destination address or invalid length.
> > > Within that we may have a couple of specifics for why an address is bad,
> > > but even so I don't think we need to start having multiple bit
> > > combinations.
> >
> > OK. What is the purpose of errors status? Is it for application printing it or
> > Does the application need to take any action based on specific error requests?
>
> It's largely for information purposes, but in the case of SVA/SVM errors
> could occur due to the memory not being pinned, i.e. a page fault, in some
> cases. If that happens, then it's up the app to either touch the memory and
> retry the copy, or to do a SW memcpy as a fallback.
>
> In other error cases, I think it's good to tell the application if it's
> passing around bad data, or data that is beyond the scope of hardware, e.g.
> a copy that is beyond what can be done in a single transaction for a HW
> instance. Given that there are always things that can go wrong, I think we
> need some error reporting mechanism.
>
> > If the former is scope, then we need to define the standard enum value
> > for the error right?
> > ie. uint32_t *status needs to change to enum rte_dma_error or so.
> >
> Sure. Perhaps an error/status structure either is an option, where we
> explicitly call out error info from status info.

Agree. Better to have a structure with filed like,

1)  enum rte_dma_error_type
2)  memory to store, informative message on fine aspects of error.
LIke address caused issue etc.(Which will be driver-specific
information).


>
> >
> >
> <snip to end>
>
> /Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-06  8:36  4%   ` Andrew Rybchenko
  2021-07-07  2:55  0%     ` Huisong Li
@ 2021-07-07  7:39  3%     ` David Marchand
  2021-07-07  8:23  0%       ` Andrew Rybchenko
  1 sibling, 1 reply; 200+ results
From: David Marchand @ 2021-07-07  7:39 UTC (permalink / raw)
  To: Andrew Rybchenko, Dodji Seketeli
  Cc: Huisong Li, dev, Thomas Monjalon, Yigit, Ferruh, Ananyev,
	Konstantin, Ray Kinsella

On Tue, Jul 6, 2021 at 10:36 AM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
>
> @David, could you take a look at the ABI breakage warnings for
> the patch. May we ignore it since ABI looks backward
> compatible? Or should be marked as a minor change ABI
> which is backward compatible with DPDK_21?

The whole eth_dev_shared_data area has always been reset to 0 at the
first port allocation in a dpdk application life.
Subsequent calls to rte_eth_dev_release_port() reset every port
eth_dev->data to 0.

This bit flag is added in a hole of the structure, and it is
set/manipulated internally of ethdev.

So unless the application was doing something nasty like highjacking
this empty hole in the structure, I see no problem with the change wrt
ABI.


I wonder if libabigail is too strict on this report.
Or maybe there is some extreme consideration on what a compiler could
do about this hole...
Dodji?


For now, we can waive the warning.
I'll look into the exception rule to add.


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring
      2021-07-07  5:43  3% ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
@ 2021-07-07  5:48  3% ` Ruifeng Wang
  2 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2021-07-07  5:48 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.
With the wait until equal APIs being stable, changes will not impact ABI.

[1] http://patches.dpdk.org/cover/62703/

v4:
Added meson option to expose WFE. (David, Bruce)

v3:
Series rebased. (David)

Gavin Hu (1):
  spinlock: use wfe to reduce contention on aarch64

Ruifeng Wang (2):
  ring: use wfe to wait for ring tail update on aarch64
  build: add option to enable wait until equal

 config/arm/meson.build                 | 2 +-
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 lib/ring/rte_ring_c11_pvt.h            | 4 ++--
 lib/ring/rte_ring_generic_pvt.h        | 3 +--
 meson_options.txt                      | 2 ++
 5 files changed, 8 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 0/3] Use WFE for spinlock and ring
    @ 2021-07-07  5:43  3% ` Ruifeng Wang
  2021-07-07  5:48  3% ` Ruifeng Wang
  2 siblings, 0 replies; 200+ results
From: Ruifeng Wang @ 2021-07-07  5:43 UTC (permalink / raw)
  Cc: dev, david.marchand, thomas, bruce.richardson, jerinj, nd,
	honnappa.nagarahalli, ruifeng.wang

The rte_wait_until_equal_xxx APIs abstract the functionality of 'polling
for a memory location to become equal to a given value'[1].

Use the API for the rte spinlock and ring implementations.
With the wait until equal APIs being stable, changes will not impact ABI.

[1] http://patches.dpdk.org/cover/62703/

v4:
Added meson option to expose WFE. (David, Bruce)

v3:
Series rebased. (David)

Gavin Hu (1):
  spinlock: use wfe to reduce contention on aarch64

Ruifeng Wang (2):
  ring: use wfe to wait for ring tail update on aarch64
  build: add option to enable wait until equal

 config/arm/meson.build                 | 2 +-
 lib/eal/include/generic/rte_spinlock.h | 4 ++--
 lib/ring/rte_ring_c11_pvt.h            | 4 ++--
 lib/ring/rte_ring_generic_pvt.h        | 3 +--
 meson_options.txt                      | 2 ++
 5 files changed, 8 insertions(+), 7 deletions(-)

-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  2021-07-06  8:36  4%   ` Andrew Rybchenko
@ 2021-07-07  2:55  0%     ` Huisong Li
  2021-07-07  8:25  3%       ` Andrew Rybchenko
  2021-07-07  7:39  3%     ` David Marchand
  1 sibling, 1 reply; 200+ results
From: Huisong Li @ 2021-07-07  2:55 UTC (permalink / raw)
  To: Andrew Rybchenko, dev
  Cc: thomas, ferruh.yigit, konstantin.ananyev, david.marchand, Ray Kinsella


在 2021/7/6 16:36, Andrew Rybchenko 写道:
> @David, could you take a look at the ABI breakage warnings for
> the patch. May we ignore it since ABI looks backward
> compatible? Or should be marked as a minor change ABI
> which is backward compatible with DPDK_21?
>
> On 7/6/21 7:10 AM, Huisong Li wrote:
>> Currently, if dev_configure is not called or fails to be called, users
>> can still call dev_start successfully. So it is necessary to have a flag
>> which indicates whether the device is configured, to control whether
>> dev_start can be called and eliminate dependency on user invocation order.
>>
>> The flag stored in "struct rte_eth_dev_data" is more reasonable than
>>   "enum rte_eth_dev_state". "enum rte_eth_dev_state" is private to the
>> primary and secondary processes, and can be independently controlled.
>> However, the secondary process does not make resource allocations and
>> does not call dev_configure(). These are done by the primary process
>> and can be obtained or used by the secondary process. So this patch
>> adds a "dev_configured" flag in "rte_eth_dev_data", like "dev_started".
>>
>> Signed-off-by: Huisong Li <lihuisong@huawei.com>
> Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>
>> ---
>> v1 -> v2:
>>    - adjusting the description of patch.
>>
>> ---
>>   lib/ethdev/rte_ethdev.c      | 16 ++++++++++++++++
>>   lib/ethdev/rte_ethdev_core.h |  6 +++++-
>>   2 files changed, 21 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
>> index c607eab..6540432 100644
>> --- a/lib/ethdev/rte_ethdev.c
>> +++ b/lib/ethdev/rte_ethdev.c
>> @@ -1356,6 +1356,13 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>>   		return -EBUSY;
>>   	}
>>   
>> +	/*
>> +	 * Ensure that "dev_configured" is always 0 each time prepare to do
>> +	 * dev_configure() to avoid any non-anticipated behaviour.
>> +	 * And set to 1 when dev_configure() is executed successfully.
>> +	 */
>> +	dev->data->dev_configured = 0;
>> +
>>   	 /* Store original config, as rollback required on failure */
>>   	memcpy(&orig_conf, &dev->data->dev_conf, sizeof(dev->data->dev_conf));
>>   
>> @@ -1606,6 +1613,8 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>>   	}
>>   
>>   	rte_ethdev_trace_configure(port_id, nb_rx_q, nb_tx_q, dev_conf, 0);
>> +	dev->data->dev_configured = 1;
>> +
> I think it should be inserted before the trace, since tracing
> is intentionally put close to return without any empty lines
> in between.
All right. Do I need to send a patch V3?
>>   	return 0;
>>   reset_queues:
>>   	eth_dev_rx_queue_config(dev, 0);
>> @@ -1751,6 +1760,13 @@ rte_eth_dev_start(uint16_t port_id)
>>   
>>   	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_start, -ENOTSUP);
>>   
>> +	if (dev->data->dev_configured == 0) {
>> +		RTE_ETHDEV_LOG(INFO,
>> +			"Device with port_id=%"PRIu16" is not configured.\n",
>> +			port_id);
>> +		return -EINVAL;
>> +	}
>> +
>>   	if (dev->data->dev_started != 0) {
>>   		RTE_ETHDEV_LOG(INFO,
>>   			"Device with port_id=%"PRIu16" already started\n",
>> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
>> index 4679d94..edf96de 100644
>> --- a/lib/ethdev/rte_ethdev_core.h
>> +++ b/lib/ethdev/rte_ethdev_core.h
>> @@ -167,7 +167,11 @@ struct rte_eth_dev_data {
>>   		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
>>   		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
>>   		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
>> -		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
>> +		lro         : 1,   /**< RX LRO is ON(1) / OFF(0) */
>> +		dev_configured : 1;
>> +		/**< Indicates whether the device is configured.
>> +		 *   CONFIGURED(1) / NOT CONFIGURED(0).
>> +		 */
>>   	uint8_t rx_queue_state[RTE_MAX_QUEUES_PER_PORT];
>>   		/**< Queues state: HAIRPIN(2) / STARTED(1) / STOPPED(0). */
>>   	uint8_t tx_queue_state[RTE_MAX_QUEUES_PER_PORT];
>>
> .

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] test: fix crypto_op length for sessionless case
  @ 2021-07-06 16:09  3%       ` Brandon Lo
  0 siblings, 0 replies; 200+ results
From: Brandon Lo @ 2021-07-06 16:09 UTC (permalink / raw)
  To: Gujjar, Abhinandan S
  Cc: Yigit, Ferruh, dev, jerinj, dpdklab, aconole, gakhil, Power,
	Ciara, Ali Alnubani

Hi all,

I have rerun the failing unit test. It also recreated the report, so
that category should be passing now.
Currently, I am looking more into the ABI test that is failing on
Arch, as well as the failures with DTS tests.
I will keep this thread updated.

Thanks,
Brandon

On Mon, Jul 5, 2021 at 2:30 AM Gujjar, Abhinandan S
<abhinandan.gujjar@intel.com> wrote:
>
> Hi Jerin/Akhil,
>
> Could you please review the patch?
>
> Regards
> Abhinandan
>
> > -----Original Message-----
> > From: Yigit, Ferruh <ferruh.yigit@intel.com>
> > Sent: Saturday, July 3, 2021 4:56 AM
> > To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; dev@dpdk.org;
> > jerinj@marvell.com; dpdklab@iol.unh.edu; aconole@redhat.com
> > Cc: gakhil@marvell.com; Power, Ciara <ciara.power@intel.com>; Ali Alnubani
> > <alialnu@nvidia.com>
> > Subject: Re: [PATCH] test: fix crypto_op length for sessionless case
> >
> > On 7/2/2021 7:08 PM, Gujjar, Abhinandan S wrote:
> > > Hi Aaron/dpdklab,
> > >
> > > This patch's CI seems to have lot of false positive!
> > > Ferruh triggered the re-test sometime back. Now, it is reporting less.
> > > Could you please check from your end? Thanks!
> > >
> >
> > Only a malloc related unit test is still failing, which seems unrelated with the
> > patch. I am triggering it one more time, third time lucky.
> >
> > Also after re-run, some tests passing now still shown as fail in the patchwork
> > checks table. Isn't re-run sending the patchwork test status again?
> >
> > > Regards
> > > Abhinandan
> > >
> > >
> > >> -----Original Message-----
> > >> From: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>
> > >> Sent: Wednesday, June 30, 2021 6:17 PM
> > >> To: dev@dpdk.org; jerinj@marvell.com
> > >> Cc: gakhil@marvell.com; Gujjar, Abhinandan S
> > >> <abhinandan.gujjar@intel.com>; Power, Ciara <ciara.power@intel.com>
> > >> Subject: [PATCH] test: fix crypto_op length for sessionless case
> > >>
> > >> Currently, private_data_offset for the sessionless is computed
> > >> wrongly which includes extra bytes added because of using
> > >> sizeof(struct
> > >> rte_crypto_sym_xform) * 2) instead of (sizeof(union
> > >> rte_event_crypto_metadata)). Due to this buffer overflow, the
> > >> corruption was leading to test application crash while freeing the ops
> > mempool.
> > >>
> > >> Fixes: 3c2c535ecfc0 ("test: add event crypto adapter auto-test")
> > >> Reported-by: ciara.power@intel.com
> > >>
> > >> Signed-off-by: Abhinandan Gujjar <abhinandan.gujjar@intel.com>
> > >> ---
> > >>  app/test/test_event_crypto_adapter.c | 4 ++--
> > >>  1 file changed, 2 insertions(+), 2 deletions(-)
> > >>
> > >> diff --git a/app/test/test_event_crypto_adapter.c
> > >> b/app/test/test_event_crypto_adapter.c
> > >> index f689bc1f2..688ac0b2f 100644
> > >> --- a/app/test/test_event_crypto_adapter.c
> > >> +++ b/app/test/test_event_crypto_adapter.c
> > >> @@ -229,7 +229,7 @@ test_op_forward_mode(uint8_t session_less)
> > >>               first_xform = &cipher_xform;
> > >>               sym_op->xform = first_xform;
> > >>               uint32_t len = IV_OFFSET + MAXIMUM_IV_LENGTH +
> > >> -                             (sizeof(struct rte_crypto_sym_xform) * 2);
> > >> +                             (sizeof(union
> > >> + rte_event_crypto_metadata));
> > >>               op->private_data_offset = len;
> > >>               /* Fill in private data information */
> > >>               rte_memcpy(&m_data.response_info, &response_info, @@ -
> > >> 424,7 +424,7 @@ test_op_new_mode(uint8_t session_less)
> > >>               first_xform = &cipher_xform;
> > >>               sym_op->xform = first_xform;
> > >>               uint32_t len = IV_OFFSET + MAXIMUM_IV_LENGTH +
> > >> -                             (sizeof(struct rte_crypto_sym_xform) * 2);
> > >> +                             (sizeof(union
> > >> + rte_event_crypto_metadata));
> > >>               op->private_data_offset = len;
> > >>               /* Fill in private data information */
> > >>               rte_memcpy(&m_data.response_info, &response_info,
> > >> --
> > >> 2.25.1
> > >
>


-- 

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo@iol.unh.edu

www.iol.unh.edu

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH V2] ethdev: add dev configured flag
  @ 2021-07-06  8:36  4%   ` Andrew Rybchenko
  2021-07-07  2:55  0%     ` Huisong Li
  2021-07-07  7:39  3%     ` David Marchand
  0 siblings, 2 replies; 200+ results
From: Andrew Rybchenko @ 2021-07-06  8:36 UTC (permalink / raw)
  To: Huisong Li, dev
  Cc: thomas, ferruh.yigit, konstantin.ananyev, david.marchand, Ray Kinsella

@David, could you take a look at the ABI breakage warnings for
the patch. May we ignore it since ABI looks backward
compatible? Or should be marked as a minor change ABI
which is backward compatible with DPDK_21?

On 7/6/21 7:10 AM, Huisong Li wrote:
> Currently, if dev_configure is not called or fails to be called, users
> can still call dev_start successfully. So it is necessary to have a flag
> which indicates whether the device is configured, to control whether
> dev_start can be called and eliminate dependency on user invocation order.
> 
> The flag stored in "struct rte_eth_dev_data" is more reasonable than
>  "enum rte_eth_dev_state". "enum rte_eth_dev_state" is private to the
> primary and secondary processes, and can be independently controlled.
> However, the secondary process does not make resource allocations and
> does not call dev_configure(). These are done by the primary process
> and can be obtained or used by the secondary process. So this patch
> adds a "dev_configured" flag in "rte_eth_dev_data", like "dev_started".
> 
> Signed-off-by: Huisong Li <lihuisong@huawei.com>

Reviewed-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

> ---
> v1 -> v2:
>   - adjusting the description of patch.
> 
> ---
>  lib/ethdev/rte_ethdev.c      | 16 ++++++++++++++++
>  lib/ethdev/rte_ethdev_core.h |  6 +++++-
>  2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c
> index c607eab..6540432 100644
> --- a/lib/ethdev/rte_ethdev.c
> +++ b/lib/ethdev/rte_ethdev.c
> @@ -1356,6 +1356,13 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>  		return -EBUSY;
>  	}
>  
> +	/*
> +	 * Ensure that "dev_configured" is always 0 each time prepare to do
> +	 * dev_configure() to avoid any non-anticipated behaviour.
> +	 * And set to 1 when dev_configure() is executed successfully.
> +	 */
> +	dev->data->dev_configured = 0;
> +
>  	 /* Store original config, as rollback required on failure */
>  	memcpy(&orig_conf, &dev->data->dev_conf, sizeof(dev->data->dev_conf));
>  
> @@ -1606,6 +1613,8 @@ rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_q, uint16_t nb_tx_q,
>  	}
>  
>  	rte_ethdev_trace_configure(port_id, nb_rx_q, nb_tx_q, dev_conf, 0);
> +	dev->data->dev_configured = 1;
> +

I think it should be inserted before the trace, since tracing
is intentionally put close to return without any empty lines
in between.

>  	return 0;
>  reset_queues:
>  	eth_dev_rx_queue_config(dev, 0);
> @@ -1751,6 +1760,13 @@ rte_eth_dev_start(uint16_t port_id)
>  
>  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_start, -ENOTSUP);
>  
> +	if (dev->data->dev_configured == 0) {
> +		RTE_ETHDEV_LOG(INFO,
> +			"Device with port_id=%"PRIu16" is not configured.\n",
> +			port_id);
> +		return -EINVAL;
> +	}
> +
>  	if (dev->data->dev_started != 0) {
>  		RTE_ETHDEV_LOG(INFO,
>  			"Device with port_id=%"PRIu16" already started\n",
> diff --git a/lib/ethdev/rte_ethdev_core.h b/lib/ethdev/rte_ethdev_core.h
> index 4679d94..edf96de 100644
> --- a/lib/ethdev/rte_ethdev_core.h
> +++ b/lib/ethdev/rte_ethdev_core.h
> @@ -167,7 +167,11 @@ struct rte_eth_dev_data {
>  		scattered_rx : 1,  /**< RX of scattered packets is ON(1) / OFF(0) */
>  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
>  		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> -		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> +		lro         : 1,   /**< RX LRO is ON(1) / OFF(0) */
> +		dev_configured : 1;
> +		/**< Indicates whether the device is configured.
> +		 *   CONFIGURED(1) / NOT CONFIGURED(0).
> +		 */
>  	uint8_t rx_queue_state[RTE_MAX_QUEUES_PER_PORT];
>  		/**< Queues state: HAIRPIN(2) / STARTED(1) / STOPPED(0). */
>  	uint8_t tx_queue_state[RTE_MAX_QUEUES_PER_PORT];
> 


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [dpdk-stable] 20.11.2 patches review and test
  2021-07-06  3:26  0% ` [dpdk-dev] [dpdk-stable] " Kalesh Anakkur Purayil
@ 2021-07-06  6:47  0%   ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-07-06  6:47 UTC (permalink / raw)
  To: Kalesh Anakkur Purayil
  Cc: dpdk stable, dpdk-dev, Abhishek Marathe, Akhil Goyal,
	Ali Alnubani, benjamin.walker, David Christensen,
	Hariprasad Govindharajan, Hemant Agrawal, Ian Stokes,
	Jerin Jacob, John McNamara, Ju-Hyoung Lee, Kevin Traynor,
	Luca Boccassi, Pei Zhang, pingx.yu, qian.q.xu, Raslan Darawsheh,
	NBU-Contact-Thomas Monjalon, yuan.peng, zhaoyan.chen

> 
> From: Kalesh Anakkur Purayil <kalesh-anakkur.purayil@broadcom.com> 
> Sent: Tuesday, July 6, 2021 11:27 AM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dpdk stable <stable@dpdk.org>; dpdk-dev <dev@dpdk.org>; Abhishek Marathe <Abhishek.Marathe@microsoft.com>; Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani <alialnu@nvidi
> Subject: Re: [dpdk-stable] 20.11.2 patches review and test
> 
> Hi Xueming,
> 
> Testing with dpdk v20.11.2 from Broadcom looks good.
> 
> - Basic functionality:
>   Send and receive multiple types of traffic.
> - Changing/checking link status through testpmd.
> - RSS tests.
> - TSO tests
> - VLAN filtering tests.
> - MAC filtering test
> - statistics tests
> - Checksum offload tests
> - MTU tests
> - Promiscuous tests
> - Allmulti test
> 
> NIC: BCM57414 NetXtreme-E 10Gb/25Gb Ethernet Controller, Firmware: 219.0.88.0
> NIC: BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet, Firmware : 220.0.0.100

Thanks very much!

> 
> Regards,
> Kalesh
> 
> On Sun, Jun 27, 2021 at 4:59 AM Xueming Li <mailto:xuemingl@nvidia.com> wrote:
> Hi all,
> 
> Here is a list of patches targeted for stable release 20.11.2.
> 
> The planned date for the final release is 6th July.
> 
> Please help with testing and validation of your use cases and report
> any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
> 
> A release candidate tarball can be found at:
> 
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2
> 
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
> 
> Thanks.
> 
> Xueming Li <mailto:xuemingl@nvidia.com>
> 
> ---
> Adam Dybkowski (3):
>       common/qat: increase IM buffer size for GEN3
>       compress/qat: enable compression on GEN3
>       crypto/qat: fix null authentication request
> 
> Ajit Khaparde (7):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
>       doc: fix formatting in testpmd guide
>       net/bnxt: fix mismatched type comparison in MAC restore
>       net/bnxt: check PCI config read
>       net/bnxt: fix mismatched type comparison in Rx
> 
> Alvin Zhang (11):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
>       net/ice: fix fast mbuf freeing
>       net/iavf: fix VF to PF command failure handling
>       net/i40e: fix VF RSS configuration
>       net/igc: fix speed configuration
> 
> Anatoly Burakov (3):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
>       power: save original ACPI governor always
> 
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
> 
> Andrew Rybchenko (4):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
>       net/sfc: fix mark support in EF100 native Rx datapath
> 
> Andy Moreton (2):
>       common/sfc_efx/base: limit reported MCDI response length
>       common/sfc_efx/base: add missing MCDI response length checks
> 
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
> 
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
> 
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
> 
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
> 
> Bing Zhao (1):
>       net/mlx5: fix loopback for Direct Verbs queue
> 
> Bruce Richardson (2):
>       build: exclude meson files from examples installation
>       raw/ioat: fix script for configuring small number of queues
> 
> Chaoyong He (1):
>       doc: fix multiport syntax in nfp guide
> 
> Chenbo Xia (1):
>       examples/vhost: check memory table query
> 
> Chengchang Tang (20):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
>       app/testpmd: fix max queue number for Tx offloads
>       net/tap: fix interrupt vector array size
>       net/bonding: fix socket ID check
>       net/tap: check ioctl on restore
>       examples/timer: fix time interval
> 
> Chengwen Feng (50):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
>       mbuf: check shared memory before dumping dynamic space
>       eventdev: remove redundant thread name setting
>       eventdev: fix memory leakage on thread creation failure
>       net/kni: check init result
>       net/hns3: fix mailbox error message
>       net/hns3: fix processing link status message on PF
>       net/hns3: remove unused mailbox macro and struct
>       net/bonding: fix leak on remove
>       net/hns3: fix handling link update
>       net/i40e: fix negative VEB index
>       net/i40e: remove redundant VSI check in Tx queue setup
>       net/virtio: fix getline memory leakage
>       net/hns3: log time delta in decimal format
>       net/hns3: fix time delta calculation
>       net/hns3: remove unused macros
>       net/hns3: fix vector Rx burst limitation
>       net/hns3: remove read when enabling TM QCN error event
>       net/hns3: remove unused VMDq code
>       net/hns3: increase readability in logs
>       raw/ntb: check SPAD user index
>       raw/ntb: check memory allocations
>       ipc: check malloc sync reply result
>       eal: fix service core list parsing
>       ipc: use monotonic clock
>       net/hns3: return error on PCI config write failure
>       net/hns3: fix log on flow director clear
>       net/hns3: clear hash map on flow director clear
>       net/hns3: fix querying flow director counter for out param
>       net/hns3: fix TM QCN error event report by MSI-X
>       net/hns3: fix mailbox message ID in log
>       net/hns3: fix secondary process request start/stop Rx/Tx
>       net/hns3: fix ordering in secondary process initialization
>       net/hns3: fail setting FEC if one bit mode is not supported
>       net/mlx4: fix secondary process initialization ordering
>       net/mlx5: fix secondary process initialization ordering
> 
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
> 
> Ciara Power (2):
>       telemetry: fix race on callbacks list
>       test/crypto: fix return value of a skipped test
> 
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
> 
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
> 
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
> 
> David Christensen (1):
>       config/ppc: reduce number of cores and NUMA nodes
> 
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
> 
> David Hunt (4):
>       test/power: fix CPU frequency check
>       test/power: add turbo mode to frequency check
>       test/power: fix low frequency test when turbo enabled
>       test/power: fix turbo test
> 
> David Marchand (18):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
>       vhost: fix offload flags in Rx path
>       bus/fslmc: remove unused debug macro
>       eal: fix leak in shared lib mode detection
>       event/dpaa2: remove unused macros
>       net/ice/base: fix memory allocation wrapper
>       net/ice: fix leak on thread termination
>       devtools: fix orphan symbols check with busybox
>       net/vhost: restore pseudo TSO support
>       net/ark: fix leak on thread termination
>       build: fix drivers selection without Python
> 
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
> 
> Dmitry Kozlyuk (4):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
>       examples/rxtx_callbacks: fix port ID format specifier
> 
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
> 
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
> 
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
> 
> Ferruh Yigit (9):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
>       drivers/net: fix FW version query
>       net/bnx2x: fix build with GCC 11
>       net/bnx2x: fix build with GCC 11
>       net/ice/base: fix build with GCC 11
>       net/tap: fix build with GCC 11
>       test/table: fix build with GCC 11
> 
> Gregory Etelson (2):
>       app/testpmd: fix tunnel offload flows cleanup
>       net/mlx5: fix tunnel offload private items location
> 
> Guoyang Zhou (1):
>       net/hinic: fix crash in secondary process
> 
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
> 
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
> 
> Heinrich Kuhn (1):
>       net/nfp: fix reporting of RSS capabilities
> 
> Hemant Agrawal (3):
>       ethdev: add missing buses in device iterator
>       crypto/dpaa_sec: affine the thread portal affinity
>       crypto/dpaa2_sec: fix close and uninit functions
> 
> Hongbo Zheng (9):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
>       bpf: fix JSLT validation
>       common/sfc_efx/base: fix dereferencing null pointer
>       power: fix sanity checks for guest channel read
>       net/hns3: fix VF alive notification after config restore
>       examples/l3fwd-power: fix empty poll thresholds
>       net/hns3: fix concurrent interrupt handling
> 
> Huisong Li (23):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
>       net/hns3: fix link status when port is stopped
>       net/hns3: fix link speed when port is down
>       app/testpmd: fix forward lcores number for DCB
>       app/testpmd: fix DCB forwarding configuration
>       app/testpmd: fix DCB re-configuration
>       app/testpmd: verify DCB config during forward config
>       net/hns3: fix Rx/Tx queue numbers check
>       net/hns3: fix requested FC mode rollback
>       net/hns3: remove meaningless packet buffer rollback
>       net/hns3: fix DCB configuration
>       net/hns3: fix DCB reconfiguration
>       net/hns3: fix link speed when VF device is down
> 
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
> 
> Igor Chauskin (2):
>       net/ena: switch memcpy to optimized version
>       net/ena: fix parsing of large LLQ header device argument
> 
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
> 
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
> 
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
> 
> Jerin Jacob (1):
>       examples: fix pkg-config override
> 
> Jiawei Wang (4):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
>       net/mlx5: fix RSS flow item expansion for GRE key
>       net/mlx5: fix RSS flow item expansion for NVGRE
> 
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
> 
> Jiawen Wu (4):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
>       net/txgbe: fix QinQ strip
> 
> Jiayu Hu (2):
>       vhost: fix queue initialization
>       vhost: fix redundant vring status change notification
> 
> Jie Wang (1):
>       net/ice: fix VSI array out of bounds access
> 
> John Daley (2):
>       net/enic: fix flow initialization error handling
>       net/enic: enable GENEVE offload via VNIC configuration
> 
> Juraj Linkea (1):
>       eal/arm64: fix platform register bit
> 
> Kai Ji (2):
>       test/crypto: fix auth-cipher compare length in OOP
>       test/crypto: copy offset data to OOP destination buffer
> 
> Kalesh AP (23):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
>       net/bnxt: remove unnecessary forward declarations
>       net/bnxt: remove unused function parameters
>       net/bnxt: drop unused attribute
>       net/bnxt: fix single PF per port check
>       net/bnxt: prevent device access in error state
> 
> Kamil Vojanec (1):
>       net/mlx5/linux: fix firmware version
> 
> Kevin Traynor (5):
>       test/cmdline: fix inputs array
>       test/crypto: fix build with GCC 11
>       crypto/zuc: fix build with GCC 11
>       test: fix build with GCC 11
>       test/cmdline: silence clang 12 warning
> 
> Konstantin Ananyev (1):
>       acl: fix build with GCC 11
> 
> Lance Richardson (8):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
>       net/bnxt: fix dynamic VNIC count
>       eal: fix memory mapping on 32-bit target
> 
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
> 
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
> 
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
> 
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
> 
> Matan Azrad (5):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
>       vdpa/mlx5: fix device unplug
> 
> Michael Baum (1):
>       net/mlx5: fix flow age event triggering
> 
> Michal Krawczyk (5):
>       net/ena/base: improve style and comments
>       net/ena/base: fix type conversions by explicit casting
>       net/ena/base: destroy multiple wait events
>       net/ena: fix crash with unsupported device argument
>       net/ena: indicate Rx RSS hash presence
> 
> Min Hu (Connor) (25):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
>       app/eventdev: fix overflow in lcore list parsing
>       test/kni: fix a comment
>       test/kni: check init result
>       net/hns3: fix typos on comments
>       net/e1000: fix flow error message object
>       app/testpmd: fix division by zero on socket memory dump
>       net/kni: warn on stop failure
>       app/bbdev: check memory allocation
>       app/bbdev: fix HARQ error messages
>       raw/skeleton: add missing check after setting attribute
>       test/timer: check memzone allocation
>       app/crypto-perf: check memory allocation
>       examples/flow_classify: fix NUMA check of port and core
>       examples/l2fwd-cat: fix NUMA check of port and core
>       examples/skeleton: fix NUMA check of port and core
>       test: check flow classifier creation
>       test: fix division by zero
> 
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
> 
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
> 
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
> 
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
> 
> Olivier Matz (1):
>       test/mempool: fix object initializer
> 
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
> 
> Pavan Nikhilesh (4):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
>       app/eventdev: fix lcore parsing skipping last core
>       event/octeontx2: fix XAQ pool reconfigure
> 
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
> 
> Qi Zhang (8):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
>       common/iavf: fix duplicated offload bit
> 
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
> 
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
> 
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
> 
> Robin Zhang (6):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
>       net/iavf: fix primary MAC type when starting port
>       net/i40e: fix primary MAC type when starting port
> 
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
> 
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
> 
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
> 
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
> 
> Shijith Thotton (3):
>       eventdev: fix case to initiate crypto adapter service
>       event/octeontx2: fix crypto adapter queue pair operations
>       event/octeontx2: configure crypto adapter xaq pool
> 
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode
> 
> Somnath Kotur (5):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
>       net/bnxt: refactor multi-queue Rx configuration
>       net/bnxt: fix Rx timestamp when FIFO pending bit is set
> 
> Stanislaw Kardach (6):
>       test: proceed if timer subsystem already initialized
>       stack: allow lock-free only on relevant architectures
>       test/distributor: fix worker notification in burst mode
>       test/distributor: fix burst flush on worker quit
>       net/ena: remove endian swap functions
>       net/ena: report default ring size
> 
> Stephen Hemminger (2):
>       kni: refactor user request processing
>       net/bnxt: use prefix on global function
> 
> Suanming Mou (1):
>       net/mlx5: fix counter offset detection
> 
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
> 
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
> 
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
> 
> Thomas Monjalon (18):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
>       regex/octeontx2: remove unused include directory
>       doc: remove PDF requirements
> 
> Tianyu Li (1):
>       net/memif: fix Tx bps statistics for zero-copy
> 
> Timothy McDaniel (2):
>       event/dlb2: remove references to deferred scheduling
>       doc: fix runtime options in DLB2 guide
> 
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
> 
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
> 
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
> 
> Viacheslav Ovsiienko (16):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
>       app/testpmd: fix segment number check
>       net/mlx5: remove drop queue function prototypes
>       net/mlx4: fix buffer leakage on device close
>       net/mlx5: fix probing device in legacy bonding mode
>       net/mlx5: fix receiving queue timestamp format
> 
> Wei Huang (1):
>       raw/ifpga: fix device name format
> 
> Wenjun Wu (3):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
>       net/ice: fix RSS for L2 packet
> 
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
> 
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
> 
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
> 
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
> 
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
> 
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
> 
> Xueming Li (2):
>       version: 20.11.2-rc1
>       net/virtio: fix vectorized Rx queue rearm
> 
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
> 
> Yunjian Wang (5):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map
>       net/mlx4: fix leak when configured repeatedly
>       net/mlx5: fix leak when configured repeatedly
> 
> 
> 
> -- 
> Regards,
> Kalesh A P
>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [dpdk-stable] 20.11.2 patches review and test
  2021-06-26 23:28  1% Xueming Li
  2021-06-30 10:33  0% ` Jiang, YuX
@ 2021-07-06  3:26  0% ` Kalesh Anakkur Purayil
  2021-07-06  6:47  0%   ` Xueming(Steven) Li
  1 sibling, 1 reply; 200+ results
From: Kalesh Anakkur Purayil @ 2021-07-06  3:26 UTC (permalink / raw)
  To: Xueming Li
  Cc: dpdk stable, dpdk-dev, Abhishek Marathe, Akhil Goyal,
	Ali Alnubani, benjamin.walker, David Christensen,
	Hariprasad Govindharajan, Hemant Agrawal, Ian Stokes,
	Jerin Jacob, John McNamara, Ju-Hyoung Lee, Kevin Traynor,
	Luca Boccassi, Pei Zhang, pingx.yu, qian.q.xu, Raslan Darawsheh,
	Thomas Monjalon, yuan.peng, zhaoyan.chen

[-- Attachment #1: Type: text/plain, Size: 26728 bytes --]

Hi Xueming,

Testing with dpdk v20.11.2 from Broadcom looks good.

- Basic functionality:
  Send and receive multiple types of traffic.
- Changing/checking link status through testpmd.
- RSS tests.
- TSO tests
- VLAN filtering tests.
- MAC filtering test
- statistics tests
- Checksum offload tests
- MTU tests
- Promiscuous tests
- Allmulti test

NIC: BCM57414 NetXtreme-E 10Gb/25Gb Ethernet Controller, Firmware:
219.0.88.0
NIC: BCM57508 NetXtreme-E 10Gb/25Gb/40Gb/50Gb/100Gb/200Gb Ethernet,
Firmware : 220.0.0.100

Regards,
Kalesh

On Sun, Jun 27, 2021 at 4:59 AM Xueming Li <xuemingl@nvidia.com> wrote:

> Hi all,
>
> Here is a list of patches targeted for stable release 20.11.2.
>
> The planned date for the final release is 6th July.
>
> Please help with testing and validation of your use cases and report
> any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
>
> A release candidate tarball can be found at:
>
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2
>
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
>
> Thanks.
>
> Xueming Li <xuemingl@nvidia.com>
>
> ---
> Adam Dybkowski (3):
>       common/qat: increase IM buffer size for GEN3
>       compress/qat: enable compression on GEN3
>       crypto/qat: fix null authentication request
>
> Ajit Khaparde (7):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
>       doc: fix formatting in testpmd guide
>       net/bnxt: fix mismatched type comparison in MAC restore
>       net/bnxt: check PCI config read
>       net/bnxt: fix mismatched type comparison in Rx
>
> Alvin Zhang (11):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
>       net/ice: fix fast mbuf freeing
>       net/iavf: fix VF to PF command failure handling
>       net/i40e: fix VF RSS configuration
>       net/igc: fix speed configuration
>
> Anatoly Burakov (3):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
>       power: save original ACPI governor always
>
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
>
> Andrew Rybchenko (4):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
>       net/sfc: fix mark support in EF100 native Rx datapath
>
> Andy Moreton (2):
>       common/sfc_efx/base: limit reported MCDI response length
>       common/sfc_efx/base: add missing MCDI response length checks
>
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
>
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
>
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
>
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
>
> Bing Zhao (1):
>       net/mlx5: fix loopback for Direct Verbs queue
>
> Bruce Richardson (2):
>       build: exclude meson files from examples installation
>       raw/ioat: fix script for configuring small number of queues
>
> Chaoyong He (1):
>       doc: fix multiport syntax in nfp guide
>
> Chenbo Xia (1):
>       examples/vhost: check memory table query
>
> Chengchang Tang (20):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
>       app/testpmd: fix max queue number for Tx offloads
>       net/tap: fix interrupt vector array size
>       net/bonding: fix socket ID check
>       net/tap: check ioctl on restore
>       examples/timer: fix time interval
>
> Chengwen Feng (50):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
>       mbuf: check shared memory before dumping dynamic space
>       eventdev: remove redundant thread name setting
>       eventdev: fix memory leakage on thread creation failure
>       net/kni: check init result
>       net/hns3: fix mailbox error message
>       net/hns3: fix processing link status message on PF
>       net/hns3: remove unused mailbox macro and struct
>       net/bonding: fix leak on remove
>       net/hns3: fix handling link update
>       net/i40e: fix negative VEB index
>       net/i40e: remove redundant VSI check in Tx queue setup
>       net/virtio: fix getline memory leakage
>       net/hns3: log time delta in decimal format
>       net/hns3: fix time delta calculation
>       net/hns3: remove unused macros
>       net/hns3: fix vector Rx burst limitation
>       net/hns3: remove read when enabling TM QCN error event
>       net/hns3: remove unused VMDq code
>       net/hns3: increase readability in logs
>       raw/ntb: check SPAD user index
>       raw/ntb: check memory allocations
>       ipc: check malloc sync reply result
>       eal: fix service core list parsing
>       ipc: use monotonic clock
>       net/hns3: return error on PCI config write failure
>       net/hns3: fix log on flow director clear
>       net/hns3: clear hash map on flow director clear
>       net/hns3: fix querying flow director counter for out param
>       net/hns3: fix TM QCN error event report by MSI-X
>       net/hns3: fix mailbox message ID in log
>       net/hns3: fix secondary process request start/stop Rx/Tx
>       net/hns3: fix ordering in secondary process initialization
>       net/hns3: fail setting FEC if one bit mode is not supported
>       net/mlx4: fix secondary process initialization ordering
>       net/mlx5: fix secondary process initialization ordering
>
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
>
> Ciara Power (2):
>       telemetry: fix race on callbacks list
>       test/crypto: fix return value of a skipped test
>
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
>
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
>
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
>
> David Christensen (1):
>       config/ppc: reduce number of cores and NUMA nodes
>
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
>
> David Hunt (4):
>       test/power: fix CPU frequency check
>       test/power: add turbo mode to frequency check
>       test/power: fix low frequency test when turbo enabled
>       test/power: fix turbo test
>
> David Marchand (18):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
>       vhost: fix offload flags in Rx path
>       bus/fslmc: remove unused debug macro
>       eal: fix leak in shared lib mode detection
>       event/dpaa2: remove unused macros
>       net/ice/base: fix memory allocation wrapper
>       net/ice: fix leak on thread termination
>       devtools: fix orphan symbols check with busybox
>       net/vhost: restore pseudo TSO support
>       net/ark: fix leak on thread termination
>       build: fix drivers selection without Python
>
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
>
> Dmitry Kozlyuk (4):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
>       examples/rxtx_callbacks: fix port ID format specifier
>
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
>
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
>
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
>
> Ferruh Yigit (9):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
>       drivers/net: fix FW version query
>       net/bnx2x: fix build with GCC 11
>       net/bnx2x: fix build with GCC 11
>       net/ice/base: fix build with GCC 11
>       net/tap: fix build with GCC 11
>       test/table: fix build with GCC 11
>
> Gregory Etelson (2):
>       app/testpmd: fix tunnel offload flows cleanup
>       net/mlx5: fix tunnel offload private items location
>
> Guoyang Zhou (1):
>       net/hinic: fix crash in secondary process
>
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
>
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
>
> Heinrich Kuhn (1):
>       net/nfp: fix reporting of RSS capabilities
>
> Hemant Agrawal (3):
>       ethdev: add missing buses in device iterator
>       crypto/dpaa_sec: affine the thread portal affinity
>       crypto/dpaa2_sec: fix close and uninit functions
>
> Hongbo Zheng (9):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
>       bpf: fix JSLT validation
>       common/sfc_efx/base: fix dereferencing null pointer
>       power: fix sanity checks for guest channel read
>       net/hns3: fix VF alive notification after config restore
>       examples/l3fwd-power: fix empty poll thresholds
>       net/hns3: fix concurrent interrupt handling
>
> Huisong Li (23):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
>       net/hns3: fix link status when port is stopped
>       net/hns3: fix link speed when port is down
>       app/testpmd: fix forward lcores number for DCB
>       app/testpmd: fix DCB forwarding configuration
>       app/testpmd: fix DCB re-configuration
>       app/testpmd: verify DCB config during forward config
>       net/hns3: fix Rx/Tx queue numbers check
>       net/hns3: fix requested FC mode rollback
>       net/hns3: remove meaningless packet buffer rollback
>       net/hns3: fix DCB configuration
>       net/hns3: fix DCB reconfiguration
>       net/hns3: fix link speed when VF device is down
>
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
>
> Igor Chauskin (2):
>       net/ena: switch memcpy to optimized version
>       net/ena: fix parsing of large LLQ header device argument
>
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
>
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
>
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
>
> Jerin Jacob (1):
>       examples: fix pkg-config override
>
> Jiawei Wang (4):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
>       net/mlx5: fix RSS flow item expansion for GRE key
>       net/mlx5: fix RSS flow item expansion for NVGRE
>
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
>
> Jiawen Wu (4):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
>       net/txgbe: fix QinQ strip
>
> Jiayu Hu (2):
>       vhost: fix queue initialization
>       vhost: fix redundant vring status change notification
>
> Jie Wang (1):
>       net/ice: fix VSI array out of bounds access
>
> John Daley (2):
>       net/enic: fix flow initialization error handling
>       net/enic: enable GENEVE offload via VNIC configuration
>
> Juraj Linkeš (1):
>       eal/arm64: fix platform register bit
>
> Kai Ji (2):
>       test/crypto: fix auth-cipher compare length in OOP
>       test/crypto: copy offset data to OOP destination buffer
>
> Kalesh AP (23):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
>       net/bnxt: remove unnecessary forward declarations
>       net/bnxt: remove unused function parameters
>       net/bnxt: drop unused attribute
>       net/bnxt: fix single PF per port check
>       net/bnxt: prevent device access in error state
>
> Kamil Vojanec (1):
>       net/mlx5/linux: fix firmware version
>
> Kevin Traynor (5):
>       test/cmdline: fix inputs array
>       test/crypto: fix build with GCC 11
>       crypto/zuc: fix build with GCC 11
>       test: fix build with GCC 11
>       test/cmdline: silence clang 12 warning
>
> Konstantin Ananyev (1):
>       acl: fix build with GCC 11
>
> Lance Richardson (8):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
>       net/bnxt: fix dynamic VNIC count
>       eal: fix memory mapping on 32-bit target
>
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
>
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
>
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
>
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
>
> Matan Azrad (5):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
>       vdpa/mlx5: fix device unplug
>
> Michael Baum (1):
>       net/mlx5: fix flow age event triggering
>
> Michal Krawczyk (5):
>       net/ena/base: improve style and comments
>       net/ena/base: fix type conversions by explicit casting
>       net/ena/base: destroy multiple wait events
>       net/ena: fix crash with unsupported device argument
>       net/ena: indicate Rx RSS hash presence
>
> Min Hu (Connor) (25):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
>       app/eventdev: fix overflow in lcore list parsing
>       test/kni: fix a comment
>       test/kni: check init result
>       net/hns3: fix typos on comments
>       net/e1000: fix flow error message object
>       app/testpmd: fix division by zero on socket memory dump
>       net/kni: warn on stop failure
>       app/bbdev: check memory allocation
>       app/bbdev: fix HARQ error messages
>       raw/skeleton: add missing check after setting attribute
>       test/timer: check memzone allocation
>       app/crypto-perf: check memory allocation
>       examples/flow_classify: fix NUMA check of port and core
>       examples/l2fwd-cat: fix NUMA check of port and core
>       examples/skeleton: fix NUMA check of port and core
>       test: check flow classifier creation
>       test: fix division by zero
>
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
>
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
>
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
>
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
>
> Olivier Matz (1):
>       test/mempool: fix object initializer
>
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
>
> Pavan Nikhilesh (4):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
>       app/eventdev: fix lcore parsing skipping last core
>       event/octeontx2: fix XAQ pool reconfigure
>
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
>
> Qi Zhang (8):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
>       common/iavf: fix duplicated offload bit
>
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
>
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
>
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
>
> Robin Zhang (6):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
>       net/iavf: fix primary MAC type when starting port
>       net/i40e: fix primary MAC type when starting port
>
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
>
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
>
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
>
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
>
> Shijith Thotton (3):
>       eventdev: fix case to initiate crypto adapter service
>       event/octeontx2: fix crypto adapter queue pair operations
>       event/octeontx2: configure crypto adapter xaq pool
>
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode
>
> Somnath Kotur (5):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
>       net/bnxt: refactor multi-queue Rx configuration
>       net/bnxt: fix Rx timestamp when FIFO pending bit is set
>
> Stanislaw Kardach (6):
>       test: proceed if timer subsystem already initialized
>       stack: allow lock-free only on relevant architectures
>       test/distributor: fix worker notification in burst mode
>       test/distributor: fix burst flush on worker quit
>       net/ena: remove endian swap functions
>       net/ena: report default ring size
>
> Stephen Hemminger (2):
>       kni: refactor user request processing
>       net/bnxt: use prefix on global function
>
> Suanming Mou (1):
>       net/mlx5: fix counter offset detection
>
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
>
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
>
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
>
> Thomas Monjalon (18):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
>       regex/octeontx2: remove unused include directory
>       doc: remove PDF requirements
>
> Tianyu Li (1):
>       net/memif: fix Tx bps statistics for zero-copy
>
> Timothy McDaniel (2):
>       event/dlb2: remove references to deferred scheduling
>       doc: fix runtime options in DLB2 guide
>
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
>
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
>
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
>
> Viacheslav Ovsiienko (16):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
>       app/testpmd: fix segment number check
>       net/mlx5: remove drop queue function prototypes
>       net/mlx4: fix buffer leakage on device close
>       net/mlx5: fix probing device in legacy bonding mode
>       net/mlx5: fix receiving queue timestamp format
>
> Wei Huang (1):
>       raw/ifpga: fix device name format
>
> Wenjun Wu (3):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
>       net/ice: fix RSS for L2 packet
>
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
>
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
>
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
>
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
>
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
>
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
>
> Xueming Li (2):
>       version: 20.11.2-rc1
>       net/virtio: fix vectorized Rx queue rearm
>
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
>
> Yunjian Wang (5):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map
>       net/mlx4: fix leak when configured repeatedly
>       net/mlx5: fix leak when configured repeatedly
>


-- 
Regards,
Kalesh A P

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-30 10:33  0% ` Jiang, YuX
@ 2021-07-06  2:37  0%   ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-07-06  2:37 UTC (permalink / raw)
  To: Jiang, YuX, stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani, Walker,
	Benjamin, David Christensen, Govindharajan, Hariprasad,
	Hemant Agrawal, Stokes, Ian, Jerin Jacob, Mcnamara, John,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, Yu,
	PingX, Xu, Qian Q, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	Peng, Yuan, Chen, Zhaoyan



> -----Original Message-----
> From: Jiang, YuX <yux.jiang@intel.com>
> Sent: Wednesday, June 30, 2021 6:33 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>; stable@dpdk.org
> Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>; Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani
> <alialnu@nvidia.com>; Walker, Benjamin <benjamin.walker@intel.com>; David Christensen <drc@linux.vnet.ibm.com>;
> Govindharajan, Hariprasad <hariprasad.govindharajan@intel.com>; Hemant Agrawal <hemant.agrawal@nxp.com>; Stokes, Ian
> <ian.stokes@intel.com>; Jerin Jacob <jerinj@marvell.com>; Mcnamara, John <john.mcnamara@intel.com>; Ju-Hyoung Lee
> <juhlee@microsoft.com>; Kevin Traynor <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang
> <pezhang@redhat.com>; Yu, PingX <pingx.yu@intel.com>; Xu, Qian Q <qian.q.xu@intel.com>; Raslan Darawsheh
> <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Peng, Yuan <yuan.peng@intel.com>; Chen,
> Zhaoyan <zhaoyan.chen@intel.com>
> Subject: RE: [dpdk-dev] 20.11.2 patches review and test
> 
> All,
> Testing with dpdk v20.11.2-rc2 from Intel looks good, no critical issue is found. All of them are known issues.
> Below two issues has been fixed in 20.11.2-rc2:
>   1) Fedora34 GCC11 and Clang12 build failed.
>   2) dcf_lifecycle/handle_acl_filter_05: after reset port the mac changed.
> 
> # Basic Intel(R) NIC testing
> *PF(i40e, ixgbe): test scenarios including rte_flow/TSO/Jumboframe/checksum offload/Tunnel, etc. Listed but not all.
> - Below two known issues are found.
>   1)https://bugs.dpdk.org/show_bug.cgi?id=687 : unit_tests_power/power_cpufreq: unit test failed. This issue is found in 21.05 and
> not fixed yet.
>   2)ddp_gtp_qregion/fd_gtpu_ipv4_dstip: flow director does not work. This issue is found in 21.05, fixed in 21.08.
>     Fixed patch link: http://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/
> *VF(i40e,ixgbe): test scenarios including vf-rte_flow/TSO/Jumboframe/checksum offload/Tunnel, Listed but not all.
> - No new issues are found.
> *PF/VF(ice): test scenarios including switch features/Flow Director/Advanced RSS/ACL/DCF/Flexible Descriptor and so on, Listed but
> not all.
> - Below 3 known DPDK issues are found.
>   1)rxtx_offload/rxoffload_port: Pkt1 can't be distributed to the same queue. This issue is found in 21.05, fixed in 21.08
>     Fixed patch link: http://patches.dpdk.org/project/dpdk/patch/20210527064251.242076-1-dapengx.yu@intel.com/
>   2)cvl_advanced_iavf_rss: change the SCTP port value, the hash value remains unchanged. This issue is found in 20.11-rc3, fixed in
> 21.02, but it’s belong to 21.02 new feature, won’t backporting to LTS20.11.
>   3)Can't create 512 acl rules after creating a full mask switch rule. This issue is also occurred in dpdk 20.11 and not fixed yet.
> * Build: cover the build test combination with latest GCC/Clang/ICC version and the popular OS revision such as Ubuntu20.04,
> CentOS8.3 and so on. Listed but not all.
> - All passed.
> * Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test(AVX2+AVX512) test and so on.
> Listed but not all.
> - All passed. No big data drop.
> 
> # Basic cryptodev and virtio testing
> * Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf
> testing, etc.. Listed but not all.
> - One known issues as below:
> > (1)The UDP fragmentation offload feature of Virtio-net device can’t be turned on in the VM, kernel issue, bugzilla has been submited:
> https://bugzilla.kernel.org/show_bug.cgi?id=207075, not fixed yet.
> * Cryptodev:
> - Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. Listed but not all.
>   - All passed.
> - Performance test: test scenarios including Thoughput Performance /Cryptodev Latency, etc. Listed but not all.
>   - No big data drop.
> 
> Best regards,
> Yu Jiang

Thank you!

> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming Li
> > Sent: Sunday, June 27, 2021 7:28 AM
> > To: stable@dpdk.org
> > Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>;
> > Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani <alialnu@nvidia.com>;
> > Walker, Benjamin <benjamin.walker@intel.com>; David Christensen
> > <drc@linux.vnet.ibm.com>; Govindharajan, Hariprasad
> > <hariprasad.govindharajan@intel.com>; Hemant Agrawal
> > <hemant.agrawal@nxp.com>; Stokes, Ian <ian.stokes@intel.com>; Jerin
> > Jacob <jerinj@marvell.com>; Mcnamara, John <john.mcnamara@intel.com>;
> > Ju-Hyoung Lee <juhlee@microsoft.com>; Kevin Traynor
> > <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang
> > <pezhang@redhat.com>; Yu, PingX <pingx.yu@intel.com>; Xu, Qian Q
> > <qian.q.xu@intel.com>; Raslan Darawsheh <rasland@nvidia.com>; Thomas
> > Monjalon <thomas@monjalon.net>; Peng, Yuan <yuan.peng@intel.com>;
> > Chen, Zhaoyan <zhaoyan.chen@intel.com>; xuemingl@nvidia.com
> > Subject: [dpdk-dev] 20.11.2 patches review and test
> >
> > Hi all,
> >
> > Here is a list of patches targeted for stable release 20.11.2.
> >
> > The planned date for the final release is 6th July.
> >
> > Please help with testing and validation of your use cases and report
> > any issues/results with reply-all to this mail. For the final release
> > the fixes and reported validations will be added to the release notes.
> >
> > A release candidate tarball can be found at:
> >
> >     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2
> >
> > These patches are located at branch 20.11 of dpdk-stable repo:
> >     https://dpdk.org/browse/dpdk-stable/
> >
> > Thanks.
> >
> > Xueming Li <xuemingl@nvidia.com>
> >
> > ---
> > Adam Dybkowski (3):
> >       common/qat: increase IM buffer size for GEN3
> >       compress/qat: enable compression on GEN3
> >       crypto/qat: fix null authentication request
> >
> > Ajit Khaparde (7):
> >       net/bnxt: fix RSS context cleanup
> >       net/bnxt: check kvargs parsing
> >       net/bnxt: fix resource cleanup
> >       doc: fix formatting in testpmd guide
> >       net/bnxt: fix mismatched type comparison in MAC restore
> >       net/bnxt: check PCI config read
> >       net/bnxt: fix mismatched type comparison in Rx
> >
> > Alvin Zhang (11):
> >       net/ice: fix VLAN filter with PF
> >       net/i40e: fix input set field mask
> >       net/igc: fix Rx RSS hash offload capability
> >       net/igc: fix Rx error counter for bad length
> >       net/e1000: fix Rx error counter for bad length
> >       net/e1000: fix max Rx packet size
> >       net/igc: fix Rx packet size
> >       net/ice: fix fast mbuf freeing
> >       net/iavf: fix VF to PF command failure handling
> >       net/i40e: fix VF RSS configuration
> >       net/igc: fix speed configuration
> >
> > Anatoly Burakov (3):
> >       fbarray: fix log message on truncation error
> >       power: do not skip saving original P-state governor
> >       power: save original ACPI governor always
> >
> > Andrew Boyer (1):
> >       net/ionic: fix completion type in lif init
> >
> > Andrew Rybchenko (4):
> >       net/failsafe: fix RSS hash offload reporting
> >       net/failsafe: report minimum and maximum MTU
> >       common/sfc_efx: remove GENEVE from supported tunnels
> >       net/sfc: fix mark support in EF100 native Rx datapath
> >
> > Andy Moreton (2):
> >       common/sfc_efx/base: limit reported MCDI response length
> >       common/sfc_efx/base: add missing MCDI response length checks
> >
> > Ankur Dwivedi (1):
> >       crypto/octeontx: fix session-less mode
> >
> > Apeksha Gupta (1):
> >       examples/l2fwd-crypto: skip masked devices
> >
> > Arek Kusztal (1):
> >       crypto/qat: fix offset for out-of-place scatter-gather
> >
> > Beilei Xing (1):
> >       net/i40evf: fix packet loss for X722
> >
> > Bing Zhao (1):
> >       net/mlx5: fix loopback for Direct Verbs queue
> >
> > Bruce Richardson (2):
> >       build: exclude meson files from examples installation
> >       raw/ioat: fix script for configuring small number of queues
> >
> > Chaoyong He (1):
> >       doc: fix multiport syntax in nfp guide
> >
> > Chenbo Xia (1):
> >       examples/vhost: check memory table query
> >
> > Chengchang Tang (20):
> >       net/hns3: fix HW buffer size on MTU update
> >       net/hns3: fix processing Tx offload flags
> >       net/hns3: fix Tx checksum for UDP packets with special port
> >       net/hns3: fix long task queue pairs reset time
> >       ethdev: validate input in module EEPROM dump
> >       ethdev: validate input in register info
> >       ethdev: validate input in EEPROM info
> >       net/hns3: fix rollback after setting PVID failure
> >       net/hns3: fix timing in resetting queues
> >       net/hns3: fix queue state when concurrent with reset
> >       net/hns3: fix configure FEC when concurrent with reset
> >       net/hns3: fix use of command status enumeration
> >       examples: add eal cleanup to examples
> >       net/bonding: fix adding itself as its slave
> >       net/hns3: fix timing in mailbox
> >       app/testpmd: fix max queue number for Tx offloads
> >       net/tap: fix interrupt vector array size
> >       net/bonding: fix socket ID check
> >       net/tap: check ioctl on restore
> >       examples/timer: fix time interval
> >
> > Chengwen Feng (50):
> >       net/hns3: fix flow counter value
> >       net/hns3: fix VF mailbox head field
> >       net/hns3: support get device version when dump register
> >       net/hns3: fix some packet types
> >       net/hns3: fix missing outer L4 UDP flag for VXLAN
> >       net/hns3: remove VLAN/QinQ ptypes from support list
> >       test: check thread creation
> >       common/dpaax: fix possible null pointer access
> >       examples/ethtool: remove unused parsing
> >       net/hns3: fix flow director lock
> >       net/e1000/base: fix timeout for shadow RAM write
> >       net/hns3: fix setting default MAC address in bonding of VF
> >       net/hns3: fix possible mismatched response of mailbox
> >       net/hns3: fix VF handling LSC event in secondary process
> >       net/hns3: fix verification of NEON support
> >       mbuf: check shared memory before dumping dynamic space
> >       eventdev: remove redundant thread name setting
> >       eventdev: fix memory leakage on thread creation failure
> >       net/kni: check init result
> >       net/hns3: fix mailbox error message
> >       net/hns3: fix processing link status message on PF
> >       net/hns3: remove unused mailbox macro and struct
> >       net/bonding: fix leak on remove
> >       net/hns3: fix handling link update
> >       net/i40e: fix negative VEB index
> >       net/i40e: remove redundant VSI check in Tx queue setup
> >       net/virtio: fix getline memory leakage
> >       net/hns3: log time delta in decimal format
> >       net/hns3: fix time delta calculation
> >       net/hns3: remove unused macros
> >       net/hns3: fix vector Rx burst limitation
> >       net/hns3: remove read when enabling TM QCN error event
> >       net/hns3: remove unused VMDq code
> >       net/hns3: increase readability in logs
> >       raw/ntb: check SPAD user index
> >       raw/ntb: check memory allocations
> >       ipc: check malloc sync reply result
> >       eal: fix service core list parsing
> >       ipc: use monotonic clock
> >       net/hns3: return error on PCI config write failure
> >       net/hns3: fix log on flow director clear
> >       net/hns3: clear hash map on flow director clear
> >       net/hns3: fix querying flow director counter for out param
> >       net/hns3: fix TM QCN error event report by MSI-X
> >       net/hns3: fix mailbox message ID in log
> >       net/hns3: fix secondary process request start/stop Rx/Tx
> >       net/hns3: fix ordering in secondary process initialization
> >       net/hns3: fail setting FEC if one bit mode is not supported
> >       net/mlx4: fix secondary process initialization ordering
> >       net/mlx5: fix secondary process initialization ordering
> >
> > Ciara Loftus (1):
> >       net/af_xdp: fix error handling during Rx queue setup
> >
> > Ciara Power (2):
> >       telemetry: fix race on callbacks list
> >       test/crypto: fix return value of a skipped test
> >
> > Conor Walsh (1):
> >       examples/l3fwd: fix LPM IPv6 subnets
> >
> > Cristian Dumitrescu (3):
> >       table: fix actions with different data size
> >       pipeline: fix instruction translation
> >       pipeline: fix endianness conversions
> >
> > Dapeng Yu (3):
> >       net/igc: remove MTU setting limitation
> >       net/e1000: remove MTU setting limitation
> >       examples/packet_ordering: fix port configuration
> >
> > David Christensen (1):
> >       config/ppc: reduce number of cores and NUMA nodes
> >
> > David Harton (1):
> >       net/ena: fix releasing Tx ring mbufs
> >
> > David Hunt (4):
> >       test/power: fix CPU frequency check
> >       test/power: add turbo mode to frequency check
> >       test/power: fix low frequency test when turbo enabled
> >       test/power: fix turbo test
> >
> > David Marchand (18):
> >       doc: fix sphinx rtd theme import in GHA
> >       service: clean references to removed symbol
> >       eal: fix evaluation of log level option
> >       ci: hook to GitHub Actions
> >       ci: enable v21 ABI checks
> >       ci: fix package installation in GitHub Actions
> >       ci: ignore APT update failure in GitHub Actions
> >       ci: catch coredumps
> >       vhost: fix offload flags in Rx path
> >       bus/fslmc: remove unused debug macro
> >       eal: fix leak in shared lib mode detection
> >       event/dpaa2: remove unused macros
> >       net/ice/base: fix memory allocation wrapper
> >       net/ice: fix leak on thread termination
> >       devtools: fix orphan symbols check with busybox
> >       net/vhost: restore pseudo TSO support
> >       net/ark: fix leak on thread termination
> >       build: fix drivers selection without Python
> >
> > Dekel Peled (1):
> >       common/mlx5: fix DevX read output buffer size
> >
> > Dmitry Kozlyuk (4):
> >       net/pcap: fix format string
> >       eal/windows: add missing SPDX license tag
> >       buildtools: fix all drivers disabled on Windows
> >       examples/rxtx_callbacks: fix port ID format specifier
> >
> > Ed Czeck (2):
> >       net/ark: update packet director initial state
> >       net/ark: refactor Rx buffer recovery
> >
> > Elad Nachman (2):
> >       kni: support async user request
> >       kni: fix kernel deadlock with bifurcated device
> >
> > Feifei Wang (2):
> >       net/i40e: fix parsing packet type for NEON
> >       test/trace: fix race on collected perf data
> >
> > Ferruh Yigit (9):
> >       power: remove duplicated symbols from map file
> >       log/linux: make default output stderr
> >       license: fix typos
> >       drivers/net: fix FW version query
> >       net/bnx2x: fix build with GCC 11
> >       net/bnx2x: fix build with GCC 11
> >       net/ice/base: fix build with GCC 11
> >       net/tap: fix build with GCC 11
> >       test/table: fix build with GCC 11
> >
> > Gregory Etelson (2):
> >       app/testpmd: fix tunnel offload flows cleanup
> >       net/mlx5: fix tunnel offload private items location
> >
> > Guoyang Zhou (1):
> >       net/hinic: fix crash in secondary process
> >
> > Haiyue Wang (1):
> >       net/ixgbe: fix Rx errors statistics for UDP checksum
> >
> > Harman Kalra (1):
> >       event/octeontx2: fix device reconfigure for single slot
> >
> > Heinrich Kuhn (1):
> >       net/nfp: fix reporting of RSS capabilities
> >
> > Hemant Agrawal (3):
> >       ethdev: add missing buses in device iterator
> >       crypto/dpaa_sec: affine the thread portal affinity
> >       crypto/dpaa2_sec: fix close and uninit functions
> >
> > Hongbo Zheng (9):
> >       app/testpmd: fix Tx/Rx descriptor query error log
> >       net/hns3: fix FLR miss detection
> >       net/hns3: delete redundant blank line
> >       bpf: fix JSLT validation
> >       common/sfc_efx/base: fix dereferencing null pointer
> >       power: fix sanity checks for guest channel read
> >       net/hns3: fix VF alive notification after config restore
> >       examples/l3fwd-power: fix empty poll thresholds
> >       net/hns3: fix concurrent interrupt handling
> >
> > Huisong Li (23):
> >       net/hns3: fix device capabilities for copper media type
> >       net/hns3: remove unused parameter markers
> >       net/hns3: fix reporting undefined speed
> >       net/hns3: fix link update when failed to get link info
> >       net/hns3: fix flow control exception
> >       app/testpmd: fix bitmap of link speeds when force speed
> >       net/hns3: fix flow control mode
> >       net/hns3: remove redundant mailbox response
> >       net/hns3: fix DCB mode check
> >       net/hns3: fix VMDq mode check
> >       net/hns3: fix mbuf leakage
> >       net/hns3: fix link status when port is stopped
> >       net/hns3: fix link speed when port is down
> >       app/testpmd: fix forward lcores number for DCB
> >       app/testpmd: fix DCB forwarding configuration
> >       app/testpmd: fix DCB re-configuration
> >       app/testpmd: verify DCB config during forward config
> >       net/hns3: fix Rx/Tx queue numbers check
> >       net/hns3: fix requested FC mode rollback
> >       net/hns3: remove meaningless packet buffer rollback
> >       net/hns3: fix DCB configuration
> >       net/hns3: fix DCB reconfiguration
> >       net/hns3: fix link speed when VF device is down
> >
> > Ibtisam Tariq (1):
> >       examples/vhost_crypto: remove unused short option
> >
> > Igor Chauskin (2):
> >       net/ena: switch memcpy to optimized version
> >       net/ena: fix parsing of large LLQ header device argument
> >
> > Igor Russkikh (2):
> >       net/qede: reduce log verbosity
> >       net/qede: accept bigger RSS table
> >
> > Ilya Maximets (1):
> >       net/virtio: fix interrupt unregistering for listening socket
> >
> > Ivan Malov (5):
> >       net/sfc: fix buffer size for flow parse
> >       net: fix comment in IPv6 header
> >       net/sfc: fix error path inconsistency
> >       common/sfc_efx/base: fix indication of MAE encap support
> >       net/sfc: fix outer rule rollback on error
> >
> > Jerin Jacob (1):
> >       examples: fix pkg-config override
> >
> > Jiawei Wang (4):
> >       app/testpmd: fix NVGRE encap configuration
> >       net/mlx5: fix resource release for mirror flow
> >       net/mlx5: fix RSS flow item expansion for GRE key
> >       net/mlx5: fix RSS flow item expansion for NVGRE
> >
> > Jiawei Zhu (1):
> >       net/mlx5: fix Rx segmented packets on mbuf starvation
> >
> > Jiawen Wu (4):
> >       net/txgbe: remove unused functions
> >       net/txgbe: fix Rx missed packet counter
> >       net/txgbe: update packet type
> >       net/txgbe: fix QinQ strip
> >
> > Jiayu Hu (2):
> >       vhost: fix queue initialization
> >       vhost: fix redundant vring status change notification
> >
> > Jie Wang (1):
> >       net/ice: fix VSI array out of bounds access
> >
> > John Daley (2):
> >       net/enic: fix flow initialization error handling
> >       net/enic: enable GENEVE offload via VNIC configuration
> >
> > Juraj Linkeš (1):
> >       eal/arm64: fix platform register bit
> >
> > Kai Ji (2):
> >       test/crypto: fix auth-cipher compare length in OOP
> >       test/crypto: copy offset data to OOP destination buffer
> >
> > Kalesh AP (23):
> >       net/bnxt: remove unused macro
> >       net/bnxt: fix VNIC configuration
> >       net/bnxt: fix firmware fatal error handling
> >       net/bnxt: fix FW readiness check during recovery
> >       net/bnxt: fix device readiness check
> >       net/bnxt: fix VF info allocation
> >       net/bnxt: fix HWRM and FW incompatibility handling
> >       net/bnxt: mute some failure logs
> >       app/testpmd: check MAC address query
> >       net/bnxt: fix PCI write check
> >       net/bnxt: fix link state operations
> >       net/bnxt: fix timesync when PTP is not supported
> >       net/bnxt: fix memory allocation for command response
> >       net/bnxt: fix double free in port start failure
> >       net/bnxt: fix configuring LRO
> >       net/bnxt: fix health check alarm cancellation
> >       net/bnxt: fix PTP support for Thor
> >       net/bnxt: fix ring count calculation for Thor
> >       net/bnxt: remove unnecessary forward declarations
> >       net/bnxt: remove unused function parameters
> >       net/bnxt: drop unused attribute
> >       net/bnxt: fix single PF per port check
> >       net/bnxt: prevent device access in error state
> >
> > Kamil Vojanec (1):
> >       net/mlx5/linux: fix firmware version
> >
> > Kevin Traynor (5):
> >       test/cmdline: fix inputs array
> >       test/crypto: fix build with GCC 11
> >       crypto/zuc: fix build with GCC 11
> >       test: fix build with GCC 11
> >       test/cmdline: silence clang 12 warning
> >
> > Konstantin Ananyev (1):
> >       acl: fix build with GCC 11
> >
> > Lance Richardson (8):
> >       net/bnxt: fix Rx buffer posting
> >       net/bnxt: fix Tx length hint threshold
> >       net/bnxt: fix handling of null flow mask
> >       test: fix TCP header initialization
> >       net/bnxt: fix Rx descriptor status
> >       net/bnxt: fix Rx queue count
> >       net/bnxt: fix dynamic VNIC count
> >       eal: fix memory mapping on 32-bit target
> >
> > Leyi Rong (1):
> >       net/iavf: fix packet length parsing in AVX512
> >
> > Li Zhang (1):
> >       net/mlx5: fix flow actions index in cache
> >
> > Luc Pelletier (2):
> >       eal: fix race in control thread creation
> >       eal: fix hang in control thread creation
> >
> > Marvin Liu (5):
> >       vhost: fix split ring potential buffer overflow
> >       vhost: fix packed ring potential buffer overflow
> >       vhost: fix batch dequeue potential buffer overflow
> >       vhost: fix initialization of temporary header
> >       vhost: fix initialization of async temporary header
> >
> > Matan Azrad (5):
> >       common/mlx5/linux: add glue function to query WQ
> >       common/mlx5: add DevX command to query WQ
> >       common/mlx5: add DevX commands for queue counters
> >       vdpa/mlx5: fix virtq cleaning
> >       vdpa/mlx5: fix device unplug
> >
> > Michael Baum (1):
> >       net/mlx5: fix flow age event triggering
> >
> > Michal Krawczyk (5):
> >       net/ena/base: improve style and comments
> >       net/ena/base: fix type conversions by explicit casting
> >       net/ena/base: destroy multiple wait events
> >       net/ena: fix crash with unsupported device argument
> >       net/ena: indicate Rx RSS hash presence
> >
> > Min Hu (Connor) (25):
> >       net/hns3: fix MTU config complexity
> >       net/hns3: update HiSilicon copyright syntax
> >       net/hns3: fix copyright date
> >       examples/ptpclient: remove wrong comment
> >       test/bpf: fix error message
> >       doc: fix HiSilicon copyright syntax
> >       net/hns3: remove unused macros
> >       net/hns3: remove unused macro
> >       app/eventdev: fix overflow in lcore list parsing
> >       test/kni: fix a comment
> >       test/kni: check init result
> >       net/hns3: fix typos on comments
> >       net/e1000: fix flow error message object
> >       app/testpmd: fix division by zero on socket memory dump
> >       net/kni: warn on stop failure
> >       app/bbdev: check memory allocation
> >       app/bbdev: fix HARQ error messages
> >       raw/skeleton: add missing check after setting attribute
> >       test/timer: check memzone allocation
> >       app/crypto-perf: check memory allocation
> >       examples/flow_classify: fix NUMA check of port and core
> >       examples/l2fwd-cat: fix NUMA check of port and core
> >       examples/skeleton: fix NUMA check of port and core
> >       test: check flow classifier creation
> >       test: fix division by zero
> >
> > Murphy Yang (3):
> >       net/ixgbe: fix RSS RETA being reset after port start
> >       net/i40e: fix flow director config after flow validate
> >       net/i40e: fix flow director for common pctypes
> >
> > Natanael Copa (5):
> >       common/dpaax/caamflib: fix build with musl
> >       bus/dpaa: fix 64-bit arch detection
> >       bus/dpaa: fix build with musl
> >       net/cxgbe: remove use of uint type
> >       app/testpmd: fix build with musl
> >
> > Nipun Gupta (1):
> >       bus/dpaa: fix statistics reading
> >
> > Nithin Dabilpuram (3):
> >       vfio: do not merge contiguous areas
> >       vfio: fix DMA mapping granularity for IOVA as VA
> >       test/mem: fix page size for external memory
> >
> > Olivier Matz (1):
> >       test/mempool: fix object initializer
> >
> > Pallavi Kadam (1):
> >       bus/pci: skip probing some Windows NDIS devices
> >
> > Pavan Nikhilesh (4):
> >       test/event: fix timeout accuracy
> >       app/eventdev: fix timeout accuracy
> >       app/eventdev: fix lcore parsing skipping last core
> >       event/octeontx2: fix XAQ pool reconfigure
> >
> > Pu Xu (1):
> >       ip_frag: fix fragmenting IPv4 packet with header option
> >
> > Qi Zhang (8):
> >       net/ice/base: fix payload indicator on ptype
> >       net/ice/base: fix uninitialized struct
> >       net/ice/base: cleanup filter list on error
> >       net/ice/base: fix memory allocation for MAC addresses
> >       net/iavf: fix TSO max segment size
> >       doc: fix matching versions in ice guide
> >       net/iavf: fix wrong Tx context descriptor
> >       common/iavf: fix duplicated offload bit
> >
> > Radha Mohan Chintakuntla (1):
> >       raw/octeontx2_dma: assign PCI device in DPI VF
> >
> > Raslan Darawsheh (1):
> >       ethdev: update flow item GTP QFI definition
> >
> > Richael Zhuang (2):
> >       test/power: add delay before checking CPU frequency
> >       test/power: round CPU frequency to check
> >
> > Robin Zhang (6):
> >       net/i40e: announce request queue capability in PF
> >       doc: update recommended versions for i40e
> >       net/i40e: fix lack of MAC type when set MAC address
> >       net/iavf: fix lack of MAC type when set MAC address
> >       net/iavf: fix primary MAC type when starting port
> >       net/i40e: fix primary MAC type when starting port
> >
> > Rohit Raj (3):
> >       net/dpaa2: fix getting link status
> >       net/dpaa: fix getting link status
> >       examples/l2fwd-crypto: fix packet length while decryption
> >
> > Roy Shterman (1):
> >       mem: fix freeing segments in --huge-unlink mode
> >
> > Satheesh Paul (1):
> >       net/octeontx2: fix VLAN filter
> >
> > Savinay Dharmappa (1):
> >       sched: fix traffic class oversubscription parameter
> >
> > Shijith Thotton (3):
> >       eventdev: fix case to initiate crypto adapter service
> >       event/octeontx2: fix crypto adapter queue pair operations
> >       event/octeontx2: configure crypto adapter xaq pool
> >
> > Siwar Zitouni (1):
> >       net/ice: fix disabling promiscuous mode
> >
> > Somnath Kotur (5):
> >       net/bnxt: fix xstats get
> >       net/bnxt: fix Rx and Tx timestamps
> >       net/bnxt: fix Tx timestamp init
> >       net/bnxt: refactor multi-queue Rx configuration
> >       net/bnxt: fix Rx timestamp when FIFO pending bit is set
> >
> > Stanislaw Kardach (6):
> >       test: proceed if timer subsystem already initialized
> >       stack: allow lock-free only on relevant architectures
> >       test/distributor: fix worker notification in burst mode
> >       test/distributor: fix burst flush on worker quit
> >       net/ena: remove endian swap functions
> >       net/ena: report default ring size
> >
> > Stephen Hemminger (2):
> >       kni: refactor user request processing
> >       net/bnxt: use prefix on global function
> >
> > Suanming Mou (1):
> >       net/mlx5: fix counter offset detection
> >
> > Tal Shnaiderman (2):
> >       eal/windows: fix default thread priority
> >       eal/windows: fix return codes of pthread shim layer
> >
> > Tengfei Zhang (1):
> >       net/pcap: fix file descriptor leak on close
> >
> > Thinh Tran (1):
> >       test: fix autotest handling of skipped tests
> >
> > Thomas Monjalon (18):
> >       bus/pci: fix Windows kernel driver categories
> >       eal: fix comment of OS-specific header files
> >       buildtools: fix build with busybox
> >       build: detect execinfo library on Linux
> >       build: remove redundant _GNU_SOURCE definitions
> >       eal: fix build with musl
> >       net/igc: remove use of uint type
> >       event/dlb: fix header includes for musl
> >       examples/bbdev: fix header include for musl
> >       drivers: fix log level after loading
> >       app/regex: fix usage text
> >       app/testpmd: fix usage text
> >       doc: fix names of UIO drivers
> >       doc: fix build with Sphinx 4
> >       bus/pci: support I/O port operations with musl
> >       app: fix exit messages
> >       regex/octeontx2: remove unused include directory
> >       doc: remove PDF requirements
> >
> > Tianyu Li (1):
> >       net/memif: fix Tx bps statistics for zero-copy
> >
> > Timothy McDaniel (2):
> >       event/dlb2: remove references to deferred scheduling
> >       doc: fix runtime options in DLB2 guide
> >
> > Tyler Retzlaff (1):
> >       eal: add C++ include guard for reciprocal header
> >
> > Vadim Podovinnikov (1):
> >       net/bonding: fix LACP system address check
> >
> > Venkat Duvvuru (1):
> >       net/bnxt: fix queues per VNIC
> >
> > Viacheslav Ovsiienko (16):
> >       net/mlx5: fix external buffer pool registration for Rx queue
> >       net/mlx5: fix metadata item validation for ingress flows
> >       net/mlx5: fix hashed list size for tunnel flow groups
> >       net/mlx5: fix UAR allocation diagnostics messages
> >       common/mlx5: add timestamp format support to DevX
> >       vdpa/mlx5: support timestamp format
> >       net/mlx5: fix Rx metadata leftovers
> >       net/mlx5: fix drop action for Direct Rules/Verbs
> >       net/mlx4: fix RSS action with null hash key
> >       net/mlx5: support timestamp format
> >       regex/mlx5: support timestamp format
> >       app/testpmd: fix segment number check
> >       net/mlx5: remove drop queue function prototypes
> >       net/mlx4: fix buffer leakage on device close
> >       net/mlx5: fix probing device in legacy bonding mode
> >       net/mlx5: fix receiving queue timestamp format
> >
> > Wei Huang (1):
> >       raw/ifpga: fix device name format
> >
> > Wenjun Wu (3):
> >       net/ice: check some functions return
> >       net/ice: fix RSS hash update
> >       net/ice: fix RSS for L2 packet
> >
> > Wenwu Ma (1):
> >       net/ice: fix illegal access when removing MAC filter
> >
> > Wenzhuo Lu (2):
> >       net/iavf: fix crash in AVX512
> >       net/ice: fix crash in AVX512
> >
> > Wisam Jaddo (1):
> >       app/flow-perf: fix encap/decap actions
> >
> > Xiao Wang (1):
> >       vdpa/ifc: check PCI config read
> >
> > Xiaoyu Min (4):
> >       net/mlx5: support RSS expansion for IPv6 GRE
> >       net/mlx5: fix shared inner RSS
> >       net/mlx5: fix missing shared RSS hash types
> >       net/mlx5: fix redundant flow after RSS expansion
> >
> > Xiaoyun Li (2):
> >       app/testpmd: remove unnecessary UDP tunnel check
> >       net/i40e: fix IPv4 fragment offload
> >
> > Xueming Li (2):
> >       version: 20.11.2-rc1
> >       net/virtio: fix vectorized Rx queue rearm
> >
> > Youri Querry (1):
> >       bus/fslmc: fix random portal hangs with qbman 5.0
> >
> > Yunjian Wang (5):
> >       vfio: fix API description
> >       net/mlx5: fix using flow tunnel before null check
> >       vfio: fix duplicated user mem map
> >       net/mlx4: fix leak when configured repeatedly
> >       net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
  2021-07-05 15:55  0%     ` Jerin Jacob
@ 2021-07-05 17:16  0%       ` Bruce Richardson
  2021-07-07  8:08  0%         ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2021-07-05 17:16 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Chengwen Feng, Thomas Monjalon, Ferruh Yigit, Jerin Jacob,
	dpdk-dev, Morten Brørup, Nipun Gupta, Hemant Agrawal,
	Maxime Coquelin, Honnappa Nagarahalli, David Marchand,
	Satananda Burla, Prasun Kapoor, Ananyev, Konstantin, liangma,
	Radha Mohan Chintakuntla

On Mon, Jul 05, 2021 at 09:25:34PM +0530, Jerin Jacob wrote:
> 
> On Mon, Jul 5, 2021 at 4:22 PM Bruce Richardson
> <bruce.richardson@intel.com> wrote:
> >
> > On Sun, Jul 04, 2021 at 03:00:30PM +0530, Jerin Jacob wrote:
> > > On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > > >
> > > > This patch introduces 'dmadevice' which is a generic type of DMA
> > > > device.
<snip>
> >
> > +1 and the terminology with regards to queues and channels. With our ioat
> > hardware, each HW queue was called a channel for instance.
> 
> Looks like <dmadev> <> <channel> can cover all the use cases, if the
> HW has more than
> 1 queues it can be exposed as separate dmadev dev.
> 

Fine for me.

However, just to confirm that Morten's suggestion of using a
(device-specific void *) channel pointer rather than dev_id + channel_id
pair of parameters won't work for you? You can't store a pointer or dev
index in the channel struct in the driver?

> 
<snip>
> > > > + *
> > > > + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
> > > > + * code.
> > > > + * When using cookies, comply with the following rules:
> > > > + * a) Cookies for each virtual queue are independent.
> > > > + * b) For a virt queue, the cookie are monotonically incremented, when it reach
> > > > + *    the INT_MAX, it wraps back to zero.
> >
> > I disagree with the INT_MAX (or INT32_MAX) value here. If we use that
> > value, it means that we cannot use implicit wrap-around inside the CPU and
> > have to check for the INT_MAX value. Better to:
> > 1. Specify that it wraps at UINT16_MAX which allows us to just use a
> > uint16_t internally and wrap-around automatically, or:
> > 2. Specify that it wraps at a power-of-2 value >= UINT16_MAX, giving
> > drivers the flexibility at what value to wrap around.
> 
> I think, (2) better than 1. I think, even better to wrap around the number of
> descriptors configured in dev_configure()(We cake make this as the power of 2),
> 

Interesting, I hadn't really considered that before. My only concern
would be if an app wants to keep values in the app ring for a while after
they have been returned from dmadev. I thought it easier to have the full
16-bit counter value returned to the user to give the most flexibility,
given that going from that to any power-of-2 ring size smaller is a trivial
operation.

Overall, while my ideal situation is to always have a 0..UINT16_MAX return
value from the function, I can live with your suggestion of wrapping at
ring_size, since drivers will likely do that internally anyway.
I think wrapping at INT32_MAX is too awkward and will be error prone since
we can't rely on hardware automatically wrapping to zero, nor on the driver
having pre-masked the value.

> >
> > > > + * c) The initial cookie of a virt queue is zero, after the device is stopped or
> > > > + *    reset, the virt queue's cookie needs to be reset to zero.
<snip>
> > >
> > > Please add some good amount of reserved bits and have API to init this
> > > structure for future ABI stability, say rte_dmadev_queue_config_init()
> > > or so.
> > >
> >
> > I don't think that is necessary. Since the config struct is used only as
> > parameter to the config function, any changes to it can be managed by
> > versioning that single function. Padding would only be necessary if we had
> > an array of these config structs somewhere.
> 
> OK.
> 
> For some reason, the versioning API looks ugly to me in code instead of keeping
> some rsvd fields look cool to me with init function.
> 
> But I agree. function versioning works in this case. No need to find other API
> if tt is not general DPDK API practice.
> 

The one thing I would suggest instead of the padding is for the internal
APIS, to pass the struct size through, since we can't version those - and
for padding we can't know whether any replaced padding should be used or
not. Specifically:

	typedef int (*rte_dmadev_configure_t)(struct rte_dmadev *dev, struct
			rte_dmadev_conf *cfg, size_t cfg_size);

but for the public function:

	int
	rte_dmadev_configure(struct rte_dmadev *dev, struct
			rte_dmadev_conf *cfg)
	{
		...
		ret = dev->ops.configure(dev, cfg, sizeof(*cfg));
		...
	}

Then if we change the structure and version the config API, the driver can
tell from the size what struct version it is and act accordingly. Without
that, each time the struct changed, we'd have to add a new function pointer
to the device ops.

> In other libraries, I have seen such _init or function that can use
> for this as well as filling default value
> in some cases implementation values is not zero).
> So that application can avoid memset for param structure.
> Added rte_event_queue_default_conf_get() in eventdev spec for this.
> 

I think that would largely have the same issues, unless it returned a
pointer to data inside the driver - and which therefore could not be
modified. Alternatively it would mean that the memory would have been
allocated in the driver and we would need to ensure proper cleanup
functions were called to free memory afterwards. Supporting having the
config parameter as a local variable I think makes things a lot easier.

> No strong opinion on this.
> 
> 
> 
> >
> > >
> > > > +
> > > > +/**
> > > > + * A structure used to retrieve information of a DMA virt queue.
> > > > + */
> > > > +struct rte_dmadev_queue_info {
> > > > +       enum dma_transfer_direction direction;
> > >
> > > A queue may support all directions so I think it should be a bitfield.
> > >
> > > > +       /**< Associated transfer direction */
> > > > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > > > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > > > +       uint64_t dev_flags; /**< Device specific flags */
> > > > +};
> > > > +
> > >
> > > > +__rte_experimental
> > > > +static inline dma_cookie_t
> > > > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id,
> > > > +                  const struct dma_scatterlist *sg,
> > > > +                  uint32_t sg_len, uint64_t flags)
> > >
> > > I would like to change this as:
> > > rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct
> > > rte_dma_sg *src, uint32_t nb_src,
> > > const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like
> > > src 30 MB copy can be splitted as written as 1 MB x 30 dst.
> > >

Out of interest, do you see much benefit (and in what way) from having the
scatter-gather support? Unlike sending 5 buffers in one packet rather than
5 buffers in 5 packets to a NIC, copying an array of memory in one op vs
multiple is functionally identical.

> > >
> > >
<snip>
> Got it. In order to save space if first CL size for fastpath(Saving 8B
> for the pointer) and to avoid
> function overhead, Can we use one bit of flags of op function to
> enable the fence?
> 

The original ioat implementation did exactly that. However, I then
discovered that because a fence logically belongs between two operations,
does the fence flag on an operation mean "don't do any jobs after this
until this job has completed" or does it mean "don't start this job until
all previous jobs have completed". [Or theoretically does it mean both :-)]
Naturally, some hardware does it the former way (i.e. fence flag goes on
last op before fence), while other hardware the latter way (i.e. fence flag
goes on first op after the fence). Therefore, since fencing is about
ordering *between* two (sets of) jobs, I decided that it should do exactly
that and go between two jobs, so there is no ambiguity!

However, I'm happy enough to switch to having a fence flag, but I think if
we do that, it should be put in the "first job after fence" case, because
it is always easier to modify a previously written job if we need to, than
to save the flag for a future one.

Alternatively, if we keep the fence as a separate function, I'm happy
enough for it not to be on the same cacheline as the "hot" operations,
since fencing will always introduce a small penalty anyway.

> >
> > >
<snip>
> > > Since we have additional function call overhead in all the
> > > applications for this scheme, I would like to understand
> > > the use of doing this way vs enq does the doorbell implicitly from
> > > driver/application PoV?
> > >
> >
> > In our benchmarks it's just faster. When we tested it, the overhead of the
> > function calls was noticably less than the cost of building up the
> > parameter array(s) for passing the jobs in as a burst. [We don't see this
> > cost with things like NIC I/O since DPDK tends to already have the mbuf
> > fully populated before the TX call anyway.]
> 
> OK. I agree with stack population.
> 
> My question was more on doing implicit doorbell update enq. Is doorbell write
> costly in other HW compare to a function call? In our HW, it is just write of
> the number of instructions written in a register.
> 
> Also, we need to again access the internal PMD memory structure to find
> where to write etc if it is a separate function.
> 

The cost varies depending on a number of factors - even writing to a single
HW register can be very slow if that register is mapped as device
(uncacheable) memory, since (AFAIK) it will act as a full fence and wait
for the write to go all the way to hardware. For more modern HW, the cost
can be lighter. However, any cost of HW writes is going to be the same
whether its a separate function call or not.

However, the main thing about the doorbell update is that it's a
once-per-burst thing, rather than a once-per-job. Therefore, even if you
have to re-read the struct memory (which is likely still somewhere in your
cores' cache), any extra small cost of doing so is to be amortized over the
cost of a whole burst of copies.

> 
> >
> > >
<snip>
> > > > +
> > > > +/**
> > > > + * @warning
> > > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > > + *
> > > > + * Returns the number of operations that failed to complete.
> > > > + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> > > > + *
> > > > + * @param dev_id
> > > > + *   The identifier of the device.
> > > > + * @param vq_id
> > > > + *   The identifier of virt queue.
> > > (> + * @param nb_status
> > > > + *   Indicates the size  of status array.
> > > > + * @param[out] status
> > > > + *   The error code of operations that failed to complete.
> > > > + * @param[out] cookie
> > > > + *   The last failed completed operation's cookie.
> > > > + *
> > > > + * @return
> > > > + *   The number of operations that failed to complete.
> > > > + *
> > > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > > + *       corresponding device supports the operation.
> > > > + */
> > > > +__rte_experimental
> > > > +static inline uint16_t
> > > > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
> > > > +                          const uint16_t nb_status, uint32_t *status,
> > > > +                          dma_cookie_t *cookie)
> > >
> > > IMO, it is better to move cookie/rind_idx at 3.
> > > Why it would return any array of errors? since it called after
> > > rte_dmadev_completed() has
> > > has_error. Is it better to change
> > >
> > > rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
> > > *cookie,  uint32_t *status)
> > >
> > > I also think, we may need to set status as bitmask and enumerate all
> > > the combination of error codes
> > > of all the driver and return string from driver existing rte_flow_error
> > >
> > > See
> > > struct rte_flow_error {
> > >         enum rte_flow_error_type type; /**< Cause field and error types. */
> > >         const void *cause; /**< Object responsible for the error. */
> > >         const char *message; /**< Human-readable error message. */
> > > };
> > >
> >
> > I think we need a multi-return value API here, as we may add operations in
> > future which have non-error status values to return. The obvious case is
> > DMA engines which support "compare" operations. In that case a successful
> > compare (as in there were no DMA or HW errors) can return "equal" or
> > "not-equal" as statuses. For general "copy" operations, the faster
> > completion op can be used to just return successful values (and only call
> > this status version on error), while apps using those compare ops or a
> > mixture of copy and compare ops, would always use the slower one that
> > returns status values for each and every op..
> >
> > The ioat APIs used 32-bit integer values for this status array so as to
> > allow e.g. 16-bits for error code and 16-bits for future status values. For
> > most operations there should be a fairly small set of things that can go
> > wrong, i.e. bad source address, bad destination address or invalid length.
> > Within that we may have a couple of specifics for why an address is bad,
> > but even so I don't think we need to start having multiple bit
> > combinations.
> 
> OK. What is the purpose of errors status? Is it for application printing it or
> Does the application need to take any action based on specific error requests?

It's largely for information purposes, but in the case of SVA/SVM errors
could occur due to the memory not being pinned, i.e. a page fault, in some
cases. If that happens, then it's up the app to either touch the memory and
retry the copy, or to do a SW memcpy as a fallback.

In other error cases, I think it's good to tell the application if it's
passing around bad data, or data that is beyond the scope of hardware, e.g.
a copy that is beyond what can be done in a single transaction for a HW
instance. Given that there are always things that can go wrong, I think we
need some error reporting mechanism.

> If the former is scope, then we need to define the standard enum value
> for the error right?
> ie. uint32_t *status needs to change to enum rte_dma_error or so.
> 
Sure. Perhaps an error/status structure either is an option, where we
explicitly call out error info from status info.

> 
> 
<snip to end>

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
  2021-07-05 10:52  0%   ` Bruce Richardson
@ 2021-07-05 15:55  0%     ` Jerin Jacob
  2021-07-05 17:16  0%       ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2021-07-05 15:55 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Chengwen Feng, Thomas Monjalon, Ferruh Yigit, Jerin Jacob,
	dpdk-dev, Morten Brørup, Nipun Gupta, Hemant Agrawal,
	Maxime Coquelin, Honnappa Nagarahalli, David Marchand,
	Satananda Burla, Prasun Kapoor, Ananyev, Konstantin, liangma,
	Radha Mohan Chintakuntla

 need

On Mon, Jul 5, 2021 at 4:22 PM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Sun, Jul 04, 2021 at 03:00:30PM +0530, Jerin Jacob wrote:
> > On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> > >
> > > This patch introduces 'dmadevice' which is a generic type of DMA
> > > device.
> > >
> > > The APIs of dmadev library exposes some generic operations which can
> > > enable configuration and I/O with the DMA devices.
> > >
> > > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> >
> > Thanks for v1.
> >
> > I would suggest finalizing  lib/dmadev/rte_dmadev.h before doing the
> > implementation so that you don't need
> > to waste time on rewoking the implementation.
> >
>
> I actually like having the .c file available too. Before we lock down the
> .h file and the API, I want to verify the performance of our drivers with
> the implementation, and having a working .c file is obviously necessary for
> that. So I appreciate having it as part of the RFC.

Ack.

>
> > Comments inline.
> >
> > > ---
> <snip>
> > > + *
> > > + * The DMA framework is built on the following abstraction model:
> > > + *
> > > + *     ------------    ------------
> > > + *     |virt-queue|    |virt-queue|
> > > + *     ------------    ------------
> > > + *            \           /
> > > + *             \         /
> > > + *              \       /
> > > + *            ------------     ------------
> > > + *            | HW-queue |     | HW-queue |
> > > + *            ------------     ------------
> > > + *                   \            /
> > > + *                    \          /
> > > + *                     \        /
> > > + *                     ----------
> > > + *                     | dmadev |
> > > + *                     ----------
> >
> > Continuing the discussion with @Morten Brørup , I think, we need to
> > finalize the model.
> >
>
> +1 and the terminology with regards to queues and channels. With our ioat
> hardware, each HW queue was called a channel for instance.

Looks like <dmadev> <> <channel> can cover all the use cases, if the
HW has more than
1 queues it can be exposed as separate dmadev dev.


>
> > > + *   a) The DMA operation request must be submitted to the virt queue, virt
> > > + *      queues must be created based on HW queues, the DMA device could have
> > > + *      multiple HW queues.
> > > + *   b) The virt queues on the same HW-queue could represent different contexts,
> > > + *      e.g. user could create virt-queue-0 on HW-queue-0 for mem-to-mem
> > > + *      transfer scenario, and create virt-queue-1 on the same HW-queue for
> > > + *      mem-to-dev transfer scenario.
> > > + *   NOTE: user could also create multiple virt queues for mem-to-mem transfer
> > > + *         scenario as long as the corresponding driver supports.
> > > + *
> > > + * The control plane APIs include configure/queue_setup/queue_release/start/
> > > + * stop/reset/close, in order to start device work, the call sequence must be
> > > + * as follows:
> > > + *     - rte_dmadev_configure()
> > > + *     - rte_dmadev_queue_setup()
> > > + *     - rte_dmadev_start()
> >
> > Please add reconfigure behaviour etc, Please check the
> > lib/regexdev/rte_regexdev.h
> > introduction. I have added similar ones so you could reuse as much as possible.
> >
> >
> > > + * The dataplane APIs include two parts:
> > > + *   a) The first part is the submission of operation requests:
> > > + *        - rte_dmadev_copy()
> > > + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> > > + *        - rte_dmadev_fill()
> > > + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> > > + *        - rte_dmadev_fence()   - add a fence force ordering between operations
> > > + *        - rte_dmadev_perform() - issue doorbell to hardware
> > > + *      These APIs could work with different virt queues which have different
> > > + *      contexts.
> > > + *      The first four APIs are used to submit the operation request to the virt
> > > + *      queue, if the submission is successful, a cookie (as type
> > > + *      'dma_cookie_t') is returned, otherwise a negative number is returned.
> > > + *   b) The second part is to obtain the result of requests:
> > > + *        - rte_dmadev_completed()
> > > + *            - return the number of operation requests completed successfully.
> > > + *        - rte_dmadev_completed_fails()
> > > + *            - return the number of operation requests failed to complete.
> > > + *
> > > + * The misc APIs include info_get/queue_info_get/stats/xstats/selftest, provide
> > > + * information query and self-test capabilities.
> > > + *
> > > + * About the dataplane APIs MT-safe, there are two dimensions:
> > > + *   a) For one virt queue, the submit/completion API could be MT-safe,
> > > + *      e.g. one thread do submit operation, another thread do completion
> > > + *      operation.
> > > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VQ.
> > > + *      If driver don't support it, it's up to the application to guarantee
> > > + *      MT-safe.
> > > + *   b) For multiple virt queues on the same HW queue, e.g. one thread do
> > > + *      operation on virt-queue-0, another thread do operation on virt-queue-1.
> > > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MVQ.
> > > + *      If driver don't support it, it's up to the application to guarantee
> > > + *      MT-safe.
> >
> > From an application PoV it may not be good to write portable
> > applications. Please check
> > latest thread with @Morten Brørup
> >
> > > + */
> > > +
> > > +#ifdef __cplusplus
> > > +extern "C" {
> > > +#endif
> > > +
> > > +#include <rte_common.h>
> > > +#include <rte_memory.h>
> > > +#include <rte_errno.h>
> > > +#include <rte_compat.h>
> >
> > Sort in alphabetical order.
> >
> > > +
> > > +/**
> > > + * dma_cookie_t - an opaque DMA cookie
> >
> > Since we are defining the behaviour is not opaque any more.
> > I think, it is better to call ring_idx or so.
> >
>
> +1 for ring index. We don't need a separate type for it though, just
> document the index as an unsigned return value.
>
> >
> > > +#define RTE_DMA_DEV_CAPA_MT_MVQ (1ull << 11) /**< Support MT-safe of multiple virt queues */
> >
> > Please lot of @see for all symbols where it is being used. So that one
> > can understand the full scope of
> > symbols. See below example.
> >
> > #define RTE_REGEXDEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> > /**< RegEx device does support compiling the rules at runtime unlike
> >  * loading only the pre-built rule database using
> >  * struct rte_regexdev_config::rule_db in rte_regexdev_configure()
> >  *
> >  * @see struct rte_regexdev_config::rule_db, rte_regexdev_configure()
> >  * @see struct rte_regexdev_info::regexdev_capa
> >  */
> >
> > > + *
> > > + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
> > > + * code.
> > > + * When using cookies, comply with the following rules:
> > > + * a) Cookies for each virtual queue are independent.
> > > + * b) For a virt queue, the cookie are monotonically incremented, when it reach
> > > + *    the INT_MAX, it wraps back to zero.
>
> I disagree with the INT_MAX (or INT32_MAX) value here. If we use that
> value, it means that we cannot use implicit wrap-around inside the CPU and
> have to check for the INT_MAX value. Better to:
> 1. Specify that it wraps at UINT16_MAX which allows us to just use a
> uint16_t internally and wrap-around automatically, or:
> 2. Specify that it wraps at a power-of-2 value >= UINT16_MAX, giving
> drivers the flexibility at what value to wrap around.

I think, (2) better than 1. I think, even better to wrap around the number of
descriptors configured in dev_configure()(We cake make this as the power of 2),


>
> > > + * c) The initial cookie of a virt queue is zero, after the device is stopped or
> > > + *    reset, the virt queue's cookie needs to be reset to zero.
> > > + * Example:
> > > + *    step-1: start one dmadev
> > > + *    step-2: enqueue a copy operation, the cookie return is 0
> > > + *    step-3: enqueue a copy operation again, the cookie return is 1
> > > + *    ...
> > > + *    step-101: stop the dmadev
> > > + *    step-102: start the dmadev
> > > + *    step-103: enqueue a copy operation, the cookie return is 0
> > > + *    ...
> > > + */
> >
> > Good explanation.
> >
> > > +typedef int32_t dma_cookie_t;
> >
>
> As I mentioned before, I'd just remove this, and use regular int types,
> with "ring_idx" as the name.

+1

>
> >
> > > +
> > > +/**
> > > + * dma_scatterlist - can hold scatter DMA operation request
> > > + */
> > > +struct dma_scatterlist {
> >
> > I prefer to change scatterlist -> sg
> > i.e rte_dma_sg
> >
> > > +       void *src;
> > > +       void *dst;
> > > +       uint32_t length;
> > > +};
> > > +
> >
> > > +
> > > +/**
> > > + * A structure used to retrieve the contextual information of
> > > + * an DMA device
> > > + */
> > > +struct rte_dmadev_info {
> > > +       /**
> > > +        * Fields filled by framewok
> >
> > typo.
> >
> > > +        */
> > > +       struct rte_device *device; /**< Generic Device information */
> > > +       const char *driver_name; /**< Device driver name */
> > > +       int socket_id; /**< Socket ID where memory is allocated */
> > > +
> > > +       /**
> > > +        * Specification fields filled by driver
> > > +        */
> > > +       uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> > > +       uint16_t max_hw_queues; /**< Maximum number of HW queues. */
> > > +       uint16_t max_vqs_per_hw_queue;
> > > +       /**< Maximum number of virt queues to allocate per HW queue */
> > > +       uint16_t max_desc;
> > > +       /**< Maximum allowed number of virt queue descriptors */
> > > +       uint16_t min_desc;
> > > +       /**< Minimum allowed number of virt queue descriptors */
> >
> > Please add max_nb_segs. i.e maximum number of segments supported.
> >
> > > +
> > > +       /**
> > > +        * Status fields filled by driver
> > > +        */
> > > +       uint16_t nb_hw_queues; /**< Number of HW queues configured */
> > > +       uint16_t nb_vqs; /**< Number of virt queues configured */
> > > +};
> > > + i
> > > +
> > > +/**
> > > + * dma_address_type
> > > + */
> > > +enum dma_address_type {
> > > +       DMA_ADDRESS_TYPE_IOVA, /**< Use IOVA as dma address */
> > > +       DMA_ADDRESS_TYPE_VA, /**< Use VA as dma address */
> > > +};
> > > +
> > > +/**
> > > + * A structure used to configure a DMA device.
> > > + */
> > > +struct rte_dmadev_conf {
> > > +       enum dma_address_type addr_type; /**< Address type to used */
> >
> > I think, there are 3 kinds of limitations/capabilities.
> >
> > When the system is configured as IOVA as VA
> > 1) Device supports any VA address like memory from rte_malloc(),
> > rte_memzone(), malloc, stack memory
> > 2) Device support only VA address from rte_malloc(), rte_memzone() i.e
> > memory backed by hugepage and added to DMA map.
> >
> > When the system is configured as IOVA as PA
> > 1) Devices support only PA addresses .
> >
> > IMO, Above needs to be  advertised as capability and application needs
> > to align with that
> > and I dont think application requests the driver to work in any of the modes.
> >
> >
>
> I don't think we need this level of detail for addressing capabilities.
> Unless I'm missing something, the hardware should behave exactly as other
> hardware does taking in iova's.  If the user wants to check whether virtual
> addresses to pinned memory can be used directly, the user can call
> "rte_eal_iova_mode". We can't have a situation where some hardware uses one
> type of addresses and another hardware the other.
>
> Therefore, the only additional addressing capability we should need to
> report is that the hardware can use SVM/SVA and use virtual addresses not
> in hugepage memory.

+1.


>
> >
> > > +       uint16_t nb_hw_queues; /**< Number of HW-queues enable to use */
> > > +       uint16_t max_vqs; /**< Maximum number of virt queues to use */
> >
> > You need to what is max value allowed etc i.e it is based on
> > info_get() and mention the field
> > in info structure
> >
> >
> > > +
> > > +/**
> > > + * dma_transfer_direction
> > > + */
> > > +enum dma_transfer_direction {
> >
> > rte_dma_transter_direction
> >
> > > +       DMA_MEM_TO_MEM,
> > > +       DMA_MEM_TO_DEV,
> > > +       DMA_DEV_TO_MEM,
> > > +       DMA_DEV_TO_DEV,
> > > +};
> > > +
> > > +/**
> > > + * A structure used to configure a DMA virt queue.
> > > + */
> > > +struct rte_dmadev_queue_conf {
> > > +       enum dma_transfer_direction direction;
> >
> >
> > > +       /**< Associated transfer direction */
> > > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > > +       uint64_t dev_flags; /**< Device specific flags */
> >
> > Use of this? Need more comments on this.
> > Since it is in slowpath, We can have non opaque names here based on
> > each driver capability.
> >
> >
> > > +       void *dev_ctx; /**< Device specific context */
> >
> > Use of this ? Need more comment ont this.
> >
>
> I think this should be dropped. We should not have any opaque
> device-specific info in these structs, rather if a particular device needs
> parameters we should call them out. Drivers for which it's not relevant can
> ignore them (and report same in capability if necessary). Since this is not
> a dataplane API, we aren't concerned too much about perf and can size the
> struct appropriately.
>
> >
> > Please add some good amount of reserved bits and have API to init this
> > structure for future ABI stability, say rte_dmadev_queue_config_init()
> > or so.
> >
>
> I don't think that is necessary. Since the config struct is used only as
> parameter to the config function, any changes to it can be managed by
> versioning that single function. Padding would only be necessary if we had
> an array of these config structs somewhere.

OK.

For some reason, the versioning API looks ugly to me in code instead of keeping
some rsvd fields look cool to me with init function.

But I agree. function versioning works in this case. No need to find other API
if tt is not general DPDK API practice.

In other libraries, I have seen such _init or function that can use
for this as well as filling default value
in some cases implementation values is not zero).
So that application can avoid memset for param structure.
Added rte_event_queue_default_conf_get() in eventdev spec for this.

No strong opinion on this.



>
> >
> > > +
> > > +/**
> > > + * A structure used to retrieve information of a DMA virt queue.
> > > + */
> > > +struct rte_dmadev_queue_info {
> > > +       enum dma_transfer_direction direction;
> >
> > A queue may support all directions so I think it should be a bitfield.
> >
> > > +       /**< Associated transfer direction */
> > > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > > +       uint64_t dev_flags; /**< Device specific flags */
> > > +};
> > > +
> >
> > > +__rte_experimental
> > > +static inline dma_cookie_t
> > > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id,
> > > +                  const struct dma_scatterlist *sg,
> > > +                  uint32_t sg_len, uint64_t flags)
> >
> > I would like to change this as:
> > rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct
> > rte_dma_sg *src, uint32_t nb_src,
> > const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like
> > src 30 MB copy can be splitted as written as 1 MB x 30 dst.
> >
> >
> >
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +       return (*dev->copy_sg)(dev, vq_id, sg, sg_len, flags);
> > > +}
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Enqueue a fill operation onto the DMA virt queue
> > > + *
> > > + * This queues up a fill operation to be performed by hardware, but does not
> > > + * trigger hardware to begin that operation.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vq_id
> > > + *   The identifier of virt queue.
> > > + * @param pattern
> > > + *   The pattern to populate the destination buffer with.
> > > + * @param dst
> > > + *   The address of the destination buffer.
> > > + * @param length
> > > + *   The length of the destination buffer.
> > > + * @param flags
> > > + *   An opaque flags for this operation.
> >
> > PLEASE REMOVE opaque stuff from fastpath it will be a pain for
> > application writers as
> > they need to write multiple combinations of fastpath. flags are OK, if
> > we have a valid
> > generic flag now to control the transfer behavior.
> >
>
> +1. Flags need to be explicitly listed. If we don't have any flags for now,
> we can specify that the value must be given as zero and it's for future
> use.

OK.

>
> >
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Add a fence to force ordering between operations
> > > + *
> > > + * This adds a fence to a sequence of operations to enforce ordering, such that
> > > + * all operations enqueued before the fence must be completed before operations
> > > + * after the fence.
> > > + * NOTE: Since this fence may be added as a flag to the last operation enqueued,
> > > + * this API may not function correctly when called immediately after an
> > > + * "rte_dmadev_perform" call i.e. before any new operations are enqueued.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vq_id
> > > + *   The identifier of virt queue.
> > > + *
> > > + * @return
> > > + *   - =0: Successful add fence.
> > > + *   - <0: Failure to add fence.
> > > + *
> > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > + *       corresponding device supports the operation.
> > > + */
> > > +__rte_experimental
> > > +static inline int
> > > +rte_dmadev_fence(uint16_t dev_id, uint16_t vq_id)
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +       return (*dev->fence)(dev, vq_id);
> > > +}
> >
> > Since HW submission is in a queue(FIFO) the ordering is always
> > maintained. Right?
> > Could you share more details and use case of fence() from
> > driver/application PoV?
> >
>
> There are different kinds of ordering to consider, ordering of completions
> and the ordering of operations. While jobs are reported as completed to the
> user in order, for performance hardware, may overlap individual jobs within
> a burst (or even across bursts). Therefore, we need a fence operation to
> inform hardware that one job should not be started until the other has
> fully completed.

Got it. In order to save space if first CL size for fastpath(Saving 8B
for the pointer) and to avoid
function overhead, Can we use one bit of flags of op function to
enable the fence?

>
> >
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Trigger hardware to begin performing enqueued operations
> > > + *
> > > + * This API is used to write the "doorbell" to the hardware to trigger it
> > > + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vq_id
> > > + *   The identifier of virt queue.
> > > + *
> > > + * @return
> > > + *   - =0: Successful trigger hardware.
> > > + *   - <0: Failure to trigger hardware.
> > > + *
> > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > + *       corresponding device supports the operation.
> > > + */
> > > +__rte_experimental
> > > +static inline int
> > > +rte_dmadev_perform(uint16_t dev_id, uint16_t vq_id)
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +       return (*dev->perform)(dev, vq_id);
> > > +}
> >
> > Since we have additional function call overhead in all the
> > applications for this scheme, I would like to understand
> > the use of doing this way vs enq does the doorbell implicitly from
> > driver/application PoV?
> >
>
> In our benchmarks it's just faster. When we tested it, the overhead of the
> function calls was noticably less than the cost of building up the
> parameter array(s) for passing the jobs in as a burst. [We don't see this
> cost with things like NIC I/O since DPDK tends to already have the mbuf
> fully populated before the TX call anyway.]

OK. I agree with stack population.

My question was more on doing implicit doorbell update enq. Is doorbell write
costly in other HW compare to a function call? In our HW, it is just write of
the number of instructions written in a register.

Also, we need to again access the internal PMD memory structure to find
where to write etc if it is a separate function.


>
> >
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Returns the number of operations that have been successful completed.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vq_id
> > > + *   The identifier of virt queue.
> > > + * @param nb_cpls
> > > + *   The maximum number of completed operations that can be processed.
> > > + * @param[out] cookie
> > > + *   The last completed operation's cookie.
> > > + * @param[out] has_error
> > > + *   Indicates if there are transfer error.
> > > + *
> > > + * @return
> > > + *   The number of operations that successful completed.
> >
> > successfully
> >
> > > + *
> > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > + *       corresponding device supports the operation.
> > > + */
> > > +__rte_experimental
> > > +static inline uint16_t
> > > +rte_dmadev_completed(uint16_t dev_id, uint16_t vq_id, const uint16_t nb_cpls,
> > > +                    dma_cookie_t *cookie, bool *has_error)
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +       has_error = false;
> > > +       return (*dev->completed)(dev, vq_id, nb_cpls, cookie, has_error);
> >
> > It may be better to have cookie/ring_idx as third argument.
> >
>
> No strong opinions here, but having it as in the code above means all
> input parameters come before all output, which makes sense to me.

+1

>
> > > +}
> > > +
> > > +/**
> > > + * @warning
> > > + * @b EXPERIMENTAL: this API may change without prior notice.
> > > + *
> > > + * Returns the number of operations that failed to complete.
> > > + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> > > + *
> > > + * @param dev_id
> > > + *   The identifier of the device.
> > > + * @param vq_id
> > > + *   The identifier of virt queue.
> > (> + * @param nb_status
> > > + *   Indicates the size  of status array.
> > > + * @param[out] status
> > > + *   The error code of operations that failed to complete.
> > > + * @param[out] cookie
> > > + *   The last failed completed operation's cookie.
> > > + *
> > > + * @return
> > > + *   The number of operations that failed to complete.
> > > + *
> > > + * NOTE: The caller must ensure that the input parameter is valid and the
> > > + *       corresponding device supports the operation.
> > > + */
> > > +__rte_experimental
> > > +static inline uint16_t
> > > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
> > > +                          const uint16_t nb_status, uint32_t *status,
> > > +                          dma_cookie_t *cookie)
> >
> > IMO, it is better to move cookie/rind_idx at 3.
> > Why it would return any array of errors? since it called after
> > rte_dmadev_completed() has
> > has_error. Is it better to change
> >
> > rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
> > *cookie,  uint32_t *status)
> >
> > I also think, we may need to set status as bitmask and enumerate all
> > the combination of error codes
> > of all the driver and return string from driver existing rte_flow_error
> >
> > See
> > struct rte_flow_error {
> >         enum rte_flow_error_type type; /**< Cause field and error types. */
> >         const void *cause; /**< Object responsible for the error. */
> >         const char *message; /**< Human-readable error message. */
> > };
> >
>
> I think we need a multi-return value API here, as we may add operations in
> future which have non-error status values to return. The obvious case is
> DMA engines which support "compare" operations. In that case a successful
> compare (as in there were no DMA or HW errors) can return "equal" or
> "not-equal" as statuses. For general "copy" operations, the faster
> completion op can be used to just return successful values (and only call
> this status version on error), while apps using those compare ops or a
> mixture of copy and compare ops, would always use the slower one that
> returns status values for each and every op..
>
> The ioat APIs used 32-bit integer values for this status array so as to
> allow e.g. 16-bits for error code and 16-bits for future status values. For
> most operations there should be a fairly small set of things that can go
> wrong, i.e. bad source address, bad destination address or invalid length.
> Within that we may have a couple of specifics for why an address is bad,
> but even so I don't think we need to start having multiple bit
> combinations.

OK. What is the purpose of errors status? Is it for application printing it or
Does the application need to take any action based on specific error requests?

If the former is scope, then we need to define the standard enum value
for the error right?
ie. uint32_t *status needs to change to enum rte_dma_error or so.



>
> > > +{
> > > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > > +       return (*dev->completed_fails)(dev, vq_id, nb_status, status, cookie);
> > > +}
> > > +
> > > +struct rte_dmadev_stats {
> > > +       uint64_t enqueue_fail_count;
> > > +       /**< Conut of all operations which failed enqueued */
> > > +       uint64_t enqueued_count;
> > > +       /**< Count of all operations which successful enqueued */
> > > +       uint64_t completed_fail_count;
> > > +       /**< Count of all operations which failed to complete */
> > > +       uint64_t completed_count;
> > > +       /**< Count of all operations which successful complete */
> > > +};
> >
> > We need to have capability API to tell which items are
> > updated/supported by the driver.
> >
>
> I also would remove the enqueue fail counts, since they are better counted
> by the app. If a driver reports 20,000 failures we have no way of knowing
> if that is 20,000 unique operations which failed to enqueue or a single
> operation which failed to enqueue 20,000 times but succeeded on attempt
> 20,001.
>
> >
> > > diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> > > new file mode 100644
> > > index 0000000..a3afea2
> > > --- /dev/null
> > > +++ b/lib/dmadev/rte_dmadev_core.h
> > > @@ -0,0 +1,98 @@
> > > +/* SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright 2021 HiSilicon Limited.
> > > + */
> > > +
> > > +#ifndef _RTE_DMADEV_CORE_H_
> > > +#define _RTE_DMADEV_CORE_H_
> > > +
> > > +/**
> > > + * @file
> > > + *
> > > + * RTE DMA Device internal header.
> > > + *
> > > + * This header contains internal data types. But they are still part of the
> > > + * public API because they are used by inline public functions.
> > > + */
> > > +
> > > +struct rte_dmadev;
> > > +
> > > +typedef dma_cookie_t (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vq_id,
> > > +                                     void *src, void *dst,
> > > +                                     uint32_t length, uint64_t flags);
> > > +/**< @internal Function used to enqueue a copy operation. */
> >
> > To avoid namespace conflict(as it is public API) use rte_
> >
> >
> > > +
> > > +/**
> > > + * The data structure associated with each DMA device.
> > > + */
> > > +struct rte_dmadev {
> > > +       /**< Enqueue a copy operation onto the DMA device. */
> > > +       dmadev_copy_t copy;
> > > +       /**< Enqueue a scatter list copy operation onto the DMA device. */
> > > +       dmadev_copy_sg_t copy_sg;
> > > +       /**< Enqueue a fill operation onto the DMA device. */
> > > +       dmadev_fill_t fill;
> > > +       /**< Enqueue a scatter list fill operation onto the DMA device. */
> > > +       dmadev_fill_sg_t fill_sg;
> > > +       /**< Add a fence to force ordering between operations. */
> > > +       dmadev_fence_t fence;
> > > +       /**< Trigger hardware to begin performing enqueued operations. */
> > > +       dmadev_perform_t perform;
> > > +       /**< Returns the number of operations that successful completed. */
> > > +       dmadev_completed_t completed;
> > > +       /**< Returns the number of operations that failed to complete. */
> > > +       dmadev_completed_fails_t completed_fails;
> >
> > We need to limit fastpath items in 1 CL
> >
>
> I don't think that is going to be possible. I also would like to see
> numbers to check if we benefit much from having these fastpath ops separate
> from the regular ops.
>
> > > +
> > > +       void *dev_private; /**< PMD-specific private data */
> > > +       const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD */
> > > +
> > > +       uint16_t dev_id; /**< Device ID for this instance */
> > > +       int socket_id; /**< Socket ID where memory is allocated */
> > > +       struct rte_device *device;
> > > +       /**< Device info. supplied during device initialization */
> > > +       const char *driver_name; /**< Driver info. supplied by probing */
> > > +       char name[RTE_DMADEV_NAME_MAX_LEN]; /**< Device name */
> > > +
> > > +       RTE_STD_C11
> > > +       uint8_t attached : 1; /**< Flag indicating the device is attached */
> > > +       uint8_t started : 1; /**< Device state: STARTED(1)/STOPPED(0) */
> >
> > Add a couple of reserved fields for future ABI stability.
> >
> > > +
> > > +} __rte_cache_aligned;
> > > +
> > > +extern struct rte_dmadev rte_dmadevices[];
> > > +

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v6 4/7] power: remove thread safety from PMD power API's
    2021-07-05 15:21  3%           ` [dpdk-dev] [PATCH v6 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-07-05 15:21  3%           ` Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-07-05 15:21 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: ciara.loftus, konstantin.ananyev

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   5 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 67 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index 9d1cfac395..f015c509fc 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -88,6 +88,11 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
+
 ABI Changes
 -----------
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v6 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-07-05 15:21  3%           ` Anatoly Burakov
  2021-07-05 15:21  3%           ` [dpdk-dev] [PATCH v6 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-07-05 15:21 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: david.hunt, ciara.loftus

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---

Notes:
    v4:
    - Return error if callback is set to NULL
    - Replace raw number with a macro in monitor condition opaque data
    
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  1 +
 drivers/event/dlb2/dlb2.c                     | 17 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 20 +++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 20 +++++++----
 drivers/net/ice/ice_rxtx.c                    | 20 +++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 20 +++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 17 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 33 +++++++++++++++----
 lib/eal/x86/rte_power_intrinsics.c            | 17 +++++-----
 9 files changed, 121 insertions(+), 44 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index a6ecfdf3ce..c84ac280f5 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -84,6 +84,7 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..252bbd8d5e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,16 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3204,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decece..081682f88b 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,18 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +105,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0361af0d85..7ed196ec22 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,18 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +81,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index fc9bb5a3e7..d12437d19d 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,18 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +51,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..c814a28cb4 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,18 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1393,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 777a1d6e45..17370b77dc 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,18 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +294,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..c9aa52a86d 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,38 @@
  * which are architecture-dependent.
  */
 
+/** Size of the opaque data in monitor condition */
+#define RTE_POWER_MONITOR_OPAQUE_SZ 4
+
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ];
+	/**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..66fea28897 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -76,6 +76,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
+	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
 	if (!wait_supported)
@@ -91,6 +92,9 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (__check_val_size(pmc->size) < 0)
 		return -EINVAL;
 
+	if (pmc->fn == NULL)
+		return -EINVAL;
+
 	s = &wait_status[lcore_id];
 
 	/* update sleep address */
@@ -110,16 +114,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
-		const uint64_t cur_value = __get_umwait_val(
-				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
+	cur_value = __get_umwait_val(pmc->addr, pmc->size);
 
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
-			goto end;
-	}
+	/* check if callback indicates we should abort */
+	if (pmc->fn(cur_value, pmc->opaque) != 0)
+		goto end;
 
 	/* execute UMWAIT */
 	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
  2021-07-05 14:57  0%         ` Thomas Monjalon
@ 2021-07-05 15:06  0%           ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2021-07-05 15:06 UTC (permalink / raw)
  To: Thomas Monjalon, Xueming(Steven) Li, techboard
  Cc: dev, Wang Haiyue, Kinsella Ray, Parav Pandit, david.marchand

On 7/5/21 5:57 PM, Thomas Monjalon wrote:
> 05/07/2021 11:35, Andrew Rybchenko:
>> On 7/5/21 12:30 PM, Xueming(Steven) Li wrote:
>>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>>> I still don't understand if we really need to make the API a part of stable API/ABI in the future. Can it be internal?
>>>
>>> There was some discussion on this with Thomas in earlier version.
>>> Users might want to register/unregister their own PMD driver,
>>> Is this a valid scenario?
>>
>> Yes, it is true, but should DPDK care that much about
>> out-of-tree drivers. I'm just asking since don't know
>> techboard position on it.
> 
> I think there is a consensus to allow out-of-tree drivers
> without any compatibility commitment.
> 
> Some other bus drivers are exporting some API like in this patch.
> We could discuss again in techboard what to make internal.
> If it is decided to hide buses API, we could change all bus drivers
> later in DPDK 21.11.

OK, thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
  2021-07-05  9:35  0%       ` Andrew Rybchenko
@ 2021-07-05 14:57  0%         ` Thomas Monjalon
  2021-07-05 15:06  0%           ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-07-05 14:57 UTC (permalink / raw)
  To: Xueming(Steven) Li, Andrew Rybchenko, techboard
  Cc: dev, Wang Haiyue, Kinsella Ray, Parav Pandit, david.marchand

05/07/2021 11:35, Andrew Rybchenko:
> On 7/5/21 12:30 PM, Xueming(Steven) Li wrote:
> > From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> >> I still don't understand if we really need to make the API a part of stable API/ABI in the future. Can it be internal?
> > 
> > There was some discussion on this with Thomas in earlier version.
> > Users might want to register/unregister their own PMD driver,
> > Is this a valid scenario?
> 
> Yes, it is true, but should DPDK care that much about
> out-of-tree drivers. I'm just asking since don't know
> techboard position on it.

I think there is a consensus to allow out-of-tree drivers
without any compatibility commitment.

Some other bus drivers are exporting some API like in this patch.
We could discuss again in techboard what to make internal.
If it is decided to hide buses API, we could change all bus drivers
later in DPDK 21.11.




^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH 21.11] telemetry: remove experimental tags from APIs
  @ 2021-07-05 10:58  3%   ` Bruce Richardson
  0 siblings, 0 replies; 200+ results
From: Bruce Richardson @ 2021-07-05 10:58 UTC (permalink / raw)
  To: Power, Ciara; +Cc: dev, Ray Kinsella

On Mon, Jul 05, 2021 at 11:09:38AM +0100, Power, Ciara wrote:
> 
> 
> >-----Original Message-----
> >From: Richardson, Bruce <bruce.richardson@intel.com>
> >Sent: Friday 2 July 2021 16:23
> >To: dev@dpdk.org
> >Cc: Ray Kinsella <mdr@ashroe.eu>; Power, Ciara <ciara.power@intel.com>;
> >Richardson, Bruce <bruce.richardson@intel.com>
> >Subject: [PATCH 21.11] telemetry: remove experimental tags from APIs
> >
> >The telemetry APIs have been present and unchanged for >1 year now, so
> >remove experimental tag from them.
> >
> >Signed-off-by: Bruce Richardson <bruce.richardson@intel.com>
> >---
> > lib/telemetry/rte_telemetry.h | 18 ------------------
> > lib/telemetry/version.map     |  2 +-
> > 2 files changed, 1 insertion(+), 19 deletions(-)
> >
> <snip>
> 
> Hi Bruce,
> 
> +1 for this change.
> 
> I think there are some experimental tags missing from this patch - the legacy telemetry functions that are in "metrics/rte_metrics_telemetry.h" currently have the tags too.

I'm not sure about making those part of the stable ABI.

> Also, there is a reference to the library being experimental in the Telemetry User Guide doc.
> 
I missed checking the "howto" doc on telemetry, yes. I'll include that in a
v2.


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
  2021-07-04  9:30  3% ` Jerin Jacob
@ 2021-07-05 10:52  0%   ` Bruce Richardson
  2021-07-05 15:55  0%     ` Jerin Jacob
  0 siblings, 1 reply; 200+ results
From: Bruce Richardson @ 2021-07-05 10:52 UTC (permalink / raw)
  To: Jerin Jacob
  Cc: Chengwen Feng, Thomas Monjalon, Ferruh Yigit, Jerin Jacob,
	dpdk-dev, Morten Brørup, Nipun Gupta, Hemant Agrawal,
	Maxime Coquelin, Honnappa Nagarahalli, David Marchand,
	Satananda Burla, Prasun Kapoor, Ananyev, Konstantin, liangma,
	Radha Mohan Chintakuntla

On Sun, Jul 04, 2021 at 03:00:30PM +0530, Jerin Jacob wrote:
> On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
> >
> > This patch introduces 'dmadevice' which is a generic type of DMA
> > device.
> >
> > The APIs of dmadev library exposes some generic operations which can
> > enable configuration and I/O with the DMA devices.
> >
> > Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>
> 
> Thanks for v1.
> 
> I would suggest finalizing  lib/dmadev/rte_dmadev.h before doing the
> implementation so that you don't need
> to waste time on rewoking the implementation.
> 

I actually like having the .c file available too. Before we lock down the
.h file and the API, I want to verify the performance of our drivers with
the implementation, and having a working .c file is obviously necessary for
that. So I appreciate having it as part of the RFC.

> Comments inline.
> 
> > ---
<snip>
> > + *
> > + * The DMA framework is built on the following abstraction model:
> > + *
> > + *     ------------    ------------
> > + *     |virt-queue|    |virt-queue|
> > + *     ------------    ------------
> > + *            \           /
> > + *             \         /
> > + *              \       /
> > + *            ------------     ------------
> > + *            | HW-queue |     | HW-queue |
> > + *            ------------     ------------
> > + *                   \            /
> > + *                    \          /
> > + *                     \        /
> > + *                     ----------
> > + *                     | dmadev |
> > + *                     ----------
> 
> Continuing the discussion with @Morten Brørup , I think, we need to
> finalize the model.
> 

+1 and the terminology with regards to queues and channels. With our ioat
hardware, each HW queue was called a channel for instance.

> > + *   a) The DMA operation request must be submitted to the virt queue, virt
> > + *      queues must be created based on HW queues, the DMA device could have
> > + *      multiple HW queues.
> > + *   b) The virt queues on the same HW-queue could represent different contexts,
> > + *      e.g. user could create virt-queue-0 on HW-queue-0 for mem-to-mem
> > + *      transfer scenario, and create virt-queue-1 on the same HW-queue for
> > + *      mem-to-dev transfer scenario.
> > + *   NOTE: user could also create multiple virt queues for mem-to-mem transfer
> > + *         scenario as long as the corresponding driver supports.
> > + *
> > + * The control plane APIs include configure/queue_setup/queue_release/start/
> > + * stop/reset/close, in order to start device work, the call sequence must be
> > + * as follows:
> > + *     - rte_dmadev_configure()
> > + *     - rte_dmadev_queue_setup()
> > + *     - rte_dmadev_start()
> 
> Please add reconfigure behaviour etc, Please check the
> lib/regexdev/rte_regexdev.h
> introduction. I have added similar ones so you could reuse as much as possible.
> 
> 
> > + * The dataplane APIs include two parts:
> > + *   a) The first part is the submission of operation requests:
> > + *        - rte_dmadev_copy()
> > + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> > + *        - rte_dmadev_fill()
> > + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> > + *        - rte_dmadev_fence()   - add a fence force ordering between operations
> > + *        - rte_dmadev_perform() - issue doorbell to hardware
> > + *      These APIs could work with different virt queues which have different
> > + *      contexts.
> > + *      The first four APIs are used to submit the operation request to the virt
> > + *      queue, if the submission is successful, a cookie (as type
> > + *      'dma_cookie_t') is returned, otherwise a negative number is returned.
> > + *   b) The second part is to obtain the result of requests:
> > + *        - rte_dmadev_completed()
> > + *            - return the number of operation requests completed successfully.
> > + *        - rte_dmadev_completed_fails()
> > + *            - return the number of operation requests failed to complete.
> > + *
> > + * The misc APIs include info_get/queue_info_get/stats/xstats/selftest, provide
> > + * information query and self-test capabilities.
> > + *
> > + * About the dataplane APIs MT-safe, there are two dimensions:
> > + *   a) For one virt queue, the submit/completion API could be MT-safe,
> > + *      e.g. one thread do submit operation, another thread do completion
> > + *      operation.
> > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VQ.
> > + *      If driver don't support it, it's up to the application to guarantee
> > + *      MT-safe.
> > + *   b) For multiple virt queues on the same HW queue, e.g. one thread do
> > + *      operation on virt-queue-0, another thread do operation on virt-queue-1.
> > + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MVQ.
> > + *      If driver don't support it, it's up to the application to guarantee
> > + *      MT-safe.
> 
> From an application PoV it may not be good to write portable
> applications. Please check
> latest thread with @Morten Brørup
> 
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <rte_common.h>
> > +#include <rte_memory.h>
> > +#include <rte_errno.h>
> > +#include <rte_compat.h>
> 
> Sort in alphabetical order.
> 
> > +
> > +/**
> > + * dma_cookie_t - an opaque DMA cookie
> 
> Since we are defining the behaviour is not opaque any more.
> I think, it is better to call ring_idx or so.
> 

+1 for ring index. We don't need a separate type for it though, just
document the index as an unsigned return value.

> 
> > +#define RTE_DMA_DEV_CAPA_MT_MVQ (1ull << 11) /**< Support MT-safe of multiple virt queues */
> 
> Please lot of @see for all symbols where it is being used. So that one
> can understand the full scope of
> symbols. See below example.
> 
> #define RTE_REGEXDEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
> /**< RegEx device does support compiling the rules at runtime unlike
>  * loading only the pre-built rule database using
>  * struct rte_regexdev_config::rule_db in rte_regexdev_configure()
>  *
>  * @see struct rte_regexdev_config::rule_db, rte_regexdev_configure()
>  * @see struct rte_regexdev_info::regexdev_capa
>  */
> 
> > + *
> > + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
> > + * code.
> > + * When using cookies, comply with the following rules:
> > + * a) Cookies for each virtual queue are independent.
> > + * b) For a virt queue, the cookie are monotonically incremented, when it reach
> > + *    the INT_MAX, it wraps back to zero.

I disagree with the INT_MAX (or INT32_MAX) value here. If we use that
value, it means that we cannot use implicit wrap-around inside the CPU and
have to check for the INT_MAX value. Better to:
1. Specify that it wraps at UINT16_MAX which allows us to just use a
uint16_t internally and wrap-around automatically, or:
2. Specify that it wraps at a power-of-2 value >= UINT16_MAX, giving
drivers the flexibility at what value to wrap around.

> > + * c) The initial cookie of a virt queue is zero, after the device is stopped or
> > + *    reset, the virt queue's cookie needs to be reset to zero.
> > + * Example:
> > + *    step-1: start one dmadev
> > + *    step-2: enqueue a copy operation, the cookie return is 0
> > + *    step-3: enqueue a copy operation again, the cookie return is 1
> > + *    ...
> > + *    step-101: stop the dmadev
> > + *    step-102: start the dmadev
> > + *    step-103: enqueue a copy operation, the cookie return is 0
> > + *    ...
> > + */
> 
> Good explanation.
> 
> > +typedef int32_t dma_cookie_t;
> 

As I mentioned before, I'd just remove this, and use regular int types,
with "ring_idx" as the name.

> 
> > +
> > +/**
> > + * dma_scatterlist - can hold scatter DMA operation request
> > + */
> > +struct dma_scatterlist {
> 
> I prefer to change scatterlist -> sg
> i.e rte_dma_sg
> 
> > +       void *src;
> > +       void *dst;
> > +       uint32_t length;
> > +};
> > +
> 
> > +
> > +/**
> > + * A structure used to retrieve the contextual information of
> > + * an DMA device
> > + */
> > +struct rte_dmadev_info {
> > +       /**
> > +        * Fields filled by framewok
> 
> typo.
> 
> > +        */
> > +       struct rte_device *device; /**< Generic Device information */
> > +       const char *driver_name; /**< Device driver name */
> > +       int socket_id; /**< Socket ID where memory is allocated */
> > +
> > +       /**
> > +        * Specification fields filled by driver
> > +        */
> > +       uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> > +       uint16_t max_hw_queues; /**< Maximum number of HW queues. */
> > +       uint16_t max_vqs_per_hw_queue;
> > +       /**< Maximum number of virt queues to allocate per HW queue */
> > +       uint16_t max_desc;
> > +       /**< Maximum allowed number of virt queue descriptors */
> > +       uint16_t min_desc;
> > +       /**< Minimum allowed number of virt queue descriptors */
> 
> Please add max_nb_segs. i.e maximum number of segments supported.
> 
> > +
> > +       /**
> > +        * Status fields filled by driver
> > +        */
> > +       uint16_t nb_hw_queues; /**< Number of HW queues configured */
> > +       uint16_t nb_vqs; /**< Number of virt queues configured */
> > +};
> > + i
> > +
> > +/**
> > + * dma_address_type
> > + */
> > +enum dma_address_type {
> > +       DMA_ADDRESS_TYPE_IOVA, /**< Use IOVA as dma address */
> > +       DMA_ADDRESS_TYPE_VA, /**< Use VA as dma address */
> > +};
> > +
> > +/**
> > + * A structure used to configure a DMA device.
> > + */
> > +struct rte_dmadev_conf {
> > +       enum dma_address_type addr_type; /**< Address type to used */
> 
> I think, there are 3 kinds of limitations/capabilities.
> 
> When the system is configured as IOVA as VA
> 1) Device supports any VA address like memory from rte_malloc(),
> rte_memzone(), malloc, stack memory
> 2) Device support only VA address from rte_malloc(), rte_memzone() i.e
> memory backed by hugepage and added to DMA map.
> 
> When the system is configured as IOVA as PA
> 1) Devices support only PA addresses .
> 
> IMO, Above needs to be  advertised as capability and application needs
> to align with that
> and I dont think application requests the driver to work in any of the modes.
> 
> 

I don't think we need this level of detail for addressing capabilities.
Unless I'm missing something, the hardware should behave exactly as other
hardware does taking in iova's.  If the user wants to check whether virtual
addresses to pinned memory can be used directly, the user can call
"rte_eal_iova_mode". We can't have a situation where some hardware uses one
type of addresses and another hardware the other.

Therefore, the only additional addressing capability we should need to
report is that the hardware can use SVM/SVA and use virtual addresses not
in hugepage memory.

> 
> > +       uint16_t nb_hw_queues; /**< Number of HW-queues enable to use */
> > +       uint16_t max_vqs; /**< Maximum number of virt queues to use */
> 
> You need to what is max value allowed etc i.e it is based on
> info_get() and mention the field
> in info structure
> 
> 
> > +
> > +/**
> > + * dma_transfer_direction
> > + */
> > +enum dma_transfer_direction {
> 
> rte_dma_transter_direction
> 
> > +       DMA_MEM_TO_MEM,
> > +       DMA_MEM_TO_DEV,
> > +       DMA_DEV_TO_MEM,
> > +       DMA_DEV_TO_DEV,
> > +};
> > +
> > +/**
> > + * A structure used to configure a DMA virt queue.
> > + */
> > +struct rte_dmadev_queue_conf {
> > +       enum dma_transfer_direction direction;
> 
> 
> > +       /**< Associated transfer direction */
> > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > +       uint64_t dev_flags; /**< Device specific flags */
> 
> Use of this? Need more comments on this.
> Since it is in slowpath, We can have non opaque names here based on
> each driver capability.
> 
> 
> > +       void *dev_ctx; /**< Device specific context */
> 
> Use of this ? Need more comment ont this.
> 

I think this should be dropped. We should not have any opaque
device-specific info in these structs, rather if a particular device needs
parameters we should call them out. Drivers for which it's not relevant can
ignore them (and report same in capability if necessary). Since this is not
a dataplane API, we aren't concerned too much about perf and can size the
struct appropriately.

> 
> Please add some good amount of reserved bits and have API to init this
> structure for future ABI stability, say rte_dmadev_queue_config_init()
> or so.
> 

I don't think that is necessary. Since the config struct is used only as
parameter to the config function, any changes to it can be managed by
versioning that single function. Padding would only be necessary if we had
an array of these config structs somewhere.

> 
> > +
> > +/**
> > + * A structure used to retrieve information of a DMA virt queue.
> > + */
> > +struct rte_dmadev_queue_info {
> > +       enum dma_transfer_direction direction;
> 
> A queue may support all directions so I think it should be a bitfield.
> 
> > +       /**< Associated transfer direction */
> > +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> > +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> > +       uint64_t dev_flags; /**< Device specific flags */
> > +};
> > +
> 
> > +__rte_experimental
> > +static inline dma_cookie_t
> > +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id,
> > +                  const struct dma_scatterlist *sg,
> > +                  uint32_t sg_len, uint64_t flags)
> 
> I would like to change this as:
> rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct
> rte_dma_sg *src, uint32_t nb_src,
> const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like
> src 30 MB copy can be splitted as written as 1 MB x 30 dst.
> 
> 
> 
> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +       return (*dev->copy_sg)(dev, vq_id, sg, sg_len, flags);
> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Enqueue a fill operation onto the DMA virt queue
> > + *
> > + * This queues up a fill operation to be performed by hardware, but does not
> > + * trigger hardware to begin that operation.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vq_id
> > + *   The identifier of virt queue.
> > + * @param pattern
> > + *   The pattern to populate the destination buffer with.
> > + * @param dst
> > + *   The address of the destination buffer.
> > + * @param length
> > + *   The length of the destination buffer.
> > + * @param flags
> > + *   An opaque flags for this operation.
> 
> PLEASE REMOVE opaque stuff from fastpath it will be a pain for
> application writers as
> they need to write multiple combinations of fastpath. flags are OK, if
> we have a valid
> generic flag now to control the transfer behavior.
> 

+1. Flags need to be explicitly listed. If we don't have any flags for now,
we can specify that the value must be given as zero and it's for future
use.

> 
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Add a fence to force ordering between operations
> > + *
> > + * This adds a fence to a sequence of operations to enforce ordering, such that
> > + * all operations enqueued before the fence must be completed before operations
> > + * after the fence.
> > + * NOTE: Since this fence may be added as a flag to the last operation enqueued,
> > + * this API may not function correctly when called immediately after an
> > + * "rte_dmadev_perform" call i.e. before any new operations are enqueued.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vq_id
> > + *   The identifier of virt queue.
> > + *
> > + * @return
> > + *   - =0: Successful add fence.
> > + *   - <0: Failure to add fence.
> > + *
> > + * NOTE: The caller must ensure that the input parameter is valid and the
> > + *       corresponding device supports the operation.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_fence(uint16_t dev_id, uint16_t vq_id)
> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +       return (*dev->fence)(dev, vq_id);
> > +}
> 
> Since HW submission is in a queue(FIFO) the ordering is always
> maintained. Right?
> Could you share more details and use case of fence() from
> driver/application PoV?
> 

There are different kinds of ordering to consider, ordering of completions
and the ordering of operations. While jobs are reported as completed to the
user in order, for performance hardware, may overlap individual jobs within
a burst (or even across bursts). Therefore, we need a fence operation to
inform hardware that one job should not be started until the other has
fully completed.

> 
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Trigger hardware to begin performing enqueued operations
> > + *
> > + * This API is used to write the "doorbell" to the hardware to trigger it
> > + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vq_id
> > + *   The identifier of virt queue.
> > + *
> > + * @return
> > + *   - =0: Successful trigger hardware.
> > + *   - <0: Failure to trigger hardware.
> > + *
> > + * NOTE: The caller must ensure that the input parameter is valid and the
> > + *       corresponding device supports the operation.
> > + */
> > +__rte_experimental
> > +static inline int
> > +rte_dmadev_perform(uint16_t dev_id, uint16_t vq_id)
> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +       return (*dev->perform)(dev, vq_id);
> > +}
> 
> Since we have additional function call overhead in all the
> applications for this scheme, I would like to understand
> the use of doing this way vs enq does the doorbell implicitly from
> driver/application PoV?
> 

In our benchmarks it's just faster. When we tested it, the overhead of the
function calls was noticably less than the cost of building up the
parameter array(s) for passing the jobs in as a burst. [We don't see this
cost with things like NIC I/O since DPDK tends to already have the mbuf
fully populated before the TX call anyway.]

> 
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Returns the number of operations that have been successful completed.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vq_id
> > + *   The identifier of virt queue.
> > + * @param nb_cpls
> > + *   The maximum number of completed operations that can be processed.
> > + * @param[out] cookie
> > + *   The last completed operation's cookie.
> > + * @param[out] has_error
> > + *   Indicates if there are transfer error.
> > + *
> > + * @return
> > + *   The number of operations that successful completed.
> 
> successfully
> 
> > + *
> > + * NOTE: The caller must ensure that the input parameter is valid and the
> > + *       corresponding device supports the operation.
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_dmadev_completed(uint16_t dev_id, uint16_t vq_id, const uint16_t nb_cpls,
> > +                    dma_cookie_t *cookie, bool *has_error)
> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +       has_error = false;
> > +       return (*dev->completed)(dev, vq_id, nb_cpls, cookie, has_error);
> 
> It may be better to have cookie/ring_idx as third argument.
> 

No strong opinions here, but having it as in the code above means all
input parameters come before all output, which makes sense to me.

> > +}
> > +
> > +/**
> > + * @warning
> > + * @b EXPERIMENTAL: this API may change without prior notice.
> > + *
> > + * Returns the number of operations that failed to complete.
> > + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> > + *
> > + * @param dev_id
> > + *   The identifier of the device.
> > + * @param vq_id
> > + *   The identifier of virt queue.
> (> + * @param nb_status
> > + *   Indicates the size  of status array.
> > + * @param[out] status
> > + *   The error code of operations that failed to complete.
> > + * @param[out] cookie
> > + *   The last failed completed operation's cookie.
> > + *
> > + * @return
> > + *   The number of operations that failed to complete.
> > + *
> > + * NOTE: The caller must ensure that the input parameter is valid and the
> > + *       corresponding device supports the operation.
> > + */
> > +__rte_experimental
> > +static inline uint16_t
> > +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
> > +                          const uint16_t nb_status, uint32_t *status,
> > +                          dma_cookie_t *cookie)
> 
> IMO, it is better to move cookie/rind_idx at 3.
> Why it would return any array of errors? since it called after
> rte_dmadev_completed() has
> has_error. Is it better to change
> 
> rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
> *cookie,  uint32_t *status)
> 
> I also think, we may need to set status as bitmask and enumerate all
> the combination of error codes
> of all the driver and return string from driver existing rte_flow_error
> 
> See
> struct rte_flow_error {
>         enum rte_flow_error_type type; /**< Cause field and error types. */
>         const void *cause; /**< Object responsible for the error. */
>         const char *message; /**< Human-readable error message. */
> };
> 

I think we need a multi-return value API here, as we may add operations in
future which have non-error status values to return. The obvious case is
DMA engines which support "compare" operations. In that case a successful
compare (as in there were no DMA or HW errors) can return "equal" or
"not-equal" as statuses. For general "copy" operations, the faster
completion op can be used to just return successful values (and only call
this status version on error), while apps using those compare ops or a
mixture of copy and compare ops, would always use the slower one that
returns status values for each and every op..

The ioat APIs used 32-bit integer values for this status array so as to
allow e.g. 16-bits for error code and 16-bits for future status values. For
most operations there should be a fairly small set of things that can go
wrong, i.e. bad source address, bad destination address or invalid length.
Within that we may have a couple of specifics for why an address is bad,
but even so I don't think we need to start having multiple bit
combinations.

> > +{
> > +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> > +       return (*dev->completed_fails)(dev, vq_id, nb_status, status, cookie);
> > +}
> > +
> > +struct rte_dmadev_stats {
> > +       uint64_t enqueue_fail_count;
> > +       /**< Conut of all operations which failed enqueued */
> > +       uint64_t enqueued_count;
> > +       /**< Count of all operations which successful enqueued */
> > +       uint64_t completed_fail_count;
> > +       /**< Count of all operations which failed to complete */
> > +       uint64_t completed_count;
> > +       /**< Count of all operations which successful complete */
> > +};
> 
> We need to have capability API to tell which items are
> updated/supported by the driver.
> 

I also would remove the enqueue fail counts, since they are better counted
by the app. If a driver reports 20,000 failures we have no way of knowing
if that is 20,000 unique operations which failed to enqueue or a single
operation which failed to enqueue 20,000 times but succeeded on attempt
20,001.

> 
> > diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> > new file mode 100644
> > index 0000000..a3afea2
> > --- /dev/null
> > +++ b/lib/dmadev/rte_dmadev_core.h
> > @@ -0,0 +1,98 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright 2021 HiSilicon Limited.
> > + */
> > +
> > +#ifndef _RTE_DMADEV_CORE_H_
> > +#define _RTE_DMADEV_CORE_H_
> > +
> > +/**
> > + * @file
> > + *
> > + * RTE DMA Device internal header.
> > + *
> > + * This header contains internal data types. But they are still part of the
> > + * public API because they are used by inline public functions.
> > + */
> > +
> > +struct rte_dmadev;
> > +
> > +typedef dma_cookie_t (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vq_id,
> > +                                     void *src, void *dst,
> > +                                     uint32_t length, uint64_t flags);
> > +/**< @internal Function used to enqueue a copy operation. */
> 
> To avoid namespace conflict(as it is public API) use rte_
> 
> 
> > +
> > +/**
> > + * The data structure associated with each DMA device.
> > + */
> > +struct rte_dmadev {
> > +       /**< Enqueue a copy operation onto the DMA device. */
> > +       dmadev_copy_t copy;
> > +       /**< Enqueue a scatter list copy operation onto the DMA device. */
> > +       dmadev_copy_sg_t copy_sg;
> > +       /**< Enqueue a fill operation onto the DMA device. */
> > +       dmadev_fill_t fill;
> > +       /**< Enqueue a scatter list fill operation onto the DMA device. */
> > +       dmadev_fill_sg_t fill_sg;
> > +       /**< Add a fence to force ordering between operations. */
> > +       dmadev_fence_t fence;
> > +       /**< Trigger hardware to begin performing enqueued operations. */
> > +       dmadev_perform_t perform;
> > +       /**< Returns the number of operations that successful completed. */
> > +       dmadev_completed_t completed;
> > +       /**< Returns the number of operations that failed to complete. */
> > +       dmadev_completed_fails_t completed_fails;
> 
> We need to limit fastpath items in 1 CL
> 

I don't think that is going to be possible. I also would like to see
numbers to check if we benefit much from having these fastpath ops separate
from the regular ops.

> > +
> > +       void *dev_private; /**< PMD-specific private data */
> > +       const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD */
> > +
> > +       uint16_t dev_id; /**< Device ID for this instance */
> > +       int socket_id; /**< Socket ID where memory is allocated */
> > +       struct rte_device *device;
> > +       /**< Device info. supplied during device initialization */
> > +       const char *driver_name; /**< Driver info. supplied by probing */
> > +       char name[RTE_DMADEV_NAME_MAX_LEN]; /**< Device name */
> > +
> > +       RTE_STD_C11
> > +       uint8_t attached : 1; /**< Flag indicating the device is attached */
> > +       uint8_t started : 1; /**< Device state: STARTED(1)/STOPPED(0) */
> 
> Add a couple of reserved fields for future ABI stability.
> 
> > +
> > +} __rte_cache_aligned;
> > +
> > +extern struct rte_dmadev rte_dmadevices[];
> > +

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
  2021-07-05  9:30  0%     ` Xueming(Steven) Li
@ 2021-07-05  9:35  0%       ` Andrew Rybchenko
  2021-07-05 14:57  0%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-07-05  9:35 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Wang Haiyue, NBU-Contact-Thomas Monjalon, Kinsella Ray,
	Parav Pandit, Neil Horman

On 7/5/21 12:30 PM, Xueming(Steven) Li wrote:
> Hi Andrew,
> 
>> -----Original Message-----
>> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>> Sent: Monday, July 5, 2021 5:19 PM
>> To: Xueming(Steven) Li <xuemingl@nvidia.com>
>> Cc: dev@dpdk.org; Wang Haiyue <haiyue.wang@intel.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Kinsella Ray
>> <mdr@ashroe.eu>; Parav Pandit <parav@nvidia.com>; Neil Horman <nhorman@tuxdriver.com>
>> Subject: Re: [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
>>
>> On 7/5/21 9:45 AM, Xueming Li wrote:
>>> Auxiliary bus [1] provides a way to split function into child-devices
>>> representing sub-domains of functionality. Each auxiliary device
>>> represents a part of its parent functionality.
>>>
>>> Auxiliary device is identified by unique device name, sysfs path:
>>>   /sys/bus/auxiliary/devices/<name>
>>>
>>> Devargs legacy syntax of auxiliary device:
>>>   -a auxiliary:<name>[,args...]
>>> Devargs generic syntax of auxiliary device:
>>>   -a bus=auxiliary,name=<name>/class=<class>/driver=<driver>[,args...]
>>>
>>> [1] kernel auxiliary bus document:
>>> https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html
>>>
>>> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
>>> Cc: Wang Haiyue <haiyue.wang@intel.com>
>>> Cc: Thomas Monjalon <thomas@monjalon.net>
>>> Cc: Kinsella Ray <mdr@ashroe.eu>
>>> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
>>
>> I still don't understand if we really need to make the API a part of stable API/ABI in the future. Can it be internal?
> 
> There was some discussion on this with Thomas in earlier version.
> Users might want to register/unregister their own PMD driver,
> Is this a valid scenario?

Yes, it is true, but should DPDK care that much about
out-of-tree drivers. I'm just asking since don't know
techboard position on it.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
  2021-07-05  9:19  3%   ` Andrew Rybchenko
@ 2021-07-05  9:30  0%     ` Xueming(Steven) Li
  2021-07-05  9:35  0%       ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: Xueming(Steven) Li @ 2021-07-05  9:30 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, Wang Haiyue, NBU-Contact-Thomas Monjalon, Kinsella Ray,
	Parav Pandit, Neil Horman

Hi Andrew,

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Monday, July 5, 2021 5:19 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Wang Haiyue <haiyue.wang@intel.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Kinsella Ray
> <mdr@ashroe.eu>; Parav Pandit <parav@nvidia.com>; Neil Horman <nhorman@tuxdriver.com>
> Subject: Re: [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
> 
> On 7/5/21 9:45 AM, Xueming Li wrote:
> > Auxiliary bus [1] provides a way to split function into child-devices
> > representing sub-domains of functionality. Each auxiliary device
> > represents a part of its parent functionality.
> >
> > Auxiliary device is identified by unique device name, sysfs path:
> >   /sys/bus/auxiliary/devices/<name>
> >
> > Devargs legacy syntax of auxiliary device:
> >   -a auxiliary:<name>[,args...]
> > Devargs generic syntax of auxiliary device:
> >   -a bus=auxiliary,name=<name>/class=<class>/driver=<driver>[,args...]
> >
> > [1] kernel auxiliary bus document:
> > https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html
> >
> > Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> > Cc: Wang Haiyue <haiyue.wang@intel.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>
> > Cc: Kinsella Ray <mdr@ashroe.eu>
> > Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> 
> I still don't understand if we really need to make the API a part of stable API/ABI in the future. Can it be internal?

There was some discussion on this with Thomas in earlier version. Users might want to register/unregister their own PMD driver,
Is this a valid scenario?

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus
  @ 2021-07-05  9:19  3%   ` Andrew Rybchenko
  2021-07-05  9:30  0%     ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-07-05  9:19 UTC (permalink / raw)
  To: Xueming Li
  Cc: dev, Wang Haiyue, Thomas Monjalon, Kinsella Ray, Parav Pandit,
	Neil Horman

On 7/5/21 9:45 AM, Xueming Li wrote:
> Auxiliary bus [1] provides a way to split function into child-devices
> representing sub-domains of functionality. Each auxiliary device
> represents a part of its parent functionality.
> 
> Auxiliary device is identified by unique device name, sysfs path:
>   /sys/bus/auxiliary/devices/<name>
> 
> Devargs legacy syntax of auxiliary device:
>   -a auxiliary:<name>[,args...]
> Devargs generic syntax of auxiliary device:
>   -a bus=auxiliary,name=<name>/class=<class>/driver=<driver>[,args...]
> 
> [1] kernel auxiliary bus document:
> https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> Cc: Wang Haiyue <haiyue.wang@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Kinsella Ray <mdr@ashroe.eu>
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

I still don't understand if we really need to make the API
a part of stable API/ABI in the future. Can it be internal?

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 19/20] net/sfc: support flow action COUNT in transfer rules
  2021-07-04 19:45  3%     ` Thomas Monjalon
@ 2021-07-05  8:41  0%       ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2021-07-05  8:41 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: David Marchand, Bruce Richardson, dev, Igor Romanov,
	Andy Moreton, Ivan Malov

On 7/4/21 10:45 PM, Thomas Monjalon wrote:
> 02/07/2021 14:53, Andrew Rybchenko:
>> On 7/2/21 3:30 PM, Thomas Monjalon wrote:
>>> 02/07/2021 10:43, Andrew Rybchenko:
>>>> On 7/1/21 4:05 PM, Andrew Rybchenko wrote:
>>>>> On 7/1/21 3:34 PM, David Marchand wrote:
>>>>>> On Thu, Jul 1, 2021 at 11:22 AM Andrew Rybchenko
>>>>>> <andrew.rybchenko@oktetlabs.ru> wrote:
>>>>>>> The build works fine for me on FC34, but it has
>>>>>>> libatomic-11.1.1-3.fc34.x86_64 installed.
>>>>>> I first produced the issue on my "old" FC32.
>>>>>> Afaics, for FC33 and later, gcc now depends on libatomic and the
>>>>>> problem won't be noticed.
>>>>>> FC32 and before are EOL, but I then reproduced the issue on RHEL 8
>>>>>> (and Intel CI reported it on Centos 8 too).
>>>>> I see. Thanks for the clarification.
>>>>>
>>>>>>> I'd like to understand what we're trying to solve here.
>>>>>>> Are we trying to make meson to report the missing library
>>>>>>> correctly?
>>>>>>>
>>>>>>> If so, I think I can do simple check using cc.links()
>>>>>>> which will fail if the library is not found. I'll
>>>>>>> test that it works as expected if the library is not
>>>>>>> completely installed.
>>>>>>>
>>>>>> I tried below diff, and it works for me.
>>>>>> "works" as in net/sfc gets disabled without libatomic installed:
>>> [...]
>>>>>>  # for gcc compiles we need -latomic for 128-bit atomic ops
>>>>>>  if cc.get_id() == 'gcc'
>>>>>> +    code = '''#include <stdio.h>
>>>>>> +    void main() { printf("Atomilink me.\n"); }
>>>>>> +    '''
>>>>>> +    if not cc.links(code, args: '-latomic', name: 'libatomic link check')
>>>>>> +        build = false
>>>>>> +        reason = 'missing dependency, "libatomic"'
>>>>>> +        subdir_done()
>>>>>> +    endif
>>>>>>      ext_deps += cc.find_library('atomic')
>>>>>>  endif
>>>>> Many thanks, LGTM. I'll pick it up and add comments why
>>>>> it is checked this way.
>>>>>
>>>> I've send v4 with the problem fixed. However, I'm afraid
>>>> build test systems should be updated to have libatomic
>>>> correctly installed. Otherwise, they do not really check
>>>> net/sfc build.
>>> When testing on old systems, sfc won't be tested anymore after this patchset.
>>> On recent systems, sfc should be enabled I guess.
>>> I don't see how to manage better, sorry.
>>>
>> I see. I thought that it is possible to install missing
>> package on corresponding systems to make build coverage
>> better.
>>
>> Now I automatically test build on problematic distros
>> with previously missing packages installed. So I have
>> internal build coverage anyway.
> David asked for installing libatomic:
> https://inbox.dpdk.org/ci/CAJFAV8xCNBL4yEZU0c=dJGYS+13QM7Uz7e2qnUkMuM7eaKKw+Q@mail.gmail.com/
>
> We should wait for it to be installed otherwise ABI check will fail.

Yes, I see. Thanks.


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH v4 0/3] Add PIE support for HQoS library
  2021-06-21  7:35  3%     ` [dpdk-dev] [RFC PATCH v3 " Liguzinski, WojciechX
@ 2021-07-05  8:04  3%       ` Liguzinski, WojciechX
  0 siblings, 0 replies; 200+ results
From: Liguzinski, WojciechX @ 2021-07-05  8:04 UTC (permalink / raw)
  To: dev, jasvinder.singh, cristian.dumitrescu; +Cc: savinay.dharmappa, megha.ajmera

DPDK sched library is equipped with mechanism that secures it from the bufferbloat problem
which is a situation when excess buffers in the network cause high latency and latency 
variation. Currently, it supports RED for active queue management (which is designed 
to control the queue length but it does not control latency directly and is now being 
obsoleted). However, more advanced queue management is required to address this problem
and provide desirable quality of service to users.

This solution (RFC) proposes usage of new algorithm called "PIE" (Proportional Integral
controller Enhanced) that can effectively and directly control queuing latency to address 
the bufferbloat problem.

The implementation of mentioned functionality includes modification of existing and 
adding a new set of data structures to the library, adding PIE related APIs. 
This affects structures in public API/ABI. That is why deprecation notice is going
to be prepared and sent.

Liguzinski, WojciechX (3):
  sched: add PIE based congestion management
  example/qos_sched: add PIE support
  example/ip_pipeline: add PIE support

 config/rte_config.h                      |   1 -
 drivers/net/softnic/rte_eth_softnic_tm.c |   6 +-
 examples/ip_pipeline/tmgr.c              |   6 +-
 examples/qos_sched/app_thread.c          |   1 -
 examples/qos_sched/cfg_file.c            |  82 ++++-
 examples/qos_sched/init.c                |   7 +-
 examples/qos_sched/profile.cfg           | 196 +++++++----
 lib/sched/meson.build                    |  10 +-
 lib/sched/rte_pie.c                      |  82 +++++
 lib/sched/rte_pie.h                      | 393 +++++++++++++++++++++++
 lib/sched/rte_sched.c                    | 229 +++++++++----
 lib/sched/rte_sched.h                    |  53 ++-
 lib/sched/version.map                    |   3 +
 13 files changed, 888 insertions(+), 181 deletions(-)
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v6 2/2] bus/auxiliary: introduce auxiliary bus
  2021-07-04 16:13  3%   ` Andrew Rybchenko
@ 2021-07-05  5:47  0%     ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-07-05  5:47 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: dev, Wang Haiyue, NBU-Contact-Thomas Monjalon, Kinsella Ray, Neil Horman

Hi Andrew,

Thanks very much all the good suggestions, v7 posted.

Best Regards,
Xueming

> -----Original Message-----
> From: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> Sent: Monday, July 5, 2021 12:13 AM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; Wang Haiyue <haiyue.wang@intel.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>; Kinsella Ray
> <mdr@ashroe.eu>; Neil Horman <nhorman@tuxdriver.com>
> Subject: Re: [dpdk-dev] [PATCH v6 2/2] bus/auxiliary: introduce auxiliary bus
> 
> On 6/25/21 2:47 PM, Xueming Li wrote:
> > Auxiliary bus [1] provides a way to split function into child-devices
> > representing sub-domains of functionality. Each auxiliary device
> > represents a part of its parent functionality.
> >
> > Auxiliary device is identified by unique device name, sysfs path:
> >   /sys/bus/auxiliary/devices/<name>
> >
> > Devargs legacy syntax ofauxiliary device:
> 
> Missing space after 'of'
> 
> >   -a auxiliary:<name>[,args...]
> > Devargs generic syntax of auxiliary device:
> >   -a bus=auxiliary,name=<name>,,/class=<classs>,,/driver=<driver>,,
> 
> Are two commas above intentionall? What for?
> 
> >
> > [1] kernel auxiliary bus document:
> > https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html
> >
> > Signed-off-by: Xueming Li <xuemingl@nvidia.com>
> 
> With my below notes fixed:
> 
> Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>
> 
> > Cc: Wang Haiyue <haiyue.wang@intel.com>
> > Cc: Thomas Monjalon <thomas@monjalon.net>
> > Cc: Kinsella Ray <mdr@ashroe.eu>
> > ---
> >  MAINTAINERS                               |   5 +
> >  doc/guides/rel_notes/release_21_08.rst    |   6 +
> >  drivers/bus/auxiliary/auxiliary_common.c  | 411
> > ++++++++++++++++++++++  drivers/bus/auxiliary/auxiliary_params.c  |  59 ++++
> >  drivers/bus/auxiliary/linux/auxiliary.c   | 141 ++++++++
> >  drivers/bus/auxiliary/meson.build         |  16 +
> >  drivers/bus/auxiliary/private.h           |  74 ++++
> >  drivers/bus/auxiliary/rte_bus_auxiliary.h | 201 +++++++++++
> >  drivers/bus/auxiliary/version.map         |   7 +
> >  drivers/bus/meson.build                   |   1 +
> >  10 files changed, 921 insertions(+)
> >  create mode 100644 drivers/bus/auxiliary/auxiliary_common.c
> >  create mode 100644 drivers/bus/auxiliary/auxiliary_params.c
> >  create mode 100644 drivers/bus/auxiliary/linux/auxiliary.c
> >  create mode 100644 drivers/bus/auxiliary/meson.build  create mode
> > 100644 drivers/bus/auxiliary/private.h  create mode 100644
> > drivers/bus/auxiliary/rte_bus_auxiliary.h
> >  create mode 100644 drivers/bus/auxiliary/version.map
> >
> > diff --git a/MAINTAINERS b/MAINTAINERS index 5877a16971..eaf691ca6a
> > 100644
> > --- a/MAINTAINERS
> > +++ b/MAINTAINERS
> > @@ -525,6 +525,11 @@ F: doc/guides/mempool/octeontx2.rst  Bus Drivers
> >  -----------
> >
> > +Auxiliary bus driver
> 
> Shouldn't it be EXPERIMENTAL?
> 
> > +M: Parav Pandit <parav@nvidia.com>
> > +M: Xueming Li <xuemingl@nvidia.com>
> > +F: drivers/bus/auxiliary/
> > +
> >  Intel FPGA bus
> >  M: Rosen Xu <rosen.xu@intel.com>
> >  F: drivers/bus/ifpga/
> > diff --git a/doc/guides/rel_notes/release_21_08.rst
> > b/doc/guides/rel_notes/release_21_08.rst
> > index a6ecfdf3ce..e7ef4c8a05 100644
> > --- a/doc/guides/rel_notes/release_21_08.rst
> > +++ b/doc/guides/rel_notes/release_21_08.rst
> > @@ -55,6 +55,12 @@ New Features
> >       Also, make sure to start the actual text at the margin.
> >       =======================================================
> >
> > +* **Added auxiliary bus support.**
> > +
> > +  Auxiliary bus provides a way to split function into child-devices
> > + representing sub-domains of functionality. Each auxiliary device
> > + represents a part of its parent functionality.
> > +
> >
> >  Removed Items
> >  -------------
> > diff --git a/drivers/bus/auxiliary/auxiliary_common.c
> > b/drivers/bus/auxiliary/auxiliary_common.c
> > new file mode 100644
> > index 0000000000..8a75306da5
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/auxiliary_common.c
> > @@ -0,0 +1,411 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2021 NVIDIA Corporation & Affiliates  */
> > +
> > +#include <string.h>
> > +#include <inttypes.h>
> > +#include <stdint.h>
> > +#include <stdbool.h>
> > +#include <stdlib.h>
> > +#include <stdio.h>
> > +#include <sys/queue.h>
> > +#include <rte_errno.h>
> > +#include <rte_interrupts.h>
> > +#include <rte_log.h>
> > +#include <rte_bus.h>
> > +#include <rte_per_lcore.h>
> > +#include <rte_memory.h>
> > +#include <rte_eal.h>
> > +#include <rte_eal_paging.h>
> > +#include <rte_string_fns.h>
> > +#include <rte_common.h>
> > +#include <rte_devargs.h>
> > +
> > +#include "private.h"
> > +#include "rte_bus_auxiliary.h"
> > +
> > +static struct rte_devargs *
> > +auxiliary_devargs_lookup(const char *name) {
> > +	struct rte_devargs *devargs;
> > +
> > +	RTE_EAL_DEVARGS_FOREACH(RTE_BUS_AUXILIARY_NAME, devargs) {
> > +		if (strcmp(devargs->name, name) == 0)
> > +			return devargs;
> > +	}
> > +	return NULL;
> > +}
> > +
> > +/*
> > + * Test whether the auxiliary device exist
> 
> Missing full stop above.
> 
> > + *
> > + * Stub for OS not supporting auxiliary bus.
> > + */
> > +__rte_weak bool
> > +auxiliary_dev_exists(const char *name) {
> > +	RTE_SET_USED(name);
> > +	return false;
> > +}
> > +
> > +/*
> > + * Scan the devices in the auxiliary bus.
> > + *
> > + * Stub for OS not supporting auxiliary bus.
> > + */
> > +__rte_weak int
> > +auxiliary_scan(void)
> > +{
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Update a device's devargs being scanned.
> > + *
> > + * @param aux_dev
> > + *	AUXILIARY device.
> > + */
> > +void
> > +auxiliary_on_scan(struct rte_auxiliary_device *aux_dev) {
> > +	aux_dev->device.devargs = auxiliary_devargs_lookup(aux_dev->name);
> > +}
> > +
> > +/*
> > + * Match the auxiliary driver and device using driver function.
> > + */
> > +bool
> > +auxiliary_match(const struct rte_auxiliary_driver *aux_drv,
> > +		const struct rte_auxiliary_device *aux_dev) {
> > +	if (aux_drv->match == NULL)
> > +		return false;
> > +	return aux_drv->match(aux_dev->name); }
> > +
> > +/*
> > + * Call the probe() function of the driver.
> > + */
> > +static int
> > +rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver *drv,
> > +			       struct rte_auxiliary_device *dev)
> > +{
> > +	enum rte_iova_mode iova_mode;
> > +	int ret;
> > +
> > +	if ((drv == NULL) || (dev == NULL))
> 
> Unnecessary internal parenthesis.
> 
> > +		return -EINVAL;
> > +
> > +	/* Check if driver supports it. */
> > +	if (!auxiliary_match(drv, dev))
> > +		/* Match of device and driver failed */
> > +		return 1;
> > +
> > +	/* No initialization when marked as blocked, return without error. */
> > +	if (dev->device.devargs != NULL &&
> > +	    dev->device.devargs->policy == RTE_DEV_BLOCKED) {
> > +		AUXILIARY_LOG(INFO, "Device is blocked, not initializing");
> > +		return -1;
> > +	}
> > +
> > +	if (dev->device.numa_node < 0) {
> > +		AUXILIARY_LOG(INFO, "Device is not NUMA-aware, defaulting socket to 0");
> 
> socket -> NUMA node
> 
> > +		dev->device.numa_node = 0;
> > +	}
> > +
> > +	iova_mode = rte_eal_iova_mode();
> > +	if ((drv->drv_flags & RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA) > 0 &&
> > +	    iova_mode != RTE_IOVA_VA) {
> > +		AUXILIARY_LOG(ERR, "Driver %s expecting VA IOVA mode but current mode is PA, not initializing",
> > +			      drv->driver.name);
> > +		return -EINVAL;
> > +	}
> > +
> > +	dev->driver = drv;
> > +
> > +	AUXILIARY_LOG(INFO, "Probe auxiliary driver: %s device: %s (socket %i)",
> 
> socket -> NUMA node
> 
> > +		      drv->driver.name, dev->name, dev->device.numa_node);
> > +	ret = drv->probe(drv, dev);
> > +	if (ret != 0)
> > +		dev->driver = NULL;
> > +	else
> > +		dev->device.driver = &drv->driver;
> > +
> > +	return ret;
> > +}
> > +
> > +/*
> > + * Call the remove() function of the driver.
> > + */
> > +static int
> > +rte_auxiliary_driver_remove_dev(struct rte_auxiliary_device *dev)
> > +{
> > +	struct rte_auxiliary_driver *drv;
> > +	int ret = 0;
> > +
> > +	if (dev == NULL)
> > +		return -EINVAL;
> > +
> > +	drv = dev->driver;
> > +
> > +	AUXILIARY_LOG(DEBUG, "Driver %s remove auxiliary device %s on NUMA socket %i",
> 
> socket -> node
> 
> > +		      drv->driver.name, dev->name, dev->device.numa_node);
> > +
> > +	if (drv->remove != NULL) {
> > +		ret = drv->remove(dev);
> > +		if (ret < 0)
> > +			return ret;
> > +	}
> > +
> > +	/* clear driver structure */
> > +	dev->driver = NULL;
> > +	dev->device.driver = NULL;
> > +
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Call the probe() function of all registered driver for the given device.
> > + * Return < 0 if initialization failed.
> > + * Return 1 if no driver is found for this device.
> > + */
> > +static int
> > +auxiliary_probe_all_drivers(struct rte_auxiliary_device *dev)
> > +{
> > +	struct rte_auxiliary_driver *drv;
> > +	int rc;
> > +
> > +	if (dev == NULL)
> > +		return -EINVAL;
> > +
> > +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> > +		if (!drv->match(dev->name))
> > +			continue;
> > +
> > +		rc = rte_auxiliary_probe_one_driver(drv, dev);
> > +		if (rc < 0)
> > +			/* negative value is an error */
> > +			return rc;
> > +		if (rc > 0)
> > +			/* positive value means driver doesn't support it */
> > +			continue;
> > +		return 0;
> > +	}
> > +	return 1;
> > +}
> > +
> > +/*
> > + * Scan the content of the auxiliary bus, and call the probe function for
> > + * all registered drivers to try to probe discovered devices.
> > + */
> > +static int
> > +auxiliary_probe(void)
> > +{
> > +	struct rte_auxiliary_device *dev = NULL;
> > +	size_t probed = 0, failed = 0;
> > +	int ret = 0;
> > +
> > +	FOREACH_DEVICE_ON_AUXILIARY_BUS(dev) {
> > +		probed++;
> > +
> > +		ret = auxiliary_probe_all_drivers(dev);
> > +		if (ret < 0) {
> > +			if (ret != -EEXIST) {
> > +				AUXILIARY_LOG(ERR, "Requested device %s cannot be used",
> > +					      dev->name);
> > +				rte_errno = errno;
> > +				failed++;
> > +			}
> > +			ret = 0;
> > +		}
> > +	}
> > +
> > +	return (probed && probed == failed) ? -1 : 0;
> > +}
> > +
> > +static int
> > +auxiliary_parse(const char *name, void *addr)
> > +{
> > +	struct rte_auxiliary_driver *drv = NULL;
> > +	const char **out = addr;
> > +
> > +	/* Allow empty device name "auxiliary:" to bypass entire bus scan. */
> > +	if (strlen(name) == 0)
> > +		return 0;
> > +
> > +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> > +		if (drv->match(name))
> > +			break;
> > +	}
> > +	if (drv != NULL && addr != NULL)
> > +		*out = name;
> > +	return drv != NULL ? 0 : -1;
> > +}
> > +
> > +/* Register a driver */
> > +void
> > +rte_auxiliary_register(struct rte_auxiliary_driver *driver)
> > +{
> > +	TAILQ_INSERT_TAIL(&auxiliary_bus.driver_list, driver, next);
> > +	driver->bus = &auxiliary_bus;
> > +}
> > +
> > +/* Unregister a driver */
> > +void
> > +rte_auxiliary_unregister(struct rte_auxiliary_driver *driver)
> > +{
> > +	TAILQ_REMOVE(&auxiliary_bus.driver_list, driver, next);
> > +	driver->bus = NULL;
> > +}
> > +
> > +/* Add a device to auxiliary bus */
> > +void
> > +auxiliary_add_device(struct rte_auxiliary_device *aux_dev)
> > +{
> > +	TAILQ_INSERT_TAIL(&auxiliary_bus.device_list, aux_dev, next);
> > +}
> > +
> > +/* Insert a device into a predefined position in auxiliary bus */
> > +void
> > +auxiliary_insert_device(struct rte_auxiliary_device *exist_aux_dev,
> > +			struct rte_auxiliary_device *new_aux_dev)
> > +{
> > +	TAILQ_INSERT_BEFORE(exist_aux_dev, new_aux_dev, next);
> > +}
> > +
> > +/* Remove a device from auxiliary bus */
> > +static void
> > +rte_auxiliary_remove_device(struct rte_auxiliary_device *auxiliary_dev)
> > +{
> > +	TAILQ_REMOVE(&auxiliary_bus.device_list, auxiliary_dev, next);
> > +}
> > +
> > +static struct rte_device *
> > +auxiliary_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
> > +		      const void *data)
> > +{
> > +	const struct rte_auxiliary_device *pstart;
> > +	struct rte_auxiliary_device *adev;
> > +
> > +	if (start != NULL) {
> > +		pstart = RTE_DEV_TO_AUXILIARY_CONST(start);
> > +		adev = TAILQ_NEXT(pstart, next);
> > +	} else {
> > +		adev = TAILQ_FIRST(&auxiliary_bus.device_list);
> > +	}
> > +	while (adev != NULL) {
> > +		if (cmp(&adev->device, data) == 0)
> > +			return &adev->device;
> > +		adev = TAILQ_NEXT(adev, next);
> > +	}
> > +	return NULL;
> > +}
> > +
> > +static int
> > +auxiliary_plug(struct rte_device *dev)
> > +{
> > +	if (!auxiliary_dev_exists(dev->name))
> > +		return -ENOENT;
> > +	return auxiliary_probe_all_drivers(RTE_DEV_TO_AUXILIARY(dev));
> > +}
> > +
> > +static int
> > +auxiliary_unplug(struct rte_device *dev)
> > +{
> > +	struct rte_auxiliary_device *adev;
> > +	int ret;
> > +
> > +	adev = RTE_DEV_TO_AUXILIARY(dev);
> > +	ret = rte_auxiliary_driver_remove_dev(adev);
> > +	if (ret == 0) {
> > +		rte_auxiliary_remove_device(adev);
> > +		rte_devargs_remove(dev->devargs);
> > +		free(adev);
> > +	}
> > +	return ret;
> > +}
> > +
> > +static int
> > +auxiliary_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
> > +{
> > +	struct rte_auxiliary_device *aux_dev = RTE_DEV_TO_AUXILIARY(dev);
> > +
> > +	if (dev == NULL || aux_dev->driver == NULL) {
> > +		rte_errno = EINVAL;
> > +		return -1;
> > +	}
> > +	if (aux_dev->driver->dma_map == NULL) {
> > +		rte_errno = ENOTSUP;
> > +		return -1;
> > +	}
> > +	return aux_dev->driver->dma_map(aux_dev, addr, iova, len);
> > +}
> > +
> > +static int
> > +auxiliary_dma_unmap(struct rte_device *dev, void *addr, uint64_t iova,
> > +		    size_t len)
> > +{
> > +	struct rte_auxiliary_device *aux_dev = RTE_DEV_TO_AUXILIARY(dev);
> > +
> > +	if (dev == NULL || aux_dev->driver == NULL) {
> > +		rte_errno = EINVAL;
> > +		return -1;
> > +	}
> > +	if (aux_dev->driver->dma_unmap == NULL) {
> > +		rte_errno = ENOTSUP;
> > +		return -1;
> > +	}
> > +	return aux_dev->driver->dma_unmap(aux_dev, addr, iova, len);
> > +}
> > +
> > +bool
> > +auxiliary_is_ignored_device(const char *name)
> > +{
> > +	struct rte_devargs *devargs = auxiliary_devargs_lookup(name);
> > +
> > +	switch (auxiliary_bus.bus.conf.scan_mode) {
> > +	case RTE_BUS_SCAN_ALLOWLIST:
> > +		if (devargs && devargs->policy == RTE_DEV_ALLOWED)
> > +			return false;
> > +		break;
> > +	case RTE_BUS_SCAN_UNDEFINED:
> > +	case RTE_BUS_SCAN_BLOCKLIST:
> > +		if (devargs == NULL || devargs->policy != RTE_DEV_BLOCKED)
> > +			return false;
> > +		break;
> > +	}
> > +	return true;
> > +}
> > +
> > +static enum rte_iova_mode
> > +auxiliary_get_iommu_class(void)
> > +{
> > +	const struct rte_auxiliary_driver *drv;
> > +
> > +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> > +		if ((drv->drv_flags & RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA) > 0)
> > +			return RTE_IOVA_VA;
> > +	}
> > +
> > +	return RTE_IOVA_DC;
> > +}
> > +
> > +struct rte_auxiliary_bus auxiliary_bus = {
> > +	.bus = {
> > +		.scan = auxiliary_scan,
> > +		.probe = auxiliary_probe,
> > +		.find_device = auxiliary_find_device,
> > +		.plug = auxiliary_plug,
> > +		.unplug = auxiliary_unplug,
> > +		.parse = auxiliary_parse,
> > +		.dma_map = auxiliary_dma_map,
> > +		.dma_unmap = auxiliary_dma_unmap,
> > +		.get_iommu_class = auxiliary_get_iommu_class,
> > +		.dev_iterate = auxiliary_dev_iterate,
> > +	},
> > +	.device_list = TAILQ_HEAD_INITIALIZER(auxiliary_bus.device_list),
> > +	.driver_list = TAILQ_HEAD_INITIALIZER(auxiliary_bus.driver_list),
> > +};
> > +
> > +RTE_REGISTER_BUS(auxiliary, auxiliary_bus.bus);
> > +RTE_LOG_REGISTER_DEFAULT(auxiliary_bus_logtype, NOTICE);
> > diff --git a/drivers/bus/auxiliary/auxiliary_params.c b/drivers/bus/auxiliary/auxiliary_params.c
> > new file mode 100644
> > index 0000000000..cd3fa56cb4
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/auxiliary_params.c
> > @@ -0,0 +1,59 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> > + */
> > +
> > +#include <string.h>
> > +
> > +#include <rte_bus.h>
> > +#include <rte_dev.h>
> > +#include <rte_errno.h>
> > +#include <rte_kvargs.h>
> > +
> > +#include "private.h"
> > +#include "rte_bus_auxiliary.h"
> > +
> > +enum auxiliary_params {
> > +	RTE_AUXILIARY_PARAM_NAME,
> > +};
> > +
> > +static const char * const auxiliary_params_keys[] = {
> > +	[RTE_AUXILIARY_PARAM_NAME] = "name",
> > +};
> > +
> > +static int
> > +auxiliary_dev_match(const struct rte_device *dev,
> > +	      const void *_kvlist)
> > +{
> > +	const struct rte_kvargs *kvlist = _kvlist;
> > +	int ret;
> > +
> > +	ret = rte_kvargs_process(kvlist,
> > +			auxiliary_params_keys[RTE_AUXILIARY_PARAM_NAME],
> > +			rte_kvargs_strcmp, (void *)(uintptr_t)dev->name);
> > +
> > +	return ret != 0 ? -1 : 0;
> > +}
> > +
> > +void *
> > +auxiliary_dev_iterate(const void *start,
> > +		    const char *str,
> > +		    const struct rte_dev_iterator *it __rte_unused)
> > +{
> > +	rte_bus_find_device_t find_device;
> > +	struct rte_kvargs *kvargs = NULL;
> > +	struct rte_device *dev;
> > +
> > +	if (str != NULL) {
> > +		kvargs = rte_kvargs_parse(str, auxiliary_params_keys);
> > +		if (kvargs == NULL) {
> > +			AUXILIARY_LOG(ERR, "cannot parse argument list %s",
> > +				      str);
> > +			rte_errno = EINVAL;
> > +			return NULL;
> > +		}
> > +	}
> > +	find_device = auxiliary_bus.bus.find_device;
> > +	dev = find_device(start, auxiliary_dev_match, kvargs);
> > +	rte_kvargs_free(kvargs);
> > +	return dev;
> > +}
> > diff --git a/drivers/bus/auxiliary/linux/auxiliary.c b/drivers/bus/auxiliary/linux/auxiliary.c
> > new file mode 100644
> > index 0000000000..8464487971
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/linux/auxiliary.c
> > @@ -0,0 +1,141 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> > + */
> > +
> > +#include <string.h>
> > +#include <dirent.h>
> > +
> > +#include <rte_log.h>
> > +#include <rte_bus.h>
> > +#include <rte_malloc.h>
> > +#include <rte_devargs.h>
> > +#include <rte_memcpy.h>
> > +#include <eal_filesystem.h>
> > +
> > +#include "../rte_bus_auxiliary.h"
> > +#include "../private.h"
> > +
> > +#define AUXILIARY_SYSFS_PATH "/sys/bus/auxiliary/devices"
> > +
> > +/* Scan one auxiliary sysfs entry, and fill the devices list from it. */
> > +static int
> > +auxiliary_scan_one(const char *dirname, const char *name)
> > +{
> > +	struct rte_auxiliary_device *dev;
> > +	struct rte_auxiliary_device *dev2;
> > +	char filename[PATH_MAX];
> > +	unsigned long tmp;
> > +	int ret;
> > +
> > +	dev = malloc(sizeof(*dev));
> > +	if (dev == NULL)
> > +		return -1;
> > +
> > +	memset(dev, 0, sizeof(*dev));
> > +	if (rte_strscpy(dev->name, name, sizeof(dev->name)) < 0) {
> > +		free(dev);
> > +		return -1;
> > +	}
> > +	dev->device.name = dev->name;
> > +	dev->device.bus = &auxiliary_bus.bus;
> > +
> > +	/* Get NUMA node, default to 0 if not present */
> > +	snprintf(filename, sizeof(filename), "%s/%s/numa_node",
> > +		 dirname, name);
> > +	if (access(filename, F_OK) != -1) {
> > +		if (eal_parse_sysfs_value(filename, &tmp) == 0)
> > +			dev->device.numa_node = tmp;
> > +		else
> > +			dev->device.numa_node = -1;
> > +	} else {
> > +		dev->device.numa_node = 0;
> > +	}
> > +
> > +	auxiliary_on_scan(dev);
> > +
> > +	/* Device is valid, add in list (sorted) */
> > +	TAILQ_FOREACH(dev2, &auxiliary_bus.device_list, next) {
> > +		ret = strcmp(dev->name, dev2->name);
> > +		if (ret > 0)
> > +			continue;
> > +		if (ret < 0) {
> > +			auxiliary_insert_device(dev2, dev);
> > +		} else { /* already registered */
> > +			if (rte_dev_is_probed(&dev2->device) &&
> > +			    dev2->device.devargs != dev->device.devargs) {
> > +				/* To probe device with new devargs. */
> > +				rte_devargs_remove(dev2->device.devargs);
> > +				auxiliary_on_scan(dev2);
> > +			}
> > +			free(dev);
> > +		}
> > +		return 0;
> > +	}
> > +	auxiliary_add_device(dev);
> > +	return 0;
> > +}
> > +
> > +/*
> > + * Test whether the auxiliary device exist
> 
> Missing full stop above.
> 
> > + */
> > +bool
> > +auxiliary_dev_exists(const char *name)
> > +{
> > +	DIR *dir;
> > +	char dirname[PATH_MAX];
> > +
> > +	snprintf(dirname, sizeof(dirname), "%s/%s",
> > +		 AUXILIARY_SYSFS_PATH, name);
> > +	dir = opendir(dirname);
> > +	if (dir == NULL)
> > +		return false;
> > +	closedir(dir);
> > +	return true;
> > +}
> > +
> > +/*
> > + * Scan the devices in the auxiliary bus
> 
> Missing full stop above.
> 
> > + */
> > +int
> > +auxiliary_scan(void)
> > +{
> > +	struct dirent *e;
> > +	DIR *dir;
> > +	char dirname[PATH_MAX];
> > +	struct rte_auxiliary_driver *drv;
> > +
> > +	dir = opendir(AUXILIARY_SYSFS_PATH);
> > +	if (dir == NULL) {
> > +		AUXILIARY_LOG(INFO, "%s not found, is auxiliary module loaded?",
> > +			      AUXILIARY_SYSFS_PATH);
> > +		return 0;
> > +	}
> > +
> > +	while ((e = readdir(dir)) != NULL) {
> > +		if (e->d_name[0] == '.')
> > +			continue;
> > +
> > +		if (auxiliary_is_ignored_device(e->d_name))
> > +			continue;
> > +
> > +		snprintf(dirname, sizeof(dirname), "%s/%s",
> > +			 AUXILIARY_SYSFS_PATH, e->d_name);
> > +
> > +		/* Ignore if no driver can handle. */
> > +		FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> > +			if (drv->match(e->d_name))
> > +				break;
> > +		}
> > +		if (drv == NULL)
> > +			continue;
> > +
> > +		if (auxiliary_scan_one(dirname, e->d_name) < 0)
> > +			goto error;
> > +	}
> > +	closedir(dir);
> > +	return 0;
> > +
> > +error:
> > +	closedir(dir);
> > +	return -1;
> > +}
> > diff --git a/drivers/bus/auxiliary/meson.build b/drivers/bus/auxiliary/meson.build
> > new file mode 100644
> > index 0000000000..357550eff7
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/meson.build
> > @@ -0,0 +1,16 @@
> > +# SPDX-License-Identifier: BSD-3-Clause
> > +# Copyright (c) 2021 NVIDIA Corporation & Affiliates
> > +
> > +headers = files(
> > +        'rte_bus_auxiliary.h',
> > +)
> > +sources = files(
> > +        'auxiliary_common.c',
> > +        'auxiliary_params.c',
> > +)
> > +if is_linux
> > +    sources += files(
> > +        'linux/auxiliary.c',
> > +    )
> > +endif
> > +deps += ['kvargs']
> > diff --git a/drivers/bus/auxiliary/private.h b/drivers/bus/auxiliary/private.h
> > new file mode 100644
> > index 0000000000..cb3e849993
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/private.h
> > @@ -0,0 +1,74 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> > + */
> > +
> > +#ifndef AUXILIARY_PRIVATE_H
> 
> May be add BUS_ prefix at leaat?
> 
> > +#define AUXILIARY_PRIVATE_H
> > +
> > +#include <stdbool.h>
> > +#include <stdio.h>
> > +
> > +#include "rte_bus_auxiliary.h"
> > +
> > +extern struct rte_auxiliary_bus auxiliary_bus;
> > +extern int auxiliary_bus_logtype;
> > +
> > +#define AUXILIARY_LOG(level, ...) \
> > +	rte_log(RTE_LOG_ ## level, auxiliary_bus_logtype, \
> > +		RTE_FMT("auxiliary bus: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
> > +			RTE_FMT_TAIL(__VA_ARGS__,)))
> > +
> > +/* Auxiliary bus iterators */
> > +#define FOREACH_DEVICE_ON_AUXILIARY_BUS(p) \
> > +		TAILQ_FOREACH(p, &(auxiliary_bus.device_list), next)
> > +
> > +#define FOREACH_DRIVER_ON_AUXILIARY_BUS(p) \
> > +		TAILQ_FOREACH(p, &(auxiliary_bus.driver_list), next)
> > +
> > +bool auxiliary_dev_exists(const char *name);
> > +
> > +/*
> > + * Scan the content of the auxiliary bus, and the devices in the devices
> > + * list.
> > + */
> > +int auxiliary_scan(void);
> > +
> > +/*
> > + * Update a device being scanned.
> > + */
> > +void auxiliary_on_scan(struct rte_auxiliary_device *aux_dev);
> > +
> > +/*
> > + * Validate whether a device with given auxiliary device should be ignored
> > + * or not.
> > + */
> > +bool auxiliary_is_ignored_device(const char *name);
> > +
> > +/*
> > + * Add an auxiliary device to the auxiliary bus (append to auxiliary device
> > + * list). This function also updates the bus references of the auxiliary
> > + * device and the generic device object embedded within.
> > + */
> > +void auxiliary_add_device(struct rte_auxiliary_device *aux_dev);
> > +
> > +/*
> > + * Insert an auxiliary device in the auxiliary bus at a particular location
> > + * in the device list. It also updates the auxiliary bus reference of the
> > + * new devices to be inserted.
> > + */
> > +void auxiliary_insert_device(struct rte_auxiliary_device *exist_aux_dev,
> > +			     struct rte_auxiliary_device *new_aux_dev);
> > +
> > +/*
> > + * Match the auxiliary driver and device by driver function
> 
> Missing full stop.
> 
> > + */
> > +bool auxiliary_match(const struct rte_auxiliary_driver *aux_drv,
> > +		     const struct rte_auxiliary_device *aux_dev);
> > +
> > +/*
> > + * Iterate over devices, matching any device against the provided string
> 
> Missing full stop.
> 
> > + */
> > +void *auxiliary_dev_iterate(const void *start, const char *str,
> > +			    const struct rte_dev_iterator *it);
> > +
> > +#endif /* AUXILIARY_PRIVATE_H */
> > diff --git a/drivers/bus/auxiliary/rte_bus_auxiliary.h b/drivers/bus/auxiliary/rte_bus_auxiliary.h
> > new file mode 100644
> > index 0000000000..16b147e387
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/rte_bus_auxiliary.h
> > @@ -0,0 +1,201 @@
> > +/* SPDX-License-Identifier: BSD-3-Clause
> > + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> > + */
> > +
> > +#ifndef RTE_BUS_AUXILIARY_H
> > +#define RTE_BUS_AUXILIARY_H
> > +
> > +/**
> > + * @file
> > + *
> > + * Auxiliary Bus Interface.
> > + */
> > +
> > +#ifdef __cplusplus
> > +extern "C" {
> > +#endif
> > +
> > +#include <stdio.h>
> > +#include <stdlib.h>
> > +#include <limits.h>
> > +#include <errno.h>
> > +#include <sys/queue.h>
> > +#include <stdint.h>
> > +#include <inttypes.h>
> > +
> > +#include <rte_debug.h>
> > +#include <rte_interrupts.h>
> > +#include <rte_dev.h>
> > +#include <rte_bus.h>
> > +#include <rte_kvargs.h>
> > +
> > +#define RTE_BUS_AUXILIARY_NAME "auxiliary"
> > +
> > +/* Forward declarations */
> > +struct rte_auxiliary_driver;
> > +struct rte_auxiliary_bus;
> > +struct rte_auxiliary_device;
> > +
> > +/**
> > + * Match function for the driver to decide if device can be handled.
> > + *
> > + * @param name
> > + *   Pointer to the auxiliary device name.
> > + * @return
> > + *   Whether the driver can handle the auxiliary device.
> > + */
> > +typedef bool(rte_auxiliary_match_t)(const char *name);
> > +
> > +/**
> > + * Initialization function for the driver called during auxiliary probing.
> > + *
> > + * @param drv
> > + *   Pointer to the auxiliary driver.
> > + * @param dev
> > + *   Pointer to the auxiliary device.
> > + * @return
> > + *   - 0 On success.
> > + *   - Negative value and rte_errno is set otherwise.
> > + */
> > +typedef int(rte_auxiliary_probe_t)(struct rte_auxiliary_driver *drv,
> > +				    struct rte_auxiliary_device *dev);
> > +
> > +/**
> > + * Uninitialization function for the driver called during hotplugging.
> > + *
> > + * @param dev
> > + *   Pointer to the auxiliary device.
> > + * @return
> > + *   - 0 On success.
> > + *   - Negative value and rte_errno is set otherwise.
> > + */
> > +typedef int (rte_auxiliary_remove_t)(struct rte_auxiliary_device *dev);
> > +
> > +/**
> > + * Driver-specific DMA mapping. After a successful call the device
> > + * will be able to read/write from/to this segment.
> > + *
> > + * @param dev
> > + *   Pointer to the auxiliary device.
> > + * @param addr
> > + *   Starting virtual address of memory to be mapped.
> > + * @param iova
> > + *   Starting IOVA address of memory to be mapped.
> > + * @param len
> > + *   Length of memory segment being mapped.
> > + * @return
> > + *   - 0 On success.
> > + *   - Negative value and rte_errno is set otherwise.
> > + */
> > +typedef int (rte_auxiliary_dma_map_t)(struct rte_auxiliary_device *dev,
> > +				       void *addr, uint64_t iova, size_t len);
> > +
> > +/**
> > + * Driver-specific DMA un-mapping. After a successful call the device
> > + * will not be able to read/write from/to this segment.
> > + *
> > + * @param dev
> > + *   Pointer to the auxiliary device.
> > + * @param addr
> > + *   Starting virtual address of memory to be unmapped.
> > + * @param iova
> > + *   Starting IOVA address of memory to be unmapped.
> > + * @param len
> > + *   Length of memory segment being unmapped.
> > + * @return
> > + *   - 0 On success.
> > + *   - Negative value and rte_errno is set otherwise.
> > + */
> > +typedef int (rte_auxiliary_dma_unmap_t)(struct rte_auxiliary_device *dev,
> > +					 void *addr, uint64_t iova, size_t len);
> > +
> > +/**
> > + * A structure describing an auxiliary device.
> > + */
> > +struct rte_auxiliary_device {
> > +	TAILQ_ENTRY(rte_auxiliary_device) next;   /**< Next probed device. */
> > +	struct rte_device device;                 /**< Inherit core device */
> > +	char name[RTE_DEV_NAME_MAX_LEN + 1];      /**< ASCII device name */
> > +	struct rte_intr_handle intr_handle;       /**< Interrupt handle */
> > +	struct rte_auxiliary_driver *driver;      /**< Device driver */
> > +};
> > +
> > +/** List of auxiliary devices */
> > +TAILQ_HEAD(rte_auxiliary_device_list, rte_auxiliary_device);
> > +/** List of auxiliary drivers */
> > +TAILQ_HEAD(rte_auxiliary_driver_list, rte_auxiliary_driver);
> 
> Shouldn't we hide rte_auxiliary_device inside the library take
> API/ABI stability into account? Or will be it DPDK internal anyway? If
> so, it should be done INTERNAL from the very
> beginning.
> 
> > +
> > +/**
> > + * Structure describing the auxiliary bus
> > + */
> > +struct rte_auxiliary_bus {
> > +	struct rte_bus bus;                  /**< Inherit the generic class */
> > +	struct rte_auxiliary_device_list device_list;  /**< List of devices */
> > +	struct rte_auxiliary_driver_list driver_list;  /**< List of drivers */
> > +};
> 
> It looks internal. The following forward declaration should be
> sufficient to build.
> 
> struct rte_auxiliary_bus;
> 
> 
> > +
> > +/**
> > + * A structure describing an auxiliary driver.
> > + */
> > +struct rte_auxiliary_driver {
> > +	TAILQ_ENTRY(rte_auxiliary_driver) next; /**< Next in list. */
> > +	struct rte_driver driver;             /**< Inherit core driver. */
> > +	struct rte_auxiliary_bus *bus;        /**< Auxiliary bus reference. */
> > +	rte_auxiliary_match_t *match;         /**< Device match function. */
> > +	rte_auxiliary_probe_t *probe;         /**< Device probe function. */
> > +	rte_auxiliary_remove_t *remove;       /**< Device remove function. */
> > +	rte_auxiliary_dma_map_t *dma_map;     /**< Device DMA map function. */
> > +	rte_auxiliary_dma_unmap_t *dma_unmap; /**< Device DMA unmap function. */
> > +	uint32_t drv_flags;                   /**< Flags RTE_AUXILIARY_DRV_*. */
> > +};
> > +
> > +/**
> > + * @internal
> > + * Helper macro for drivers that need to convert to struct rte_auxiliary_device.
> > + */
> > +#define RTE_DEV_TO_AUXILIARY(ptr) \
> > +	container_of(ptr, struct rte_auxiliary_device, device)
> > +
> > +#define RTE_DEV_TO_AUXILIARY_CONST(ptr) \
> > +	container_of(ptr, const struct rte_auxiliary_device, device)
> > +
> > +#define RTE_ETH_DEV_TO_AUXILIARY(eth_dev) \
> > +	RTE_DEV_TO_AUXILIARY((eth_dev)->device)
> > +
> > +/** Device driver needs IOVA as VA and cannot work with IOVA as PA */
> > +#define RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA 0x002
> > +
> > +/**
> 
> Don't we need EXPERIMENTAL notice here?
> 
> > + * Register an auxiliary driver.
> > + *
> > + * @param driver
> > + *   A pointer to a rte_auxiliary_driver structure describing the driver
> > + *   to be registered.
> > + */
> > +__rte_experimental
> > +void rte_auxiliary_register(struct rte_auxiliary_driver *driver);
> > +
> > +/** Helper for auxiliary device registration from driver instance */
> > +#define RTE_PMD_REGISTER_AUXILIARY(nm, auxiliary_drv) \
> > +	RTE_INIT(auxiliaryinitfn_ ##nm) \
> > +	{ \
> > +		(auxiliary_drv).driver.name = RTE_STR(nm); \
> > +		rte_auxiliary_register(&(auxiliary_drv)); \
> > +	} \
> > +	RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
> > +
> > +/**
> 
> Don't we need EXPERIMENTAL notice here?
> 
> > + * Unregister an auxiliary driver.
> > + *
> > + * @param driver
> > + *   A pointer to a rte_auxiliary_driver structure describing the driver
> > + *   to be unregistered.
> > + */
> > +__rte_experimental
> > +void rte_auxiliary_unregister(struct rte_auxiliary_driver *driver);
> > +
> > +#ifdef __cplusplus
> > +}
> > +#endif
> > +
> > +#endif /* RTE_BUS_AUXILIARY_H */
> > diff --git a/drivers/bus/auxiliary/version.map b/drivers/bus/auxiliary/version.map
> > new file mode 100644
> > index 0000000000..a52260657c
> > --- /dev/null
> > +++ b/drivers/bus/auxiliary/version.map
> > @@ -0,0 +1,7 @@
> > +EXPERIMENTAL {
> > +	global:
> > +
> > +	# added in 21.08
> > +	rte_auxiliary_register;
> > +	rte_auxiliary_unregister;
> > +};
> > diff --git a/drivers/bus/meson.build b/drivers/bus/meson.build
> > index 410058de3a..45eab5233d 100644
> > --- a/drivers/bus/meson.build
> > +++ b/drivers/bus/meson.build
> > @@ -2,6 +2,7 @@
> >  # Copyright(c) 2017 Intel Corporation
> >
> >  drivers = [
> > +        'auxiliary',
> >          'dpaa',
> >          'fslmc',
> >          'ifpga',
> >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3 19/20] net/sfc: support flow action COUNT in transfer rules
  @ 2021-07-04 19:45  3%     ` Thomas Monjalon
  2021-07-05  8:41  0%       ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-07-04 19:45 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: David Marchand, Bruce Richardson, dev, Igor Romanov,
	Andy Moreton, Ivan Malov

02/07/2021 14:53, Andrew Rybchenko:
> On 7/2/21 3:30 PM, Thomas Monjalon wrote:
> > 02/07/2021 10:43, Andrew Rybchenko:
> >> On 7/1/21 4:05 PM, Andrew Rybchenko wrote:
> >>> On 7/1/21 3:34 PM, David Marchand wrote:
> >>>> On Thu, Jul 1, 2021 at 11:22 AM Andrew Rybchenko
> >>>> <andrew.rybchenko@oktetlabs.ru> wrote:
> >>>>> The build works fine for me on FC34, but it has
> >>>>> libatomic-11.1.1-3.fc34.x86_64 installed.
> >>>>
> >>>> I first produced the issue on my "old" FC32.
> >>>> Afaics, for FC33 and later, gcc now depends on libatomic and the
> >>>> problem won't be noticed.
> >>>> FC32 and before are EOL, but I then reproduced the issue on RHEL 8
> >>>> (and Intel CI reported it on Centos 8 too).
> >>>
> >>> I see. Thanks for the clarification.
> >>>
> >>>>>
> >>>>> I'd like to understand what we're trying to solve here.
> >>>>> Are we trying to make meson to report the missing library
> >>>>> correctly?
> >>>>>
> >>>>> If so, I think I can do simple check using cc.links()
> >>>>> which will fail if the library is not found. I'll
> >>>>> test that it works as expected if the library is not
> >>>>> completely installed.
> >>>>>
> >>>>
> >>>> I tried below diff, and it works for me.
> >>>> "works" as in net/sfc gets disabled without libatomic installed:
> > [...]
> >>>>  # for gcc compiles we need -latomic for 128-bit atomic ops
> >>>>  if cc.get_id() == 'gcc'
> >>>> +    code = '''#include <stdio.h>
> >>>> +    void main() { printf("Atomilink me.\n"); }
> >>>> +    '''
> >>>> +    if not cc.links(code, args: '-latomic', name: 'libatomic link check')
> >>>> +        build = false
> >>>> +        reason = 'missing dependency, "libatomic"'
> >>>> +        subdir_done()
> >>>> +    endif
> >>>>      ext_deps += cc.find_library('atomic')
> >>>>  endif
> >>>
> >>> Many thanks, LGTM. I'll pick it up and add comments why
> >>> it is checked this way.
> >>>
> >>
> >> I've send v4 with the problem fixed. However, I'm afraid
> >> build test systems should be updated to have libatomic
> >> correctly installed. Otherwise, they do not really check
> >> net/sfc build.
> > 
> > When testing on old systems, sfc won't be tested anymore after this patchset.
> > On recent systems, sfc should be enabled I guess.
> > I don't see how to manage better, sorry.
> > 
> 
> I see. I thought that it is possible to install missing
> package on corresponding systems to make build coverage
> better.
> 
> Now I automatically test build on problematic distros
> with previously missing packages installed. So I have
> internal build coverage anyway.

David asked for installing libatomic:
https://inbox.dpdk.org/ci/CAJFAV8xCNBL4yEZU0c=dJGYS+13QM7Uz7e2qnUkMuM7eaKKw+Q@mail.gmail.com/

We should wait for it to be installed otherwise ABI check will fail.




^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v6 2/2] bus/auxiliary: introduce auxiliary bus
  @ 2021-07-04 16:13  3%   ` Andrew Rybchenko
  2021-07-05  5:47  0%     ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-07-04 16:13 UTC (permalink / raw)
  To: Xueming Li; +Cc: dev, Wang Haiyue, Thomas Monjalon, Kinsella Ray, Neil Horman

On 6/25/21 2:47 PM, Xueming Li wrote:
> Auxiliary bus [1] provides a way to split function into child-devices
> representing sub-domains of functionality. Each auxiliary device
> represents a part of its parent functionality.
> 
> Auxiliary device is identified by unique device name, sysfs path:
>   /sys/bus/auxiliary/devices/<name>
> 
> Devargs legacy syntax ofauxiliary device:

Missing space after 'of'

>   -a auxiliary:<name>[,args...]
> Devargs generic syntax of auxiliary device:
>   -a bus=auxiliary,name=<name>,,/class=<classs>,,/driver=<driver>,,

Are two commas above intentionall? What for?

> 
> [1] kernel auxiliary bus document:
> https://www.kernel.org/doc/html/latest/driver-api/auxiliary_bus.html
> 
> Signed-off-by: Xueming Li <xuemingl@nvidia.com>

With my below notes fixed:

Acked-by: Andrew Rybchenko <andrew.rybchenko@oktetlabs.ru>

> Cc: Wang Haiyue <haiyue.wang@intel.com>
> Cc: Thomas Monjalon <thomas@monjalon.net>
> Cc: Kinsella Ray <mdr@ashroe.eu>
> ---
>  MAINTAINERS                               |   5 +
>  doc/guides/rel_notes/release_21_08.rst    |   6 +
>  drivers/bus/auxiliary/auxiliary_common.c  | 411 ++++++++++++++++++++++
>  drivers/bus/auxiliary/auxiliary_params.c  |  59 ++++
>  drivers/bus/auxiliary/linux/auxiliary.c   | 141 ++++++++
>  drivers/bus/auxiliary/meson.build         |  16 +
>  drivers/bus/auxiliary/private.h           |  74 ++++
>  drivers/bus/auxiliary/rte_bus_auxiliary.h | 201 +++++++++++
>  drivers/bus/auxiliary/version.map         |   7 +
>  drivers/bus/meson.build                   |   1 +
>  10 files changed, 921 insertions(+)
>  create mode 100644 drivers/bus/auxiliary/auxiliary_common.c
>  create mode 100644 drivers/bus/auxiliary/auxiliary_params.c
>  create mode 100644 drivers/bus/auxiliary/linux/auxiliary.c
>  create mode 100644 drivers/bus/auxiliary/meson.build
>  create mode 100644 drivers/bus/auxiliary/private.h
>  create mode 100644 drivers/bus/auxiliary/rte_bus_auxiliary.h
>  create mode 100644 drivers/bus/auxiliary/version.map
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5877a16971..eaf691ca6a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -525,6 +525,11 @@ F: doc/guides/mempool/octeontx2.rst
>  Bus Drivers
>  -----------
>  
> +Auxiliary bus driver

Shouldn't it be EXPERIMENTAL?

> +M: Parav Pandit <parav@nvidia.com>
> +M: Xueming Li <xuemingl@nvidia.com>
> +F: drivers/bus/auxiliary/
> +
>  Intel FPGA bus
>  M: Rosen Xu <rosen.xu@intel.com>
>  F: drivers/bus/ifpga/
> diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
> index a6ecfdf3ce..e7ef4c8a05 100644
> --- a/doc/guides/rel_notes/release_21_08.rst
> +++ b/doc/guides/rel_notes/release_21_08.rst
> @@ -55,6 +55,12 @@ New Features
>       Also, make sure to start the actual text at the margin.
>       =======================================================
>  
> +* **Added auxiliary bus support.**
> +
> +  Auxiliary bus provides a way to split function into child-devices
> +  representing sub-domains of functionality. Each auxiliary device
> +  represents a part of its parent functionality.
> +
>  
>  Removed Items
>  -------------
> diff --git a/drivers/bus/auxiliary/auxiliary_common.c b/drivers/bus/auxiliary/auxiliary_common.c
> new file mode 100644
> index 0000000000..8a75306da5
> --- /dev/null
> +++ b/drivers/bus/auxiliary/auxiliary_common.c
> @@ -0,0 +1,411 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#include <string.h>
> +#include <inttypes.h>
> +#include <stdint.h>
> +#include <stdbool.h>
> +#include <stdlib.h>
> +#include <stdio.h>
> +#include <sys/queue.h>
> +#include <rte_errno.h>
> +#include <rte_interrupts.h>
> +#include <rte_log.h>
> +#include <rte_bus.h>
> +#include <rte_per_lcore.h>
> +#include <rte_memory.h>
> +#include <rte_eal.h>
> +#include <rte_eal_paging.h>
> +#include <rte_string_fns.h>
> +#include <rte_common.h>
> +#include <rte_devargs.h>
> +
> +#include "private.h"
> +#include "rte_bus_auxiliary.h"
> +
> +static struct rte_devargs *
> +auxiliary_devargs_lookup(const char *name)
> +{
> +	struct rte_devargs *devargs;
> +
> +	RTE_EAL_DEVARGS_FOREACH(RTE_BUS_AUXILIARY_NAME, devargs) {
> +		if (strcmp(devargs->name, name) == 0)
> +			return devargs;
> +	}
> +	return NULL;
> +}
> +
> +/*
> + * Test whether the auxiliary device exist

Missing full stop above.

> + *
> + * Stub for OS not supporting auxiliary bus.
> + */
> +__rte_weak bool
> +auxiliary_dev_exists(const char *name)
> +{
> +	RTE_SET_USED(name);
> +	return false;
> +}
> +
> +/*
> + * Scan the devices in the auxiliary bus.
> + *
> + * Stub for OS not supporting auxiliary bus.
> + */
> +__rte_weak int
> +auxiliary_scan(void)
> +{
> +	return 0;
> +}
> +
> +/*
> + * Update a device's devargs being scanned.
> + *
> + * @param aux_dev
> + *	AUXILIARY device.
> + */
> +void
> +auxiliary_on_scan(struct rte_auxiliary_device *aux_dev)
> +{
> +	aux_dev->device.devargs = auxiliary_devargs_lookup(aux_dev->name);
> +}
> +
> +/*
> + * Match the auxiliary driver and device using driver function.
> + */
> +bool
> +auxiliary_match(const struct rte_auxiliary_driver *aux_drv,
> +		const struct rte_auxiliary_device *aux_dev)
> +{
> +	if (aux_drv->match == NULL)
> +		return false;
> +	return aux_drv->match(aux_dev->name);
> +}
> +
> +/*
> + * Call the probe() function of the driver.
> + */
> +static int
> +rte_auxiliary_probe_one_driver(struct rte_auxiliary_driver *drv,
> +			       struct rte_auxiliary_device *dev)
> +{
> +	enum rte_iova_mode iova_mode;
> +	int ret;
> +
> +	if ((drv == NULL) || (dev == NULL))

Unnecessary internal parenthesis.

> +		return -EINVAL;
> +
> +	/* Check if driver supports it. */
> +	if (!auxiliary_match(drv, dev))
> +		/* Match of device and driver failed */
> +		return 1;
> +
> +	/* No initialization when marked as blocked, return without error. */
> +	if (dev->device.devargs != NULL &&
> +	    dev->device.devargs->policy == RTE_DEV_BLOCKED) {
> +		AUXILIARY_LOG(INFO, "Device is blocked, not initializing");
> +		return -1;
> +	}
> +
> +	if (dev->device.numa_node < 0) {
> +		AUXILIARY_LOG(INFO, "Device is not NUMA-aware, defaulting socket to 0");

socket -> NUMA node

> +		dev->device.numa_node = 0;
> +	}
> +
> +	iova_mode = rte_eal_iova_mode();
> +	if ((drv->drv_flags & RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA) > 0 &&
> +	    iova_mode != RTE_IOVA_VA) {
> +		AUXILIARY_LOG(ERR, "Driver %s expecting VA IOVA mode but current mode is PA, not initializing",
> +			      drv->driver.name);
> +		return -EINVAL;
> +	}
> +
> +	dev->driver = drv;
> +
> +	AUXILIARY_LOG(INFO, "Probe auxiliary driver: %s device: %s (socket %i)",

socket -> NUMA node

> +		      drv->driver.name, dev->name, dev->device.numa_node);
> +	ret = drv->probe(drv, dev);
> +	if (ret != 0)
> +		dev->driver = NULL;
> +	else
> +		dev->device.driver = &drv->driver;
> +
> +	return ret;
> +}
> +
> +/*
> + * Call the remove() function of the driver.
> + */
> +static int
> +rte_auxiliary_driver_remove_dev(struct rte_auxiliary_device *dev)
> +{
> +	struct rte_auxiliary_driver *drv;
> +	int ret = 0;
> +
> +	if (dev == NULL)
> +		return -EINVAL;
> +
> +	drv = dev->driver;
> +
> +	AUXILIARY_LOG(DEBUG, "Driver %s remove auxiliary device %s on NUMA socket %i",

socket -> node

> +		      drv->driver.name, dev->name, dev->device.numa_node);
> +
> +	if (drv->remove != NULL) {
> +		ret = drv->remove(dev);
> +		if (ret < 0)
> +			return ret;
> +	}
> +
> +	/* clear driver structure */
> +	dev->driver = NULL;
> +	dev->device.driver = NULL;
> +
> +	return 0;
> +}
> +
> +/*
> + * Call the probe() function of all registered driver for the given device.
> + * Return < 0 if initialization failed.
> + * Return 1 if no driver is found for this device.
> + */
> +static int
> +auxiliary_probe_all_drivers(struct rte_auxiliary_device *dev)
> +{
> +	struct rte_auxiliary_driver *drv;
> +	int rc;
> +
> +	if (dev == NULL)
> +		return -EINVAL;
> +
> +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> +		if (!drv->match(dev->name))
> +			continue;
> +
> +		rc = rte_auxiliary_probe_one_driver(drv, dev);
> +		if (rc < 0)
> +			/* negative value is an error */
> +			return rc;
> +		if (rc > 0)
> +			/* positive value means driver doesn't support it */
> +			continue;
> +		return 0;
> +	}
> +	return 1;
> +}
> +
> +/*
> + * Scan the content of the auxiliary bus, and call the probe function for
> + * all registered drivers to try to probe discovered devices.
> + */
> +static int
> +auxiliary_probe(void)
> +{
> +	struct rte_auxiliary_device *dev = NULL;
> +	size_t probed = 0, failed = 0;
> +	int ret = 0;
> +
> +	FOREACH_DEVICE_ON_AUXILIARY_BUS(dev) {
> +		probed++;
> +
> +		ret = auxiliary_probe_all_drivers(dev);
> +		if (ret < 0) {
> +			if (ret != -EEXIST) {
> +				AUXILIARY_LOG(ERR, "Requested device %s cannot be used",
> +					      dev->name);
> +				rte_errno = errno;
> +				failed++;
> +			}
> +			ret = 0;
> +		}
> +	}
> +
> +	return (probed && probed == failed) ? -1 : 0;
> +}
> +
> +static int
> +auxiliary_parse(const char *name, void *addr)
> +{
> +	struct rte_auxiliary_driver *drv = NULL;
> +	const char **out = addr;
> +
> +	/* Allow empty device name "auxiliary:" to bypass entire bus scan. */
> +	if (strlen(name) == 0)
> +		return 0;
> +
> +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> +		if (drv->match(name))
> +			break;
> +	}
> +	if (drv != NULL && addr != NULL)
> +		*out = name;
> +	return drv != NULL ? 0 : -1;
> +}
> +
> +/* Register a driver */
> +void
> +rte_auxiliary_register(struct rte_auxiliary_driver *driver)
> +{
> +	TAILQ_INSERT_TAIL(&auxiliary_bus.driver_list, driver, next);
> +	driver->bus = &auxiliary_bus;
> +}
> +
> +/* Unregister a driver */
> +void
> +rte_auxiliary_unregister(struct rte_auxiliary_driver *driver)
> +{
> +	TAILQ_REMOVE(&auxiliary_bus.driver_list, driver, next);
> +	driver->bus = NULL;
> +}
> +
> +/* Add a device to auxiliary bus */
> +void
> +auxiliary_add_device(struct rte_auxiliary_device *aux_dev)
> +{
> +	TAILQ_INSERT_TAIL(&auxiliary_bus.device_list, aux_dev, next);
> +}
> +
> +/* Insert a device into a predefined position in auxiliary bus */
> +void
> +auxiliary_insert_device(struct rte_auxiliary_device *exist_aux_dev,
> +			struct rte_auxiliary_device *new_aux_dev)
> +{
> +	TAILQ_INSERT_BEFORE(exist_aux_dev, new_aux_dev, next);
> +}
> +
> +/* Remove a device from auxiliary bus */
> +static void
> +rte_auxiliary_remove_device(struct rte_auxiliary_device *auxiliary_dev)
> +{
> +	TAILQ_REMOVE(&auxiliary_bus.device_list, auxiliary_dev, next);
> +}
> +
> +static struct rte_device *
> +auxiliary_find_device(const struct rte_device *start, rte_dev_cmp_t cmp,
> +		      const void *data)
> +{
> +	const struct rte_auxiliary_device *pstart;
> +	struct rte_auxiliary_device *adev;
> +
> +	if (start != NULL) {
> +		pstart = RTE_DEV_TO_AUXILIARY_CONST(start);
> +		adev = TAILQ_NEXT(pstart, next);
> +	} else {
> +		adev = TAILQ_FIRST(&auxiliary_bus.device_list);
> +	}
> +	while (adev != NULL) {
> +		if (cmp(&adev->device, data) == 0)
> +			return &adev->device;
> +		adev = TAILQ_NEXT(adev, next);
> +	}
> +	return NULL;
> +}
> +
> +static int
> +auxiliary_plug(struct rte_device *dev)
> +{
> +	if (!auxiliary_dev_exists(dev->name))
> +		return -ENOENT;
> +	return auxiliary_probe_all_drivers(RTE_DEV_TO_AUXILIARY(dev));
> +}
> +
> +static int
> +auxiliary_unplug(struct rte_device *dev)
> +{
> +	struct rte_auxiliary_device *adev;
> +	int ret;
> +
> +	adev = RTE_DEV_TO_AUXILIARY(dev);
> +	ret = rte_auxiliary_driver_remove_dev(adev);
> +	if (ret == 0) {
> +		rte_auxiliary_remove_device(adev);
> +		rte_devargs_remove(dev->devargs);
> +		free(adev);
> +	}
> +	return ret;
> +}
> +
> +static int
> +auxiliary_dma_map(struct rte_device *dev, void *addr, uint64_t iova, size_t len)
> +{
> +	struct rte_auxiliary_device *aux_dev = RTE_DEV_TO_AUXILIARY(dev);
> +
> +	if (dev == NULL || aux_dev->driver == NULL) {
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +	if (aux_dev->driver->dma_map == NULL) {
> +		rte_errno = ENOTSUP;
> +		return -1;
> +	}
> +	return aux_dev->driver->dma_map(aux_dev, addr, iova, len);
> +}
> +
> +static int
> +auxiliary_dma_unmap(struct rte_device *dev, void *addr, uint64_t iova,
> +		    size_t len)
> +{
> +	struct rte_auxiliary_device *aux_dev = RTE_DEV_TO_AUXILIARY(dev);
> +
> +	if (dev == NULL || aux_dev->driver == NULL) {
> +		rte_errno = EINVAL;
> +		return -1;
> +	}
> +	if (aux_dev->driver->dma_unmap == NULL) {
> +		rte_errno = ENOTSUP;
> +		return -1;
> +	}
> +	return aux_dev->driver->dma_unmap(aux_dev, addr, iova, len);
> +}
> +
> +bool
> +auxiliary_is_ignored_device(const char *name)
> +{
> +	struct rte_devargs *devargs = auxiliary_devargs_lookup(name);
> +
> +	switch (auxiliary_bus.bus.conf.scan_mode) {
> +	case RTE_BUS_SCAN_ALLOWLIST:
> +		if (devargs && devargs->policy == RTE_DEV_ALLOWED)
> +			return false;
> +		break;
> +	case RTE_BUS_SCAN_UNDEFINED:
> +	case RTE_BUS_SCAN_BLOCKLIST:
> +		if (devargs == NULL || devargs->policy != RTE_DEV_BLOCKED)
> +			return false;
> +		break;
> +	}
> +	return true;
> +}
> +
> +static enum rte_iova_mode
> +auxiliary_get_iommu_class(void)
> +{
> +	const struct rte_auxiliary_driver *drv;
> +
> +	FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> +		if ((drv->drv_flags & RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA) > 0)
> +			return RTE_IOVA_VA;
> +	}
> +
> +	return RTE_IOVA_DC;
> +}
> +
> +struct rte_auxiliary_bus auxiliary_bus = {
> +	.bus = {
> +		.scan = auxiliary_scan,
> +		.probe = auxiliary_probe,
> +		.find_device = auxiliary_find_device,
> +		.plug = auxiliary_plug,
> +		.unplug = auxiliary_unplug,
> +		.parse = auxiliary_parse,
> +		.dma_map = auxiliary_dma_map,
> +		.dma_unmap = auxiliary_dma_unmap,
> +		.get_iommu_class = auxiliary_get_iommu_class,
> +		.dev_iterate = auxiliary_dev_iterate,
> +	},
> +	.device_list = TAILQ_HEAD_INITIALIZER(auxiliary_bus.device_list),
> +	.driver_list = TAILQ_HEAD_INITIALIZER(auxiliary_bus.driver_list),
> +};
> +
> +RTE_REGISTER_BUS(auxiliary, auxiliary_bus.bus);
> +RTE_LOG_REGISTER_DEFAULT(auxiliary_bus_logtype, NOTICE);
> diff --git a/drivers/bus/auxiliary/auxiliary_params.c b/drivers/bus/auxiliary/auxiliary_params.c
> new file mode 100644
> index 0000000000..cd3fa56cb4
> --- /dev/null
> +++ b/drivers/bus/auxiliary/auxiliary_params.c
> @@ -0,0 +1,59 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#include <string.h>
> +
> +#include <rte_bus.h>
> +#include <rte_dev.h>
> +#include <rte_errno.h>
> +#include <rte_kvargs.h>
> +
> +#include "private.h"
> +#include "rte_bus_auxiliary.h"
> +
> +enum auxiliary_params {
> +	RTE_AUXILIARY_PARAM_NAME,
> +};
> +
> +static const char * const auxiliary_params_keys[] = {
> +	[RTE_AUXILIARY_PARAM_NAME] = "name",
> +};
> +
> +static int
> +auxiliary_dev_match(const struct rte_device *dev,
> +	      const void *_kvlist)
> +{
> +	const struct rte_kvargs *kvlist = _kvlist;
> +	int ret;
> +
> +	ret = rte_kvargs_process(kvlist,
> +			auxiliary_params_keys[RTE_AUXILIARY_PARAM_NAME],
> +			rte_kvargs_strcmp, (void *)(uintptr_t)dev->name);
> +
> +	return ret != 0 ? -1 : 0;
> +}
> +
> +void *
> +auxiliary_dev_iterate(const void *start,
> +		    const char *str,
> +		    const struct rte_dev_iterator *it __rte_unused)
> +{
> +	rte_bus_find_device_t find_device;
> +	struct rte_kvargs *kvargs = NULL;
> +	struct rte_device *dev;
> +
> +	if (str != NULL) {
> +		kvargs = rte_kvargs_parse(str, auxiliary_params_keys);
> +		if (kvargs == NULL) {
> +			AUXILIARY_LOG(ERR, "cannot parse argument list %s",
> +				      str);
> +			rte_errno = EINVAL;
> +			return NULL;
> +		}
> +	}
> +	find_device = auxiliary_bus.bus.find_device;
> +	dev = find_device(start, auxiliary_dev_match, kvargs);
> +	rte_kvargs_free(kvargs);
> +	return dev;
> +}
> diff --git a/drivers/bus/auxiliary/linux/auxiliary.c b/drivers/bus/auxiliary/linux/auxiliary.c
> new file mode 100644
> index 0000000000..8464487971
> --- /dev/null
> +++ b/drivers/bus/auxiliary/linux/auxiliary.c
> @@ -0,0 +1,141 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#include <string.h>
> +#include <dirent.h>
> +
> +#include <rte_log.h>
> +#include <rte_bus.h>
> +#include <rte_malloc.h>
> +#include <rte_devargs.h>
> +#include <rte_memcpy.h>
> +#include <eal_filesystem.h>
> +
> +#include "../rte_bus_auxiliary.h"
> +#include "../private.h"
> +
> +#define AUXILIARY_SYSFS_PATH "/sys/bus/auxiliary/devices"
> +
> +/* Scan one auxiliary sysfs entry, and fill the devices list from it. */
> +static int
> +auxiliary_scan_one(const char *dirname, const char *name)
> +{
> +	struct rte_auxiliary_device *dev;
> +	struct rte_auxiliary_device *dev2;
> +	char filename[PATH_MAX];
> +	unsigned long tmp;
> +	int ret;
> +
> +	dev = malloc(sizeof(*dev));
> +	if (dev == NULL)
> +		return -1;
> +
> +	memset(dev, 0, sizeof(*dev));
> +	if (rte_strscpy(dev->name, name, sizeof(dev->name)) < 0) {
> +		free(dev);
> +		return -1;
> +	}
> +	dev->device.name = dev->name;
> +	dev->device.bus = &auxiliary_bus.bus;
> +
> +	/* Get NUMA node, default to 0 if not present */
> +	snprintf(filename, sizeof(filename), "%s/%s/numa_node",
> +		 dirname, name);
> +	if (access(filename, F_OK) != -1) {
> +		if (eal_parse_sysfs_value(filename, &tmp) == 0)
> +			dev->device.numa_node = tmp;
> +		else
> +			dev->device.numa_node = -1;
> +	} else {
> +		dev->device.numa_node = 0;
> +	}
> +
> +	auxiliary_on_scan(dev);
> +
> +	/* Device is valid, add in list (sorted) */
> +	TAILQ_FOREACH(dev2, &auxiliary_bus.device_list, next) {
> +		ret = strcmp(dev->name, dev2->name);
> +		if (ret > 0)
> +			continue;
> +		if (ret < 0) {
> +			auxiliary_insert_device(dev2, dev);
> +		} else { /* already registered */
> +			if (rte_dev_is_probed(&dev2->device) &&
> +			    dev2->device.devargs != dev->device.devargs) {
> +				/* To probe device with new devargs. */
> +				rte_devargs_remove(dev2->device.devargs);
> +				auxiliary_on_scan(dev2);
> +			}
> +			free(dev);
> +		}
> +		return 0;
> +	}
> +	auxiliary_add_device(dev);
> +	return 0;
> +}
> +
> +/*
> + * Test whether the auxiliary device exist

Missing full stop above.

> + */
> +bool
> +auxiliary_dev_exists(const char *name)
> +{
> +	DIR *dir;
> +	char dirname[PATH_MAX];
> +
> +	snprintf(dirname, sizeof(dirname), "%s/%s",
> +		 AUXILIARY_SYSFS_PATH, name);
> +	dir = opendir(dirname);
> +	if (dir == NULL)
> +		return false;
> +	closedir(dir);
> +	return true;
> +}
> +
> +/*
> + * Scan the devices in the auxiliary bus

Missing full stop above.

> + */
> +int
> +auxiliary_scan(void)
> +{
> +	struct dirent *e;
> +	DIR *dir;
> +	char dirname[PATH_MAX];
> +	struct rte_auxiliary_driver *drv;
> +
> +	dir = opendir(AUXILIARY_SYSFS_PATH);
> +	if (dir == NULL) {
> +		AUXILIARY_LOG(INFO, "%s not found, is auxiliary module loaded?",
> +			      AUXILIARY_SYSFS_PATH);
> +		return 0;
> +	}
> +
> +	while ((e = readdir(dir)) != NULL) {
> +		if (e->d_name[0] == '.')
> +			continue;
> +
> +		if (auxiliary_is_ignored_device(e->d_name))
> +			continue;
> +
> +		snprintf(dirname, sizeof(dirname), "%s/%s",
> +			 AUXILIARY_SYSFS_PATH, e->d_name);
> +
> +		/* Ignore if no driver can handle. */
> +		FOREACH_DRIVER_ON_AUXILIARY_BUS(drv) {
> +			if (drv->match(e->d_name))
> +				break;
> +		}
> +		if (drv == NULL)
> +			continue;
> +
> +		if (auxiliary_scan_one(dirname, e->d_name) < 0)
> +			goto error;
> +	}
> +	closedir(dir);
> +	return 0;
> +
> +error:
> +	closedir(dir);
> +	return -1;
> +}
> diff --git a/drivers/bus/auxiliary/meson.build b/drivers/bus/auxiliary/meson.build
> new file mode 100644
> index 0000000000..357550eff7
> --- /dev/null
> +++ b/drivers/bus/auxiliary/meson.build
> @@ -0,0 +1,16 @@
> +# SPDX-License-Identifier: BSD-3-Clause
> +# Copyright (c) 2021 NVIDIA Corporation & Affiliates
> +
> +headers = files(
> +        'rte_bus_auxiliary.h',
> +)
> +sources = files(
> +        'auxiliary_common.c',
> +        'auxiliary_params.c',
> +)
> +if is_linux
> +    sources += files(
> +        'linux/auxiliary.c',
> +    )
> +endif
> +deps += ['kvargs']
> diff --git a/drivers/bus/auxiliary/private.h b/drivers/bus/auxiliary/private.h
> new file mode 100644
> index 0000000000..cb3e849993
> --- /dev/null
> +++ b/drivers/bus/auxiliary/private.h
> @@ -0,0 +1,74 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef AUXILIARY_PRIVATE_H

May be add BUS_ prefix at leaat?

> +#define AUXILIARY_PRIVATE_H
> +
> +#include <stdbool.h>
> +#include <stdio.h>
> +
> +#include "rte_bus_auxiliary.h"
> +
> +extern struct rte_auxiliary_bus auxiliary_bus;
> +extern int auxiliary_bus_logtype;
> +
> +#define AUXILIARY_LOG(level, ...) \
> +	rte_log(RTE_LOG_ ## level, auxiliary_bus_logtype, \
> +		RTE_FMT("auxiliary bus: " RTE_FMT_HEAD(__VA_ARGS__,) "\n", \
> +			RTE_FMT_TAIL(__VA_ARGS__,)))
> +
> +/* Auxiliary bus iterators */
> +#define FOREACH_DEVICE_ON_AUXILIARY_BUS(p) \
> +		TAILQ_FOREACH(p, &(auxiliary_bus.device_list), next)
> +
> +#define FOREACH_DRIVER_ON_AUXILIARY_BUS(p) \
> +		TAILQ_FOREACH(p, &(auxiliary_bus.driver_list), next)
> +
> +bool auxiliary_dev_exists(const char *name);
> +
> +/*
> + * Scan the content of the auxiliary bus, and the devices in the devices
> + * list.
> + */
> +int auxiliary_scan(void);
> +
> +/*
> + * Update a device being scanned.
> + */
> +void auxiliary_on_scan(struct rte_auxiliary_device *aux_dev);
> +
> +/*
> + * Validate whether a device with given auxiliary device should be ignored
> + * or not.
> + */
> +bool auxiliary_is_ignored_device(const char *name);
> +
> +/*
> + * Add an auxiliary device to the auxiliary bus (append to auxiliary device
> + * list). This function also updates the bus references of the auxiliary
> + * device and the generic device object embedded within.
> + */
> +void auxiliary_add_device(struct rte_auxiliary_device *aux_dev);
> +
> +/*
> + * Insert an auxiliary device in the auxiliary bus at a particular location
> + * in the device list. It also updates the auxiliary bus reference of the
> + * new devices to be inserted.
> + */
> +void auxiliary_insert_device(struct rte_auxiliary_device *exist_aux_dev,
> +			     struct rte_auxiliary_device *new_aux_dev);
> +
> +/*
> + * Match the auxiliary driver and device by driver function

Missing full stop.

> + */
> +bool auxiliary_match(const struct rte_auxiliary_driver *aux_drv,
> +		     const struct rte_auxiliary_device *aux_dev);
> +
> +/*
> + * Iterate over devices, matching any device against the provided string

Missing full stop.

> + */
> +void *auxiliary_dev_iterate(const void *start, const char *str,
> +			    const struct rte_dev_iterator *it);
> +
> +#endif /* AUXILIARY_PRIVATE_H */
> diff --git a/drivers/bus/auxiliary/rte_bus_auxiliary.h b/drivers/bus/auxiliary/rte_bus_auxiliary.h
> new file mode 100644
> index 0000000000..16b147e387
> --- /dev/null
> +++ b/drivers/bus/auxiliary/rte_bus_auxiliary.h
> @@ -0,0 +1,201 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright (c) 2021 NVIDIA Corporation & Affiliates
> + */
> +
> +#ifndef RTE_BUS_AUXILIARY_H
> +#define RTE_BUS_AUXILIARY_H
> +
> +/**
> + * @file
> + *
> + * Auxiliary Bus Interface.
> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <stdio.h>
> +#include <stdlib.h>
> +#include <limits.h>
> +#include <errno.h>
> +#include <sys/queue.h>
> +#include <stdint.h>
> +#include <inttypes.h>
> +
> +#include <rte_debug.h>
> +#include <rte_interrupts.h>
> +#include <rte_dev.h>
> +#include <rte_bus.h>
> +#include <rte_kvargs.h>
> +
> +#define RTE_BUS_AUXILIARY_NAME "auxiliary"
> +
> +/* Forward declarations */
> +struct rte_auxiliary_driver;
> +struct rte_auxiliary_bus;
> +struct rte_auxiliary_device;
> +
> +/**
> + * Match function for the driver to decide if device can be handled.
> + *
> + * @param name
> + *   Pointer to the auxiliary device name.
> + * @return
> + *   Whether the driver can handle the auxiliary device.
> + */
> +typedef bool(rte_auxiliary_match_t)(const char *name);
> +
> +/**
> + * Initialization function for the driver called during auxiliary probing.
> + *
> + * @param drv
> + *   Pointer to the auxiliary driver.
> + * @param dev
> + *   Pointer to the auxiliary device.
> + * @return
> + *   - 0 On success.
> + *   - Negative value and rte_errno is set otherwise.
> + */
> +typedef int(rte_auxiliary_probe_t)(struct rte_auxiliary_driver *drv,
> +				    struct rte_auxiliary_device *dev);
> +
> +/**
> + * Uninitialization function for the driver called during hotplugging.
> + *
> + * @param dev
> + *   Pointer to the auxiliary device.
> + * @return
> + *   - 0 On success.
> + *   - Negative value and rte_errno is set otherwise.
> + */
> +typedef int (rte_auxiliary_remove_t)(struct rte_auxiliary_device *dev);
> +
> +/**
> + * Driver-specific DMA mapping. After a successful call the device
> + * will be able to read/write from/to this segment.
> + *
> + * @param dev
> + *   Pointer to the auxiliary device.
> + * @param addr
> + *   Starting virtual address of memory to be mapped.
> + * @param iova
> + *   Starting IOVA address of memory to be mapped.
> + * @param len
> + *   Length of memory segment being mapped.
> + * @return
> + *   - 0 On success.
> + *   - Negative value and rte_errno is set otherwise.
> + */
> +typedef int (rte_auxiliary_dma_map_t)(struct rte_auxiliary_device *dev,
> +				       void *addr, uint64_t iova, size_t len);
> +
> +/**
> + * Driver-specific DMA un-mapping. After a successful call the device
> + * will not be able to read/write from/to this segment.
> + *
> + * @param dev
> + *   Pointer to the auxiliary device.
> + * @param addr
> + *   Starting virtual address of memory to be unmapped.
> + * @param iova
> + *   Starting IOVA address of memory to be unmapped.
> + * @param len
> + *   Length of memory segment being unmapped.
> + * @return
> + *   - 0 On success.
> + *   - Negative value and rte_errno is set otherwise.
> + */
> +typedef int (rte_auxiliary_dma_unmap_t)(struct rte_auxiliary_device *dev,
> +					 void *addr, uint64_t iova, size_t len);
> +
> +/**
> + * A structure describing an auxiliary device.
> + */
> +struct rte_auxiliary_device {
> +	TAILQ_ENTRY(rte_auxiliary_device) next;   /**< Next probed device. */
> +	struct rte_device device;                 /**< Inherit core device */
> +	char name[RTE_DEV_NAME_MAX_LEN + 1];      /**< ASCII device name */
> +	struct rte_intr_handle intr_handle;       /**< Interrupt handle */
> +	struct rte_auxiliary_driver *driver;      /**< Device driver */
> +};
> +
> +/** List of auxiliary devices */
> +TAILQ_HEAD(rte_auxiliary_device_list, rte_auxiliary_device);
> +/** List of auxiliary drivers */
> +TAILQ_HEAD(rte_auxiliary_driver_list, rte_auxiliary_driver);

Shouldn't we hide rte_auxiliary_device inside the library take
API/ABI stability into account? Or will be it DPDK internal anyway? If
so, it should be done INTERNAL from the very
beginning.

> +
> +/**
> + * Structure describing the auxiliary bus
> + */
> +struct rte_auxiliary_bus {
> +	struct rte_bus bus;                  /**< Inherit the generic class */
> +	struct rte_auxiliary_device_list device_list;  /**< List of devices */
> +	struct rte_auxiliary_driver_list driver_list;  /**< List of drivers */
> +};

It looks internal. The following forward declaration should be
sufficient to build.

struct rte_auxiliary_bus;


> +
> +/**
> + * A structure describing an auxiliary driver.
> + */
> +struct rte_auxiliary_driver {
> +	TAILQ_ENTRY(rte_auxiliary_driver) next; /**< Next in list. */
> +	struct rte_driver driver;             /**< Inherit core driver. */
> +	struct rte_auxiliary_bus *bus;        /**< Auxiliary bus reference. */
> +	rte_auxiliary_match_t *match;         /**< Device match function. */
> +	rte_auxiliary_probe_t *probe;         /**< Device probe function. */
> +	rte_auxiliary_remove_t *remove;       /**< Device remove function. */
> +	rte_auxiliary_dma_map_t *dma_map;     /**< Device DMA map function. */
> +	rte_auxiliary_dma_unmap_t *dma_unmap; /**< Device DMA unmap function. */
> +	uint32_t drv_flags;                   /**< Flags RTE_AUXILIARY_DRV_*. */
> +};
> +
> +/**
> + * @internal
> + * Helper macro for drivers that need to convert to struct rte_auxiliary_device.
> + */
> +#define RTE_DEV_TO_AUXILIARY(ptr) \
> +	container_of(ptr, struct rte_auxiliary_device, device)
> +
> +#define RTE_DEV_TO_AUXILIARY_CONST(ptr) \
> +	container_of(ptr, const struct rte_auxiliary_device, device)
> +
> +#define RTE_ETH_DEV_TO_AUXILIARY(eth_dev) \
> +	RTE_DEV_TO_AUXILIARY((eth_dev)->device)
> +
> +/** Device driver needs IOVA as VA and cannot work with IOVA as PA */
> +#define RTE_AUXILIARY_DRV_NEED_IOVA_AS_VA 0x002
> +
> +/**

Don't we need EXPERIMENTAL notice here?

> + * Register an auxiliary driver.
> + *
> + * @param driver
> + *   A pointer to a rte_auxiliary_driver structure describing the driver
> + *   to be registered.
> + */
> +__rte_experimental
> +void rte_auxiliary_register(struct rte_auxiliary_driver *driver);
> +
> +/** Helper for auxiliary device registration from driver instance */
> +#define RTE_PMD_REGISTER_AUXILIARY(nm, auxiliary_drv) \
> +	RTE_INIT(auxiliaryinitfn_ ##nm) \
> +	{ \
> +		(auxiliary_drv).driver.name = RTE_STR(nm); \
> +		rte_auxiliary_register(&(auxiliary_drv)); \
> +	} \
> +	RTE_PMD_EXPORT_NAME(nm, __COUNTER__)
> +
> +/**

Don't we need EXPERIMENTAL notice here?

> + * Unregister an auxiliary driver.
> + *
> + * @param driver
> + *   A pointer to a rte_auxiliary_driver structure describing the driver
> + *   to be unregistered.
> + */
> +__rte_experimental
> +void rte_auxiliary_unregister(struct rte_auxiliary_driver *driver);
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* RTE_BUS_AUXILIARY_H */
> diff --git a/drivers/bus/auxiliary/version.map b/drivers/bus/auxiliary/version.map
> new file mode 100644
> index 0000000000..a52260657c
> --- /dev/null
> +++ b/drivers/bus/auxiliary/version.map
> @@ -0,0 +1,7 @@
> +EXPERIMENTAL {
> +	global:
> +
> +	# added in 21.08
> +	rte_auxiliary_register;
> +	rte_auxiliary_unregister;
> +};
> diff --git a/drivers/bus/meson.build b/drivers/bus/meson.build
> index 410058de3a..45eab5233d 100644
> --- a/drivers/bus/meson.build
> +++ b/drivers/bus/meson.build
> @@ -2,6 +2,7 @@
>  # Copyright(c) 2017 Intel Corporation
>  
>  drivers = [
> +        'auxiliary',
>          'dpaa',
>          'fslmc',
>          'ifpga',
> 


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] dmadev: introduce DMA device library
  @ 2021-07-04  9:30  3% ` Jerin Jacob
  2021-07-05 10:52  0%   ` Bruce Richardson
  0 siblings, 1 reply; 200+ results
From: Jerin Jacob @ 2021-07-04  9:30 UTC (permalink / raw)
  To: Chengwen Feng
  Cc: Thomas Monjalon, Ferruh Yigit, Richardson, Bruce, Jerin Jacob,
	dpdk-dev, Morten Brørup, Nipun Gupta, Hemant Agrawal,
	Maxime Coquelin, Honnappa Nagarahalli, David Marchand,
	Satananda Burla, Prasun Kapoor, Ananyev, Konstantin, liangma,
	Radha Mohan Chintakuntla

On Fri, Jul 2, 2021 at 6:51 PM Chengwen Feng <fengchengwen@huawei.com> wrote:
>
> This patch introduces 'dmadevice' which is a generic type of DMA
> device.
>
> The APIs of dmadev library exposes some generic operations which can
> enable configuration and I/O with the DMA devices.
>
> Signed-off-by: Chengwen Feng <fengchengwen@huawei.com>

Thanks for v1.

I would suggest finalizing  lib/dmadev/rte_dmadev.h before doing the
implementation so that you don't need
to waste time on rewoking the implementation.

Comments inline.

> ---
>  MAINTAINERS                  |   4 +
>  config/rte_config.h          |   3 +
>  lib/dmadev/meson.build       |   6 +
>  lib/dmadev/rte_dmadev.c      | 438 +++++++++++++++++++++
>  lib/dmadev/rte_dmadev.h      | 919 +++++++++++++++++++++++++++++++++++++++++++
>  lib/dmadev/rte_dmadev_core.h |  98 +++++
>  lib/dmadev/rte_dmadev_pmd.h  | 210 ++++++++++
>  lib/dmadev/version.map       |  32 ++

Missed to update doxygen. See doc/api/doxy-api.conf.in
Use meson  -Denable_docs=true to verify the generated doxgen doc.

>  lib/meson.build              |   1 +
>  9 files changed, 1711 insertions(+)
>  create mode 100644 lib/dmadev/meson.build
>  create mode 100644 lib/dmadev/rte_dmadev.c
>  create mode 100644 lib/dmadev/rte_dmadev.h
>  create mode 100644 lib/dmadev/rte_dmadev_core.h
>  create mode 100644 lib/dmadev/rte_dmadev_pmd.h
>  create mode 100644 lib/dmadev/version.map
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 4347555..2019783 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -496,6 +496,10 @@ F: drivers/raw/skeleton/
>  F: app/test/test_rawdev.c
>  F: doc/guides/prog_guide/rawdev.rst
>

Add EXPERIMENTAL

> +Dma device API
> +M: Chengwen Feng <fengchengwen@huawei.com>
> +F: lib/dmadev/
> +
>

> new file mode 100644
> index 0000000..a94e839
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.c
> @@ -0,0 +1,438 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 HiSilicon Limited.
> + */
> +
> +#include <ctype.h>
> +#include <stdlib.h>
> +#include <string.h>
> +#include <stdint.h>
> +
> +#include <rte_log.h>
> +#include <rte_debug.h>
> +#include <rte_dev.h>
> +#include <rte_memory.h>
> +#include <rte_memzone.h>
> +#include <rte_malloc.h>
> +#include <rte_errno.h>
> +#include <rte_string_fns.h>

Sort in alphabetical order.

> +
> +#include "rte_dmadev.h"
> +#include "rte_dmadev_pmd.h"
> +
> +struct rte_dmadev rte_dmadevices[RTE_DMADEV_MAX_DEVS];

# Please check have you missed any multiprocess angle.
lib/regexdev/rte_regexdev.c is latest device class implemented in dpdk and
please check *rte_regexdev_shared_data scheme.


# Missing dynamic log for this library.


> diff --git a/lib/dmadev/rte_dmadev.h b/lib/dmadev/rte_dmadev.h
> new file mode 100644
> index 0000000..f74fc6a
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev.h
> @@ -0,0 +1,919 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 HiSilicon Limited.

It would be nice to add other companies' names who have contributed to
the specification.

> + */
> +
> +#ifndef _RTE_DMADEV_H_
> +#define _RTE_DMADEV_H_
> +
> +/**
> + * @file rte_dmadev.h
> + *
> + * RTE DMA (Direct Memory Access) device APIs.
> + *
> + * The generic DMA device diagram:
> + *
> + *            ------------     ------------
> + *            | HW-queue |     | HW-queue |
> + *            ------------     ------------
> + *                   \            /
> + *                    \          /
> + *                     \        /
> + *                  ----------------
> + *                  |dma-controller|
> + *                  ----------------
> + *
> + *   The DMA could have multiple HW-queues, each HW-queue could have multiple
> + *   capabilities, e.g. whether to support fill operation, supported DMA
> + *   transfter direction and etc.

typo

> + *
> + * The DMA framework is built on the following abstraction model:
> + *
> + *     ------------    ------------
> + *     |virt-queue|    |virt-queue|
> + *     ------------    ------------
> + *            \           /
> + *             \         /
> + *              \       /
> + *            ------------     ------------
> + *            | HW-queue |     | HW-queue |
> + *            ------------     ------------
> + *                   \            /
> + *                    \          /
> + *                     \        /
> + *                     ----------
> + *                     | dmadev |
> + *                     ----------

Continuing the discussion with @Morten Brørup , I think, we need to
finalize the model.

> + *   a) The DMA operation request must be submitted to the virt queue, virt
> + *      queues must be created based on HW queues, the DMA device could have
> + *      multiple HW queues.
> + *   b) The virt queues on the same HW-queue could represent different contexts,
> + *      e.g. user could create virt-queue-0 on HW-queue-0 for mem-to-mem
> + *      transfer scenario, and create virt-queue-1 on the same HW-queue for
> + *      mem-to-dev transfer scenario.
> + *   NOTE: user could also create multiple virt queues for mem-to-mem transfer
> + *         scenario as long as the corresponding driver supports.
> + *
> + * The control plane APIs include configure/queue_setup/queue_release/start/
> + * stop/reset/close, in order to start device work, the call sequence must be
> + * as follows:
> + *     - rte_dmadev_configure()
> + *     - rte_dmadev_queue_setup()
> + *     - rte_dmadev_start()

Please add reconfigure behaviour etc, Please check the
lib/regexdev/rte_regexdev.h
introduction. I have added similar ones so you could reuse as much as possible.


> + * The dataplane APIs include two parts:
> + *   a) The first part is the submission of operation requests:
> + *        - rte_dmadev_copy()
> + *        - rte_dmadev_copy_sg() - scatter-gather form of copy
> + *        - rte_dmadev_fill()
> + *        - rte_dmadev_fill_sg() - scatter-gather form of fill
> + *        - rte_dmadev_fence()   - add a fence force ordering between operations
> + *        - rte_dmadev_perform() - issue doorbell to hardware
> + *      These APIs could work with different virt queues which have different
> + *      contexts.
> + *      The first four APIs are used to submit the operation request to the virt
> + *      queue, if the submission is successful, a cookie (as type
> + *      'dma_cookie_t') is returned, otherwise a negative number is returned.
> + *   b) The second part is to obtain the result of requests:
> + *        - rte_dmadev_completed()
> + *            - return the number of operation requests completed successfully.
> + *        - rte_dmadev_completed_fails()
> + *            - return the number of operation requests failed to complete.
> + *
> + * The misc APIs include info_get/queue_info_get/stats/xstats/selftest, provide
> + * information query and self-test capabilities.
> + *
> + * About the dataplane APIs MT-safe, there are two dimensions:
> + *   a) For one virt queue, the submit/completion API could be MT-safe,
> + *      e.g. one thread do submit operation, another thread do completion
> + *      operation.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_VQ.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.
> + *   b) For multiple virt queues on the same HW queue, e.g. one thread do
> + *      operation on virt-queue-0, another thread do operation on virt-queue-1.
> + *      If driver support it, then declare RTE_DMA_DEV_CAPA_MT_MVQ.
> + *      If driver don't support it, it's up to the application to guarantee
> + *      MT-safe.

From an application PoV it may not be good to write portable
applications. Please check
latest thread with @Morten Brørup

> + */
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <rte_common.h>
> +#include <rte_memory.h>
> +#include <rte_errno.h>
> +#include <rte_compat.h>

Sort in alphabetical order.

> +
> +/**
> + * dma_cookie_t - an opaque DMA cookie

Since we are defining the behaviour is not opaque any more.
I think, it is better to call ring_idx or so.


> +#define RTE_DMA_DEV_CAPA_MT_MVQ (1ull << 11) /**< Support MT-safe of multiple virt queues */

Please lot of @see for all symbols where it is being used. So that one
can understand the full scope of
symbols. See below example.

#define RTE_REGEXDEV_CAPA_RUNTIME_COMPILATION_F (1ULL << 0)
/**< RegEx device does support compiling the rules at runtime unlike
 * loading only the pre-built rule database using
 * struct rte_regexdev_config::rule_db in rte_regexdev_configure()
 *
 * @see struct rte_regexdev_config::rule_db, rte_regexdev_configure()
 * @see struct rte_regexdev_info::regexdev_capa
 */

> + *
> + * If dma_cookie_t is >=0 it's a DMA operation request cookie, <0 it's a error
> + * code.
> + * When using cookies, comply with the following rules:
> + * a) Cookies for each virtual queue are independent.
> + * b) For a virt queue, the cookie are monotonically incremented, when it reach
> + *    the INT_MAX, it wraps back to zero.
> + * c) The initial cookie of a virt queue is zero, after the device is stopped or
> + *    reset, the virt queue's cookie needs to be reset to zero.
> + * Example:
> + *    step-1: start one dmadev
> + *    step-2: enqueue a copy operation, the cookie return is 0
> + *    step-3: enqueue a copy operation again, the cookie return is 1
> + *    ...
> + *    step-101: stop the dmadev
> + *    step-102: start the dmadev
> + *    step-103: enqueue a copy operation, the cookie return is 0
> + *    ...
> + */

Good explanation.

> +typedef int32_t dma_cookie_t;


> +
> +/**
> + * dma_scatterlist - can hold scatter DMA operation request
> + */
> +struct dma_scatterlist {

I prefer to change scatterlist -> sg
i.e rte_dma_sg

> +       void *src;
> +       void *dst;
> +       uint32_t length;
> +};
> +

> +
> +/**
> + * A structure used to retrieve the contextual information of
> + * an DMA device
> + */
> +struct rte_dmadev_info {
> +       /**
> +        * Fields filled by framewok

typo.

> +        */
> +       struct rte_device *device; /**< Generic Device information */
> +       const char *driver_name; /**< Device driver name */
> +       int socket_id; /**< Socket ID where memory is allocated */
> +
> +       /**
> +        * Specification fields filled by driver
> +        */
> +       uint64_t dev_capa; /**< Device capabilities (RTE_DMA_DEV_CAPA_) */
> +       uint16_t max_hw_queues; /**< Maximum number of HW queues. */
> +       uint16_t max_vqs_per_hw_queue;
> +       /**< Maximum number of virt queues to allocate per HW queue */
> +       uint16_t max_desc;
> +       /**< Maximum allowed number of virt queue descriptors */
> +       uint16_t min_desc;
> +       /**< Minimum allowed number of virt queue descriptors */

Please add max_nb_segs. i.e maximum number of segments supported.

> +
> +       /**
> +        * Status fields filled by driver
> +        */
> +       uint16_t nb_hw_queues; /**< Number of HW queues configured */
> +       uint16_t nb_vqs; /**< Number of virt queues configured */
> +};
> + i
> +
> +/**
> + * dma_address_type
> + */
> +enum dma_address_type {
> +       DMA_ADDRESS_TYPE_IOVA, /**< Use IOVA as dma address */
> +       DMA_ADDRESS_TYPE_VA, /**< Use VA as dma address */
> +};
> +
> +/**
> + * A structure used to configure a DMA device.
> + */
> +struct rte_dmadev_conf {
> +       enum dma_address_type addr_type; /**< Address type to used */

I think, there are 3 kinds of limitations/capabilities.

When the system is configured as IOVA as VA
1) Device supports any VA address like memory from rte_malloc(),
rte_memzone(), malloc, stack memory
2) Device support only VA address from rte_malloc(), rte_memzone() i.e
memory backed by hugepage and added to DMA map.

When the system is configured as IOVA as PA
1) Devices support only PA addresses .

IMO, Above needs to be  advertised as capability and application needs
to align with that
and I dont think application requests the driver to work in any of the modes.



> +       uint16_t nb_hw_queues; /**< Number of HW-queues enable to use */
> +       uint16_t max_vqs; /**< Maximum number of virt queues to use */

You need to what is max value allowed etc i.e it is based on
info_get() and mention the field
in info structure


> +
> +/**
> + * dma_transfer_direction
> + */
> +enum dma_transfer_direction {

rte_dma_transter_direction

> +       DMA_MEM_TO_MEM,
> +       DMA_MEM_TO_DEV,
> +       DMA_DEV_TO_MEM,
> +       DMA_DEV_TO_DEV,
> +};
> +
> +/**
> + * A structure used to configure a DMA virt queue.
> + */
> +struct rte_dmadev_queue_conf {
> +       enum dma_transfer_direction direction;


> +       /**< Associated transfer direction */
> +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> +       uint64_t dev_flags; /**< Device specific flags */

Use of this? Need more comments on this.
Since it is in slowpath, We can have non opaque names here based on
each driver capability.


> +       void *dev_ctx; /**< Device specific context */

Use of this ? Need more comment ont this.


Please add some good amount of reserved bits and have API to init this
structure for future ABI stability, say rte_dmadev_queue_config_init()
or so.


> +
> +/**
> + * A structure used to retrieve information of a DMA virt queue.
> + */
> +struct rte_dmadev_queue_info {
> +       enum dma_transfer_direction direction;

A queue may support all directions so I think it should be a bitfield.

> +       /**< Associated transfer direction */
> +       uint16_t hw_queue_id; /**< The HW queue on which to create virt queue */
> +       uint16_t nb_desc; /**< Number of descriptor for this virt queue */
> +       uint64_t dev_flags; /**< Device specific flags */
> +};
> +

> +__rte_experimental
> +static inline dma_cookie_t
> +rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id,
> +                  const struct dma_scatterlist *sg,
> +                  uint32_t sg_len, uint64_t flags)

I would like to change this as:
rte_dmadev_copy_sg(uint16_t dev_id, uint16_t vq_id, const struct
rte_dma_sg *src, uint32_t nb_src,
const struct rte_dma_sg *dst, uint32_t nb_dst) or so allow the use case like
src 30 MB copy can be splitted as written as 1 MB x 30 dst.



> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       return (*dev->copy_sg)(dev, vq_id, sg, sg_len, flags);
> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Enqueue a fill operation onto the DMA virt queue
> + *
> + * This queues up a fill operation to be performed by hardware, but does not
> + * trigger hardware to begin that operation.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vq_id
> + *   The identifier of virt queue.
> + * @param pattern
> + *   The pattern to populate the destination buffer with.
> + * @param dst
> + *   The address of the destination buffer.
> + * @param length
> + *   The length of the destination buffer.
> + * @param flags
> + *   An opaque flags for this operation.

PLEASE REMOVE opaque stuff from fastpath it will be a pain for
application writers as
they need to write multiple combinations of fastpath. flags are OK, if
we have a valid
generic flag now to control the transfer behavior.


> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Add a fence to force ordering between operations
> + *
> + * This adds a fence to a sequence of operations to enforce ordering, such that
> + * all operations enqueued before the fence must be completed before operations
> + * after the fence.
> + * NOTE: Since this fence may be added as a flag to the last operation enqueued,
> + * this API may not function correctly when called immediately after an
> + * "rte_dmadev_perform" call i.e. before any new operations are enqueued.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vq_id
> + *   The identifier of virt queue.
> + *
> + * @return
> + *   - =0: Successful add fence.
> + *   - <0: Failure to add fence.
> + *
> + * NOTE: The caller must ensure that the input parameter is valid and the
> + *       corresponding device supports the operation.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_fence(uint16_t dev_id, uint16_t vq_id)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       return (*dev->fence)(dev, vq_id);
> +}

Since HW submission is in a queue(FIFO) the ordering is always
maintained. Right?
Could you share more details and use case of fence() from
driver/application PoV?


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Trigger hardware to begin performing enqueued operations
> + *
> + * This API is used to write the "doorbell" to the hardware to trigger it
> + * to begin the operations previously enqueued by rte_dmadev_copy/fill()
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vq_id
> + *   The identifier of virt queue.
> + *
> + * @return
> + *   - =0: Successful trigger hardware.
> + *   - <0: Failure to trigger hardware.
> + *
> + * NOTE: The caller must ensure that the input parameter is valid and the
> + *       corresponding device supports the operation.
> + */
> +__rte_experimental
> +static inline int
> +rte_dmadev_perform(uint16_t dev_id, uint16_t vq_id)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       return (*dev->perform)(dev, vq_id);
> +}

Since we have additional function call overhead in all the
applications for this scheme, I would like to understand
the use of doing this way vs enq does the doorbell implicitly from
driver/application PoV?


> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that have been successful completed.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vq_id
> + *   The identifier of virt queue.
> + * @param nb_cpls
> + *   The maximum number of completed operations that can be processed.
> + * @param[out] cookie
> + *   The last completed operation's cookie.
> + * @param[out] has_error
> + *   Indicates if there are transfer error.
> + *
> + * @return
> + *   The number of operations that successful completed.

successfully

> + *
> + * NOTE: The caller must ensure that the input parameter is valid and the
> + *       corresponding device supports the operation.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed(uint16_t dev_id, uint16_t vq_id, const uint16_t nb_cpls,
> +                    dma_cookie_t *cookie, bool *has_error)
> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       has_error = false;
> +       return (*dev->completed)(dev, vq_id, nb_cpls, cookie, has_error);

It may be better to have cookie/ring_idx as third argument.

> +}
> +
> +/**
> + * @warning
> + * @b EXPERIMENTAL: this API may change without prior notice.
> + *
> + * Returns the number of operations that failed to complete.
> + * NOTE: This API was used when rte_dmadev_completed has_error was set.
> + *
> + * @param dev_id
> + *   The identifier of the device.
> + * @param vq_id
> + *   The identifier of virt queue.
(> + * @param nb_status
> + *   Indicates the size  of status array.
> + * @param[out] status
> + *   The error code of operations that failed to complete.
> + * @param[out] cookie
> + *   The last failed completed operation's cookie.
> + *
> + * @return
> + *   The number of operations that failed to complete.
> + *
> + * NOTE: The caller must ensure that the input parameter is valid and the
> + *       corresponding device supports the operation.
> + */
> +__rte_experimental
> +static inline uint16_t
> +rte_dmadev_completed_fails(uint16_t dev_id, uint16_t vq_id,
> +                          const uint16_t nb_status, uint32_t *status,
> +                          dma_cookie_t *cookie)

IMO, it is better to move cookie/rind_idx at 3.
Why it would return any array of errors? since it called after
rte_dmadev_completed() has
has_error. Is it better to change

rte_dmadev_error_status((uint16_t dev_id, uint16_t vq_id, dma_cookie_t
*cookie,  uint32_t *status)

I also think, we may need to set status as bitmask and enumerate all
the combination of error codes
of all the driver and return string from driver existing rte_flow_error

See
struct rte_flow_error {
        enum rte_flow_error_type type; /**< Cause field and error types. */
        const void *cause; /**< Object responsible for the error. */
        const char *message; /**< Human-readable error message. */
};

> +{
> +       struct rte_dmadev *dev = &rte_dmadevices[dev_id];
> +       return (*dev->completed_fails)(dev, vq_id, nb_status, status, cookie);
> +}
> +
> +struct rte_dmadev_stats {
> +       uint64_t enqueue_fail_count;
> +       /**< Conut of all operations which failed enqueued */
> +       uint64_t enqueued_count;
> +       /**< Count of all operations which successful enqueued */
> +       uint64_t completed_fail_count;
> +       /**< Count of all operations which failed to complete */
> +       uint64_t completed_count;
> +       /**< Count of all operations which successful complete */
> +};

We need to have capability API to tell which items are
updated/supported by the driver.


> diff --git a/lib/dmadev/rte_dmadev_core.h b/lib/dmadev/rte_dmadev_core.h
> new file mode 100644
> index 0000000..a3afea2
> --- /dev/null
> +++ b/lib/dmadev/rte_dmadev_core.h
> @@ -0,0 +1,98 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright 2021 HiSilicon Limited.
> + */
> +
> +#ifndef _RTE_DMADEV_CORE_H_
> +#define _RTE_DMADEV_CORE_H_
> +
> +/**
> + * @file
> + *
> + * RTE DMA Device internal header.
> + *
> + * This header contains internal data types. But they are still part of the
> + * public API because they are used by inline public functions.
> + */
> +
> +struct rte_dmadev;
> +
> +typedef dma_cookie_t (*dmadev_copy_t)(struct rte_dmadev *dev, uint16_t vq_id,
> +                                     void *src, void *dst,
> +                                     uint32_t length, uint64_t flags);
> +/**< @internal Function used to enqueue a copy operation. */

To avoid namespace conflict(as it is public API) use rte_


> +
> +/**
> + * The data structure associated with each DMA device.
> + */
> +struct rte_dmadev {
> +       /**< Enqueue a copy operation onto the DMA device. */
> +       dmadev_copy_t copy;
> +       /**< Enqueue a scatter list copy operation onto the DMA device. */
> +       dmadev_copy_sg_t copy_sg;
> +       /**< Enqueue a fill operation onto the DMA device. */
> +       dmadev_fill_t fill;
> +       /**< Enqueue a scatter list fill operation onto the DMA device. */
> +       dmadev_fill_sg_t fill_sg;
> +       /**< Add a fence to force ordering between operations. */
> +       dmadev_fence_t fence;
> +       /**< Trigger hardware to begin performing enqueued operations. */
> +       dmadev_perform_t perform;
> +       /**< Returns the number of operations that successful completed. */
> +       dmadev_completed_t completed;
> +       /**< Returns the number of operations that failed to complete. */
> +       dmadev_completed_fails_t completed_fails;

We need to limit fastpath items in 1 CL

> +
> +       void *dev_private; /**< PMD-specific private data */
> +       const struct rte_dmadev_ops *dev_ops; /**< Functions exported by PMD */
> +
> +       uint16_t dev_id; /**< Device ID for this instance */
> +       int socket_id; /**< Socket ID where memory is allocated */
> +       struct rte_device *device;
> +       /**< Device info. supplied during device initialization */
> +       const char *driver_name; /**< Driver info. supplied by probing */
> +       char name[RTE_DMADEV_NAME_MAX_LEN]; /**< Device name */
> +
> +       RTE_STD_C11
> +       uint8_t attached : 1; /**< Flag indicating the device is attached */
> +       uint8_t started : 1; /**< Device state: STARTED(1)/STOPPED(0) */

Add a couple of reserved fields for future ABI stability.

> +
> +} __rte_cache_aligned;
> +
> +extern struct rte_dmadev rte_dmadevices[];
> +

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v3 19/20] net/sfc: support flow action COUNT in transfer rules
  2021-07-02 13:37  3%               ` David Marchand
@ 2021-07-02 13:39  0%                 ` Andrew Rybchenko
  0 siblings, 0 replies; 200+ results
From: Andrew Rybchenko @ 2021-07-02 13:39 UTC (permalink / raw)
  To: David Marchand
  Cc: Bruce Richardson, Thomas Monjalon, dev, Igor Romanov,
	Andy Moreton, Ivan Malov

On 7/2/21 4:37 PM, David Marchand wrote:
> On Fri, Jul 2, 2021 at 10:43 AM Andrew Rybchenko
> <andrew.rybchenko@oktetlabs.ru> wrote:
>> I've send v4 with the problem fixed. However, I'm afraid
>> build test systems should be updated to have libatomic
>> correctly installed. Otherwise, they do not really check
>> net/sfc build.
> 
> CI systems must be updated if they check ABI.
> And in general, we want them to continue testing net/sfc.
> I sent a mail to ask for this.

Many thanks, David


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v3 19/20] net/sfc: support flow action COUNT in transfer rules
  @ 2021-07-02 13:37  3%               ` David Marchand
  2021-07-02 13:39  0%                 ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2021-07-02 13:37 UTC (permalink / raw)
  To: Andrew Rybchenko
  Cc: Bruce Richardson, Thomas Monjalon, dev, Igor Romanov,
	Andy Moreton, Ivan Malov

On Fri, Jul 2, 2021 at 10:43 AM Andrew Rybchenko
<andrew.rybchenko@oktetlabs.ru> wrote:
> I've send v4 with the problem fixed. However, I'm afraid
> build test systems should be updated to have libatomic
> correctly installed. Otherwise, they do not really check
> net/sfc build.

CI systems must be updated if they check ABI.
And in general, we want them to continue testing net/sfc.
I sent a mail to ask for this.


-- 
David Marchand


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [dpdk-techboard] ABI/API stability towards drivers
  2021-07-02  8:00  8% [dpdk-dev] ABI/API stability towards drivers Morten Brørup
  2021-07-02  9:45  7% ` [dpdk-dev] [dpdk-techboard] " Ferruh Yigit
@ 2021-07-02 12:26  4% ` Thomas Monjalon
  2021-07-07 18:46  8% ` [dpdk-dev] " Tyler Retzlaff
  2 siblings, 0 replies; 200+ results
From: Thomas Monjalon @ 2021-07-02 12:26 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dpdk-techboard, dpdk-dev

02/07/2021 10:00, Morten Brørup:
> Regarding the ongoing ABI stability project, it is suggested to export driver interfaces as internal.
> 
> What are we targeting regarding ABI and API stability towards drivers?

No stability for driver interface.
It is recommended to make drivers internal.
If a driver is kept external to DPDK, there is a maintenance cost.



^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [dpdk-techboard] ABI/API stability towards drivers
  2021-07-02  8:00  8% [dpdk-dev] ABI/API stability towards drivers Morten Brørup
@ 2021-07-02  9:45  7% ` Ferruh Yigit
  2021-07-02 12:26  4% ` Thomas Monjalon
  2021-07-07 18:46  8% ` [dpdk-dev] " Tyler Retzlaff
  2 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-07-02  9:45 UTC (permalink / raw)
  To: Morten Brørup, dpdk-techboard; +Cc: dpdk-dev

On 7/2/2021 10:00 AM, Morten Brørup wrote:
> Regarding the ongoing ABI stability project, it is suggested to export driver interfaces as internal.
> 
> What are we targeting regarding ABI and API stability towards drivers?
> 

Hi Morten,

It is about some device abstraction libraries, like cryptodev, exposing the
internal driver to library interface to the application. And any change on them
causing an unnecessary ABI break.

So target is not drivers, but hide everything from application that only needs
to be between lib and driver.

^ permalink raw reply	[relevance 7%]

* [dpdk-dev] ABI/API stability towards drivers
@ 2021-07-02  8:00  8% Morten Brørup
  2021-07-02  9:45  7% ` [dpdk-dev] [dpdk-techboard] " Ferruh Yigit
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Morten Brørup @ 2021-07-02  8:00 UTC (permalink / raw)
  To: dpdk-techboard; +Cc: dpdk-dev

Regarding the ongoing ABI stability project, it is suggested to export driver interfaces as internal.

What are we targeting regarding ABI and API stability towards drivers?

-Morten


^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-07-01 15:09  4%         ` Tyler Retzlaff
@ 2021-07-02  6:30  4%           ` Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-07-02  6:30 UTC (permalink / raw)
  To: Tyler Retzlaff; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen



On 01/07/2021 16:09, Tyler Retzlaff wrote:
> On Thu, Jul 01, 2021 at 11:19:27AM +0100, Kinsella, Ray wrote:
>>
>>
>> On 30/06/2021 20:56, Tyler Retzlaff wrote:
>>> On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
>>>>
>>>>
>>>>>> +Promotion to stable
>>>>>> +~~~~~~~~~~~~~~~~~~~
>>>>>> +
>>>>>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
>>>>>> +once a maintainer and/or the original contributor is satisfied that the API is
>>>>>> +reasonably mature. In exceptional circumstances, should an API still be
>>>>>
>>>>> this seems vague and arbitrary. is there a way we can have a more
>>>>> quantitative metric for what "reasonably mature" means.
>>>>>
>>>>>> +classified as ``experimental`` after two years and is without any prospect of
>>>>>> +becoming part of the stable API. The API will then become a candidate for
>>>>>> +removal, to avoid the acculumation of abandoned symbols.
>>>>>
>>>>> i think with the above comment the basis for removal then depends on
>>>>> whatever metric is used to determine maturity. 
>>>>> if it is still changing
>>>>> then it seems like it is useful and still evolving so perhaps should not
>>>>> be removed but hasn't changed but doesn't meet the metric for being made
>>>>> stable then perhaps it becomes a candidate for removal.
>>>>
>>>> Good idea. 
>>>>
>>>> I think it is reasonable to add a clause that indicates that any change 
>>>> to the "API signature" would reset the clock.
>>>
>>> a time based strategy works but i guess the follow-on to that is how is
>>> the clock tracked and how does it get updated? i don't think trying to
>>> troll through git history will be effective.
>>>
>>> one nit, i think "api signature" doesn't cover all cases of what i would
>>> regard as change. i would prefer to define it as "no change where api/abi
>>> compatibility or semantic change occurred"? which is a lot more strict
>>> but in practice is necessary to support binaries when abi/api is stable.
>>>
>>> i.e. if a recompile is necessary with or without code change then it's a
>>> change.
>>
>> Having thought a bit ... this becomes a bit problematic.
>>
>> Many data-structures in DPDK are nested, 
>> these can have a ripple effect when changed - a change to mbuf is a good example.
>>
>> What I saying is ...
>> I don't think changes in ABI due to in-direct reasons should count.
>> If there is a change due to a deliberate change in the ABI signature 
>> that is fine, reset the clock.
>>
>>
>> If there is a change due to some nested data-structure, 
>> 3-levels down changing in my book that doesn't count. 
> 
> it has to count otherwise dpdk's abi stability promise for major version
> releases is meaningless. or are you suggesting it doesn't count for the
> purpose of determining whether or not an experimental api/abi has
> changed?
"it doesn't count for the purpose of determining whether or not an experimental api/abi has changed?".

Exactly - that is what I meant - apologies if I was unclear. 
In this case the change is not a deliberate act, 
in that it is not really happening because of any maturing of the ABI.

> 
>> As that may or may not have been deliberate, and is almost impossible to police. 
>>
>> Checking anything but a deliberate change to the ABI signature,
>> would be practically impossible IMHO. 
> 
> well, it isn't impossible but it does take knowledge, mechanism and
> process maintain the abi for a major version.

100% agree with this statement.

What do you think of the v3?


 

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] DPDK Release Status Meeting 01/07/2021
@ 2021-07-01 16:30  4% Mcnamara, John
  0 siblings, 0 replies; 200+ results
From: Mcnamara, John @ 2021-07-01 16:30 UTC (permalink / raw)
  To: dev; +Cc: thomas, Yigit, Ferruh

Release status meeting minutes {Date}
=====================================
:Date: 1 July 2021
:toc:

.Agenda:
* Release Dates
* Subtrees
* Roadmaps
* LTS
* Defects
* Opens

.Participants:
* Broadcom
* Canonical
* Debian/Microsoft
* Intel
* Marvell
* Nvidia
* Red Hat


Release Dates
-------------

* `v21.08` dates
  - Proposal/V1:    Wednesday, 2 June (completed)
  - -rc1:           Wednesday, 7 July
  - Release:        Tuesday,   3 August

* Note: We need to hold to the early August release date since
  several of the maintainers will be on holidays after that.

* `v21.11` dates (proposed and subject to discussion)
  - Proposal/V1:    Friday, 10 September
  - -rc1:           Friday, 15 October
  - Release:        Friday, 19 November

Subtrees
--------

* main
  - Backlog is a little big at the moment. RC1 will probably slip to Wednesday 7th July.
  - Most subtrees PRs are ready or close to ready.
  - Still waiting update on Solarflare patches.
  - New auxiliary bus patch series should go into this release.

* next-net
  - Testpmd patchset for Windows.
  - Looking at net/sfc patches.

* next-crypto
  - 4 new PMDs in this release:
    ** CNXK - reviewed - awaiting final version for RC1.
    ** MLX - still in progress. New version will be sent today.
    ** Intel QAT - under review.
    ** NXP baseband - requires new version.

* next-eventdev
  - PR for RC1 will be completed today.

* next-virtio
  - PR posted yesterday.

* next-net-brcm
 - All patches in sub-tree waiting to be pulled.

* next-net-intel
  - Proceeding okay. No issues

* next-net-mlx
  - PR not pulled due to comments that need to be addressed.
  - New version sent today.

* next-net-mrvl
  - Pull request for RC1 sent.


LTS
---

* `v19.11` (next version is `v19.11.9`)
  - RC3 tagged.
  - Target release date July 2, however there are some late reported
    MLX regressions that are under investigation.
  - There are 2 other known issues:
    ** Plenty of GCC11 and Clang build issues were fixed, but 19.11.9
       is not yet compatible with clang 12.0.0. Fixes are discussed
       and a potential 3 backports identified for 19.11.10:
       https://bugs.dpdk.org/show_bug.cgi?id=733
    ** Due to a kernel patch backport in SUSE Linux Enterprise Server 15
       SP3 6, compilation of kni fails there:
       https://bugs.dpdk.org/show_bug.cgi?id=728

* `v20.11` (next version is `v20.11.2`)
  - RC2 released
  - Some test reports coming in (Intel, MLX)
  - 6 July is proposed release date.

* Distros
  - v20.11 in Debian 11
  - v20.11 in Ubuntu 21.04


Defects
-------

* Bugzilla links, 'Bugs',  added for hosted projects
  - https://www.dpdk.org/hosted-projects/


Opens
-----

* There in an ongoing initiative around ABI stability which was
  discussed in the Tech Board call. A workgroup has come up
  with a list of critical and major changes required to let us
  extend the ABI without as much disruption. For example:

  ** export driver interfaces as internal
  ** hide more structs (may require uninlining)
  ** split big structs + new feature-specific functions Major
  ** remove enum maximums
  ** reserved space initialized to 0
  ** reserved flags cleared

* We need to fill details and volunteers in this table:
  https://docs.google.com/spreadsheets/d/1betlC000ua5SsSiJIcC54mCCCJnW6voH5Dqv9UxeyfE/edit?usp=sharing

* The DPDK North America Summit will be on July 12-13. Registration is free.
  https://events.linuxfoundation.org/dpdk-summit-north-america/



.DPDK Release Status Meetings
*****
The DPDK Release Status Meeting is intended for DPDK Committers to discuss the status of the master tree and sub-trees, and for project managers to track progress or milestone dates.

The meeting occurs on every Thursdays at 8:30 UTC. on https://meet.jit.si/DPDK

If you wish to attend just send an email to "John McNamara <john.mcnamara@intel.com>" for the invite.
*****

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-07-01 10:19  4%       ` Kinsella, Ray
@ 2021-07-01 15:09  4%         ` Tyler Retzlaff
  2021-07-02  6:30  4%           ` Kinsella, Ray
  0 siblings, 1 reply; 200+ results
From: Tyler Retzlaff @ 2021-07-01 15:09 UTC (permalink / raw)
  To: Kinsella, Ray; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen

On Thu, Jul 01, 2021 at 11:19:27AM +0100, Kinsella, Ray wrote:
> 
> 
> On 30/06/2021 20:56, Tyler Retzlaff wrote:
> > On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
> >>
> >>
> >>>> +Promotion to stable
> >>>> +~~~~~~~~~~~~~~~~~~~
> >>>> +
> >>>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
> >>>> +once a maintainer and/or the original contributor is satisfied that the API is
> >>>> +reasonably mature. In exceptional circumstances, should an API still be
> >>>
> >>> this seems vague and arbitrary. is there a way we can have a more
> >>> quantitative metric for what "reasonably mature" means.
> >>>
> >>>> +classified as ``experimental`` after two years and is without any prospect of
> >>>> +becoming part of the stable API. The API will then become a candidate for
> >>>> +removal, to avoid the acculumation of abandoned symbols.
> >>>
> >>> i think with the above comment the basis for removal then depends on
> >>> whatever metric is used to determine maturity. 
> >>> if it is still changing
> >>> then it seems like it is useful and still evolving so perhaps should not
> >>> be removed but hasn't changed but doesn't meet the metric for being made
> >>> stable then perhaps it becomes a candidate for removal.
> >>
> >> Good idea. 
> >>
> >> I think it is reasonable to add a clause that indicates that any change 
> >> to the "API signature" would reset the clock.
> > 
> > a time based strategy works but i guess the follow-on to that is how is
> > the clock tracked and how does it get updated? i don't think trying to
> > troll through git history will be effective.
> > 
> > one nit, i think "api signature" doesn't cover all cases of what i would
> > regard as change. i would prefer to define it as "no change where api/abi
> > compatibility or semantic change occurred"? which is a lot more strict
> > but in practice is necessary to support binaries when abi/api is stable.
> > 
> > i.e. if a recompile is necessary with or without code change then it's a
> > change.
> 
> Having thought a bit ... this becomes a bit problematic.
> 
> Many data-structures in DPDK are nested, 
> these can have a ripple effect when changed - a change to mbuf is a good example.
> 
> What I saying is ...
> I don't think changes in ABI due to in-direct reasons should count.
> If there is a change due to a deliberate change in the ABI signature 
> that is fine, reset the clock.
>
> 
> If there is a change due to some nested data-structure, 
> 3-levels down changing in my book that doesn't count. 

it has to count otherwise dpdk's abi stability promise for major version
releases is meaningless. or are you suggesting it doesn't count for the
purpose of determining whether or not an experimental api/abi has
changed?

> As that may or may not have been deliberate, and is almost impossible to police. 
> 
> Checking anything but a deliberate change to the ABI signature,
> would be practically impossible IMHO. 

well, it isn't impossible but it does take knowledge, mechanism and
process maintain the abi for a major version.

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-07-01  7:56  0%       ` Ferruh Yigit
@ 2021-07-01 14:45  4%         ` Tyler Retzlaff
  0 siblings, 0 replies; 200+ results
From: Tyler Retzlaff @ 2021-07-01 14:45 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: Kinsella, Ray, dev, thomas, david.marchand, stephen

On Thu, Jul 01, 2021 at 08:56:22AM +0100, Ferruh Yigit wrote:
> On 6/30/2021 8:56 PM, Tyler Retzlaff wrote:
> > On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
> >>
> >>
> >>>> +Promotion to stable
> >>>> +~~~~~~~~~~~~~~~~~~~
> >>>> +
> >>>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
> >>>> +once a maintainer and/or the original contributor is satisfied that the API is
> >>>> +reasonably mature. In exceptional circumstances, should an API still be
> >>>
> >>> this seems vague and arbitrary. is there a way we can have a more
> >>> quantitative metric for what "reasonably mature" means.
> >>>
> >>>> +classified as ``experimental`` after two years and is without any prospect of
> >>>> +becoming part of the stable API. The API will then become a candidate for
> >>>> +removal, to avoid the acculumation of abandoned symbols.
> >>>
> >>> i think with the above comment the basis for removal then depends on
> >>> whatever metric is used to determine maturity. 
> >>> if it is still changing
> >>> then it seems like it is useful and still evolving so perhaps should not
> >>> be removed but hasn't changed but doesn't meet the metric for being made
> >>> stable then perhaps it becomes a candidate for removal.
> >>
> >> Good idea. 
> >>
> >> I think it is reasonable to add a clause that indicates that any change 
> >> to the "API signature" would reset the clock.
> > 
> > a time based strategy works but i guess the follow-on to that is how is
> > the clock tracked and how does it get updated? i don't think trying to
> > troll through git history will be effective.
> > 
> 
> We are grouping the new experimental APIs in the version file based on the
> release they are added with a comment, thanks to Dave. Like:
> 
>         # added in 19.02
>         rte_extmem_attach;
>         rte_extmem_detach;
>         rte_extmem_register;
>         rte_extmem_unregister;
> 
>         # added in 19.05
>         rte_dev_dma_map;
>         rte_dev_dma_unmap;
>         ....
> 
> Please check 'lib/eal/version.map' as sample.
> 
> This enables us easily see the release experimental APIs are added.

this is fine but the subject being discussed is oriented around how long
an api/abi has been unchanged to identify it as a candidate for qualifying
it as stable (not experimental). are you suggesting that if api/abi changes
then it is moved to the -current version to "restart the clock" as it were?


^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v3] doc: policy on the promotion of experimental APIs
  2021-06-29 16:00 21% [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs Ray Kinsella
  2021-06-29 16:28  3% ` Tyler Retzlaff
  2021-07-01 10:31 23% ` [dpdk-dev] [PATCH v2] " Ray Kinsella
@ 2021-07-01 10:38 23% ` Ray Kinsella
  2021-07-07 18:32  0%   ` Tyler Retzlaff
  2 siblings, 1 reply; 200+ results
From: Ray Kinsella @ 2021-07-01 10:38 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, roretzla, ferruh.yigit, thomas,
	david.marchand, stephen, Ray Kinsella

Clarifying the ABI policy on the promotion of experimental APIS to stable.
We have a fair number of APIs that have been experimental for more than
2 years. This policy amendment indicates that these APIs should be
promoted or removed, or should at least form a conservation between the
maintainer and original contributor.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
v2: addressing comments on abi expiry from Tyler Retzlaff.
v3: addressing typos in the git commit message

 doc/guides/contributing/abi_policy.rst | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index 4ad87dbfed..840c295e5d 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -26,9 +26,10 @@ General Guidelines
    symbols is managed with :ref:`ABI Versioning <abi_versioning>`.
 #. The removal of symbols is considered an :ref:`ABI breakage <abi_breakages>`,
    once approved these will form part of the next ABI version.
-#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may
-   be changed or removed without prior notice, as they are not considered part
-   of an ABI version.
+#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may be
+   changed or removed without prior notice, as they are not considered part of
+   an ABI version. The :ref:`experimental <experimental_apis>` status of an API
+   is not an indefinite state.
 #. Updates to the :ref:`minimum hardware requirements <hw_rqmts>`, which drop
    support for hardware which was previously supported, should be treated as an
    ABI change.
@@ -358,3 +359,18 @@ Libraries
 Libraries marked as ``experimental`` are entirely not considered part of an ABI
 version.
 All functions in such libraries may be changed or removed without prior notice.
+
+Promotion to stable
+~~~~~~~~~~~~~~~~~~~
+
+Ordinarily APIs marked as ``experimental`` will be promoted to the stable ABI
+once a maintainer and/or the original contributor is satisfied that the API is
+reasonably mature. In exceptional circumstances, should an API still be
+classified as ``experimental`` after two years and is without any prospect of
+becoming part of the stable API. The API will then become a candidate for
+removal, to avoid the acculumation of abandoned symbols.
+
+Should an API's Binary Interface change during the two year period, usually due
+to a direct change in the to API's signature. It is reasonable for the expiry
+clock to reset. The promotion or removal of symbols will typically form part of
+a conversation between the maintainer and the original contributor.
-- 
2.26.2


^ permalink raw reply	[relevance 23%]

* [dpdk-dev] [PATCH v2] doc: policy on promotion of experimental APIs
  2021-06-29 16:00 21% [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs Ray Kinsella
  2021-06-29 16:28  3% ` Tyler Retzlaff
@ 2021-07-01 10:31 23% ` Ray Kinsella
  2021-07-01 10:38 23% ` [dpdk-dev] [PATCH v3] doc: policy on the " Ray Kinsella
  2 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2021-07-01 10:31 UTC (permalink / raw)
  To: dev
  Cc: bruce.richardson, john.mcnamara, roretzla, ferruh.yigit, thomas,
	david.marchand, stephen, Ray Kinsella

Clarifying the ABI policy on the promotion of experimental APIS to stable.
We have a fair number of APIs that have been experimental for more than
2 years. This policy ammendment indicates that these APIs should be
promoted or removed, or should at least form a conservation between the
maintainer and original contributor.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
v2: addressing comments on abi expiry from Tyler Retzlaff.

 doc/guides/contributing/abi_policy.rst | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index 4ad87dbfed..840c295e5d 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -26,9 +26,10 @@ General Guidelines
    symbols is managed with :ref:`ABI Versioning <abi_versioning>`.
 #. The removal of symbols is considered an :ref:`ABI breakage <abi_breakages>`,
    once approved these will form part of the next ABI version.
-#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may
-   be changed or removed without prior notice, as they are not considered part
-   of an ABI version.
+#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may be
+   changed or removed without prior notice, as they are not considered part of
+   an ABI version. The :ref:`experimental <experimental_apis>` status of an API
+   is not an indefinite state.
 #. Updates to the :ref:`minimum hardware requirements <hw_rqmts>`, which drop
    support for hardware which was previously supported, should be treated as an
    ABI change.
@@ -358,3 +359,18 @@ Libraries
 Libraries marked as ``experimental`` are entirely not considered part of an ABI
 version.
 All functions in such libraries may be changed or removed without prior notice.
+
+Promotion to stable
+~~~~~~~~~~~~~~~~~~~
+
+Ordinarily APIs marked as ``experimental`` will be promoted to the stable ABI
+once a maintainer and/or the original contributor is satisfied that the API is
+reasonably mature. In exceptional circumstances, should an API still be
+classified as ``experimental`` after two years and is without any prospect of
+becoming part of the stable API. The API will then become a candidate for
+removal, to avoid the acculumation of abandoned symbols.
+
+Should an API's Binary Interface change during the two year period, usually due
+to a direct change in the to API's signature. It is reasonable for the expiry
+clock to reset. The promotion or removal of symbols will typically form part of
+a conversation between the maintainer and the original contributor.
-- 
2.26.2


^ permalink raw reply	[relevance 23%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-06-30 19:56  4%     ` Tyler Retzlaff
  2021-07-01  7:56  0%       ` Ferruh Yigit
@ 2021-07-01 10:19  4%       ` Kinsella, Ray
  2021-07-01 15:09  4%         ` Tyler Retzlaff
  1 sibling, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-07-01 10:19 UTC (permalink / raw)
  To: Tyler Retzlaff; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen



On 30/06/2021 20:56, Tyler Retzlaff wrote:
> On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
>>
>>
>>>> +Promotion to stable
>>>> +~~~~~~~~~~~~~~~~~~~
>>>> +
>>>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
>>>> +once a maintainer and/or the original contributor is satisfied that the API is
>>>> +reasonably mature. In exceptional circumstances, should an API still be
>>>
>>> this seems vague and arbitrary. is there a way we can have a more
>>> quantitative metric for what "reasonably mature" means.
>>>
>>>> +classified as ``experimental`` after two years and is without any prospect of
>>>> +becoming part of the stable API. The API will then become a candidate for
>>>> +removal, to avoid the acculumation of abandoned symbols.
>>>
>>> i think with the above comment the basis for removal then depends on
>>> whatever metric is used to determine maturity. 
>>> if it is still changing
>>> then it seems like it is useful and still evolving so perhaps should not
>>> be removed but hasn't changed but doesn't meet the metric for being made
>>> stable then perhaps it becomes a candidate for removal.
>>
>> Good idea. 
>>
>> I think it is reasonable to add a clause that indicates that any change 
>> to the "API signature" would reset the clock.
> 
> a time based strategy works but i guess the follow-on to that is how is
> the clock tracked and how does it get updated? i don't think trying to
> troll through git history will be effective.
> 
> one nit, i think "api signature" doesn't cover all cases of what i would
> regard as change. i would prefer to define it as "no change where api/abi
> compatibility or semantic change occurred"? which is a lot more strict
> but in practice is necessary to support binaries when abi/api is stable.
> 
> i.e. if a recompile is necessary with or without code change then it's a
> change.

Having thought a bit ... this becomes a bit problematic.

Many data-structures in DPDK are nested, 
these can have a ripple effect when changed - a change to mbuf is a good example.

What I saying is ...
I don't think changes in ABI due to in-direct reasons should count.
If there is a change due to a deliberate change in the ABI signature 
that is fine, reset the clock.

If there is a change due to some nested data-structure, 
3-levels down changing in my book that doesn't count. 
As that may or may not have been deliberate, and is almost impossible to police. 

Checking anything but a deliberate change to the ABI signature,
would be practically impossible IMHO. 

> 
>>
>> However equally any changes to the implementation do not reset the clock.
>>
>> Would that work?
> 
> that works for me.

v2 on the way.

> 
>>
>>>
>>>> +
>>>> +The promotion or removal of symbols will typically form part of a conversation
>>>> +between the maintainer and the original contributor.
>>>
>>> this should extend beyond just symbols. there are other changes that
>>> impact the abi where exported symbols don't change. e.g. additions to
>>> return values sets.> 
>>> thanks for working on this.
>>>

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-06-30 19:56  4%     ` Tyler Retzlaff
@ 2021-07-01  7:56  0%       ` Ferruh Yigit
  2021-07-01 14:45  4%         ` Tyler Retzlaff
  2021-07-01 10:19  4%       ` Kinsella, Ray
  1 sibling, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-07-01  7:56 UTC (permalink / raw)
  To: Tyler Retzlaff, Kinsella, Ray; +Cc: dev, thomas, david.marchand, stephen

On 6/30/2021 8:56 PM, Tyler Retzlaff wrote:
> On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
>>
>>
>>>> +Promotion to stable
>>>> +~~~~~~~~~~~~~~~~~~~
>>>> +
>>>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
>>>> +once a maintainer and/or the original contributor is satisfied that the API is
>>>> +reasonably mature. In exceptional circumstances, should an API still be
>>>
>>> this seems vague and arbitrary. is there a way we can have a more
>>> quantitative metric for what "reasonably mature" means.
>>>
>>>> +classified as ``experimental`` after two years and is without any prospect of
>>>> +becoming part of the stable API. The API will then become a candidate for
>>>> +removal, to avoid the acculumation of abandoned symbols.
>>>
>>> i think with the above comment the basis for removal then depends on
>>> whatever metric is used to determine maturity. 
>>> if it is still changing
>>> then it seems like it is useful and still evolving so perhaps should not
>>> be removed but hasn't changed but doesn't meet the metric for being made
>>> stable then perhaps it becomes a candidate for removal.
>>
>> Good idea. 
>>
>> I think it is reasonable to add a clause that indicates that any change 
>> to the "API signature" would reset the clock.
> 
> a time based strategy works but i guess the follow-on to that is how is
> the clock tracked and how does it get updated? i don't think trying to
> troll through git history will be effective.
> 

We are grouping the new experimental APIs in the version file based on the
release they are added with a comment, thanks to Dave. Like:

        # added in 19.02
        rte_extmem_attach;
        rte_extmem_detach;
        rte_extmem_register;
        rte_extmem_unregister;

        # added in 19.05
        rte_dev_dma_map;
        rte_dev_dma_unmap;
        ....

Please check 'lib/eal/version.map' as sample.

This enables us easily see the release experimental APIs are added.

> one nit, i think "api signature" doesn't cover all cases of what i would
> regard as change. i would prefer to define it as "no change where api/abi
> compatibility or semantic change occurred"? which is a lot more strict
> but in practice is necessary to support binaries when abi/api is stable.
> 
> i.e. if a recompile is necessary with or without code change then it's a
> change.
> 
>>
>> However equally any changes to the implementation do not reset the clock.
>>
>> Would that work?
> 
> that works for me.
> 
>>
>>>
>>>> +
>>>> +The promotion or removal of symbols will typically form part of a conversation
>>>> +between the maintainer and the original contributor.
>>>
>>> this should extend beyond just symbols. there are other changes that
>>> impact the abi where exported symbols don't change. e.g. additions to
>>> return values sets.> 
>>> thanks for working on this.
>>>


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-06-29 18:38  0%   ` Kinsella, Ray
@ 2021-06-30 19:56  4%     ` Tyler Retzlaff
  2021-07-01  7:56  0%       ` Ferruh Yigit
  2021-07-01 10:19  4%       ` Kinsella, Ray
  0 siblings, 2 replies; 200+ results
From: Tyler Retzlaff @ 2021-06-30 19:56 UTC (permalink / raw)
  To: Kinsella, Ray; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen

On Tue, Jun 29, 2021 at 07:38:05PM +0100, Kinsella, Ray wrote:
> 
> 
> >> +Promotion to stable
> >> +~~~~~~~~~~~~~~~~~~~
> >> +
> >> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
> >> +once a maintainer and/or the original contributor is satisfied that the API is
> >> +reasonably mature. In exceptional circumstances, should an API still be
> > 
> > this seems vague and arbitrary. is there a way we can have a more
> > quantitative metric for what "reasonably mature" means.
> > 
> >> +classified as ``experimental`` after two years and is without any prospect of
> >> +becoming part of the stable API. The API will then become a candidate for
> >> +removal, to avoid the acculumation of abandoned symbols.
> > 
> > i think with the above comment the basis for removal then depends on
> > whatever metric is used to determine maturity. 
> > if it is still changing
> > then it seems like it is useful and still evolving so perhaps should not
> > be removed but hasn't changed but doesn't meet the metric for being made
> > stable then perhaps it becomes a candidate for removal.
> 
> Good idea. 
> 
> I think it is reasonable to add a clause that indicates that any change 
> to the "API signature" would reset the clock.

a time based strategy works but i guess the follow-on to that is how is
the clock tracked and how does it get updated? i don't think trying to
troll through git history will be effective.

one nit, i think "api signature" doesn't cover all cases of what i would
regard as change. i would prefer to define it as "no change where api/abi
compatibility or semantic change occurred"? which is a lot more strict
but in practice is necessary to support binaries when abi/api is stable.

i.e. if a recompile is necessary with or without code change then it's a
change.

> 
> However equally any changes to the implementation do not reset the clock.
> 
> Would that work?

that works for me.

> 
> > 
> >> +
> >> +The promotion or removal of symbols will typically form part of a conversation
> >> +between the maintainer and the original contributor.
> > 
> > this should extend beyond just symbols. there are other changes that
> > impact the abi where exported symbols don't change. e.g. additions to
> > return values sets.> 
> > thanks for working on this.
> > 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [dpdk-ci] [PATCH v2 2/2] drivers: add octeontx crypto adapter data path
  @ 2021-06-30 16:23  4%       ` Brandon Lo
  0 siblings, 0 replies; 200+ results
From: Brandon Lo @ 2021-06-30 16:23 UTC (permalink / raw)
  To: Akhil Goyal
  Cc: Shijith Thotton, dev, ci, Pavan Nikhilesh Bhagavatula,
	Anoob Joseph, Jerin Jacob Kollanukkaran, abhinandan.gujjar,
	Ankur Dwivedi

Hi Akhil,

I believe the FreeBSD 13 failure appeared because new requirements
were added for drivers/event/octeontx.
The ABI reference was taken at the v21.05 release which was able to
build this driver at the time.
I will try to look for a way to produce a real ABI test.

Thanks,
Brandon

On Wed, Jun 30, 2021 at 4:54 AM Akhil Goyal <gakhil@marvell.com> wrote:
>
> > Added support for crypto adapter OP_FORWARD mode.
> >
> > As OcteonTx CPT crypto completions could be out of order, each crypto op
> > is enqueued to CPT, dequeued from CPT and enqueued to SSO one-by-one.
> >
> > Signed-off-by: Shijith Thotton <sthotton@marvell.com>
> > ---
> This patch shows a CI warning for FreeBSD, but was not able to locate the error/warning in the logs.
> Can anybody confirm what is the issue?
>
> http://mails.dpdk.org/archives/test-report/2021-June/200637.html
>
> Regards,
> Akhil



-- 

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo@iol.unh.edu

www.iol.unh.edu

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-26 23:28  1% Xueming Li
@ 2021-06-30 10:33  0% ` Jiang, YuX
  2021-07-06  2:37  0%   ` Xueming(Steven) Li
  2021-07-06  3:26  0% ` [dpdk-dev] [dpdk-stable] " Kalesh Anakkur Purayil
  1 sibling, 1 reply; 200+ results
From: Jiang, YuX @ 2021-06-30 10:33 UTC (permalink / raw)
  To: Xueming Li, stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani, Walker,
	Benjamin, David Christensen, Govindharajan, Hariprasad,
	Hemant Agrawal, Stokes, Ian, Jerin Jacob, Mcnamara, John,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, Yu,
	PingX, Xu, Qian Q, Raslan Darawsheh, Thomas Monjalon, Peng, Yuan,
	Chen, Zhaoyan

All,
Testing with dpdk v20.11.2-rc2 from Intel looks good, no critical issue is found. All of them are known issues.
Below two issues has been fixed in 20.11.2-rc2:
  1) Fedora34 GCC11 and Clang12 build failed.
  2) dcf_lifecycle/handle_acl_filter_05: after reset port the mac changed.

# Basic Intel(R) NIC testing
*PF(i40e, ixgbe): test scenarios including rte_flow/TSO/Jumboframe/checksum offload/Tunnel, etc. Listed but not all.
- Below two known issues are found.
  1)https://bugs.dpdk.org/show_bug.cgi?id=687 : unit_tests_power/power_cpufreq: unit test failed. This issue is found in 21.05 and not fixed yet.
  2)ddp_gtp_qregion/fd_gtpu_ipv4_dstip: flow director does not work. This issue is found in 21.05, fixed in 21.08.
    Fixed patch link: http://patches.dpdk.org/project/dpdk/patch/20210519032745.707639-1-stevex.yang@intel.com/                         
*VF(i40e,ixgbe): test scenarios including vf-rte_flow/TSO/Jumboframe/checksum offload/Tunnel, Listed but not all.
- No new issues are found.              
*PF/VF(ice): test scenarios including switch features/Flow Director/Advanced RSS/ACL/DCF/Flexible Descriptor and so on, Listed but not all.
- Below 3 known DPDK issues are found. 
  1)rxtx_offload/rxoffload_port: Pkt1 can't be distributed to the same queue. This issue is found in 21.05, fixed in 21.08
    Fixed patch link: http://patches.dpdk.org/project/dpdk/patch/20210527064251.242076-1-dapengx.yu@intel.com/ 
  2)cvl_advanced_iavf_rss: change the SCTP port value, the hash value remains unchanged. This issue is found in 20.11-rc3, fixed in 21.02, but it’s belong to 21.02 new feature, won’t backporting to LTS20.11.
  3)Can't create 512 acl rules after creating a full mask switch rule. This issue is also occurred in dpdk 20.11 and not fixed yet.                     
* Build: cover the build test combination with latest GCC/Clang/ICC version and the popular OS revision such as Ubuntu20.04, CentOS8.3 and so on. Listed but not all.
- All passed.              
* Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test(AVX2+AVX512) test and so on. Listed but not all.
- All passed. No big data drop. 

# Basic cryptodev and virtio testing
* Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing, etc.. Listed but not all.
- One known issues as below:
> (1)The UDP fragmentation offload feature of Virtio-net device can’t be turned on in the VM, kernel issue, bugzilla has been submited: https://bugzilla.kernel.org/show_bug.cgi?id=207075, not fixed yet.                     
* Cryptodev: 
- Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. Listed but not all.
  - All passed.
- Performance test: test scenarios including Thoughput Performance /Cryptodev Latency, etc. Listed but not all.
  - No big data drop.

Best regards,
Yu Jiang

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming Li
> Sent: Sunday, June 27, 2021 7:28 AM
> To: stable@dpdk.org
> Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>;
> Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani <alialnu@nvidia.com>;
> Walker, Benjamin <benjamin.walker@intel.com>; David Christensen
> <drc@linux.vnet.ibm.com>; Govindharajan, Hariprasad
> <hariprasad.govindharajan@intel.com>; Hemant Agrawal
> <hemant.agrawal@nxp.com>; Stokes, Ian <ian.stokes@intel.com>; Jerin
> Jacob <jerinj@marvell.com>; Mcnamara, John <john.mcnamara@intel.com>;
> Ju-Hyoung Lee <juhlee@microsoft.com>; Kevin Traynor
> <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang
> <pezhang@redhat.com>; Yu, PingX <pingx.yu@intel.com>; Xu, Qian Q
> <qian.q.xu@intel.com>; Raslan Darawsheh <rasland@nvidia.com>; Thomas
> Monjalon <thomas@monjalon.net>; Peng, Yuan <yuan.peng@intel.com>;
> Chen, Zhaoyan <zhaoyan.chen@intel.com>; xuemingl@nvidia.com
> Subject: [dpdk-dev] 20.11.2 patches review and test
> 
> Hi all,
> 
> Here is a list of patches targeted for stable release 20.11.2.
> 
> The planned date for the final release is 6th July.
> 
> Please help with testing and validation of your use cases and report any
> issues/results with reply-all to this mail. For the final release the fixes and
> reported validations will be added to the release notes.
> 
> A release candidate tarball can be found at:
> 
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2
> 
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
> 
> Thanks.
> 
> Xueming Li <xuemingl@nvidia.com>
> 
> ---
> Adam Dybkowski (3):
>       common/qat: increase IM buffer size for GEN3
>       compress/qat: enable compression on GEN3
>       crypto/qat: fix null authentication request
> 
> Ajit Khaparde (7):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
>       doc: fix formatting in testpmd guide
>       net/bnxt: fix mismatched type comparison in MAC restore
>       net/bnxt: check PCI config read
>       net/bnxt: fix mismatched type comparison in Rx
> 
> Alvin Zhang (11):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
>       net/ice: fix fast mbuf freeing
>       net/iavf: fix VF to PF command failure handling
>       net/i40e: fix VF RSS configuration
>       net/igc: fix speed configuration
> 
> Anatoly Burakov (3):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
>       power: save original ACPI governor always
> 
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
> 
> Andrew Rybchenko (4):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
>       net/sfc: fix mark support in EF100 native Rx datapath
> 
> Andy Moreton (2):
>       common/sfc_efx/base: limit reported MCDI response length
>       common/sfc_efx/base: add missing MCDI response length checks
> 
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
> 
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
> 
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
> 
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
> 
> Bing Zhao (1):
>       net/mlx5: fix loopback for Direct Verbs queue
> 
> Bruce Richardson (2):
>       build: exclude meson files from examples installation
>       raw/ioat: fix script for configuring small number of queues
> 
> Chaoyong He (1):
>       doc: fix multiport syntax in nfp guide
> 
> Chenbo Xia (1):
>       examples/vhost: check memory table query
> 
> Chengchang Tang (20):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
>       app/testpmd: fix max queue number for Tx offloads
>       net/tap: fix interrupt vector array size
>       net/bonding: fix socket ID check
>       net/tap: check ioctl on restore
>       examples/timer: fix time interval
> 
> Chengwen Feng (50):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
>       mbuf: check shared memory before dumping dynamic space
>       eventdev: remove redundant thread name setting
>       eventdev: fix memory leakage on thread creation failure
>       net/kni: check init result
>       net/hns3: fix mailbox error message
>       net/hns3: fix processing link status message on PF
>       net/hns3: remove unused mailbox macro and struct
>       net/bonding: fix leak on remove
>       net/hns3: fix handling link update
>       net/i40e: fix negative VEB index
>       net/i40e: remove redundant VSI check in Tx queue setup
>       net/virtio: fix getline memory leakage
>       net/hns3: log time delta in decimal format
>       net/hns3: fix time delta calculation
>       net/hns3: remove unused macros
>       net/hns3: fix vector Rx burst limitation
>       net/hns3: remove read when enabling TM QCN error event
>       net/hns3: remove unused VMDq code
>       net/hns3: increase readability in logs
>       raw/ntb: check SPAD user index
>       raw/ntb: check memory allocations
>       ipc: check malloc sync reply result
>       eal: fix service core list parsing
>       ipc: use monotonic clock
>       net/hns3: return error on PCI config write failure
>       net/hns3: fix log on flow director clear
>       net/hns3: clear hash map on flow director clear
>       net/hns3: fix querying flow director counter for out param
>       net/hns3: fix TM QCN error event report by MSI-X
>       net/hns3: fix mailbox message ID in log
>       net/hns3: fix secondary process request start/stop Rx/Tx
>       net/hns3: fix ordering in secondary process initialization
>       net/hns3: fail setting FEC if one bit mode is not supported
>       net/mlx4: fix secondary process initialization ordering
>       net/mlx5: fix secondary process initialization ordering
> 
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
> 
> Ciara Power (2):
>       telemetry: fix race on callbacks list
>       test/crypto: fix return value of a skipped test
> 
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
> 
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
> 
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
> 
> David Christensen (1):
>       config/ppc: reduce number of cores and NUMA nodes
> 
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
> 
> David Hunt (4):
>       test/power: fix CPU frequency check
>       test/power: add turbo mode to frequency check
>       test/power: fix low frequency test when turbo enabled
>       test/power: fix turbo test
> 
> David Marchand (18):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
>       vhost: fix offload flags in Rx path
>       bus/fslmc: remove unused debug macro
>       eal: fix leak in shared lib mode detection
>       event/dpaa2: remove unused macros
>       net/ice/base: fix memory allocation wrapper
>       net/ice: fix leak on thread termination
>       devtools: fix orphan symbols check with busybox
>       net/vhost: restore pseudo TSO support
>       net/ark: fix leak on thread termination
>       build: fix drivers selection without Python
> 
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
> 
> Dmitry Kozlyuk (4):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
>       examples/rxtx_callbacks: fix port ID format specifier
> 
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
> 
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
> 
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
> 
> Ferruh Yigit (9):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
>       drivers/net: fix FW version query
>       net/bnx2x: fix build with GCC 11
>       net/bnx2x: fix build with GCC 11
>       net/ice/base: fix build with GCC 11
>       net/tap: fix build with GCC 11
>       test/table: fix build with GCC 11
> 
> Gregory Etelson (2):
>       app/testpmd: fix tunnel offload flows cleanup
>       net/mlx5: fix tunnel offload private items location
> 
> Guoyang Zhou (1):
>       net/hinic: fix crash in secondary process
> 
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
> 
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
> 
> Heinrich Kuhn (1):
>       net/nfp: fix reporting of RSS capabilities
> 
> Hemant Agrawal (3):
>       ethdev: add missing buses in device iterator
>       crypto/dpaa_sec: affine the thread portal affinity
>       crypto/dpaa2_sec: fix close and uninit functions
> 
> Hongbo Zheng (9):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
>       bpf: fix JSLT validation
>       common/sfc_efx/base: fix dereferencing null pointer
>       power: fix sanity checks for guest channel read
>       net/hns3: fix VF alive notification after config restore
>       examples/l3fwd-power: fix empty poll thresholds
>       net/hns3: fix concurrent interrupt handling
> 
> Huisong Li (23):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
>       net/hns3: fix link status when port is stopped
>       net/hns3: fix link speed when port is down
>       app/testpmd: fix forward lcores number for DCB
>       app/testpmd: fix DCB forwarding configuration
>       app/testpmd: fix DCB re-configuration
>       app/testpmd: verify DCB config during forward config
>       net/hns3: fix Rx/Tx queue numbers check
>       net/hns3: fix requested FC mode rollback
>       net/hns3: remove meaningless packet buffer rollback
>       net/hns3: fix DCB configuration
>       net/hns3: fix DCB reconfiguration
>       net/hns3: fix link speed when VF device is down
> 
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
> 
> Igor Chauskin (2):
>       net/ena: switch memcpy to optimized version
>       net/ena: fix parsing of large LLQ header device argument
> 
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
> 
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
> 
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
> 
> Jerin Jacob (1):
>       examples: fix pkg-config override
> 
> Jiawei Wang (4):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
>       net/mlx5: fix RSS flow item expansion for GRE key
>       net/mlx5: fix RSS flow item expansion for NVGRE
> 
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
> 
> Jiawen Wu (4):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
>       net/txgbe: fix QinQ strip
> 
> Jiayu Hu (2):
>       vhost: fix queue initialization
>       vhost: fix redundant vring status change notification
> 
> Jie Wang (1):
>       net/ice: fix VSI array out of bounds access
> 
> John Daley (2):
>       net/enic: fix flow initialization error handling
>       net/enic: enable GENEVE offload via VNIC configuration
> 
> Juraj Linkeš (1):
>       eal/arm64: fix platform register bit
> 
> Kai Ji (2):
>       test/crypto: fix auth-cipher compare length in OOP
>       test/crypto: copy offset data to OOP destination buffer
> 
> Kalesh AP (23):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
>       net/bnxt: remove unnecessary forward declarations
>       net/bnxt: remove unused function parameters
>       net/bnxt: drop unused attribute
>       net/bnxt: fix single PF per port check
>       net/bnxt: prevent device access in error state
> 
> Kamil Vojanec (1):
>       net/mlx5/linux: fix firmware version
> 
> Kevin Traynor (5):
>       test/cmdline: fix inputs array
>       test/crypto: fix build with GCC 11
>       crypto/zuc: fix build with GCC 11
>       test: fix build with GCC 11
>       test/cmdline: silence clang 12 warning
> 
> Konstantin Ananyev (1):
>       acl: fix build with GCC 11
> 
> Lance Richardson (8):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
>       net/bnxt: fix dynamic VNIC count
>       eal: fix memory mapping on 32-bit target
> 
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
> 
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
> 
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
> 
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
> 
> Matan Azrad (5):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
>       vdpa/mlx5: fix device unplug
> 
> Michael Baum (1):
>       net/mlx5: fix flow age event triggering
> 
> Michal Krawczyk (5):
>       net/ena/base: improve style and comments
>       net/ena/base: fix type conversions by explicit casting
>       net/ena/base: destroy multiple wait events
>       net/ena: fix crash with unsupported device argument
>       net/ena: indicate Rx RSS hash presence
> 
> Min Hu (Connor) (25):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
>       app/eventdev: fix overflow in lcore list parsing
>       test/kni: fix a comment
>       test/kni: check init result
>       net/hns3: fix typos on comments
>       net/e1000: fix flow error message object
>       app/testpmd: fix division by zero on socket memory dump
>       net/kni: warn on stop failure
>       app/bbdev: check memory allocation
>       app/bbdev: fix HARQ error messages
>       raw/skeleton: add missing check after setting attribute
>       test/timer: check memzone allocation
>       app/crypto-perf: check memory allocation
>       examples/flow_classify: fix NUMA check of port and core
>       examples/l2fwd-cat: fix NUMA check of port and core
>       examples/skeleton: fix NUMA check of port and core
>       test: check flow classifier creation
>       test: fix division by zero
> 
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
> 
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
> 
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
> 
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
> 
> Olivier Matz (1):
>       test/mempool: fix object initializer
> 
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
> 
> Pavan Nikhilesh (4):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
>       app/eventdev: fix lcore parsing skipping last core
>       event/octeontx2: fix XAQ pool reconfigure
> 
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
> 
> Qi Zhang (8):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
>       common/iavf: fix duplicated offload bit
> 
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
> 
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
> 
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
> 
> Robin Zhang (6):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
>       net/iavf: fix primary MAC type when starting port
>       net/i40e: fix primary MAC type when starting port
> 
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
> 
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
> 
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
> 
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
> 
> Shijith Thotton (3):
>       eventdev: fix case to initiate crypto adapter service
>       event/octeontx2: fix crypto adapter queue pair operations
>       event/octeontx2: configure crypto adapter xaq pool
> 
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode
> 
> Somnath Kotur (5):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
>       net/bnxt: refactor multi-queue Rx configuration
>       net/bnxt: fix Rx timestamp when FIFO pending bit is set
> 
> Stanislaw Kardach (6):
>       test: proceed if timer subsystem already initialized
>       stack: allow lock-free only on relevant architectures
>       test/distributor: fix worker notification in burst mode
>       test/distributor: fix burst flush on worker quit
>       net/ena: remove endian swap functions
>       net/ena: report default ring size
> 
> Stephen Hemminger (2):
>       kni: refactor user request processing
>       net/bnxt: use prefix on global function
> 
> Suanming Mou (1):
>       net/mlx5: fix counter offset detection
> 
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
> 
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
> 
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
> 
> Thomas Monjalon (18):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
>       regex/octeontx2: remove unused include directory
>       doc: remove PDF requirements
> 
> Tianyu Li (1):
>       net/memif: fix Tx bps statistics for zero-copy
> 
> Timothy McDaniel (2):
>       event/dlb2: remove references to deferred scheduling
>       doc: fix runtime options in DLB2 guide
> 
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
> 
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
> 
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
> 
> Viacheslav Ovsiienko (16):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
>       app/testpmd: fix segment number check
>       net/mlx5: remove drop queue function prototypes
>       net/mlx4: fix buffer leakage on device close
>       net/mlx5: fix probing device in legacy bonding mode
>       net/mlx5: fix receiving queue timestamp format
> 
> Wei Huang (1):
>       raw/ifpga: fix device name format
> 
> Wenjun Wu (3):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
>       net/ice: fix RSS for L2 packet
> 
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
> 
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
> 
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
> 
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
> 
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
> 
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
> 
> Xueming Li (2):
>       version: 20.11.2-rc1
>       net/virtio: fix vectorized Rx queue rearm
> 
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
> 
> Yunjian Wang (5):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map
>       net/mlx4: fix leak when configured repeatedly
>       net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-06-29 16:28  3% ` Tyler Retzlaff
@ 2021-06-29 18:38  0%   ` Kinsella, Ray
  2021-06-30 19:56  4%     ` Tyler Retzlaff
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-29 18:38 UTC (permalink / raw)
  To: Tyler Retzlaff; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen



On 29/06/2021 17:28, Tyler Retzlaff wrote:
> On Tue, Jun 29, 2021 at 05:00:31PM +0100, Ray Kinsella wrote:
>> Clarifying the ABI policy on the promotion of experimental APIS to stable.
>> We have a fair number of APIs that have been experimental for more than
>> 2 years. This policy ammendment indicates that these APIs should be
>> promoted or removed, or should at least form a conservation between the
>> maintainer and original contributor.
>>
>> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
>> ---
>>  doc/guides/contributing/abi_policy.rst | 20 +++++++++++++++++---
>>  1 file changed, 17 insertions(+), 3 deletions(-)
>>
>> diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
>> index 4ad87dbfed..58bc45b8a5 100644
>> --- a/doc/guides/contributing/abi_policy.rst
>> +++ b/doc/guides/contributing/abi_policy.rst
>> @@ -26,9 +26,10 @@ General Guidelines
>>     symbols is managed with :ref:`ABI Versioning <abi_versioning>`.
>>  #. The removal of symbols is considered an :ref:`ABI breakage <abi_breakages>`,
>>     once approved these will form part of the next ABI version.
>> -#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may
>> -   be changed or removed without prior notice, as they are not considered part
>> -   of an ABI version.
>> +#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may be
>> +   changed or removed without prior notice, as they are not considered part of
>> +   an ABI version. The :ref:`experimental <experimental_apis>` status of an API
>> +   is not an indefinite state.
>>  #. Updates to the :ref:`minimum hardware requirements <hw_rqmts>`, which drop
>>     support for hardware which was previously supported, should be treated as an
>>     ABI change.
>> @@ -358,3 +359,16 @@ Libraries
>>  Libraries marked as ``experimental`` are entirely not considered part of an ABI
>>  version.
>>  All functions in such libraries may be changed or removed without prior notice.
>> +
>> +Promotion to stable
>> +~~~~~~~~~~~~~~~~~~~
>> +
>> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
>> +once a maintainer and/or the original contributor is satisfied that the API is
>> +reasonably mature. In exceptional circumstances, should an API still be
> 
> this seems vague and arbitrary. is there a way we can have a more
> quantitative metric for what "reasonably mature" means.
> 
>> +classified as ``experimental`` after two years and is without any prospect of
>> +becoming part of the stable API. The API will then become a candidate for
>> +removal, to avoid the acculumation of abandoned symbols.
> 
> i think with the above comment the basis for removal then depends on
> whatever metric is used to determine maturity. 
> if it is still changing
> then it seems like it is useful and still evolving so perhaps should not
> be removed but hasn't changed but doesn't meet the metric for being made
> stable then perhaps it becomes a candidate for removal.

Good idea. 

I think it is reasonable to add a clause that indicates that any change 
to the "API signature" would reset the clock.

However equally any changes to the implementation do not reset the clock.

Would that work?

> 
>> +
>> +The promotion or removal of symbols will typically form part of a conversation
>> +between the maintainer and the original contributor.
> 
> this should extend beyond just symbols. there are other changes that
> impact the abi where exported symbols don't change. e.g. additions to
> return values sets.> 
> thanks for working on this.
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in eal lib
  2021-06-24 12:14  0% ` David Marchand
  2021-06-24 12:15  0%   ` Kinsella, Ray
@ 2021-06-29 16:50  0%   ` Tyler Retzlaff
  1 sibling, 0 replies; 200+ results
From: Tyler Retzlaff @ 2021-06-29 16:50 UTC (permalink / raw)
  To: David Marchand
  Cc: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, Burakov,
	Anatoly, dpdk-dev

On Thu, Jun 24, 2021 at 02:14:16PM +0200, David Marchand wrote:
> On Thu, Jun 24, 2021 at 12:31 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
> >
> > Hi Anatoly & Thomas,
> >
> > The following eal experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.
> 
> Just an additional comment.
> Marking stable is not the only choice.
> We can also consider hiding such symbols (marking internal) if there
> is no clear usecase out of DPDK.

+1

there has to be a very strong/clear case for promotion to public.

> 
> 
> -- 
> David Marchand

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
  2021-06-29 16:00 21% [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs Ray Kinsella
@ 2021-06-29 16:28  3% ` Tyler Retzlaff
  2021-06-29 18:38  0%   ` Kinsella, Ray
  2021-07-01 10:31 23% ` [dpdk-dev] [PATCH v2] " Ray Kinsella
  2021-07-01 10:38 23% ` [dpdk-dev] [PATCH v3] doc: policy on the " Ray Kinsella
  2 siblings, 1 reply; 200+ results
From: Tyler Retzlaff @ 2021-06-29 16:28 UTC (permalink / raw)
  To: Ray Kinsella; +Cc: dev, ferruh.yigit, thomas, david.marchand, stephen

On Tue, Jun 29, 2021 at 05:00:31PM +0100, Ray Kinsella wrote:
> Clarifying the ABI policy on the promotion of experimental APIS to stable.
> We have a fair number of APIs that have been experimental for more than
> 2 years. This policy ammendment indicates that these APIs should be
> promoted or removed, or should at least form a conservation between the
> maintainer and original contributor.
> 
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> ---
>  doc/guides/contributing/abi_policy.rst | 20 +++++++++++++++++---
>  1 file changed, 17 insertions(+), 3 deletions(-)
> 
> diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
> index 4ad87dbfed..58bc45b8a5 100644
> --- a/doc/guides/contributing/abi_policy.rst
> +++ b/doc/guides/contributing/abi_policy.rst
> @@ -26,9 +26,10 @@ General Guidelines
>     symbols is managed with :ref:`ABI Versioning <abi_versioning>`.
>  #. The removal of symbols is considered an :ref:`ABI breakage <abi_breakages>`,
>     once approved these will form part of the next ABI version.
> -#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may
> -   be changed or removed without prior notice, as they are not considered part
> -   of an ABI version.
> +#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may be
> +   changed or removed without prior notice, as they are not considered part of
> +   an ABI version. The :ref:`experimental <experimental_apis>` status of an API
> +   is not an indefinite state.
>  #. Updates to the :ref:`minimum hardware requirements <hw_rqmts>`, which drop
>     support for hardware which was previously supported, should be treated as an
>     ABI change.
> @@ -358,3 +359,16 @@ Libraries
>  Libraries marked as ``experimental`` are entirely not considered part of an ABI
>  version.
>  All functions in such libraries may be changed or removed without prior notice.
> +
> +Promotion to stable
> +~~~~~~~~~~~~~~~~~~~
> +
> +Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
> +once a maintainer and/or the original contributor is satisfied that the API is
> +reasonably mature. In exceptional circumstances, should an API still be

this seems vague and arbitrary. is there a way we can have a more
quantitative metric for what "reasonably mature" means.

> +classified as ``experimental`` after two years and is without any prospect of
> +becoming part of the stable API. The API will then become a candidate for
> +removal, to avoid the acculumation of abandoned symbols.

i think with the above comment the basis for removal then depends on
whatever metric is used to determine maturity. if it is still changing
then it seems like it is useful and still evolving so perhaps should not
be removed but hasn't changed but doesn't meet the metric for being made
stable then perhaps it becomes a candidate for removal.

> +
> +The promotion or removal of symbols will typically form part of a conversation
> +between the maintainer and the original contributor.

this should extend beyond just symbols. there are other changes that
impact the abi where exported symbols don't change. e.g. additions to
return values sets.

thanks for working on this.

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs
@ 2021-06-29 16:00 21% Ray Kinsella
  2021-06-29 16:28  3% ` Tyler Retzlaff
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Ray Kinsella @ 2021-06-29 16:00 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas, david.marchand, stephen, Ray Kinsella

Clarifying the ABI policy on the promotion of experimental APIS to stable.
We have a fair number of APIs that have been experimental for more than
2 years. This policy ammendment indicates that these APIs should be
promoted or removed, or should at least form a conservation between the
maintainer and original contributor.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
 doc/guides/contributing/abi_policy.rst | 20 +++++++++++++++++---
 1 file changed, 17 insertions(+), 3 deletions(-)

diff --git a/doc/guides/contributing/abi_policy.rst b/doc/guides/contributing/abi_policy.rst
index 4ad87dbfed..58bc45b8a5 100644
--- a/doc/guides/contributing/abi_policy.rst
+++ b/doc/guides/contributing/abi_policy.rst
@@ -26,9 +26,10 @@ General Guidelines
    symbols is managed with :ref:`ABI Versioning <abi_versioning>`.
 #. The removal of symbols is considered an :ref:`ABI breakage <abi_breakages>`,
    once approved these will form part of the next ABI version.
-#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may
-   be changed or removed without prior notice, as they are not considered part
-   of an ABI version.
+#. Libraries or APIs marked as :ref:`experimental <experimental_apis>` may be
+   changed or removed without prior notice, as they are not considered part of
+   an ABI version. The :ref:`experimental <experimental_apis>` status of an API
+   is not an indefinite state.
 #. Updates to the :ref:`minimum hardware requirements <hw_rqmts>`, which drop
    support for hardware which was previously supported, should be treated as an
    ABI change.
@@ -358,3 +359,16 @@ Libraries
 Libraries marked as ``experimental`` are entirely not considered part of an ABI
 version.
 All functions in such libraries may be changed or removed without prior notice.
+
+Promotion to stable
+~~~~~~~~~~~~~~~~~~~
+
+Ordinarily APIs marked as ``experimental`` will be promoted to the stable API
+once a maintainer and/or the original contributor is satisfied that the API is
+reasonably mature. In exceptional circumstances, should an API still be
+classified as ``experimental`` after two years and is without any prospect of
+becoming part of the stable API. The API will then become a candidate for
+removal, to avoid the acculumation of abandoned symbols.
+
+The promotion or removal of symbols will typically form part of a conversation
+between the maintainer and the original contributor.
-- 
2.26.2


^ permalink raw reply	[relevance 21%]

* [dpdk-dev] [PATCH v5 4/7] power: remove thread safety from PMD power API's
    2021-06-29 15:48  3%         ` [dpdk-dev] [PATCH v5 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-06-29 15:48  3%         ` Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-29 15:48 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: konstantin.ananyev, ciara.loftus

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   5 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 67 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index 9d1cfac395..f015c509fc 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -88,6 +88,11 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
+
 ABI Changes
 -----------
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v5 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-06-29 15:48  3%         ` Anatoly Burakov
  2021-06-29 15:48  3%         ` [dpdk-dev] [PATCH v5 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-29 15:48 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: david.hunt, ciara.loftus

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---

Notes:
    v4:
    - Return error if callback is set to NULL
    - Replace raw number with a macro in monitor condition opaque data
    
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  1 +
 drivers/event/dlb2/dlb2.c                     | 17 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 20 +++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 20 +++++++----
 drivers/net/ice/ice_rxtx.c                    | 20 +++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 20 +++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 17 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 33 +++++++++++++++----
 lib/eal/x86/rte_power_intrinsics.c            | 17 +++++-----
 9 files changed, 121 insertions(+), 44 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index a6ecfdf3ce..c84ac280f5 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -84,6 +84,7 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..252bbd8d5e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,16 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3204,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decece..081682f88b 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,18 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +105,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0361af0d85..7ed196ec22 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,18 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +81,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index fc9bb5a3e7..d12437d19d 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,18 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +51,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..c814a28cb4 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,18 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1393,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 777a1d6e45..17370b77dc 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,18 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +294,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..c9aa52a86d 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,38 @@
  * which are architecture-dependent.
  */
 
+/** Size of the opaque data in monitor condition */
+#define RTE_POWER_MONITOR_OPAQUE_SZ 4
+
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ];
+	/**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..66fea28897 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -76,6 +76,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
+	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
 	if (!wait_supported)
@@ -91,6 +92,9 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (__check_val_size(pmc->size) < 0)
 		return -EINVAL;
 
+	if (pmc->fn == NULL)
+		return -EINVAL;
+
 	s = &wait_status[lcore_id];
 
 	/* update sleep address */
@@ -110,16 +114,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
-		const uint64_t cur_value = __get_umwait_val(
-				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
+	cur_value = __get_umwait_val(pmc->addr, pmc->size);
 
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
-			goto end;
-	}
+	/* check if callback indicates we should abort */
+	if (pmc->fn(cur_value, pmc->opaque) != 0)
+		goto end;
 
 	/* execute UMWAIT */
 	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 4/7] power: remove thread safety from PMD power API's
    2021-06-28 15:54  3%       ` [dpdk-dev] [PATCH v4 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-06-28 15:54  3%       ` Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-28 15:54 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: konstantin.ananyev, ciara.loftus

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   5 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 67 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index 9d1cfac395..f015c509fc 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -88,6 +88,11 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
+
 ABI Changes
 -----------
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v4 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-06-28 15:54  3%       ` Anatoly Burakov
  2021-06-28 15:54  3%       ` [dpdk-dev] [PATCH v4 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-28 15:54 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: david.hunt, ciara.loftus

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value in a specific way.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
---

Notes:
    v4:
    - Return error if callback is set to NULL
    - Replace raw number with a macro in monitor condition opaque data
    
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  1 +
 drivers/event/dlb2/dlb2.c                     | 17 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 20 +++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 20 +++++++----
 drivers/net/ice/ice_rxtx.c                    | 20 +++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 20 +++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 17 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 33 +++++++++++++++----
 lib/eal/x86/rte_power_intrinsics.c            | 17 +++++-----
 9 files changed, 121 insertions(+), 44 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index a6ecfdf3ce..c84ac280f5 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -84,6 +84,7 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..252bbd8d5e 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,16 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3204,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decece..081682f88b 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,18 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +105,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0361af0d85..7ed196ec22 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,18 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +81,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index fc9bb5a3e7..d12437d19d 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,18 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +51,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..c814a28cb4 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,18 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value,
+		const uint64_t arg[RTE_POWER_MONITOR_OPAQUE_SZ] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1393,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 777a1d6e45..17370b77dc 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,18 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +294,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..c9aa52a86d 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,38 @@
  * which are architecture-dependent.
  */
 
+/** Size of the opaque data in monitor condition */
+#define RTE_POWER_MONITOR_OPAQUE_SZ 4
+
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[RTE_POWER_MONITOR_OPAQUE_SZ];
+	/**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..66fea28897 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -76,6 +76,7 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	const uint32_t tsc_h = (uint32_t)(tsc_timestamp >> 32);
 	const unsigned int lcore_id = rte_lcore_id();
 	struct power_wait_status *s;
+	uint64_t cur_value;
 
 	/* prevent user from running this instruction if it's not supported */
 	if (!wait_supported)
@@ -91,6 +92,9 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	if (__check_val_size(pmc->size) < 0)
 		return -EINVAL;
 
+	if (pmc->fn == NULL)
+		return -EINVAL;
+
 	s = &wait_status[lcore_id];
 
 	/* update sleep address */
@@ -110,16 +114,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
-		const uint64_t cur_value = __get_umwait_val(
-				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
+	cur_value = __get_umwait_val(pmc->addr, pmc->size);
 
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
-			goto end;
-	}
+	/* check if callback indicates we should abort */
+	if (pmc->fn(cur_value, pmc->opaque) != 0)
+		goto end;
 
 	/* execute UMWAIT */
 	asm volatile(".byte 0xf2, 0x0f, 0xae, 0xf7;"
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 4/7] power: remove thread safety from PMD power API's
    2021-06-28 12:41  3%     ` [dpdk-dev] [PATCH v3 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-06-28 12:41  3%     ` Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-28 12:41 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: ciara.loftus

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   5 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 67 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index 9d1cfac395..f015c509fc 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -88,6 +88,11 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
+
 ABI Changes
 -----------
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v3 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-06-28 12:41  3%     ` Anatoly Burakov
  2021-06-28 12:41  3%     ` [dpdk-dev] [PATCH v3 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-28 12:41 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: david.hunt, ciara.loftus

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  1 +
 drivers/event/dlb2/dlb2.c                     | 16 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 19 ++++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 19 ++++++++----
 drivers/net/ice/ice_rxtx.c                    | 19 ++++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 19 ++++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 16 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 29 ++++++++++++++-----
 lib/eal/x86/rte_power_intrinsics.c            |  9 ++----
 9 files changed, 106 insertions(+), 41 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index a6ecfdf3ce..c84ac280f5 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -84,6 +84,7 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..14dfac257c 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,15 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val, const uint64_t opaque[4])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3203,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decece..45f3fbf4ec 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,17 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +104,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0361af0d85..6e12ecce07 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,17 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +80,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index fc9bb5a3e7..278eb4b9a1 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,17 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +50,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..0c5045d9dc 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,17 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1392,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 777a1d6e45..57f6ca1467 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,17 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value, const uint64_t opaque[4])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +293,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..046667ade6 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,34 @@
  * which are architecture-dependent.
  */
 
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[4]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[4]; /**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..3c5c9ce7ad 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -110,14 +110,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
+	/* if we have a callback, we might not need to sleep at all */
+	if (pmc->fn) {
 		const uint64_t cur_value = __get_umwait_val(
 				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
-
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
+		if (pmc->fn(cur_value, pmc->opaque) != 0)
 			goto end;
 	}
 
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] Experimental symbols in kni lib
  2021-06-25 13:26  0%     ` Igor Ryzhov
@ 2021-06-28 12:23  0%       ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-28 12:23 UTC (permalink / raw)
  To: Igor Ryzhov; +Cc: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev

On 6/25/2021 2:26 PM, Igor Ryzhov wrote:
> Hi Ferruh, all,
> 
> Let's please discuss another approach to setting KNI link status before
> making this API stable:
> http://patches.dpdk.org/project/dpdk/patch/20190925093623.18419-1-iryzhov@nfware.com/
> 
> I explained the problem with the current implementation there.
> More than that, using ioctl approach makes it possible to set also speed
> and duplex and use them to implement get_link_ksettings callback.
> I can send patches for both features.
> 

Hi Igor, agree to discuss your patch before promoting the API, I will comment on
the outstanding patch.

> Igor
> 
> On Thu, Jun 24, 2021 at 4:54 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
> 
>> Sounds more than reasonable, +1 from me.
>>
>> Ray K
>>
>> On 24/06/2021 14:24, Ferruh Yigit wrote:
>>> On 6/24/2021 11:42 AM, Kinsella, Ray wrote:
>>>> Hi Ferruh,
>>>>
>>>> The following kni experimental symbols are present in both v21.05 and
>> v19.11 release. These symbols should be considered for promotion to stable
>> as part of the v22 ABI in DPDK 21.11, as they have been experimental for >=
>> 2yrs at this point.
>>>>
>>>>  * rte_kni_update_link
>>>>
>>>> Ray K
>>>>
>>>
>>> Hi Ray,
>>>
>>> Thanks for follow up.
>>>
>>> I just checked the API and planning a small behavior update to it.
>>> If the update is accepted, I suggest keeping the API experimental for
>> 21.08 too,
>>> but can mature it on v21.11.
>>>
>>> Thanks,
>>> ferruh
>>>
>>


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] 20.11.2 patches review and test
@ 2021-06-26 23:28  1% Xueming Li
  2021-06-30 10:33  0% ` Jiang, YuX
  2021-07-06  3:26  0% ` [dpdk-dev] [dpdk-stable] " Kalesh Anakkur Purayil
  0 siblings, 2 replies; 200+ results
From: Xueming Li @ 2021-06-26 23:28 UTC (permalink / raw)
  To: stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, Thomas Monjalon, yuan.peng,
	zhaoyan.chen, xuemingl

Hi all,

Here is a list of patches targeted for stable release 20.11.2.

The planned date for the final release is 6th July.

Please help with testing and validation of your use cases and report
any issues/results with reply-all to this mail. For the final release
the fixes and reported validations will be added to the release notes.

A release candidate tarball can be found at:

    https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2

These patches are located at branch 20.11 of dpdk-stable repo:
    https://dpdk.org/browse/dpdk-stable/

Thanks.

Xueming Li <xuemingl@nvidia.com>

---
Adam Dybkowski (3):
      common/qat: increase IM buffer size for GEN3
      compress/qat: enable compression on GEN3
      crypto/qat: fix null authentication request

Ajit Khaparde (7):
      net/bnxt: fix RSS context cleanup
      net/bnxt: check kvargs parsing
      net/bnxt: fix resource cleanup
      doc: fix formatting in testpmd guide
      net/bnxt: fix mismatched type comparison in MAC restore
      net/bnxt: check PCI config read
      net/bnxt: fix mismatched type comparison in Rx

Alvin Zhang (11):
      net/ice: fix VLAN filter with PF
      net/i40e: fix input set field mask
      net/igc: fix Rx RSS hash offload capability
      net/igc: fix Rx error counter for bad length
      net/e1000: fix Rx error counter for bad length
      net/e1000: fix max Rx packet size
      net/igc: fix Rx packet size
      net/ice: fix fast mbuf freeing
      net/iavf: fix VF to PF command failure handling
      net/i40e: fix VF RSS configuration
      net/igc: fix speed configuration

Anatoly Burakov (3):
      fbarray: fix log message on truncation error
      power: do not skip saving original P-state governor
      power: save original ACPI governor always

Andrew Boyer (1):
      net/ionic: fix completion type in lif init

Andrew Rybchenko (4):
      net/failsafe: fix RSS hash offload reporting
      net/failsafe: report minimum and maximum MTU
      common/sfc_efx: remove GENEVE from supported tunnels
      net/sfc: fix mark support in EF100 native Rx datapath

Andy Moreton (2):
      common/sfc_efx/base: limit reported MCDI response length
      common/sfc_efx/base: add missing MCDI response length checks

Ankur Dwivedi (1):
      crypto/octeontx: fix session-less mode

Apeksha Gupta (1):
      examples/l2fwd-crypto: skip masked devices

Arek Kusztal (1):
      crypto/qat: fix offset for out-of-place scatter-gather

Beilei Xing (1):
      net/i40evf: fix packet loss for X722

Bing Zhao (1):
      net/mlx5: fix loopback for Direct Verbs queue

Bruce Richardson (2):
      build: exclude meson files from examples installation
      raw/ioat: fix script for configuring small number of queues

Chaoyong He (1):
      doc: fix multiport syntax in nfp guide

Chenbo Xia (1):
      examples/vhost: check memory table query

Chengchang Tang (20):
      net/hns3: fix HW buffer size on MTU update
      net/hns3: fix processing Tx offload flags
      net/hns3: fix Tx checksum for UDP packets with special port
      net/hns3: fix long task queue pairs reset time
      ethdev: validate input in module EEPROM dump
      ethdev: validate input in register info
      ethdev: validate input in EEPROM info
      net/hns3: fix rollback after setting PVID failure
      net/hns3: fix timing in resetting queues
      net/hns3: fix queue state when concurrent with reset
      net/hns3: fix configure FEC when concurrent with reset
      net/hns3: fix use of command status enumeration
      examples: add eal cleanup to examples
      net/bonding: fix adding itself as its slave
      net/hns3: fix timing in mailbox
      app/testpmd: fix max queue number for Tx offloads
      net/tap: fix interrupt vector array size
      net/bonding: fix socket ID check
      net/tap: check ioctl on restore
      examples/timer: fix time interval

Chengwen Feng (50):
      net/hns3: fix flow counter value
      net/hns3: fix VF mailbox head field
      net/hns3: support get device version when dump register
      net/hns3: fix some packet types
      net/hns3: fix missing outer L4 UDP flag for VXLAN
      net/hns3: remove VLAN/QinQ ptypes from support list
      test: check thread creation
      common/dpaax: fix possible null pointer access
      examples/ethtool: remove unused parsing
      net/hns3: fix flow director lock
      net/e1000/base: fix timeout for shadow RAM write
      net/hns3: fix setting default MAC address in bonding of VF
      net/hns3: fix possible mismatched response of mailbox
      net/hns3: fix VF handling LSC event in secondary process
      net/hns3: fix verification of NEON support
      mbuf: check shared memory before dumping dynamic space
      eventdev: remove redundant thread name setting
      eventdev: fix memory leakage on thread creation failure
      net/kni: check init result
      net/hns3: fix mailbox error message
      net/hns3: fix processing link status message on PF
      net/hns3: remove unused mailbox macro and struct
      net/bonding: fix leak on remove
      net/hns3: fix handling link update
      net/i40e: fix negative VEB index
      net/i40e: remove redundant VSI check in Tx queue setup
      net/virtio: fix getline memory leakage
      net/hns3: log time delta in decimal format
      net/hns3: fix time delta calculation
      net/hns3: remove unused macros
      net/hns3: fix vector Rx burst limitation
      net/hns3: remove read when enabling TM QCN error event
      net/hns3: remove unused VMDq code
      net/hns3: increase readability in logs
      raw/ntb: check SPAD user index
      raw/ntb: check memory allocations
      ipc: check malloc sync reply result
      eal: fix service core list parsing
      ipc: use monotonic clock
      net/hns3: return error on PCI config write failure
      net/hns3: fix log on flow director clear
      net/hns3: clear hash map on flow director clear
      net/hns3: fix querying flow director counter for out param
      net/hns3: fix TM QCN error event report by MSI-X
      net/hns3: fix mailbox message ID in log
      net/hns3: fix secondary process request start/stop Rx/Tx
      net/hns3: fix ordering in secondary process initialization
      net/hns3: fail setting FEC if one bit mode is not supported
      net/mlx4: fix secondary process initialization ordering
      net/mlx5: fix secondary process initialization ordering

Ciara Loftus (1):
      net/af_xdp: fix error handling during Rx queue setup

Ciara Power (2):
      telemetry: fix race on callbacks list
      test/crypto: fix return value of a skipped test

Conor Walsh (1):
      examples/l3fwd: fix LPM IPv6 subnets

Cristian Dumitrescu (3):
      table: fix actions with different data size
      pipeline: fix instruction translation
      pipeline: fix endianness conversions

Dapeng Yu (3):
      net/igc: remove MTU setting limitation
      net/e1000: remove MTU setting limitation
      examples/packet_ordering: fix port configuration

David Christensen (1):
      config/ppc: reduce number of cores and NUMA nodes

David Harton (1):
      net/ena: fix releasing Tx ring mbufs

David Hunt (4):
      test/power: fix CPU frequency check
      test/power: add turbo mode to frequency check
      test/power: fix low frequency test when turbo enabled
      test/power: fix turbo test

David Marchand (18):
      doc: fix sphinx rtd theme import in GHA
      service: clean references to removed symbol
      eal: fix evaluation of log level option
      ci: hook to GitHub Actions
      ci: enable v21 ABI checks
      ci: fix package installation in GitHub Actions
      ci: ignore APT update failure in GitHub Actions
      ci: catch coredumps
      vhost: fix offload flags in Rx path
      bus/fslmc: remove unused debug macro
      eal: fix leak in shared lib mode detection
      event/dpaa2: remove unused macros
      net/ice/base: fix memory allocation wrapper
      net/ice: fix leak on thread termination
      devtools: fix orphan symbols check with busybox
      net/vhost: restore pseudo TSO support
      net/ark: fix leak on thread termination
      build: fix drivers selection without Python

Dekel Peled (1):
      common/mlx5: fix DevX read output buffer size

Dmitry Kozlyuk (4):
      net/pcap: fix format string
      eal/windows: add missing SPDX license tag
      buildtools: fix all drivers disabled on Windows
      examples/rxtx_callbacks: fix port ID format specifier

Ed Czeck (2):
      net/ark: update packet director initial state
      net/ark: refactor Rx buffer recovery

Elad Nachman (2):
      kni: support async user request
      kni: fix kernel deadlock with bifurcated device

Feifei Wang (2):
      net/i40e: fix parsing packet type for NEON
      test/trace: fix race on collected perf data

Ferruh Yigit (9):
      power: remove duplicated symbols from map file
      log/linux: make default output stderr
      license: fix typos
      drivers/net: fix FW version query
      net/bnx2x: fix build with GCC 11
      net/bnx2x: fix build with GCC 11
      net/ice/base: fix build with GCC 11
      net/tap: fix build with GCC 11
      test/table: fix build with GCC 11

Gregory Etelson (2):
      app/testpmd: fix tunnel offload flows cleanup
      net/mlx5: fix tunnel offload private items location

Guoyang Zhou (1):
      net/hinic: fix crash in secondary process

Haiyue Wang (1):
      net/ixgbe: fix Rx errors statistics for UDP checksum

Harman Kalra (1):
      event/octeontx2: fix device reconfigure for single slot

Heinrich Kuhn (1):
      net/nfp: fix reporting of RSS capabilities

Hemant Agrawal (3):
      ethdev: add missing buses in device iterator
      crypto/dpaa_sec: affine the thread portal affinity
      crypto/dpaa2_sec: fix close and uninit functions

Hongbo Zheng (9):
      app/testpmd: fix Tx/Rx descriptor query error log
      net/hns3: fix FLR miss detection
      net/hns3: delete redundant blank line
      bpf: fix JSLT validation
      common/sfc_efx/base: fix dereferencing null pointer
      power: fix sanity checks for guest channel read
      net/hns3: fix VF alive notification after config restore
      examples/l3fwd-power: fix empty poll thresholds
      net/hns3: fix concurrent interrupt handling

Huisong Li (23):
      net/hns3: fix device capabilities for copper media type
      net/hns3: remove unused parameter markers
      net/hns3: fix reporting undefined speed
      net/hns3: fix link update when failed to get link info
      net/hns3: fix flow control exception
      app/testpmd: fix bitmap of link speeds when force speed
      net/hns3: fix flow control mode
      net/hns3: remove redundant mailbox response
      net/hns3: fix DCB mode check
      net/hns3: fix VMDq mode check
      net/hns3: fix mbuf leakage
      net/hns3: fix link status when port is stopped
      net/hns3: fix link speed when port is down
      app/testpmd: fix forward lcores number for DCB
      app/testpmd: fix DCB forwarding configuration
      app/testpmd: fix DCB re-configuration
      app/testpmd: verify DCB config during forward config
      net/hns3: fix Rx/Tx queue numbers check
      net/hns3: fix requested FC mode rollback
      net/hns3: remove meaningless packet buffer rollback
      net/hns3: fix DCB configuration
      net/hns3: fix DCB reconfiguration
      net/hns3: fix link speed when VF device is down

Ibtisam Tariq (1):
      examples/vhost_crypto: remove unused short option

Igor Chauskin (2):
      net/ena: switch memcpy to optimized version
      net/ena: fix parsing of large LLQ header device argument

Igor Russkikh (2):
      net/qede: reduce log verbosity
      net/qede: accept bigger RSS table

Ilya Maximets (1):
      net/virtio: fix interrupt unregistering for listening socket

Ivan Malov (5):
      net/sfc: fix buffer size for flow parse
      net: fix comment in IPv6 header
      net/sfc: fix error path inconsistency
      common/sfc_efx/base: fix indication of MAE encap support
      net/sfc: fix outer rule rollback on error

Jerin Jacob (1):
      examples: fix pkg-config override

Jiawei Wang (4):
      app/testpmd: fix NVGRE encap configuration
      net/mlx5: fix resource release for mirror flow
      net/mlx5: fix RSS flow item expansion for GRE key
      net/mlx5: fix RSS flow item expansion for NVGRE

Jiawei Zhu (1):
      net/mlx5: fix Rx segmented packets on mbuf starvation

Jiawen Wu (4):
      net/txgbe: remove unused functions
      net/txgbe: fix Rx missed packet counter
      net/txgbe: update packet type
      net/txgbe: fix QinQ strip

Jiayu Hu (2):
      vhost: fix queue initialization
      vhost: fix redundant vring status change notification

Jie Wang (1):
      net/ice: fix VSI array out of bounds access

John Daley (2):
      net/enic: fix flow initialization error handling
      net/enic: enable GENEVE offload via VNIC configuration

Juraj Linkeš (1):
      eal/arm64: fix platform register bit

Kai Ji (2):
      test/crypto: fix auth-cipher compare length in OOP
      test/crypto: copy offset data to OOP destination buffer

Kalesh AP (23):
      net/bnxt: remove unused macro
      net/bnxt: fix VNIC configuration
      net/bnxt: fix firmware fatal error handling
      net/bnxt: fix FW readiness check during recovery
      net/bnxt: fix device readiness check
      net/bnxt: fix VF info allocation
      net/bnxt: fix HWRM and FW incompatibility handling
      net/bnxt: mute some failure logs
      app/testpmd: check MAC address query
      net/bnxt: fix PCI write check
      net/bnxt: fix link state operations
      net/bnxt: fix timesync when PTP is not supported
      net/bnxt: fix memory allocation for command response
      net/bnxt: fix double free in port start failure
      net/bnxt: fix configuring LRO
      net/bnxt: fix health check alarm cancellation
      net/bnxt: fix PTP support for Thor
      net/bnxt: fix ring count calculation for Thor
      net/bnxt: remove unnecessary forward declarations
      net/bnxt: remove unused function parameters
      net/bnxt: drop unused attribute
      net/bnxt: fix single PF per port check
      net/bnxt: prevent device access in error state

Kamil Vojanec (1):
      net/mlx5/linux: fix firmware version

Kevin Traynor (5):
      test/cmdline: fix inputs array
      test/crypto: fix build with GCC 11
      crypto/zuc: fix build with GCC 11
      test: fix build with GCC 11
      test/cmdline: silence clang 12 warning

Konstantin Ananyev (1):
      acl: fix build with GCC 11

Lance Richardson (8):
      net/bnxt: fix Rx buffer posting
      net/bnxt: fix Tx length hint threshold
      net/bnxt: fix handling of null flow mask
      test: fix TCP header initialization
      net/bnxt: fix Rx descriptor status
      net/bnxt: fix Rx queue count
      net/bnxt: fix dynamic VNIC count
      eal: fix memory mapping on 32-bit target

Leyi Rong (1):
      net/iavf: fix packet length parsing in AVX512

Li Zhang (1):
      net/mlx5: fix flow actions index in cache

Luc Pelletier (2):
      eal: fix race in control thread creation
      eal: fix hang in control thread creation

Marvin Liu (5):
      vhost: fix split ring potential buffer overflow
      vhost: fix packed ring potential buffer overflow
      vhost: fix batch dequeue potential buffer overflow
      vhost: fix initialization of temporary header
      vhost: fix initialization of async temporary header

Matan Azrad (5):
      common/mlx5/linux: add glue function to query WQ
      common/mlx5: add DevX command to query WQ
      common/mlx5: add DevX commands for queue counters
      vdpa/mlx5: fix virtq cleaning
      vdpa/mlx5: fix device unplug

Michael Baum (1):
      net/mlx5: fix flow age event triggering

Michal Krawczyk (5):
      net/ena/base: improve style and comments
      net/ena/base: fix type conversions by explicit casting
      net/ena/base: destroy multiple wait events
      net/ena: fix crash with unsupported device argument
      net/ena: indicate Rx RSS hash presence

Min Hu (Connor) (25):
      net/hns3: fix MTU config complexity
      net/hns3: update HiSilicon copyright syntax
      net/hns3: fix copyright date
      examples/ptpclient: remove wrong comment
      test/bpf: fix error message
      doc: fix HiSilicon copyright syntax
      net/hns3: remove unused macros
      net/hns3: remove unused macro
      app/eventdev: fix overflow in lcore list parsing
      test/kni: fix a comment
      test/kni: check init result
      net/hns3: fix typos on comments
      net/e1000: fix flow error message object
      app/testpmd: fix division by zero on socket memory dump
      net/kni: warn on stop failure
      app/bbdev: check memory allocation
      app/bbdev: fix HARQ error messages
      raw/skeleton: add missing check after setting attribute
      test/timer: check memzone allocation
      app/crypto-perf: check memory allocation
      examples/flow_classify: fix NUMA check of port and core
      examples/l2fwd-cat: fix NUMA check of port and core
      examples/skeleton: fix NUMA check of port and core
      test: check flow classifier creation
      test: fix division by zero

Murphy Yang (3):
      net/ixgbe: fix RSS RETA being reset after port start
      net/i40e: fix flow director config after flow validate
      net/i40e: fix flow director for common pctypes

Natanael Copa (5):
      common/dpaax/caamflib: fix build with musl
      bus/dpaa: fix 64-bit arch detection
      bus/dpaa: fix build with musl
      net/cxgbe: remove use of uint type
      app/testpmd: fix build with musl

Nipun Gupta (1):
      bus/dpaa: fix statistics reading

Nithin Dabilpuram (3):
      vfio: do not merge contiguous areas
      vfio: fix DMA mapping granularity for IOVA as VA
      test/mem: fix page size for external memory

Olivier Matz (1):
      test/mempool: fix object initializer

Pallavi Kadam (1):
      bus/pci: skip probing some Windows NDIS devices

Pavan Nikhilesh (4):
      test/event: fix timeout accuracy
      app/eventdev: fix timeout accuracy
      app/eventdev: fix lcore parsing skipping last core
      event/octeontx2: fix XAQ pool reconfigure

Pu Xu (1):
      ip_frag: fix fragmenting IPv4 packet with header option

Qi Zhang (8):
      net/ice/base: fix payload indicator on ptype
      net/ice/base: fix uninitialized struct
      net/ice/base: cleanup filter list on error
      net/ice/base: fix memory allocation for MAC addresses
      net/iavf: fix TSO max segment size
      doc: fix matching versions in ice guide
      net/iavf: fix wrong Tx context descriptor
      common/iavf: fix duplicated offload bit

Radha Mohan Chintakuntla (1):
      raw/octeontx2_dma: assign PCI device in DPI VF

Raslan Darawsheh (1):
      ethdev: update flow item GTP QFI definition

Richael Zhuang (2):
      test/power: add delay before checking CPU frequency
      test/power: round CPU frequency to check

Robin Zhang (6):
      net/i40e: announce request queue capability in PF
      doc: update recommended versions for i40e
      net/i40e: fix lack of MAC type when set MAC address
      net/iavf: fix lack of MAC type when set MAC address
      net/iavf: fix primary MAC type when starting port
      net/i40e: fix primary MAC type when starting port

Rohit Raj (3):
      net/dpaa2: fix getting link status
      net/dpaa: fix getting link status
      examples/l2fwd-crypto: fix packet length while decryption

Roy Shterman (1):
      mem: fix freeing segments in --huge-unlink mode

Satheesh Paul (1):
      net/octeontx2: fix VLAN filter

Savinay Dharmappa (1):
      sched: fix traffic class oversubscription parameter

Shijith Thotton (3):
      eventdev: fix case to initiate crypto adapter service
      event/octeontx2: fix crypto adapter queue pair operations
      event/octeontx2: configure crypto adapter xaq pool

Siwar Zitouni (1):
      net/ice: fix disabling promiscuous mode

Somnath Kotur (5):
      net/bnxt: fix xstats get
      net/bnxt: fix Rx and Tx timestamps
      net/bnxt: fix Tx timestamp init
      net/bnxt: refactor multi-queue Rx configuration
      net/bnxt: fix Rx timestamp when FIFO pending bit is set

Stanislaw Kardach (6):
      test: proceed if timer subsystem already initialized
      stack: allow lock-free only on relevant architectures
      test/distributor: fix worker notification in burst mode
      test/distributor: fix burst flush on worker quit
      net/ena: remove endian swap functions
      net/ena: report default ring size

Stephen Hemminger (2):
      kni: refactor user request processing
      net/bnxt: use prefix on global function

Suanming Mou (1):
      net/mlx5: fix counter offset detection

Tal Shnaiderman (2):
      eal/windows: fix default thread priority
      eal/windows: fix return codes of pthread shim layer

Tengfei Zhang (1):
      net/pcap: fix file descriptor leak on close

Thinh Tran (1):
      test: fix autotest handling of skipped tests

Thomas Monjalon (18):
      bus/pci: fix Windows kernel driver categories
      eal: fix comment of OS-specific header files
      buildtools: fix build with busybox
      build: detect execinfo library on Linux
      build: remove redundant _GNU_SOURCE definitions
      eal: fix build with musl
      net/igc: remove use of uint type
      event/dlb: fix header includes for musl
      examples/bbdev: fix header include for musl
      drivers: fix log level after loading
      app/regex: fix usage text
      app/testpmd: fix usage text
      doc: fix names of UIO drivers
      doc: fix build with Sphinx 4
      bus/pci: support I/O port operations with musl
      app: fix exit messages
      regex/octeontx2: remove unused include directory
      doc: remove PDF requirements

Tianyu Li (1):
      net/memif: fix Tx bps statistics for zero-copy

Timothy McDaniel (2):
      event/dlb2: remove references to deferred scheduling
      doc: fix runtime options in DLB2 guide

Tyler Retzlaff (1):
      eal: add C++ include guard for reciprocal header

Vadim Podovinnikov (1):
      net/bonding: fix LACP system address check

Venkat Duvvuru (1):
      net/bnxt: fix queues per VNIC

Viacheslav Ovsiienko (16):
      net/mlx5: fix external buffer pool registration for Rx queue
      net/mlx5: fix metadata item validation for ingress flows
      net/mlx5: fix hashed list size for tunnel flow groups
      net/mlx5: fix UAR allocation diagnostics messages
      common/mlx5: add timestamp format support to DevX
      vdpa/mlx5: support timestamp format
      net/mlx5: fix Rx metadata leftovers
      net/mlx5: fix drop action for Direct Rules/Verbs
      net/mlx4: fix RSS action with null hash key
      net/mlx5: support timestamp format
      regex/mlx5: support timestamp format
      app/testpmd: fix segment number check
      net/mlx5: remove drop queue function prototypes
      net/mlx4: fix buffer leakage on device close
      net/mlx5: fix probing device in legacy bonding mode
      net/mlx5: fix receiving queue timestamp format

Wei Huang (1):
      raw/ifpga: fix device name format

Wenjun Wu (3):
      net/ice: check some functions return
      net/ice: fix RSS hash update
      net/ice: fix RSS for L2 packet

Wenwu Ma (1):
      net/ice: fix illegal access when removing MAC filter

Wenzhuo Lu (2):
      net/iavf: fix crash in AVX512
      net/ice: fix crash in AVX512

Wisam Jaddo (1):
      app/flow-perf: fix encap/decap actions

Xiao Wang (1):
      vdpa/ifc: check PCI config read

Xiaoyu Min (4):
      net/mlx5: support RSS expansion for IPv6 GRE
      net/mlx5: fix shared inner RSS
      net/mlx5: fix missing shared RSS hash types
      net/mlx5: fix redundant flow after RSS expansion

Xiaoyun Li (2):
      app/testpmd: remove unnecessary UDP tunnel check
      net/i40e: fix IPv4 fragment offload

Xueming Li (2):
      version: 20.11.2-rc1
      net/virtio: fix vectorized Rx queue rearm

Youri Querry (1):
      bus/fslmc: fix random portal hangs with qbman 5.0

Yunjian Wang (5):
      vfio: fix API description
      net/mlx5: fix using flow tunnel before null check
      vfio: fix duplicated user mem map
      net/mlx4: fix leak when configured repeatedly
      net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] 20.11.2 patches review and test
@ 2021-06-26 23:08  1% Xueming Li
  0 siblings, 0 replies; 200+ results
From: Xueming Li @ 2021-06-26 23:08 UTC (permalink / raw)
  To: stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, Thomas Monjalon, yuan.peng,
	zhaoyan.chen, xuemingl

Hi all,

Here is a list of patches targeted for stable release 20.11.2.

The planned date for the final release is 6th July.

Please help with testing and validation of your use cases and report
any issues/results with reply-all to this mail. For the final release
the fixes and reported validations will be added to the release notes.

A release candidate tarball can be found at:

    https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2

These patches are located at branch 20.11 of dpdk-stable repo:
    https://dpdk.org/browse/dpdk-stable/

Thanks.

Xueming Li <xuemingl@nvidia.com>

---
Adam Dybkowski (3):
      common/qat: increase IM buffer size for GEN3
      compress/qat: enable compression on GEN3
      crypto/qat: fix null authentication request

Ajit Khaparde (7):
      net/bnxt: fix RSS context cleanup
      net/bnxt: check kvargs parsing
      net/bnxt: fix resource cleanup
      doc: fix formatting in testpmd guide
      net/bnxt: fix mismatched type comparison in MAC restore
      net/bnxt: check PCI config read
      net/bnxt: fix mismatched type comparison in Rx

Alvin Zhang (11):
      net/ice: fix VLAN filter with PF
      net/i40e: fix input set field mask
      net/igc: fix Rx RSS hash offload capability
      net/igc: fix Rx error counter for bad length
      net/e1000: fix Rx error counter for bad length
      net/e1000: fix max Rx packet size
      net/igc: fix Rx packet size
      net/ice: fix fast mbuf freeing
      net/iavf: fix VF to PF command failure handling
      net/i40e: fix VF RSS configuration
      net/igc: fix speed configuration

Anatoly Burakov (3):
      fbarray: fix log message on truncation error
      power: do not skip saving original P-state governor
      power: save original ACPI governor always

Andrew Boyer (1):
      net/ionic: fix completion type in lif init

Andrew Rybchenko (4):
      net/failsafe: fix RSS hash offload reporting
      net/failsafe: report minimum and maximum MTU
      common/sfc_efx: remove GENEVE from supported tunnels
      net/sfc: fix mark support in EF100 native Rx datapath

Andy Moreton (2):
      common/sfc_efx/base: limit reported MCDI response length
      common/sfc_efx/base: add missing MCDI response length checks

Ankur Dwivedi (1):
      crypto/octeontx: fix session-less mode

Apeksha Gupta (1):
      examples/l2fwd-crypto: skip masked devices

Arek Kusztal (1):
      crypto/qat: fix offset for out-of-place scatter-gather

Beilei Xing (1):
      net/i40evf: fix packet loss for X722

Bing Zhao (1):
      net/mlx5: fix loopback for Direct Verbs queue

Bruce Richardson (2):
      build: exclude meson files from examples installation
      raw/ioat: fix script for configuring small number of queues

Chaoyong He (1):
      doc: fix multiport syntax in nfp guide

Chenbo Xia (1):
      examples/vhost: check memory table query

Chengchang Tang (20):
      net/hns3: fix HW buffer size on MTU update
      net/hns3: fix processing Tx offload flags
      net/hns3: fix Tx checksum for UDP packets with special port
      net/hns3: fix long task queue pairs reset time
      ethdev: validate input in module EEPROM dump
      ethdev: validate input in register info
      ethdev: validate input in EEPROM info
      net/hns3: fix rollback after setting PVID failure
      net/hns3: fix timing in resetting queues
      net/hns3: fix queue state when concurrent with reset
      net/hns3: fix configure FEC when concurrent with reset
      net/hns3: fix use of command status enumeration
      examples: add eal cleanup to examples
      net/bonding: fix adding itself as its slave
      net/hns3: fix timing in mailbox
      app/testpmd: fix max queue number for Tx offloads
      net/tap: fix interrupt vector array size
      net/bonding: fix socket ID check
      net/tap: check ioctl on restore
      examples/timer: fix time interval

Chengwen Feng (50):
      net/hns3: fix flow counter value
      net/hns3: fix VF mailbox head field
      net/hns3: support get device version when dump register
      net/hns3: fix some packet types
      net/hns3: fix missing outer L4 UDP flag for VXLAN
      net/hns3: remove VLAN/QinQ ptypes from support list
      test: check thread creation
      common/dpaax: fix possible null pointer access
      examples/ethtool: remove unused parsing
      net/hns3: fix flow director lock
      net/e1000/base: fix timeout for shadow RAM write
      net/hns3: fix setting default MAC address in bonding of VF
      net/hns3: fix possible mismatched response of mailbox
      net/hns3: fix VF handling LSC event in secondary process
      net/hns3: fix verification of NEON support
      mbuf: check shared memory before dumping dynamic space
      eventdev: remove redundant thread name setting
      eventdev: fix memory leakage on thread creation failure
      net/kni: check init result
      net/hns3: fix mailbox error message
      net/hns3: fix processing link status message on PF
      net/hns3: remove unused mailbox macro and struct
      net/bonding: fix leak on remove
      net/hns3: fix handling link update
      net/i40e: fix negative VEB index
      net/i40e: remove redundant VSI check in Tx queue setup
      net/virtio: fix getline memory leakage
      net/hns3: log time delta in decimal format
      net/hns3: fix time delta calculation
      net/hns3: remove unused macros
      net/hns3: fix vector Rx burst limitation
      net/hns3: remove read when enabling TM QCN error event
      net/hns3: remove unused VMDq code
      net/hns3: increase readability in logs
      raw/ntb: check SPAD user index
      raw/ntb: check memory allocations
      ipc: check malloc sync reply result
      eal: fix service core list parsing
      ipc: use monotonic clock
      net/hns3: return error on PCI config write failure
      net/hns3: fix log on flow director clear
      net/hns3: clear hash map on flow director clear
      net/hns3: fix querying flow director counter for out param
      net/hns3: fix TM QCN error event report by MSI-X
      net/hns3: fix mailbox message ID in log
      net/hns3: fix secondary process request start/stop Rx/Tx
      net/hns3: fix ordering in secondary process initialization
      net/hns3: fail setting FEC if one bit mode is not supported
      net/mlx4: fix secondary process initialization ordering
      net/mlx5: fix secondary process initialization ordering

Ciara Loftus (1):
      net/af_xdp: fix error handling during Rx queue setup

Ciara Power (2):
      telemetry: fix race on callbacks list
      test/crypto: fix return value of a skipped test

Conor Walsh (1):
      examples/l3fwd: fix LPM IPv6 subnets

Cristian Dumitrescu (3):
      table: fix actions with different data size
      pipeline: fix instruction translation
      pipeline: fix endianness conversions

Dapeng Yu (3):
      net/igc: remove MTU setting limitation
      net/e1000: remove MTU setting limitation
      examples/packet_ordering: fix port configuration

David Christensen (1):
      config/ppc: reduce number of cores and NUMA nodes

David Harton (1):
      net/ena: fix releasing Tx ring mbufs

David Hunt (4):
      test/power: fix CPU frequency check
      test/power: add turbo mode to frequency check
      test/power: fix low frequency test when turbo enabled
      test/power: fix turbo test

David Marchand (18):
      doc: fix sphinx rtd theme import in GHA
      service: clean references to removed symbol
      eal: fix evaluation of log level option
      ci: hook to GitHub Actions
      ci: enable v21 ABI checks
      ci: fix package installation in GitHub Actions
      ci: ignore APT update failure in GitHub Actions
      ci: catch coredumps
      vhost: fix offload flags in Rx path
      bus/fslmc: remove unused debug macro
      eal: fix leak in shared lib mode detection
      event/dpaa2: remove unused macros
      net/ice/base: fix memory allocation wrapper
      net/ice: fix leak on thread termination
      devtools: fix orphan symbols check with busybox
      net/vhost: restore pseudo TSO support
      net/ark: fix leak on thread termination
      build: fix drivers selection without Python

Dekel Peled (1):
      common/mlx5: fix DevX read output buffer size

Dmitry Kozlyuk (4):
      net/pcap: fix format string
      eal/windows: add missing SPDX license tag
      buildtools: fix all drivers disabled on Windows
      examples/rxtx_callbacks: fix port ID format specifier

Ed Czeck (2):
      net/ark: update packet director initial state
      net/ark: refactor Rx buffer recovery

Elad Nachman (2):
      kni: support async user request
      kni: fix kernel deadlock with bifurcated device

Feifei Wang (2):
      net/i40e: fix parsing packet type for NEON
      test/trace: fix race on collected perf data

Ferruh Yigit (9):
      power: remove duplicated symbols from map file
      log/linux: make default output stderr
      license: fix typos
      drivers/net: fix FW version query
      net/bnx2x: fix build with GCC 11
      net/bnx2x: fix build with GCC 11
      net/ice/base: fix build with GCC 11
      net/tap: fix build with GCC 11
      test/table: fix build with GCC 11

Gregory Etelson (2):
      app/testpmd: fix tunnel offload flows cleanup
      net/mlx5: fix tunnel offload private items location

Guoyang Zhou (1):
      net/hinic: fix crash in secondary process

Haiyue Wang (1):
      net/ixgbe: fix Rx errors statistics for UDP checksum

Harman Kalra (1):
      event/octeontx2: fix device reconfigure for single slot

Heinrich Kuhn (1):
      net/nfp: fix reporting of RSS capabilities

Hemant Agrawal (3):
      ethdev: add missing buses in device iterator
      crypto/dpaa_sec: affine the thread portal affinity
      crypto/dpaa2_sec: fix close and uninit functions

Hongbo Zheng (9):
      app/testpmd: fix Tx/Rx descriptor query error log
      net/hns3: fix FLR miss detection
      net/hns3: delete redundant blank line
      bpf: fix JSLT validation
      common/sfc_efx/base: fix dereferencing null pointer
      power: fix sanity checks for guest channel read
      net/hns3: fix VF alive notification after config restore
      examples/l3fwd-power: fix empty poll thresholds
      net/hns3: fix concurrent interrupt handling

Huisong Li (23):
      net/hns3: fix device capabilities for copper media type
      net/hns3: remove unused parameter markers
      net/hns3: fix reporting undefined speed
      net/hns3: fix link update when failed to get link info
      net/hns3: fix flow control exception
      app/testpmd: fix bitmap of link speeds when force speed
      net/hns3: fix flow control mode
      net/hns3: remove redundant mailbox response
      net/hns3: fix DCB mode check
      net/hns3: fix VMDq mode check
      net/hns3: fix mbuf leakage
      net/hns3: fix link status when port is stopped
      net/hns3: fix link speed when port is down
      app/testpmd: fix forward lcores number for DCB
      app/testpmd: fix DCB forwarding configuration
      app/testpmd: fix DCB re-configuration
      app/testpmd: verify DCB config during forward config
      net/hns3: fix Rx/Tx queue numbers check
      net/hns3: fix requested FC mode rollback
      net/hns3: remove meaningless packet buffer rollback
      net/hns3: fix DCB configuration
      net/hns3: fix DCB reconfiguration
      net/hns3: fix link speed when VF device is down

Ibtisam Tariq (1):
      examples/vhost_crypto: remove unused short option

Igor Chauskin (2):
      net/ena: switch memcpy to optimized version
      net/ena: fix parsing of large LLQ header device argument

Igor Russkikh (2):
      net/qede: reduce log verbosity
      net/qede: accept bigger RSS table

Ilya Maximets (1):
      net/virtio: fix interrupt unregistering for listening socket

Ivan Malov (5):
      net/sfc: fix buffer size for flow parse
      net: fix comment in IPv6 header
      net/sfc: fix error path inconsistency
      common/sfc_efx/base: fix indication of MAE encap support
      net/sfc: fix outer rule rollback on error

Jerin Jacob (1):
      examples: fix pkg-config override

Jiawei Wang (4):
      app/testpmd: fix NVGRE encap configuration
      net/mlx5: fix resource release for mirror flow
      net/mlx5: fix RSS flow item expansion for GRE key
      net/mlx5: fix RSS flow item expansion for NVGRE

Jiawei Zhu (1):
      net/mlx5: fix Rx segmented packets on mbuf starvation

Jiawen Wu (4):
      net/txgbe: remove unused functions
      net/txgbe: fix Rx missed packet counter
      net/txgbe: update packet type
      net/txgbe: fix QinQ strip

Jiayu Hu (2):
      vhost: fix queue initialization
      vhost: fix redundant vring status change notification

Jie Wang (1):
      net/ice: fix VSI array out of bounds access

John Daley (2):
      net/enic: fix flow initialization error handling
      net/enic: enable GENEVE offload via VNIC configuration

Juraj Linkeš (1):
      eal/arm64: fix platform register bit

Kai Ji (2):
      test/crypto: fix auth-cipher compare length in OOP
      test/crypto: copy offset data to OOP destination buffer

Kalesh AP (23):
      net/bnxt: remove unused macro
      net/bnxt: fix VNIC configuration
      net/bnxt: fix firmware fatal error handling
      net/bnxt: fix FW readiness check during recovery
      net/bnxt: fix device readiness check
      net/bnxt: fix VF info allocation
      net/bnxt: fix HWRM and FW incompatibility handling
      net/bnxt: mute some failure logs
      app/testpmd: check MAC address query
      net/bnxt: fix PCI write check
      net/bnxt: fix link state operations
      net/bnxt: fix timesync when PTP is not supported
      net/bnxt: fix memory allocation for command response
      net/bnxt: fix double free in port start failure
      net/bnxt: fix configuring LRO
      net/bnxt: fix health check alarm cancellation
      net/bnxt: fix PTP support for Thor
      net/bnxt: fix ring count calculation for Thor
      net/bnxt: remove unnecessary forward declarations
      net/bnxt: remove unused function parameters
      net/bnxt: drop unused attribute
      net/bnxt: fix single PF per port check
      net/bnxt: prevent device access in error state

Kamil Vojanec (1):
      net/mlx5/linux: fix firmware version

Kevin Traynor (5):
      test/cmdline: fix inputs array
      test/crypto: fix build with GCC 11
      crypto/zuc: fix build with GCC 11
      test: fix build with GCC 11
      test/cmdline: silence clang 12 warning

Konstantin Ananyev (1):
      acl: fix build with GCC 11

Lance Richardson (8):
      net/bnxt: fix Rx buffer posting
      net/bnxt: fix Tx length hint threshold
      net/bnxt: fix handling of null flow mask
      test: fix TCP header initialization
      net/bnxt: fix Rx descriptor status
      net/bnxt: fix Rx queue count
      net/bnxt: fix dynamic VNIC count
      eal: fix memory mapping on 32-bit target

Leyi Rong (1):
      net/iavf: fix packet length parsing in AVX512

Li Zhang (1):
      net/mlx5: fix flow actions index in cache

Luc Pelletier (2):
      eal: fix race in control thread creation
      eal: fix hang in control thread creation

Marvin Liu (5):
      vhost: fix split ring potential buffer overflow
      vhost: fix packed ring potential buffer overflow
      vhost: fix batch dequeue potential buffer overflow
      vhost: fix initialization of temporary header
      vhost: fix initialization of async temporary header

Matan Azrad (5):
      common/mlx5/linux: add glue function to query WQ
      common/mlx5: add DevX command to query WQ
      common/mlx5: add DevX commands for queue counters
      vdpa/mlx5: fix virtq cleaning
      vdpa/mlx5: fix device unplug

Michael Baum (1):
      net/mlx5: fix flow age event triggering

Michal Krawczyk (5):
      net/ena/base: improve style and comments
      net/ena/base: fix type conversions by explicit casting
      net/ena/base: destroy multiple wait events
      net/ena: fix crash with unsupported device argument
      net/ena: indicate Rx RSS hash presence

Min Hu (Connor) (25):
      net/hns3: fix MTU config complexity
      net/hns3: update HiSilicon copyright syntax
      net/hns3: fix copyright date
      examples/ptpclient: remove wrong comment
      test/bpf: fix error message
      doc: fix HiSilicon copyright syntax
      net/hns3: remove unused macros
      net/hns3: remove unused macro
      app/eventdev: fix overflow in lcore list parsing
      test/kni: fix a comment
      test/kni: check init result
      net/hns3: fix typos on comments
      net/e1000: fix flow error message object
      app/testpmd: fix division by zero on socket memory dump
      net/kni: warn on stop failure
      app/bbdev: check memory allocation
      app/bbdev: fix HARQ error messages
      raw/skeleton: add missing check after setting attribute
      test/timer: check memzone allocation
      app/crypto-perf: check memory allocation
      examples/flow_classify: fix NUMA check of port and core
      examples/l2fwd-cat: fix NUMA check of port and core
      examples/skeleton: fix NUMA check of port and core
      test: check flow classifier creation
      test: fix division by zero

Murphy Yang (3):
      net/ixgbe: fix RSS RETA being reset after port start
      net/i40e: fix flow director config after flow validate
      net/i40e: fix flow director for common pctypes

Natanael Copa (5):
      common/dpaax/caamflib: fix build with musl
      bus/dpaa: fix 64-bit arch detection
      bus/dpaa: fix build with musl
      net/cxgbe: remove use of uint type
      app/testpmd: fix build with musl

Nipun Gupta (1):
      bus/dpaa: fix statistics reading

Nithin Dabilpuram (3):
      vfio: do not merge contiguous areas
      vfio: fix DMA mapping granularity for IOVA as VA
      test/mem: fix page size for external memory

Olivier Matz (1):
      test/mempool: fix object initializer

Pallavi Kadam (1):
      bus/pci: skip probing some Windows NDIS devices

Pavan Nikhilesh (4):
      test/event: fix timeout accuracy
      app/eventdev: fix timeout accuracy
      app/eventdev: fix lcore parsing skipping last core
      event/octeontx2: fix XAQ pool reconfigure

Pu Xu (1):
      ip_frag: fix fragmenting IPv4 packet with header option

Qi Zhang (8):
      net/ice/base: fix payload indicator on ptype
      net/ice/base: fix uninitialized struct
      net/ice/base: cleanup filter list on error
      net/ice/base: fix memory allocation for MAC addresses
      net/iavf: fix TSO max segment size
      doc: fix matching versions in ice guide
      net/iavf: fix wrong Tx context descriptor
      common/iavf: fix duplicated offload bit

Radha Mohan Chintakuntla (1):
      raw/octeontx2_dma: assign PCI device in DPI VF

Raslan Darawsheh (1):
      ethdev: update flow item GTP QFI definition

Richael Zhuang (2):
      test/power: add delay before checking CPU frequency
      test/power: round CPU frequency to check

Robin Zhang (6):
      net/i40e: announce request queue capability in PF
      doc: update recommended versions for i40e
      net/i40e: fix lack of MAC type when set MAC address
      net/iavf: fix lack of MAC type when set MAC address
      net/iavf: fix primary MAC type when starting port
      net/i40e: fix primary MAC type when starting port

Rohit Raj (3):
      net/dpaa2: fix getting link status
      net/dpaa: fix getting link status
      examples/l2fwd-crypto: fix packet length while decryption

Roy Shterman (1):
      mem: fix freeing segments in --huge-unlink mode

Satheesh Paul (1):
      net/octeontx2: fix VLAN filter

Savinay Dharmappa (1):
      sched: fix traffic class oversubscription parameter

Shijith Thotton (3):
      eventdev: fix case to initiate crypto adapter service
      event/octeontx2: fix crypto adapter queue pair operations
      event/octeontx2: configure crypto adapter xaq pool

Siwar Zitouni (1):
      net/ice: fix disabling promiscuous mode

Somnath Kotur (5):
      net/bnxt: fix xstats get
      net/bnxt: fix Rx and Tx timestamps
      net/bnxt: fix Tx timestamp init
      net/bnxt: refactor multi-queue Rx configuration
      net/bnxt: fix Rx timestamp when FIFO pending bit is set

Stanislaw Kardach (6):
      test: proceed if timer subsystem already initialized
      stack: allow lock-free only on relevant architectures
      test/distributor: fix worker notification in burst mode
      test/distributor: fix burst flush on worker quit
      net/ena: remove endian swap functions
      net/ena: report default ring size

Stephen Hemminger (2):
      kni: refactor user request processing
      net/bnxt: use prefix on global function

Suanming Mou (1):
      net/mlx5: fix counter offset detection

Tal Shnaiderman (2):
      eal/windows: fix default thread priority
      eal/windows: fix return codes of pthread shim layer

Tengfei Zhang (1):
      net/pcap: fix file descriptor leak on close

Thinh Tran (1):
      test: fix autotest handling of skipped tests

Thomas Monjalon (18):
      bus/pci: fix Windows kernel driver categories
      eal: fix comment of OS-specific header files
      buildtools: fix build with busybox
      build: detect execinfo library on Linux
      build: remove redundant _GNU_SOURCE definitions
      eal: fix build with musl
      net/igc: remove use of uint type
      event/dlb: fix header includes for musl
      examples/bbdev: fix header include for musl
      drivers: fix log level after loading
      app/regex: fix usage text
      app/testpmd: fix usage text
      doc: fix names of UIO drivers
      doc: fix build with Sphinx 4
      bus/pci: support I/O port operations with musl
      app: fix exit messages
      regex/octeontx2: remove unused include directory
      doc: remove PDF requirements

Tianyu Li (1):
      net/memif: fix Tx bps statistics for zero-copy

Timothy McDaniel (2):
      event/dlb2: remove references to deferred scheduling
      doc: fix runtime options in DLB2 guide

Tyler Retzlaff (1):
      eal: add C++ include guard for reciprocal header

Vadim Podovinnikov (1):
      net/bonding: fix LACP system address check

Venkat Duvvuru (1):
      net/bnxt: fix queues per VNIC

Viacheslav Ovsiienko (16):
      net/mlx5: fix external buffer pool registration for Rx queue
      net/mlx5: fix metadata item validation for ingress flows
      net/mlx5: fix hashed list size for tunnel flow groups
      net/mlx5: fix UAR allocation diagnostics messages
      common/mlx5: add timestamp format support to DevX
      vdpa/mlx5: support timestamp format
      net/mlx5: fix Rx metadata leftovers
      net/mlx5: fix drop action for Direct Rules/Verbs
      net/mlx4: fix RSS action with null hash key
      net/mlx5: support timestamp format
      regex/mlx5: support timestamp format
      app/testpmd: fix segment number check
      net/mlx5: remove drop queue function prototypes
      net/mlx4: fix buffer leakage on device close
      net/mlx5: fix probing device in legacy bonding mode
      net/mlx5: fix receiving queue timestamp format

Wei Huang (1):
      raw/ifpga: fix device name format

Wenjun Wu (3):
      net/ice: check some functions return
      net/ice: fix RSS hash update
      net/ice: fix RSS for L2 packet

Wenwu Ma (1):
      net/ice: fix illegal access when removing MAC filter

Wenzhuo Lu (2):
      net/iavf: fix crash in AVX512
      net/ice: fix crash in AVX512

Wisam Jaddo (1):
      app/flow-perf: fix encap/decap actions

Xiao Wang (1):
      vdpa/ifc: check PCI config read

Xiaoyu Min (4):
      net/mlx5: support RSS expansion for IPv6 GRE
      net/mlx5: fix shared inner RSS
      net/mlx5: fix missing shared RSS hash types
      net/mlx5: fix redundant flow after RSS expansion

Xiaoyun Li (2):
      app/testpmd: remove unnecessary UDP tunnel check
      net/i40e: fix IPv4 fragment offload

Xueming Li (2):
      version: 20.11.2-rc1
      net/virtio: fix vectorized Rx queue rearm

Youri Querry (1):
      bus/fslmc: fix random portal hangs with qbman 5.0

Yunjian Wang (5):
      vfio: fix API description
      net/mlx5: fix using flow tunnel before null check
      vfio: fix duplicated user mem map
      net/mlx4: fix leak when configured repeatedly
      net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] 20.11.2 patches review and test
@ 2021-06-26 15:41  1% Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-06-26 15:41 UTC (permalink / raw)
  To: stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	yuan.peng, zhaoyan.chen

Hi all,

Here is a list of patches targeted for stable release 20.11.2.

The planned date for the final release is 6th July.

Please help with testing and validation of your use cases and report
any issues/results with reply-all to this mail. For the final release
the fixes and reported validations will be added to the release notes.

A release candidate tarball can be found at:

    https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc2

These patches are located at branch 20.11 of dpdk-stable repo:
    https://dpdk.org/browse/dpdk-stable/

Thanks.

Xueming Li <xuemingl@nvidia.com>

---
Adam Dybkowski (3):
      common/qat: increase IM buffer size for GEN3
      compress/qat: enable compression on GEN3
      crypto/qat: fix null authentication request

Ajit Khaparde (7):
      net/bnxt: fix RSS context cleanup
      net/bnxt: check kvargs parsing
      net/bnxt: fix resource cleanup
      doc: fix formatting in testpmd guide
      net/bnxt: fix mismatched type comparison in MAC restore
      net/bnxt: check PCI config read
      net/bnxt: fix mismatched type comparison in Rx

Alvin Zhang (11):
      net/ice: fix VLAN filter with PF
      net/i40e: fix input set field mask
      net/igc: fix Rx RSS hash offload capability
      net/igc: fix Rx error counter for bad length
      net/e1000: fix Rx error counter for bad length
      net/e1000: fix max Rx packet size
      net/igc: fix Rx packet size
      net/ice: fix fast mbuf freeing
      net/iavf: fix VF to PF command failure handling
      net/i40e: fix VF RSS configuration
      net/igc: fix speed configuration

Anatoly Burakov (3):
      fbarray: fix log message on truncation error
      power: do not skip saving original P-state governor
      power: save original ACPI governor always

Andrew Boyer (1):
      net/ionic: fix completion type in lif init

Andrew Rybchenko (4):
      net/failsafe: fix RSS hash offload reporting
      net/failsafe: report minimum and maximum MTU
      common/sfc_efx: remove GENEVE from supported tunnels
      net/sfc: fix mark support in EF100 native Rx datapath

Andy Moreton (2):
      common/sfc_efx/base: limit reported MCDI response length
      common/sfc_efx/base: add missing MCDI response length checks

Ankur Dwivedi (1):
      crypto/octeontx: fix session-less mode

Apeksha Gupta (1):
      examples/l2fwd-crypto: skip masked devices

Arek Kusztal (1):
      crypto/qat: fix offset for out-of-place scatter-gather

Beilei Xing (1):
      net/i40evf: fix packet loss for X722

Bing Zhao (1):
      net/mlx5: fix loopback for Direct Verbs queue

Bruce Richardson (2):
      build: exclude meson files from examples installation
      raw/ioat: fix script for configuring small number of queues

Chaoyong He (1):
      doc: fix multiport syntax in nfp guide

Chenbo Xia (1):
      examples/vhost: check memory table query

Chengchang Tang (20):
      net/hns3: fix HW buffer size on MTU update
      net/hns3: fix processing Tx offload flags
      net/hns3: fix Tx checksum for UDP packets with special port
      net/hns3: fix long task queue pairs reset time
      ethdev: validate input in module EEPROM dump
      ethdev: validate input in register info
      ethdev: validate input in EEPROM info
      net/hns3: fix rollback after setting PVID failure
      net/hns3: fix timing in resetting queues
      net/hns3: fix queue state when concurrent with reset
      net/hns3: fix configure FEC when concurrent with reset
      net/hns3: fix use of command status enumeration
      examples: add eal cleanup to examples
      net/bonding: fix adding itself as its slave
      net/hns3: fix timing in mailbox
      app/testpmd: fix max queue number for Tx offloads
      net/tap: fix interrupt vector array size
      net/bonding: fix socket ID check
      net/tap: check ioctl on restore
      examples/timer: fix time interval

Chengwen Feng (50):
      net/hns3: fix flow counter value
      net/hns3: fix VF mailbox head field
      net/hns3: support get device version when dump register
      net/hns3: fix some packet types
      net/hns3: fix missing outer L4 UDP flag for VXLAN
      net/hns3: remove VLAN/QinQ ptypes from support list
      test: check thread creation
      common/dpaax: fix possible null pointer access
      examples/ethtool: remove unused parsing
      net/hns3: fix flow director lock
      net/e1000/base: fix timeout for shadow RAM write
      net/hns3: fix setting default MAC address in bonding of VF
      net/hns3: fix possible mismatched response of mailbox
      net/hns3: fix VF handling LSC event in secondary process
      net/hns3: fix verification of NEON support
      mbuf: check shared memory before dumping dynamic space
      eventdev: remove redundant thread name setting
      eventdev: fix memory leakage on thread creation failure
      net/kni: check init result
      net/hns3: fix mailbox error message
      net/hns3: fix processing link status message on PF
      net/hns3: remove unused mailbox macro and struct
      net/bonding: fix leak on remove
      net/hns3: fix handling link update
      net/i40e: fix negative VEB index
      net/i40e: remove redundant VSI check in Tx queue setup
      net/virtio: fix getline memory leakage
      net/hns3: log time delta in decimal format
      net/hns3: fix time delta calculation
      net/hns3: remove unused macros
      net/hns3: fix vector Rx burst limitation
      net/hns3: remove read when enabling TM QCN error event
      net/hns3: remove unused VMDq code
      net/hns3: increase readability in logs
      raw/ntb: check SPAD user index
      raw/ntb: check memory allocations
      ipc: check malloc sync reply result
      eal: fix service core list parsing
      ipc: use monotonic clock
      net/hns3: return error on PCI config write failure
      net/hns3: fix log on flow director clear
      net/hns3: clear hash map on flow director clear
      net/hns3: fix querying flow director counter for out param
      net/hns3: fix TM QCN error event report by MSI-X
      net/hns3: fix mailbox message ID in log
      net/hns3: fix secondary process request start/stop Rx/Tx
      net/hns3: fix ordering in secondary process initialization
      net/hns3: fail setting FEC if one bit mode is not supported
      net/mlx4: fix secondary process initialization ordering
      net/mlx5: fix secondary process initialization ordering

Ciara Loftus (1):
      net/af_xdp: fix error handling during Rx queue setup

Ciara Power (2):
      telemetry: fix race on callbacks list
      test/crypto: fix return value of a skipped test

Conor Walsh (1):
      examples/l3fwd: fix LPM IPv6 subnets

Cristian Dumitrescu (3):
      table: fix actions with different data size
      pipeline: fix instruction translation
      pipeline: fix endianness conversions

Dapeng Yu (3):
      net/igc: remove MTU setting limitation
      net/e1000: remove MTU setting limitation
      examples/packet_ordering: fix port configuration

David Christensen (1):
      config/ppc: reduce number of cores and NUMA nodes

David Harton (1):
      net/ena: fix releasing Tx ring mbufs

David Hunt (4):
      test/power: fix CPU frequency check
      test/power: add turbo mode to frequency check
      test/power: fix low frequency test when turbo enabled
      test/power: fix turbo test

David Marchand (18):
      doc: fix sphinx rtd theme import in GHA
      service: clean references to removed symbol
      eal: fix evaluation of log level option
      ci: hook to GitHub Actions
      ci: enable v21 ABI checks
      ci: fix package installation in GitHub Actions
      ci: ignore APT update failure in GitHub Actions
      ci: catch coredumps
      vhost: fix offload flags in Rx path
      bus/fslmc: remove unused debug macro
      eal: fix leak in shared lib mode detection
      event/dpaa2: remove unused macros
      net/ice/base: fix memory allocation wrapper
      net/ice: fix leak on thread termination
      devtools: fix orphan symbols check with busybox
      net/vhost: restore pseudo TSO support
      net/ark: fix leak on thread termination
      build: fix drivers selection without Python

Dekel Peled (1):
      common/mlx5: fix DevX read output buffer size

Dmitry Kozlyuk (4):
      net/pcap: fix format string
      eal/windows: add missing SPDX license tag
      buildtools: fix all drivers disabled on Windows
      examples/rxtx_callbacks: fix port ID format specifier

Ed Czeck (2):
      net/ark: update packet director initial state
      net/ark: refactor Rx buffer recovery

Elad Nachman (2):
      kni: support async user request
      kni: fix kernel deadlock with bifurcated device

Feifei Wang (2):
      net/i40e: fix parsing packet type for NEON
      test/trace: fix race on collected perf data

Ferruh Yigit (9):
      power: remove duplicated symbols from map file
      log/linux: make default output stderr
      license: fix typos
      drivers/net: fix FW version query
      net/bnx2x: fix build with GCC 11
      net/bnx2x: fix build with GCC 11
      net/ice/base: fix build with GCC 11
      net/tap: fix build with GCC 11
      test/table: fix build with GCC 11

Gregory Etelson (2):
      app/testpmd: fix tunnel offload flows cleanup
      net/mlx5: fix tunnel offload private items location

Guoyang Zhou (1):
      net/hinic: fix crash in secondary process

Haiyue Wang (1):
      net/ixgbe: fix Rx errors statistics for UDP checksum

Harman Kalra (1):
      event/octeontx2: fix device reconfigure for single slot

Heinrich Kuhn (1):
      net/nfp: fix reporting of RSS capabilities

Hemant Agrawal (3):
      ethdev: add missing buses in device iterator
      crypto/dpaa_sec: affine the thread portal affinity
      crypto/dpaa2_sec: fix close and uninit functions

Hongbo Zheng (9):
      app/testpmd: fix Tx/Rx descriptor query error log
      net/hns3: fix FLR miss detection
      net/hns3: delete redundant blank line
      bpf: fix JSLT validation
      common/sfc_efx/base: fix dereferencing null pointer
      power: fix sanity checks for guest channel read
      net/hns3: fix VF alive notification after config restore
      examples/l3fwd-power: fix empty poll thresholds
      net/hns3: fix concurrent interrupt handling

Huisong Li (23):
      net/hns3: fix device capabilities for copper media type
      net/hns3: remove unused parameter markers
      net/hns3: fix reporting undefined speed
      net/hns3: fix link update when failed to get link info
      net/hns3: fix flow control exception
      app/testpmd: fix bitmap of link speeds when force speed
      net/hns3: fix flow control mode
      net/hns3: remove redundant mailbox response
      net/hns3: fix DCB mode check
      net/hns3: fix VMDq mode check
      net/hns3: fix mbuf leakage
      net/hns3: fix link status when port is stopped
      net/hns3: fix link speed when port is down
      app/testpmd: fix forward lcores number for DCB
      app/testpmd: fix DCB forwarding configuration
      app/testpmd: fix DCB re-configuration
      app/testpmd: verify DCB config during forward config
      net/hns3: fix Rx/Tx queue numbers check
      net/hns3: fix requested FC mode rollback
      net/hns3: remove meaningless packet buffer rollback
      net/hns3: fix DCB configuration
      net/hns3: fix DCB reconfiguration
      net/hns3: fix link speed when VF device is down

Ibtisam Tariq (1):
      examples/vhost_crypto: remove unused short option

Igor Chauskin (2):
      net/ena: switch memcpy to optimized version
      net/ena: fix parsing of large LLQ header device argument

Igor Russkikh (2):
      net/qede: reduce log verbosity
      net/qede: accept bigger RSS table

Ilya Maximets (1):
      net/virtio: fix interrupt unregistering for listening socket

Ivan Malov (5):
      net/sfc: fix buffer size for flow parse
      net: fix comment in IPv6 header
      net/sfc: fix error path inconsistency
      common/sfc_efx/base: fix indication of MAE encap support
      net/sfc: fix outer rule rollback on error

Jerin Jacob (1):
      examples: fix pkg-config override

Jiawei Wang (4):
      app/testpmd: fix NVGRE encap configuration
      net/mlx5: fix resource release for mirror flow
      net/mlx5: fix RSS flow item expansion for GRE key
      net/mlx5: fix RSS flow item expansion for NVGRE

Jiawei Zhu (1):
      net/mlx5: fix Rx segmented packets on mbuf starvation

Jiawen Wu (4):
      net/txgbe: remove unused functions
      net/txgbe: fix Rx missed packet counter
      net/txgbe: update packet type
      net/txgbe: fix QinQ strip

Jiayu Hu (2):
      vhost: fix queue initialization
      vhost: fix redundant vring status change notification

Jie Wang (1):
      net/ice: fix VSI array out of bounds access

John Daley (2):
      net/enic: fix flow initialization error handling
      net/enic: enable GENEVE offload via VNIC configuration

Juraj Linkeš (1):
      eal/arm64: fix platform register bit

Kai Ji (2):
      test/crypto: fix auth-cipher compare length in OOP
      test/crypto: copy offset data to OOP destination buffer

Kalesh AP (23):
      net/bnxt: remove unused macro
      net/bnxt: fix VNIC configuration
      net/bnxt: fix firmware fatal error handling
      net/bnxt: fix FW readiness check during recovery
      net/bnxt: fix device readiness check
      net/bnxt: fix VF info allocation
      net/bnxt: fix HWRM and FW incompatibility handling
      net/bnxt: mute some failure logs
      app/testpmd: check MAC address query
      net/bnxt: fix PCI write check
      net/bnxt: fix link state operations
      net/bnxt: fix timesync when PTP is not supported
      net/bnxt: fix memory allocation for command response
      net/bnxt: fix double free in port start failure
      net/bnxt: fix configuring LRO
      net/bnxt: fix health check alarm cancellation
      net/bnxt: fix PTP support for Thor
      net/bnxt: fix ring count calculation for Thor
      net/bnxt: remove unnecessary forward declarations
      net/bnxt: remove unused function parameters
      net/bnxt: drop unused attribute
      net/bnxt: fix single PF per port check
      net/bnxt: prevent device access in error state

Kamil Vojanec (1):
      net/mlx5/linux: fix firmware version

Kevin Traynor (5):
      test/cmdline: fix inputs array
      test/crypto: fix build with GCC 11
      crypto/zuc: fix build with GCC 11
      test: fix build with GCC 11
      test/cmdline: silence clang 12 warning

Konstantin Ananyev (1):
      acl: fix build with GCC 11

Lance Richardson (8):
      net/bnxt: fix Rx buffer posting
      net/bnxt: fix Tx length hint threshold
      net/bnxt: fix handling of null flow mask
      test: fix TCP header initialization
      net/bnxt: fix Rx descriptor status
      net/bnxt: fix Rx queue count
      net/bnxt: fix dynamic VNIC count
      eal: fix memory mapping on 32-bit target

Leyi Rong (1):
      net/iavf: fix packet length parsing in AVX512

Li Zhang (1):
      net/mlx5: fix flow actions index in cache

Luc Pelletier (2):
      eal: fix race in control thread creation
      eal: fix hang in control thread creation

Marvin Liu (5):
      vhost: fix split ring potential buffer overflow
      vhost: fix packed ring potential buffer overflow
      vhost: fix batch dequeue potential buffer overflow
      vhost: fix initialization of temporary header
      vhost: fix initialization of async temporary header

Matan Azrad (5):
      common/mlx5/linux: add glue function to query WQ
      common/mlx5: add DevX command to query WQ
      common/mlx5: add DevX commands for queue counters
      vdpa/mlx5: fix virtq cleaning
      vdpa/mlx5: fix device unplug

Michael Baum (1):
      net/mlx5: fix flow age event triggering

Michal Krawczyk (5):
      net/ena/base: improve style and comments
      net/ena/base: fix type conversions by explicit casting
      net/ena/base: destroy multiple wait events
      net/ena: fix crash with unsupported device argument
      net/ena: indicate Rx RSS hash presence

Min Hu (Connor) (25):
      net/hns3: fix MTU config complexity
      net/hns3: update HiSilicon copyright syntax
      net/hns3: fix copyright date
      examples/ptpclient: remove wrong comment
      test/bpf: fix error message
      doc: fix HiSilicon copyright syntax
      net/hns3: remove unused macros
      net/hns3: remove unused macro
      app/eventdev: fix overflow in lcore list parsing
      test/kni: fix a comment
      test/kni: check init result
      net/hns3: fix typos on comments
      net/e1000: fix flow error message object
      app/testpmd: fix division by zero on socket memory dump
      net/kni: warn on stop failure
      app/bbdev: check memory allocation
      app/bbdev: fix HARQ error messages
      raw/skeleton: add missing check after setting attribute
      test/timer: check memzone allocation
      app/crypto-perf: check memory allocation
      examples/flow_classify: fix NUMA check of port and core
      examples/l2fwd-cat: fix NUMA check of port and core
      examples/skeleton: fix NUMA check of port and core
      test: check flow classifier creation
      test: fix division by zero

Murphy Yang (3):
      net/ixgbe: fix RSS RETA being reset after port start
      net/i40e: fix flow director config after flow validate
      net/i40e: fix flow director for common pctypes

Natanael Copa (5):
      common/dpaax/caamflib: fix build with musl
      bus/dpaa: fix 64-bit arch detection
      bus/dpaa: fix build with musl
      net/cxgbe: remove use of uint type
      app/testpmd: fix build with musl

Nipun Gupta (1):
      bus/dpaa: fix statistics reading

Nithin Dabilpuram (3):
      vfio: do not merge contiguous areas
      vfio: fix DMA mapping granularity for IOVA as VA
      test/mem: fix page size for external memory

Olivier Matz (1):
      test/mempool: fix object initializer

Pallavi Kadam (1):
      bus/pci: skip probing some Windows NDIS devices

Pavan Nikhilesh (4):
      test/event: fix timeout accuracy
      app/eventdev: fix timeout accuracy
      app/eventdev: fix lcore parsing skipping last core
      event/octeontx2: fix XAQ pool reconfigure

Pu Xu (1):
      ip_frag: fix fragmenting IPv4 packet with header option

Qi Zhang (8):
      net/ice/base: fix payload indicator on ptype
      net/ice/base: fix uninitialized struct
      net/ice/base: cleanup filter list on error
      net/ice/base: fix memory allocation for MAC addresses
      net/iavf: fix TSO max segment size
      doc: fix matching versions in ice guide
      net/iavf: fix wrong Tx context descriptor
      common/iavf: fix duplicated offload bit

Radha Mohan Chintakuntla (1):
      raw/octeontx2_dma: assign PCI device in DPI VF

Raslan Darawsheh (1):
      ethdev: update flow item GTP QFI definition

Richael Zhuang (2):
      test/power: add delay before checking CPU frequency
      test/power: round CPU frequency to check

Robin Zhang (6):
      net/i40e: announce request queue capability in PF
      doc: update recommended versions for i40e
      net/i40e: fix lack of MAC type when set MAC address
      net/iavf: fix lack of MAC type when set MAC address
      net/iavf: fix primary MAC type when starting port
      net/i40e: fix primary MAC type when starting port

Rohit Raj (3):
      net/dpaa2: fix getting link status
      net/dpaa: fix getting link status
      examples/l2fwd-crypto: fix packet length while decryption

Roy Shterman (1):
      mem: fix freeing segments in --huge-unlink mode

Satheesh Paul (1):
      net/octeontx2: fix VLAN filter

Savinay Dharmappa (1):
      sched: fix traffic class oversubscription parameter

Shijith Thotton (3):
      eventdev: fix case to initiate crypto adapter service
      event/octeontx2: fix crypto adapter queue pair operations
      event/octeontx2: configure crypto adapter xaq pool

Siwar Zitouni (1):
      net/ice: fix disabling promiscuous mode

Somnath Kotur (5):
      net/bnxt: fix xstats get
      net/bnxt: fix Rx and Tx timestamps
      net/bnxt: fix Tx timestamp init
      net/bnxt: refactor multi-queue Rx configuration
      net/bnxt: fix Rx timestamp when FIFO pending bit is set

Stanislaw Kardach (6):
      test: proceed if timer subsystem already initialized
      stack: allow lock-free only on relevant architectures
      test/distributor: fix worker notification in burst mode
      test/distributor: fix burst flush on worker quit
      net/ena: remove endian swap functions
      net/ena: report default ring size

Stephen Hemminger (2):
      kni: refactor user request processing
      net/bnxt: use prefix on global function

Suanming Mou (1):
      net/mlx5: fix counter offset detection

Tal Shnaiderman (2):
      eal/windows: fix default thread priority
      eal/windows: fix return codes of pthread shim layer

Tengfei Zhang (1):
      net/pcap: fix file descriptor leak on close

Thinh Tran (1):
      test: fix autotest handling of skipped tests

Thomas Monjalon (18):
      bus/pci: fix Windows kernel driver categories
      eal: fix comment of OS-specific header files
      buildtools: fix build with busybox
      build: detect execinfo library on Linux
      build: remove redundant _GNU_SOURCE definitions
      eal: fix build with musl
      net/igc: remove use of uint type
      event/dlb: fix header includes for musl
      examples/bbdev: fix header include for musl
      drivers: fix log level after loading
      app/regex: fix usage text
      app/testpmd: fix usage text
      doc: fix names of UIO drivers
      doc: fix build with Sphinx 4
      bus/pci: support I/O port operations with musl
      app: fix exit messages
      regex/octeontx2: remove unused include directory
      doc: remove PDF requirements

Tianyu Li (1):
      net/memif: fix Tx bps statistics for zero-copy

Timothy McDaniel (2):
      event/dlb2: remove references to deferred scheduling
      doc: fix runtime options in DLB2 guide

Tyler Retzlaff (1):
      eal: add C++ include guard for reciprocal header

Vadim Podovinnikov (1):
      net/bonding: fix LACP system address check

Venkat Duvvuru (1):
      net/bnxt: fix queues per VNIC

Viacheslav Ovsiienko (16):
      net/mlx5: fix external buffer pool registration for Rx queue
      net/mlx5: fix metadata item validation for ingress flows
      net/mlx5: fix hashed list size for tunnel flow groups
      net/mlx5: fix UAR allocation diagnostics messages
      common/mlx5: add timestamp format support to DevX
      vdpa/mlx5: support timestamp format
      net/mlx5: fix Rx metadata leftovers
      net/mlx5: fix drop action for Direct Rules/Verbs
      net/mlx4: fix RSS action with null hash key
      net/mlx5: support timestamp format
      regex/mlx5: support timestamp format
      app/testpmd: fix segment number check
      net/mlx5: remove drop queue function prototypes
      net/mlx4: fix buffer leakage on device close
      net/mlx5: fix probing device in legacy bonding mode
      net/mlx5: fix receiving queue timestamp format

Wei Huang (1):
      raw/ifpga: fix device name format

Wenjun Wu (3):
      net/ice: check some functions return
      net/ice: fix RSS hash update
      net/ice: fix RSS for L2 packet

Wenwu Ma (1):
      net/ice: fix illegal access when removing MAC filter

Wenzhuo Lu (2):
      net/iavf: fix crash in AVX512
      net/ice: fix crash in AVX512

Wisam Jaddo (1):
      app/flow-perf: fix encap/decap actions

Xiao Wang (1):
      vdpa/ifc: check PCI config read

Xiaoyu Min (4):
      net/mlx5: support RSS expansion for IPv6 GRE
      net/mlx5: fix shared inner RSS
      net/mlx5: fix missing shared RSS hash types
      net/mlx5: fix redundant flow after RSS expansion

Xiaoyun Li (2):
      app/testpmd: remove unnecessary UDP tunnel check
      net/i40e: fix IPv4 fragment offload

Xueming Li (2):
      version: 20.11.2-rc1
      net/virtio: fix vectorized Rx queue rearm

Youri Querry (1):
      bus/fslmc: fix random portal hangs with qbman 5.0

Yunjian Wang (5):
      vfio: fix API description
      net/mlx5: fix using flow tunnel before null check
      vfio: fix duplicated user mem map
      net/mlx4: fix leak when configured repeatedly
      net/mlx5: fix leak when configured repeatedly

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [PATCH v2 4/7] power: remove thread safety from PMD power API's
    2021-06-25 14:00  3%   ` [dpdk-dev] [PATCH v2 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
@ 2021-06-25 14:00  3%   ` Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-25 14:00 UTC (permalink / raw)
  To: dev, David Hunt; +Cc: ciara.loftus

Currently, we expect that only one callback can be active at any given
moment, for a particular queue configuration, which is relatively easy
to implement in a thread-safe way. However, we're about to add support
for multiple queues per lcore, which will greatly increase the
possibility of various race conditions.

We could have used something like an RCU for this use case, but absent
of a pressing need for thread safety we'll go the easy way and just
mandate that the API's are to be called when all affected ports are
stopped, and document this limitation. This greatly simplifies the
`rte_power_monitor`-related code.

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Add check for stopped queue
    - Clarified doc message
    - Added release notes

 doc/guides/rel_notes/release_21_08.rst |   5 +
 lib/power/meson.build                  |   3 +
 lib/power/rte_power_pmd_mgmt.c         | 133 ++++++++++---------------
 lib/power/rte_power_pmd_mgmt.h         |   6 ++
 4 files changed, 67 insertions(+), 80 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index 9d1cfac395..f015c509fc 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -88,6 +88,11 @@ API Changes
 
 * eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
+* rte_power: The experimental PMD power management API is no longer considered
+  to be thread safe; all Rx queues affected by the API will now need to be
+  stopped before making any changes to the power management scheme.
+
+
 ABI Changes
 -----------
 
diff --git a/lib/power/meson.build b/lib/power/meson.build
index c1097d32f1..4f6a242364 100644
--- a/lib/power/meson.build
+++ b/lib/power/meson.build
@@ -21,4 +21,7 @@ headers = files(
         'rte_power_pmd_mgmt.h',
         'rte_power_guest_channel.h',
 )
+if cc.has_argument('-Wno-cast-qual')
+    cflags += '-Wno-cast-qual'
+endif
 deps += ['timer', 'ethdev']
diff --git a/lib/power/rte_power_pmd_mgmt.c b/lib/power/rte_power_pmd_mgmt.c
index db03cbf420..9b95cf1794 100644
--- a/lib/power/rte_power_pmd_mgmt.c
+++ b/lib/power/rte_power_pmd_mgmt.c
@@ -40,8 +40,6 @@ struct pmd_queue_cfg {
 	/**< Callback mode for this queue */
 	const struct rte_eth_rxtx_callback *cur_cb;
 	/**< Callback instance */
-	volatile bool umwait_in_progress;
-	/**< are we currently sleeping? */
 	uint64_t empty_poll_stats;
 	/**< Number of empty polls */
 } __rte_cache_aligned;
@@ -92,30 +90,11 @@ clb_umwait(uint16_t port_id, uint16_t qidx, struct rte_mbuf **pkts __rte_unused,
 			struct rte_power_monitor_cond pmc;
 			uint16_t ret;
 
-			/*
-			 * we might get a cancellation request while being
-			 * inside the callback, in which case the wakeup
-			 * wouldn't work because it would've arrived too early.
-			 *
-			 * to get around this, we notify the other thread that
-			 * we're sleeping, so that it can spin until we're done.
-			 * unsolicited wakeups are perfectly safe.
-			 */
-			q_conf->umwait_in_progress = true;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-			/* check if we need to cancel sleep */
-			if (q_conf->pwr_mgmt_state == PMD_MGMT_ENABLED) {
-				/* use monitoring condition to sleep */
-				ret = rte_eth_get_monitor_addr(port_id, qidx,
-						&pmc);
-				if (ret == 0)
-					rte_power_monitor(&pmc, UINT64_MAX);
-			}
-			q_conf->umwait_in_progress = false;
-
-			rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
+			/* use monitoring condition to sleep */
+			ret = rte_eth_get_monitor_addr(port_id, qidx,
+					&pmc);
+			if (ret == 0)
+				rte_power_monitor(&pmc, UINT64_MAX);
 		}
 	} else
 		q_conf->empty_poll_stats = 0;
@@ -177,12 +156,24 @@ clb_scale_freq(uint16_t port_id, uint16_t qidx,
 	return nb_rx;
 }
 
+static int
+queue_stopped(const uint16_t port_id, const uint16_t queue_id)
+{
+	struct rte_eth_rxq_info qinfo;
+
+	if (rte_eth_rx_queue_info_get(port_id, queue_id, &qinfo) < 0)
+		return -1;
+
+	return qinfo.queue_state == RTE_ETH_QUEUE_STATE_STOPPED;
+}
+
 int
 rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		uint16_t queue_id, enum rte_power_pmd_mgmt_type mode)
 {
 	struct pmd_queue_cfg *queue_cfg;
 	struct rte_eth_dev_info info;
+	rte_rx_callback_fn clb;
 	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
@@ -203,6 +194,14 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		goto end;
 	}
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		ret = ret < 0 ? -EINVAL : -EBUSY;
+		goto end;
+	}
+
 	queue_cfg = &port_cfg[port_id][queue_id];
 
 	if (queue_cfg->pwr_mgmt_state != PMD_MGMT_DISABLED) {
@@ -232,17 +231,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->umwait_in_progress = false;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* ensure we update our state before callback starts */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_umwait, NULL);
+		clb = clb_umwait;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_SCALE:
@@ -269,16 +258,7 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 			ret = -ENOTSUP;
 			goto end;
 		}
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id,
-				queue_id, clb_scale_freq, NULL);
+		clb = clb_scale_freq;
 		break;
 	}
 	case RTE_POWER_MGMT_TYPE_PAUSE:
@@ -286,18 +266,21 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id, uint16_t port_id,
 		if (global_data.tsc_per_us == 0)
 			calc_tsc();
 
-		/* initialize data before enabling the callback */
-		queue_cfg->empty_poll_stats = 0;
-		queue_cfg->cb_mode = mode;
-		queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
-
-		/* this is not necessary here, but do it anyway */
-		rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
-		queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
-				clb_pause, NULL);
+		clb = clb_pause;
 		break;
+	default:
+		RTE_LOG(DEBUG, POWER, "Invalid power management type\n");
+		ret = -EINVAL;
+		goto end;
 	}
+
+	/* initialize data before enabling the callback */
+	queue_cfg->empty_poll_stats = 0;
+	queue_cfg->cb_mode = mode;
+	queue_cfg->pwr_mgmt_state = PMD_MGMT_ENABLED;
+	queue_cfg->cur_cb = rte_eth_add_rx_callback(port_id, queue_id,
+			clb, NULL);
+
 	ret = 0;
 end:
 	return ret;
@@ -308,12 +291,20 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		uint16_t port_id, uint16_t queue_id)
 {
 	struct pmd_queue_cfg *queue_cfg;
+	int ret;
 
 	RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
 
 	if (lcore_id >= RTE_MAX_LCORE || queue_id >= RTE_MAX_QUEUES_PER_PORT)
 		return -EINVAL;
 
+	/* check if the queue is stopped */
+	ret = queue_stopped(port_id, queue_id);
+	if (ret != 1) {
+		/* error means invalid queue, 0 means queue wasn't stopped */
+		return ret < 0 ? -EINVAL : -EBUSY;
+	}
+
 	/* no need to check queue id as wrong queue id would not be enabled */
 	queue_cfg = &port_cfg[port_id][queue_id];
 
@@ -323,27 +314,8 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 	/* stop any callbacks from progressing */
 	queue_cfg->pwr_mgmt_state = PMD_MGMT_DISABLED;
 
-	/* ensure we update our state before continuing */
-	rte_atomic_thread_fence(__ATOMIC_SEQ_CST);
-
 	switch (queue_cfg->cb_mode) {
-	case RTE_POWER_MGMT_TYPE_MONITOR:
-	{
-		bool exit = false;
-		do {
-			/*
-			 * we may request cancellation while the other thread
-			 * has just entered the callback but hasn't started
-			 * sleeping yet, so keep waking it up until we know it's
-			 * done sleeping.
-			 */
-			if (queue_cfg->umwait_in_progress)
-				rte_power_monitor_wakeup(lcore_id);
-			else
-				exit = true;
-		} while (!exit);
-	}
-	/* fall-through */
+	case RTE_POWER_MGMT_TYPE_MONITOR: /* fall-through */
 	case RTE_POWER_MGMT_TYPE_PAUSE:
 		rte_eth_remove_rx_callback(port_id, queue_id,
 				queue_cfg->cur_cb);
@@ -356,10 +328,11 @@ rte_power_ethdev_pmgmt_queue_disable(unsigned int lcore_id,
 		break;
 	}
 	/*
-	 * we don't free the RX callback here because it is unsafe to do so
-	 * unless we know for a fact that all data plane threads have stopped.
+	 * the API doc mandates that the user stops all processing on affected
+	 * ports before calling any of these API's, so we can assume that the
+	 * callbacks can be freed. we're intentionally casting away const-ness.
 	 */
-	queue_cfg->cur_cb = NULL;
+	rte_free((void *)queue_cfg->cur_cb);
 
 	return 0;
 }
diff --git a/lib/power/rte_power_pmd_mgmt.h b/lib/power/rte_power_pmd_mgmt.h
index 7a0ac24625..444e7b8a66 100644
--- a/lib/power/rte_power_pmd_mgmt.h
+++ b/lib/power/rte_power_pmd_mgmt.h
@@ -43,6 +43,9 @@ enum rte_power_pmd_mgmt_type {
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue will be polled from.
  * @param port_id
@@ -69,6 +72,9 @@ rte_power_ethdev_pmgmt_queue_enable(unsigned int lcore_id,
  *
  * @note This function is not thread-safe.
  *
+ * @warning This function must be called when all affected Ethernet queues are
+ *   stopped and no Rx/Tx is in progress!
+ *
  * @param lcore_id
  *   The lcore the Rx queue is polled from.
  * @param port_id
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 1/7] power_intrinsics: use callbacks for comparison
  @ 2021-06-25 14:00  3%   ` Anatoly Burakov
  2021-06-25 14:00  3%   ` [dpdk-dev] [PATCH v2 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
    2 siblings, 0 replies; 200+ results
From: Anatoly Burakov @ 2021-06-25 14:00 UTC (permalink / raw)
  To: dev, Timothy McDaniel, Beilei Xing, Jingjing Wu, Qiming Yang,
	Qi Zhang, Haiyue Wang, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Bruce Richardson, Konstantin Ananyev
  Cc: david.hunt, ciara.loftus

Previously, the semantics of power monitor were such that we were
checking current value against the expected value, and if they matched,
then the sleep was aborted. This is somewhat inflexible, because it only
allowed us to check for a specific value.

This commit replaces the comparison with a user callback mechanism, so
that any PMD (or other code) using `rte_power_monitor()` can define
their own comparison semantics and decision making on how to detect the
need to abort the entering of power optimized state.

Existing implementations are adjusted to follow the new semantics.

Suggested-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Use callback mechanism for more flexibility
    - Address feedback from Konstantin

 doc/guides/rel_notes/release_21_08.rst        |  1 +
 drivers/event/dlb2/dlb2.c                     | 16 ++++++++--
 drivers/net/i40e/i40e_rxtx.c                  | 19 ++++++++----
 drivers/net/iavf/iavf_rxtx.c                  | 19 ++++++++----
 drivers/net/ice/ice_rxtx.c                    | 19 ++++++++----
 drivers/net/ixgbe/ixgbe_rxtx.c                | 19 ++++++++----
 drivers/net/mlx5/mlx5_rx.c                    | 16 ++++++++--
 .../include/generic/rte_power_intrinsics.h    | 29 ++++++++++++++-----
 lib/eal/x86/rte_power_intrinsics.c            |  9 ++----
 9 files changed, 106 insertions(+), 41 deletions(-)

diff --git a/doc/guides/rel_notes/release_21_08.rst b/doc/guides/rel_notes/release_21_08.rst
index a6ecfdf3ce..c84ac280f5 100644
--- a/doc/guides/rel_notes/release_21_08.rst
+++ b/doc/guides/rel_notes/release_21_08.rst
@@ -84,6 +84,7 @@ API Changes
    Also, make sure to start the actual text at the margin.
    =======================================================
 
+* eal: the ``rte_power_intrinsics`` API changed to use a callback mechanism.
 
 ABI Changes
 -----------
diff --git a/drivers/event/dlb2/dlb2.c b/drivers/event/dlb2/dlb2.c
index eca183753f..14dfac257c 100644
--- a/drivers/event/dlb2/dlb2.c
+++ b/drivers/event/dlb2/dlb2.c
@@ -3154,6 +3154,15 @@ dlb2_port_credits_inc(struct dlb2_port *qm_port, int num)
 	}
 }
 
+#define CLB_MASK_IDX 0
+#define CLB_VAL_IDX 1
+static int
+dlb2_monitor_callback(const uint64_t val, const uint64_t opaque[4])
+{
+	/* abort if the value matches */
+	return (val & opaque[CLB_MASK_IDX]) == opaque[CLB_VAL_IDX] ? -1 : 0;
+}
+
 static inline int
 dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 		  struct dlb2_eventdev_port *ev_port,
@@ -3194,8 +3203,11 @@ dlb2_dequeue_wait(struct dlb2_eventdev *dlb2,
 			expected_value = 0;
 
 		pmc.addr = monitor_addr;
-		pmc.val = expected_value;
-		pmc.mask = qe_mask.raw_qe[1];
+		/* store expected value and comparison mask in opaque data */
+		pmc.opaque[CLB_VAL_IDX] = expected_value;
+		pmc.opaque[CLB_MASK_IDX] = qe_mask.raw_qe[1];
+		/* set up callback */
+		pmc.fn = dlb2_monitor_callback;
 		pmc.size = sizeof(uint64_t);
 
 		rte_power_monitor(&pmc, timeout + start_ticks);
diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c
index 6c58decece..45f3fbf4ec 100644
--- a/drivers/net/i40e/i40e_rxtx.c
+++ b/drivers/net/i40e/i40e_rxtx.c
@@ -81,6 +81,17 @@
 #define I40E_TX_OFFLOAD_SIMPLE_NOTSUP_MASK \
 		(PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_SIMPLE_SUP_MASK)
 
+static int
+i40e_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -93,12 +104,8 @@ i40e_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << I40E_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = i40e_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/iavf/iavf_rxtx.c b/drivers/net/iavf/iavf_rxtx.c
index 0361af0d85..6e12ecce07 100644
--- a/drivers/net/iavf/iavf_rxtx.c
+++ b/drivers/net/iavf/iavf_rxtx.c
@@ -57,6 +57,17 @@ iavf_proto_xtr_type_to_rxdid(uint8_t flex_type)
 				rxdid_map[flex_type] : IAVF_RXDID_COMMS_OVS_1;
 }
 
+static int
+iavf_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -69,12 +80,8 @@ iavf_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.qword1.status_error_len;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
-	pmc->mask = rte_cpu_to_le_64(1 << IAVF_RX_DESC_STATUS_DD_SHIFT);
+	/* comparison callback */
+	pmc->fn = iavf_monitor_callback;
 
 	/* registers are 64-bit */
 	pmc->size = sizeof(uint64_t);
diff --git a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c
index fc9bb5a3e7..278eb4b9a1 100644
--- a/drivers/net/ice/ice_rxtx.c
+++ b/drivers/net/ice/ice_rxtx.c
@@ -27,6 +27,17 @@ uint64_t rte_net_ice_dynflag_proto_xtr_ipv6_flow_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_tcp_mask;
 uint64_t rte_net_ice_dynflag_proto_xtr_ip_offset_mask;
 
+static int
+ice_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -39,12 +50,8 @@ ice_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.status_error0;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
-	pmc->mask = rte_cpu_to_le_16(1 << ICE_RX_FLEX_DESC_STATUS0_DD_S);
+	/* comparison callback */
+	pmc->fn = ice_monitor_callback;
 
 	/* register is 16-bit */
 	pmc->size = sizeof(uint16_t);
diff --git a/drivers/net/ixgbe/ixgbe_rxtx.c b/drivers/net/ixgbe/ixgbe_rxtx.c
index d69f36e977..0c5045d9dc 100644
--- a/drivers/net/ixgbe/ixgbe_rxtx.c
+++ b/drivers/net/ixgbe/ixgbe_rxtx.c
@@ -1369,6 +1369,17 @@ const uint32_t
 		RTE_PTYPE_INNER_L3_IPV4_EXT | RTE_PTYPE_INNER_L4_UDP,
 };
 
+static int
+ixgbe_monitor_callback(const uint64_t value, const uint64_t arg[4] __rte_unused)
+{
+	const uint64_t m = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/*
+	 * we expect the DD bit to be set to 1 if this descriptor was already
+	 * written to.
+	 */
+	return (value & m) == m ? -1 : 0;
+}
+
 int
 ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
@@ -1381,12 +1392,8 @@ ixgbe_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 	/* watch for changes in status bit */
 	pmc->addr = &rxdp->wb.upper.status_error;
 
-	/*
-	 * we expect the DD bit to be set to 1 if this descriptor was already
-	 * written to.
-	 */
-	pmc->val = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
-	pmc->mask = rte_cpu_to_le_32(IXGBE_RXDADV_STAT_DD);
+	/* comparison callback */
+	pmc->fn = ixgbe_monitor_callback;
 
 	/* the registers are 32-bit */
 	pmc->size = sizeof(uint32_t);
diff --git a/drivers/net/mlx5/mlx5_rx.c b/drivers/net/mlx5/mlx5_rx.c
index 6cd71a44eb..f31a1ec839 100644
--- a/drivers/net/mlx5/mlx5_rx.c
+++ b/drivers/net/mlx5/mlx5_rx.c
@@ -269,6 +269,17 @@ mlx5_rx_queue_count(struct rte_eth_dev *dev, uint16_t rx_queue_id)
 	return rx_queue_count(rxq);
 }
 
+#define CLB_VAL_IDX 0
+#define CLB_MSK_IDX 1
+static int
+mlx_monitor_callback(const uint64_t value, const uint64_t opaque[4])
+{
+	const uint64_t m = opaque[CLB_MSK_IDX];
+	const uint64_t v = opaque[CLB_VAL_IDX];
+
+	return (value & m) == v ? -1 : 0;
+}
+
 int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 {
 	struct mlx5_rxq_data *rxq = rx_queue;
@@ -282,8 +293,9 @@ int mlx5_get_monitor_addr(void *rx_queue, struct rte_power_monitor_cond *pmc)
 		return -rte_errno;
 	}
 	pmc->addr = &cqe->op_own;
-	pmc->val =  !!idx;
-	pmc->mask = MLX5_CQE_OWNER_MASK;
+	pmc->opaque[CLB_VAL_IDX] = !!idx;
+	pmc->opaque[CLB_MSK_IDX] = MLX5_CQE_OWNER_MASK;
+	pmc->fn = mlx_monitor_callback;
 	pmc->size = sizeof(uint8_t);
 	return 0;
 }
diff --git a/lib/eal/include/generic/rte_power_intrinsics.h b/lib/eal/include/generic/rte_power_intrinsics.h
index dddca3d41c..046667ade6 100644
--- a/lib/eal/include/generic/rte_power_intrinsics.h
+++ b/lib/eal/include/generic/rte_power_intrinsics.h
@@ -18,19 +18,34 @@
  * which are architecture-dependent.
  */
 
+/**
+ * Callback definition for monitoring conditions. Callbacks with this signature
+ * will be used by `rte_power_monitor()` to check if the entering of power
+ * optimized state should be aborted.
+ *
+ * @param val
+ *   The value read from memory.
+ * @param opaque
+ *   Callback-specific data.
+ *
+ * @return
+ *   0 if entering of power optimized state should proceed
+ *   -1 if entering of power optimized state should be aborted
+ */
+typedef int (*rte_power_monitor_clb_t)(const uint64_t val,
+		const uint64_t opaque[4]);
 struct rte_power_monitor_cond {
 	volatile void *addr;  /**< Address to monitor for changes */
-	uint64_t val;         /**< If the `mask` is non-zero, location pointed
-	                       *   to by `addr` will be read and compared
-	                       *   against this value.
-	                       */
-	uint64_t mask;   /**< 64-bit mask to extract value read from `addr` */
-	uint8_t size;    /**< Data size (in bytes) that will be used to compare
-	                  *   expected value (`val`) with data read from the
+	uint8_t size;    /**< Data size (in bytes) that will be read from the
 	                  *   monitored memory location (`addr`). Can be 1, 2,
 	                  *   4, or 8. Supplying any other value will result in
 	                  *   an error.
 	                  */
+	rte_power_monitor_clb_t fn; /**< Callback to be used to check if
+	                             *   entering power optimized state should
+	                             *   be aborted.
+	                             */
+	uint64_t opaque[4]; /**< Callback-specific data */
 };
 
 /**
diff --git a/lib/eal/x86/rte_power_intrinsics.c b/lib/eal/x86/rte_power_intrinsics.c
index 39ea9fdecd..3c5c9ce7ad 100644
--- a/lib/eal/x86/rte_power_intrinsics.c
+++ b/lib/eal/x86/rte_power_intrinsics.c
@@ -110,14 +110,11 @@ rte_power_monitor(const struct rte_power_monitor_cond *pmc,
 	/* now that we've put this address into monitor, we can unlock */
 	rte_spinlock_unlock(&s->lock);
 
-	/* if we have a comparison mask, we might not need to sleep at all */
-	if (pmc->mask) {
+	/* if we have a callback, we might not need to sleep at all */
+	if (pmc->fn) {
 		const uint64_t cur_value = __get_umwait_val(
 				pmc->addr, pmc->size);
-		const uint64_t masked = cur_value & pmc->mask;
-
-		/* if the masked value is already matching, abort */
-		if (masked == pmc->val)
+		if (pmc->fn(cur_value, pmc->opaque) != 0)
 			goto end;
 	}
 
-- 
2.25.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] Experimental symbols in kni lib
  2021-06-24 13:54  0%   ` Kinsella, Ray
@ 2021-06-25 13:26  0%     ` Igor Ryzhov
  2021-06-28 12:23  0%       ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Igor Ryzhov @ 2021-06-25 13:26 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Ferruh, all,

Let's please discuss another approach to setting KNI link status before
making this API stable:
http://patches.dpdk.org/project/dpdk/patch/20190925093623.18419-1-iryzhov@nfware.com/

I explained the problem with the current implementation there.
More than that, using ioctl approach makes it possible to set also speed
and duplex and use them to implement get_link_ksettings callback.
I can send patches for both features.

Igor

On Thu, Jun 24, 2021 at 4:54 PM Kinsella, Ray <mdr@ashroe.eu> wrote:

> Sounds more than reasonable, +1 from me.
>
> Ray K
>
> On 24/06/2021 14:24, Ferruh Yigit wrote:
> > On 6/24/2021 11:42 AM, Kinsella, Ray wrote:
> >> Hi Ferruh,
> >>
> >> The following kni experimental symbols are present in both v21.05 and
> v19.11 release. These symbols should be considered for promotion to stable
> as part of the v22 ABI in DPDK 21.11, as they have been experimental for >=
> 2yrs at this point.
> >>
> >>  * rte_kni_update_link
> >>
> >> Ray K
> >>
> >
> > Hi Ray,
> >
> > Thanks for follow up.
> >
> > I just checked the API and planning a small behavior update to it.
> > If the update is accepted, I suggest keeping the API experimental for
> 21.08 too,
> > but can mature it on v21.11.
> >
> > Thanks,
> > ferruh
> >
>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in compressdev lib
  2021-06-25  7:49  0% ` David Marchand
@ 2021-06-25  9:14  0%   ` Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-25  9:14 UTC (permalink / raw)
  To: David Marchand, Fiona Trahe, Ashish Gupta
  Cc: Thomas Monjalon, Stephen Hemminger, dpdk-dev



On 25/06/2021 08:49, David Marchand wrote:
> On Thu, Jun 24, 2021 at 12:33 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>
>> Hi Fiona & Ashish,
>>
>> The following compressdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.
>>
>>  * rte_compressdev_capability_get
>>  * rte_compressdev_close
>>  * rte_compressdev_configure
>>  * rte_compressdev_count
>>  * rte_compressdev_dequeue_burst
>>  * rte_compressdev_devices_get
>>  * rte_compressdev_enqueue_burst
>>  * rte_compressdev_get_dev_id
>>  * rte_compressdev_get_feature_name
>>  * rte_compressdev_info_get
>>  * rte_compressdev_name_get
>>  * rte_compressdev_pmd_allocate
>>  * rte_compressdev_pmd_create
>>  * rte_compressdev_pmd_destroy
>>  * rte_compressdev_pmd_get_named_dev
>>  * rte_compressdev_pmd_parse_input_args
>>  * rte_compressdev_pmd_release_device
>>  * rte_compressdev_private_xform_create
>>  * rte_compressdev_private_xform_free
>>  * rte_compressdev_queue_pair_count
>>  * rte_compressdev_queue_pair_setup
>>  * rte_compressdev_socket_id
>>  * rte_compressdev_start
>>  * rte_compressdev_stats_get
>>  * rte_compressdev_stats_reset
>>  * rte_compressdev_stop
>>  * rte_compressdev_stream_create
>>  * rte_compressdev_stream_free
>>  * rte_comp_get_feature_name
>>  * rte_comp_op_alloc
>>  * rte_comp_op_bulk_alloc
>>  * rte_comp_op_bulk_free
>>  * rte_comp_op_free
>>  * rte_comp_op_pool_create
>>
> 
> Part of the symbols listed here are driver-only (at least the *_pmd_*
> symbols) and should be marked internal.
> 
+1 agreed. 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v1] doc: update ABI in MAINTAINERS file
  2021-06-22 15:50 12% [dpdk-dev] [PATCH v1] doc: update ABI in MAINTAINERS file Ray Kinsella
@ 2021-06-25  8:08  7% ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-25  8:08 UTC (permalink / raw)
  To: Ray Kinsella, dev; +Cc: stephen, thomas, ktraynor, bruce.richardson

On 6/22/2021 4:50 PM, Ray Kinsella wrote:
> Update to ABI MAINTAINERS.
> 
> Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
> ---
>  MAINTAINERS | 1 -
>  1 file changed, 1 deletion(-)
> 
> diff --git a/MAINTAINERS b/MAINTAINERS
> index 5877a16971..dab8883a4f 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -117,7 +117,6 @@ F: .ci/
>  
>  ABI Policy & Versioning
>  M: Ray Kinsella <mdr@ashroe.eu>
> -M: Neil Horman <nhorman@tuxdriver.com>
>  F: lib/eal/include/rte_compat.h
>  F: lib/eal/include/rte_function_versioning.h
>  F: doc/guides/contributing/abi_*.rst
> 

Acked-by: Ferruh Yigit <ferruh.yigit@intel.com>

Tried to reach out Neil multiple times for ABI issues without success.

^ permalink raw reply	[relevance 7%]

* Re: [dpdk-dev] Experimental symbols in compressdev lib
  2021-06-24 10:32  3% [dpdk-dev] Experimental symbols in compressdev lib Kinsella, Ray
  2021-06-24 10:55  0% ` Trahe, Fiona
@ 2021-06-25  7:49  0% ` David Marchand
  2021-06-25  9:14  0%   ` Kinsella, Ray
  1 sibling, 1 reply; 200+ results
From: David Marchand @ 2021-06-25  7:49 UTC (permalink / raw)
  To: Fiona Trahe, Ashish Gupta
  Cc: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev

On Thu, Jun 24, 2021 at 12:33 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>
> Hi Fiona & Ashish,
>
> The following compressdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.
>
>  * rte_compressdev_capability_get
>  * rte_compressdev_close
>  * rte_compressdev_configure
>  * rte_compressdev_count
>  * rte_compressdev_dequeue_burst
>  * rte_compressdev_devices_get
>  * rte_compressdev_enqueue_burst
>  * rte_compressdev_get_dev_id
>  * rte_compressdev_get_feature_name
>  * rte_compressdev_info_get
>  * rte_compressdev_name_get
>  * rte_compressdev_pmd_allocate
>  * rte_compressdev_pmd_create
>  * rte_compressdev_pmd_destroy
>  * rte_compressdev_pmd_get_named_dev
>  * rte_compressdev_pmd_parse_input_args
>  * rte_compressdev_pmd_release_device
>  * rte_compressdev_private_xform_create
>  * rte_compressdev_private_xform_free
>  * rte_compressdev_queue_pair_count
>  * rte_compressdev_queue_pair_setup
>  * rte_compressdev_socket_id
>  * rte_compressdev_start
>  * rte_compressdev_stats_get
>  * rte_compressdev_stats_reset
>  * rte_compressdev_stop
>  * rte_compressdev_stream_create
>  * rte_compressdev_stream_free
>  * rte_comp_get_feature_name
>  * rte_comp_op_alloc
>  * rte_comp_op_bulk_alloc
>  * rte_comp_op_bulk_free
>  * rte_comp_op_free
>  * rte_comp_op_pool_create
>

Part of the symbols listed here are driver-only (at least the *_pmd_*
symbols) and should be marked internal.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in bbdev lib
  2021-06-24 10:35  3% [dpdk-dev] Experimental symbols in bbdev lib Kinsella, Ray
  2021-06-24 15:42  3% ` Chautru, Nicolas
@ 2021-06-25  7:48  0% ` David Marchand
  1 sibling, 0 replies; 200+ results
From: David Marchand @ 2021-06-25  7:48 UTC (permalink / raw)
  To: Nicolas Chautru
  Cc: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev,
	Maxime Coquelin

On Thu, Jun 24, 2021 at 12:35 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>
> Hi Nicolas
>
> The following bbdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.
>
> * rte_bbdev_allocate
> * rte_bbdev_callback_register
> * rte_bbdev_callback_unregister
> * rte_bbdev_close
> * rte_bbdev_count
> * rte_bbdev_dec_op_alloc_bulk
> * rte_bbdev_dec_op_free_bulk
> * rte_bbdev_dequeue_dec_ops
> * rte_bbdev_dequeue_enc_ops
> * rte_bbdev_devices
> * rte_bbdev_enc_op_alloc_bulk
> * rte_bbdev_enc_op_free_bulk
> * rte_bbdev_enqueue_dec_ops
> * rte_bbdev_enqueue_enc_ops
> * rte_bbdev_find_next
> * rte_bbdev_get_named_dev
> * rte_bbdev_info_get
> * rte_bbdev_intr_enable
> * rte_bbdev_is_valid
> * rte_bbdev_op_pool_create
> * rte_bbdev_op_type_str
> * rte_bbdev_pmd_callback_process
> * rte_bbdev_queue_configure
> * rte_bbdev_queue_info_get
> * rte_bbdev_queue_intr_ctl
> * rte_bbdev_queue_intr_disable
> * rte_bbdev_queue_intr_enable
> * rte_bbdev_queue_start
> * rte_bbdev_queue_stop
> * rte_bbdev_release
> * rte_bbdev_setup_queues
> * rte_bbdev_start
> * rte_bbdev_stats_get
> * rte_bbdev_stats_reset
> * rte_bbdev_stop

Regardless of removing the experimental status on this API, part of
the symbols listed here are driver-only and should be marked internal.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in bbdev lib
  2021-06-24 15:42  3% ` Chautru, Nicolas
@ 2021-06-24 19:27  3%   ` Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 19:27 UTC (permalink / raw)
  To: Chautru, Nicolas, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Nicolas,

I could equally ask is there is any concern with this being a tracked ABI?
The API has seen zero changes in two years - IMHO we'd need a very good reason not standardize it.
As there has been ample opportunities for others to chime in. 

git log --format=oneline --follow v19.11..v21.05 -- lib/bbdev/version.map
99a2dd955fba6e4cc23b77d590a033650ced9c45 lib: remove librte_ prefix from directory names
63b3907833d87288bbc74f370e22f2929ec34594 build: remove library name from version map file name

Ray K

On 24/06/2021 16:42, Chautru, Nicolas wrote:
> Hi Ray, 
> 
> That request was considered for 20.11. But this was deferred by the community while waiting for other vendors who may be willing to contribute their own PMDs.
> Any specific concern with this not being on a tracked ABI?
> 
> Thanks
> Nic
> 
> 
>> -----Original Message-----
>> From: Kinsella, Ray <mdr@ashroe.eu>
>> Sent: Thursday, June 24, 2021 3:35 AM
>> To: Chautru, Nicolas <nicolas.chautru@intel.com>; Thomas Monjalon
>> <thomas@monjalon.net>; Stephen Hemminger
>> <stephen@networkplumber.org>; dpdk-dev <dev@dpdk.org>
>> Subject: Experimental symbols in bbdev lib
>>
>> Hi Nicolas
>>
>> The following bbdev experimental symbols are present in both v21.05 and
>> v19.11 release. These symbols should be considered for promotion to stable
>> as part of the v22 ABI in DPDK 21.11, as they have been experimental for >=
>> 2yrs at this point.
>>
>> * rte_bbdev_allocate
>> * rte_bbdev_callback_register
>> * rte_bbdev_callback_unregister
>> * rte_bbdev_close
>> * rte_bbdev_count
>> * rte_bbdev_dec_op_alloc_bulk
>> * rte_bbdev_dec_op_free_bulk
>> * rte_bbdev_dequeue_dec_ops
>> * rte_bbdev_dequeue_enc_ops
>> * rte_bbdev_devices
>> * rte_bbdev_enc_op_alloc_bulk
>> * rte_bbdev_enc_op_free_bulk
>> * rte_bbdev_enqueue_dec_ops
>> * rte_bbdev_enqueue_enc_ops
>> * rte_bbdev_find_next
>> * rte_bbdev_get_named_dev
>> * rte_bbdev_info_get
>> * rte_bbdev_intr_enable
>> * rte_bbdev_is_valid
>> * rte_bbdev_op_pool_create
>> * rte_bbdev_op_type_str
>> * rte_bbdev_pmd_callback_process
>> * rte_bbdev_queue_configure
>> * rte_bbdev_queue_info_get
>> * rte_bbdev_queue_intr_ctl
>> * rte_bbdev_queue_intr_disable
>> * rte_bbdev_queue_intr_enable
>> * rte_bbdev_queue_start
>> * rte_bbdev_queue_stop
>> * rte_bbdev_release
>> * rte_bbdev_setup_queues
>> * rte_bbdev_start
>> * rte_bbdev_stats_get
>> * rte_bbdev_stats_reset
>> * rte_bbdev_stop
>>
>> Ray K

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] Experimental symbols in sched lib
  2021-06-24 10:33  3% [dpdk-dev] Experimental symbols in sched lib Kinsella, Ray
@ 2021-06-24 19:21  0% ` Singh, Jasvinder
  0 siblings, 0 replies; 200+ results
From: Singh, Jasvinder @ 2021-06-24 19:21 UTC (permalink / raw)
  To: Kinsella, Ray
  Cc: Dumitrescu, Cristian, Thomas Monjalon, Stephen Hemminger, dpdk-dev



> On 24 Jun 2021, at 11:33, Kinsella, Ray <mdr@ashroe.eu> wrote:
> 
> Hi Cristian & Jasvinder,
> 
> The following sched experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 
> 
> * rte_sched_subport_pipe_profile_add
> 
> Ray K

I’ll send patch to remove experimental tag. Thanks for the heads up. 
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in bbdev lib
  2021-06-24 10:35  3% [dpdk-dev] Experimental symbols in bbdev lib Kinsella, Ray
@ 2021-06-24 15:42  3% ` Chautru, Nicolas
  2021-06-24 19:27  3%   ` Kinsella, Ray
  2021-06-25  7:48  0% ` David Marchand
  1 sibling, 1 reply; 200+ results
From: Chautru, Nicolas @ 2021-06-24 15:42 UTC (permalink / raw)
  To: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Ray, 

That request was considered for 20.11. But this was deferred by the community while waiting for other vendors who may be willing to contribute their own PMDs.
Any specific concern with this not being on a tracked ABI?

Thanks
Nic


> -----Original Message-----
> From: Kinsella, Ray <mdr@ashroe.eu>
> Sent: Thursday, June 24, 2021 3:35 AM
> To: Chautru, Nicolas <nicolas.chautru@intel.com>; Thomas Monjalon
> <thomas@monjalon.net>; Stephen Hemminger
> <stephen@networkplumber.org>; dpdk-dev <dev@dpdk.org>
> Subject: Experimental symbols in bbdev lib
> 
> Hi Nicolas
> 
> The following bbdev experimental symbols are present in both v21.05 and
> v19.11 release. These symbols should be considered for promotion to stable
> as part of the v22 ABI in DPDK 21.11, as they have been experimental for >=
> 2yrs at this point.
> 
> * rte_bbdev_allocate
> * rte_bbdev_callback_register
> * rte_bbdev_callback_unregister
> * rte_bbdev_close
> * rte_bbdev_count
> * rte_bbdev_dec_op_alloc_bulk
> * rte_bbdev_dec_op_free_bulk
> * rte_bbdev_dequeue_dec_ops
> * rte_bbdev_dequeue_enc_ops
> * rte_bbdev_devices
> * rte_bbdev_enc_op_alloc_bulk
> * rte_bbdev_enc_op_free_bulk
> * rte_bbdev_enqueue_dec_ops
> * rte_bbdev_enqueue_enc_ops
> * rte_bbdev_find_next
> * rte_bbdev_get_named_dev
> * rte_bbdev_info_get
> * rte_bbdev_intr_enable
> * rte_bbdev_is_valid
> * rte_bbdev_op_pool_create
> * rte_bbdev_op_type_str
> * rte_bbdev_pmd_callback_process
> * rte_bbdev_queue_configure
> * rte_bbdev_queue_info_get
> * rte_bbdev_queue_intr_ctl
> * rte_bbdev_queue_intr_disable
> * rte_bbdev_queue_intr_enable
> * rte_bbdev_queue_start
> * rte_bbdev_queue_stop
> * rte_bbdev_release
> * rte_bbdev_setup_queues
> * rte_bbdev_start
> * rte_bbdev_stats_get
> * rte_bbdev_stats_reset
> * rte_bbdev_stop
> 
> Ray K

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] Experimental symbols in kni lib
  2021-06-24 13:24  0% ` Ferruh Yigit
@ 2021-06-24 13:54  0%   ` Kinsella, Ray
  2021-06-25 13:26  0%     ` Igor Ryzhov
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 13:54 UTC (permalink / raw)
  To: Ferruh Yigit, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Sounds more than reasonable, +1 from me.

Ray K

On 24/06/2021 14:24, Ferruh Yigit wrote:
> On 6/24/2021 11:42 AM, Kinsella, Ray wrote:
>> Hi Ferruh, 
>>
>> The following kni experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 
>>
>>  * rte_kni_update_link
>>
>> Ray K
>>
> 
> Hi Ray,
> 
> Thanks for follow up.
> 
> I just checked the API and planning a small behavior update to it.
> If the update is accepted, I suggest keeping the API experimental for 21.08 too,
> but can mature it on v21.11.
> 
> Thanks,
> ferruh
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in kni lib
  2021-06-24 10:42  3% [dpdk-dev] Experimental symbols in kni lib Kinsella, Ray
@ 2021-06-24 13:24  0% ` Ferruh Yigit
  2021-06-24 13:54  0%   ` Kinsella, Ray
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-06-24 13:24 UTC (permalink / raw)
  To: Kinsella, Ray, Thomas Monjalon, Stephen Hemminger, dpdk-dev

On 6/24/2021 11:42 AM, Kinsella, Ray wrote:
> Hi Ferruh, 
> 
> The following kni experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 
> 
>  * rte_kni_update_link
> 
> Ray K
> 

Hi Ray,

Thanks for follow up.

I just checked the API and planning a small behavior update to it.
If the update is accepted, I suggest keeping the API experimental for 21.08 too,
but can mature it on v21.11.

Thanks,
ferruh

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [EXT] Re: Experimental symbols in security lib
  2021-06-24 10:49  0% ` Kinsella, Ray
@ 2021-06-24 12:22  0%   ` Akhil Goyal
  0 siblings, 0 replies; 200+ results
From: Akhil Goyal @ 2021-06-24 12:22 UTC (permalink / raw)
  To: Kinsella, Ray, Declan Doherty, Thomas Monjalon,
	Stephen Hemminger, dpdk-dev
  Cc: Anoob Joseph, Konstantin Ananyev, Hemant Agrawal,
	Nithin Kumar Dabilpuram, Fan Zhang, matan

Hi Ray,
> ----------------------------------------------------------------------
> (correcting Goyals address, apologies for the resend)
> 
> On 24/06/2021 11:28, Kinsella, Ray wrote:
> > Hi Declan and Goyal,
> >
> > The following security experimental symbols are present in both v21.05
> and v19.11 release. These symbols should be considered for promotion to
> stable as part of the v22 ABI in DPDK 21.11, as they have been experimental
> for >= 2yrs at this point.

Thanks for reminding this, I will plan to move it to stable API in 21.11 timeframe.
Adding more people in cc in case of any objections.

> >
> >  * rte_security_get_userdata
> >  * rte_security_session_stats_get
> >  * rte_security_session_update
> >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in eal lib
  2021-06-24 12:14  0% ` David Marchand
@ 2021-06-24 12:15  0%   ` Kinsella, Ray
  2021-06-29 16:50  0%   ` Tyler Retzlaff
  1 sibling, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 12:15 UTC (permalink / raw)
  To: David Marchand
  Cc: Thomas Monjalon, Stephen Hemminger, Burakov, Anatoly, dpdk-dev

Good point, that one is very up to the lib maintainer to make that call.

Ray K

On 24/06/2021 13:14, David Marchand wrote:
> On Thu, Jun 24, 2021 at 12:31 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>>
>> Hi Anatoly & Thomas,
>>
>> The following eal experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.
> 
> Just an additional comment.
> Marking stable is not the only choice.
> We can also consider hiding such symbols (marking internal) if there
> is no clear usecase out of DPDK.
> 
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in eal lib
  2021-06-24 10:31  3% [dpdk-dev] Experimental symbols in eal lib Kinsella, Ray
@ 2021-06-24 12:14  0% ` David Marchand
  2021-06-24 12:15  0%   ` Kinsella, Ray
  2021-06-29 16:50  0%   ` Tyler Retzlaff
  0 siblings, 2 replies; 200+ results
From: David Marchand @ 2021-06-24 12:14 UTC (permalink / raw)
  To: Kinsella, Ray
  Cc: Thomas Monjalon, Stephen Hemminger, Burakov, Anatoly, dpdk-dev

On Thu, Jun 24, 2021 at 12:31 PM Kinsella, Ray <mdr@ashroe.eu> wrote:
>
> Hi Anatoly & Thomas,
>
> The following eal experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point.

Just an additional comment.
Marking stable is not the only choice.
We can also consider hiding such symbols (marking internal) if there
is no clear usecase out of DPDK.


-- 
David Marchand


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in vhost lib
  2021-06-24 10:30  3% [dpdk-dev] Experimental symbols in vhost lib Kinsella, Ray
@ 2021-06-24 11:04  0% ` Xia, Chenbo
  0 siblings, 0 replies; 200+ results
From: Xia, Chenbo @ 2021-06-24 11:04 UTC (permalink / raw)
  To: Kinsella, Ray, Maxime Coquelin, Thomas Monjalon,
	Stephen Hemminger, dpdk-dev

Hi Ray,

> -----Original Message-----
> From: Kinsella, Ray <mdr@ashroe.eu>
> Sent: Thursday, June 24, 2021 6:30 PM
> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Xia, Chenbo
> <chenbo.xia@intel.com>; Thomas Monjalon <thomas@monjalon.net>; Stephen
> Hemminger <stephen@networkplumber.org>; dpdk-dev <dev@dpdk.org>
> Subject: Experimental symbols in vhost lib
> 
> Hi Maxime and Chenbo,
> 
> The following vhost experimental symbols are present in both v21.05 and v19.11
> release. These symbols should be considered for promotion to stable as part of
> the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this
> point.

[...]

Thanks for the heads up! I will discuss with Maxime on the experimental symbols.

Chenbo

> Ray K


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in compressdev lib
  2021-06-24 10:32  3% [dpdk-dev] Experimental symbols in compressdev lib Kinsella, Ray
@ 2021-06-24 10:55  0% ` Trahe, Fiona
  2021-06-25  7:49  0% ` David Marchand
  1 sibling, 0 replies; 200+ results
From: Trahe, Fiona @ 2021-06-24 10:55 UTC (permalink / raw)
  To: Kinsella, Ray, Ashish Gupta, Thomas Monjalon, Stephen Hemminger,
	dpdk-dev
  Cc: Trahe, Fiona

Hi Ray,
Sounds reasonable, however I'm not curently working on this project, so will have to leave to others to propose.
Fiona
 

> -----Original Message-----
> From: Kinsella, Ray <mdr@ashroe.eu>
> Sent: Thursday, June 24, 2021 11:33 AM
> To: Trahe, Fiona <fiona.trahe@intel.com>; Ashish Gupta <ashish.gupta@marvell.com>; Thomas
> Monjalon <thomas@monjalon.net>; Stephen Hemminger <stephen@networkplumber.org>; dpdk-dev
> <dev@dpdk.org>
> Subject: Experimental symbols in compressdev lib
> 
> Hi Fiona & Ashish,
> 
> The following compressdev experimental symbols are present in both v21.05 and v19.11 release.
> These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as
> they have been experimental for >= 2yrs at this point.
> 
>  * rte_compressdev_capability_get
>  * rte_compressdev_close
>  * rte_compressdev_configure
>  * rte_compressdev_count
>  * rte_compressdev_dequeue_burst
>  * rte_compressdev_devices_get
>  * rte_compressdev_enqueue_burst
>  * rte_compressdev_get_dev_id
>  * rte_compressdev_get_feature_name
>  * rte_compressdev_info_get
>  * rte_compressdev_name_get
>  * rte_compressdev_pmd_allocate
>  * rte_compressdev_pmd_create
>  * rte_compressdev_pmd_destroy
>  * rte_compressdev_pmd_get_named_dev
>  * rte_compressdev_pmd_parse_input_args
>  * rte_compressdev_pmd_release_device
>  * rte_compressdev_private_xform_create
>  * rte_compressdev_private_xform_free
>  * rte_compressdev_queue_pair_count
>  * rte_compressdev_queue_pair_setup
>  * rte_compressdev_socket_id
>  * rte_compressdev_start
>  * rte_compressdev_stats_get
>  * rte_compressdev_stats_reset
>  * rte_compressdev_stop
>  * rte_compressdev_stream_create
>  * rte_compressdev_stream_free
>  * rte_comp_get_feature_name
>  * rte_comp_op_alloc
>  * rte_comp_op_bulk_alloc
>  * rte_comp_op_bulk_free
>  * rte_comp_op_free
>  * rte_comp_op_pool_create
> 
> Ray K


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in security lib
  2021-06-24 10:28  3% [dpdk-dev] Experimental symbols in security lib Kinsella, Ray
@ 2021-06-24 10:49  0% ` Kinsella, Ray
  2021-06-24 12:22  0%   ` [dpdk-dev] [EXT] " Akhil Goyal
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:49 UTC (permalink / raw)
  To: Declan Doherty, Thomas Monjalon, Stephen Hemminger, dpdk-dev,
	Akhil,Goyal,

(correcting Goyals address, apologies for the resend)  

On 24/06/2021 11:28, Kinsella, Ray wrote:
> Hi Declan and Goyal, 
> 
> The following security experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 
> 
>  * rte_security_get_userdata
>  * rte_security_session_stats_get
>  * rte_security_session_update
> 
> Ray K
> 

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] Experimental symbols in hash lib
       [not found]     <c6c3ce36-9585-6fcb-8899-719d6b8a368b@ashroe.eu>
@ 2021-06-24 10:47  0% ` Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:47 UTC (permalink / raw)
  To: Yipeng Wang, Sameh Gobriel, Richardson, Bruce, Medvedkin,
	Vladimir, dpdk-dev

+ dpdk dev

(missed the dev list the first time, apologies).

On 24/06/2021 11:41, Kinsella, Ray wrote:
> Hi Yipeng, Sameh, Bruce and Vladimir, 
> 
> The following hash experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 
> 
>  * rte_hash_free_key_with_position
> 
> Ray K
> 

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] Experimental symbols in fib lib
@ 2021-06-24 10:46  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:46 UTC (permalink / raw)
  To: Medvedkin, Vladimir, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Vladimir, 

The following fib experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_fib_add
 * rte_fib_create
 * rte_fib_delete
 * rte_fib_find_existing
 * rte_fib_free
 * rte_fib_lookup_bulk
 * rte_fib_get_dp
 * rte_fib_get_rib
 * rte_fib6_add
 * rte_fib6_create
 * rte_fib6_delete
 * rte_fib6_find_existing
 * rte_fib6_free
 * rte_fib6_lookup_bulk
 * rte_fib6_get_dp
 * rte_fib6_get_rib

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in metrics lib
@ 2021-06-24 10:44  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:44 UTC (permalink / raw)
  To: Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Thomas, 

The following metrics experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_metrics_deinit

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in kni lib
@ 2021-06-24 10:42  3% Kinsella, Ray
  2021-06-24 13:24  0% ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:42 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Ferruh, 

The following kni experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_kni_update_link

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in power lib
@ 2021-06-24 10:39  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:39 UTC (permalink / raw)
  To: David Hunt, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi David,

The following power experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_empty_poll_detection
 * rte_power_empty_poll_stat_fetch
 * rte_power_empty_poll_stat_free
 * rte_power_empty_poll_stat_init
 * rte_power_empty_poll_stat_update
 * rte_power_guest_channel_receive_msg
 * rte_power_poll_stat_fetch
 * rte_power_poll_stat_update

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental Symbols in kvargs
@ 2021-06-24 10:36  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:36 UTC (permalink / raw)
  To: Olivier Matz, Stephen Hemminger, Thomas Monjalon, dpdk-dev

Hi Oliver,

The following kvargs experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

* rte_kvargs_parse_delim
* rte_kvargs_strcmp

Ray K

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental Symbols in ethdev lib
@ 2021-06-24 10:36  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:36 UTC (permalink / raw)
  To: Thomas Monjalon, Yigit, Ferruh, Andrew Rybchenko, dpdk-dev

Hi Thomas, Ferruh and Andrew,

The following ethdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_mtr_capabilities_get,
 * rte_mtr_create,
 * rte_mtr_destroy,
 * rte_mtr_meter_disable,
 * rte_mtr_meter_dscp_table_update,
 * rte_mtr_meter_enable,
 * rte_mtr_meter_profile_add,
 * rte_mtr_meter_profile_delete,
 * rte_mtr_meter_profile_update,
 * rte_mtr_stats_read,
 * rte_mtr_stats_update,
 * rte_eth_dev_is_removed,
 * rte_eth_dev_owner_delete,
 * rte_eth_dev_owner_get,
 * rte_eth_dev_owner_new,
 * rte_eth_dev_owner_set,
 * rte_eth_dev_owner_unset,
 * rte_eth_dev_get_module_eeprom,
 * rte_eth_dev_get_module_info,
 * rte_eth_dev_rx_intr_ctl_q_get_fd,
 * rte_flow_conv,
 * rte_eth_find_next_of,
 * rte_eth_find_next_sibling,
 * rte_eth_read_clock,
 * rte_eth_dev_hairpin_capability_get,
 * rte_eth_rx_burst_mode_get,
 * rte_eth_rx_hairpin_queue_setup,
 * rte_eth_tx_burst_mode_get,
 * rte_eth_tx_hairpin_queue_setup,
 * rte_flow_dynf_metadata_offs,
 * rte_flow_dynf_metadata_mask,
 * rte_flow_dynf_metadata_register,
 * rte_eth_dev_set_ptypes

Ray K

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in bbdev lib
@ 2021-06-24 10:35  3% Kinsella, Ray
  2021-06-24 15:42  3% ` Chautru, Nicolas
  2021-06-25  7:48  0% ` David Marchand
  0 siblings, 2 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:35 UTC (permalink / raw)
  To: Nicolas Chautru, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Nicolas

The following bbdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

* rte_bbdev_allocate
* rte_bbdev_callback_register
* rte_bbdev_callback_unregister
* rte_bbdev_close
* rte_bbdev_count
* rte_bbdev_dec_op_alloc_bulk
* rte_bbdev_dec_op_free_bulk
* rte_bbdev_dequeue_dec_ops
* rte_bbdev_dequeue_enc_ops
* rte_bbdev_devices
* rte_bbdev_enc_op_alloc_bulk
* rte_bbdev_enc_op_free_bulk
* rte_bbdev_enqueue_dec_ops
* rte_bbdev_enqueue_enc_ops
* rte_bbdev_find_next
* rte_bbdev_get_named_dev
* rte_bbdev_info_get
* rte_bbdev_intr_enable
* rte_bbdev_is_valid
* rte_bbdev_op_pool_create
* rte_bbdev_op_type_str
* rte_bbdev_pmd_callback_process
* rte_bbdev_queue_configure
* rte_bbdev_queue_info_get
* rte_bbdev_queue_intr_ctl
* rte_bbdev_queue_intr_disable
* rte_bbdev_queue_intr_enable
* rte_bbdev_queue_start
* rte_bbdev_queue_stop
* rte_bbdev_release
* rte_bbdev_setup_queues
* rte_bbdev_start
* rte_bbdev_stats_get
* rte_bbdev_stats_reset
* rte_bbdev_stop

Ray K

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in ip_frag
@ 2021-06-24 10:34  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:34 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Konstantin

The following ip_frag experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

* rte_frag_table_del_expired_entries

Ray K

^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in pipeline lib
@ 2021-06-24 10:34  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:34 UTC (permalink / raw)
  To: Cristian Dumitrescu, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Cristian,

The following pipeline experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

* rte_port_in_action_create
* rte_port_in_action_fre
* rte_port_in_action_params_get
* rte_port_in_action_profile_action_register
* rte_port_in_action_profile_create
* rte_port_in_action_profile_free
* rte_port_in_action_profile_freeze
* rte_table_action_apply
* rte_table_action_create
* rte_table_action_dscp_table_update
* rte_table_action_free
* rte_table_action_meter_profile_add
* rte_table_action_meter_profile_delete
* rte_table_action_meter_read
* rte_table_action_profile_action_register
* rte_table_action_profile_create
* rte_table_action_profile_free
* rte_table_action_profile_freeze
* rte_table_action_stats_read
* rte_table_action_table_params_get,
* rte_table_action_time_read
* rte_table_action_ttl_read
* rte_table_action_crypto_sym_session_get

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in rib lib
@ 2021-06-24 10:34  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:34 UTC (permalink / raw)
  To: Medvedkin, Vladimir, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Vladimir

The following rib experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

* rte_rib_create,
* rte_rib_find_existing,
* rte_rib_free,
* rte_rib_get_depth,
* rte_rib_get_ext,
* rte_rib_get_ip,
* rte_rib_get_nh,
* rte_rib_get_nxt,
* rte_rib_insert,
* rte_rib_lookup,
* rte_rib_lookup_parent,
* rte_rib_lookup_exact,
* rte_rib_set_nh,
* rte_rib_remove,
* rte_rib6_create,
* rte_rib6_find_existing,
* rte_rib6_free,
* rte_rib6_get_depth,
* rte_rib6_get_ext,
* rte_rib6_get_ip,
* rte_rib6_get_nh,
* rte_rib6_get_nxt,
* rte_rib6_insert,
* rte_rib6_lookup,
* rte_rib6_lookup_parent,
* rte_rib6_lookup_exact,
* rte_rib6_set_nh,
* rte_rib6_remove

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in cryptodev lib
@ 2021-06-24 10:33  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:33 UTC (permalink / raw)
  To: Declan Doherty, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Declan,

The following cryptodev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_cryptodev_asym_capability_get
 * rte_cryptodev_asym_get_header_session_size
 * rte_cryptodev_asym_get_private_session_size
 * rte_cryptodev_asym_get_xform_enum
 * rte_cryptodev_asym_session_clear
 * rte_cryptodev_asym_session_create
 * rte_cryptodev_asym_session_free
 * rte_cryptodev_asym_session_init
 * rte_cryptodev_asym_xform_capability_check_modlen
 * rte_cryptodev_asym_xform_capability_check_optype
 * rte_cryptodev_sym_get_existing_header_session_size
 * rte_cryptodev_sym_session_get_user_data
 * rte_cryptodev_sym_session_pool_create
 * rte_cryptodev_sym_session_set_user_data
 * rte_crypto_asym_op_strings
 * rte_crypto_asym_xform_strings

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in sched lib
@ 2021-06-24 10:33  3% Kinsella, Ray
  2021-06-24 19:21  0% ` Singh, Jasvinder
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:33 UTC (permalink / raw)
  To: Cristian Dumitrescu, Thomas Monjalon, Stephen Hemminger, Singh,
	Jasvinder, dpdk-dev

Hi Cristian & Jasvinder,

The following sched experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_sched_subport_pipe_profile_add

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in compressdev lib
@ 2021-06-24 10:32  3% Kinsella, Ray
  2021-06-24 10:55  0% ` Trahe, Fiona
  2021-06-25  7:49  0% ` David Marchand
  0 siblings, 2 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:32 UTC (permalink / raw)
  To: Fiona Trahe, Ashish Gupta, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Fiona & Ashish,

The following compressdev experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_compressdev_capability_get
 * rte_compressdev_close
 * rte_compressdev_configure
 * rte_compressdev_count
 * rte_compressdev_dequeue_burst
 * rte_compressdev_devices_get
 * rte_compressdev_enqueue_burst
 * rte_compressdev_get_dev_id
 * rte_compressdev_get_feature_name
 * rte_compressdev_info_get
 * rte_compressdev_name_get
 * rte_compressdev_pmd_allocate
 * rte_compressdev_pmd_create
 * rte_compressdev_pmd_destroy
 * rte_compressdev_pmd_get_named_dev
 * rte_compressdev_pmd_parse_input_args
 * rte_compressdev_pmd_release_device
 * rte_compressdev_private_xform_create
 * rte_compressdev_private_xform_free
 * rte_compressdev_queue_pair_count
 * rte_compressdev_queue_pair_setup
 * rte_compressdev_socket_id
 * rte_compressdev_start
 * rte_compressdev_stats_get
 * rte_compressdev_stats_reset
 * rte_compressdev_stop
 * rte_compressdev_stream_create
 * rte_compressdev_stream_free
 * rte_comp_get_feature_name
 * rte_comp_op_alloc
 * rte_comp_op_bulk_alloc
 * rte_comp_op_bulk_free
 * rte_comp_op_free
 * rte_comp_op_pool_create

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in port lib
@ 2021-06-24 10:31  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:31 UTC (permalink / raw)
  To: Cristian Dumitrescu, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Cristian

The following port experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_port_eventdev_writer_nodrop_ops
 * rte_port_eventdev_writer_ops

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in eal lib
@ 2021-06-24 10:31  3% Kinsella, Ray
  2021-06-24 12:14  0% ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:31 UTC (permalink / raw)
  To: Thomas Monjalon, Stephen Hemminger, Burakov, Anatoly, dpdk-dev

Hi Anatoly & Thomas, 

The following eal experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_mp_action_register
 * rte_mp_action_unregister
 * rte_mp_reply
 * rte_mp_sendmsg
 * rte_dev_event_callback_register
 * rte_dev_event_callback_unregister
 * rte_dev_event_monitor_start
 * rte_dev_event_monitor_stop
 * rte_fbarray_attach
 * rte_fbarray_destroy
 * rte_fbarray_detach
 * rte_fbarray_dump_metadata
 * rte_fbarray_find_contig_free
 * rte_fbarray_find_contig_used
 * rte_fbarray_find_idx
 * rte_fbarray_find_next_free
 * rte_fbarray_find_next_n_free
 * rte_fbarray_find_next_n_used
 * rte_fbarray_find_next_used
 * rte_fbarray_get
 * rte_fbarray_init
 * rte_fbarray_is_used
 * rte_fbarray_set_free
 * rte_fbarray_set_used
 * rte_log_register_type_and_pick_level
 * rte_malloc_dump_heaps
 * rte_mem_alloc_validator_register
 * rte_mem_alloc_validator_unregister
 * rte_mem_check_dma_mask
 * rte_mem_event_callback_register
 * rte_mem_event_callback_unregister
 * rte_mem_iova2virt
 * rte_mem_virt2memseg
 * rte_mem_virt2memseg_list
 * rte_memseg_contig_walk
 * rte_memseg_list_walk
 * rte_memseg_walk
 * rte_mp_request_async
 * rte_mp_request_sync
 * rte_class_find
 * rte_class_find_by_name
 * rte_class_register
 * rte_class_unregister
 * rte_dev_iterator_init
 * rte_dev_iterator_next
 * rte_fbarray_find_prev_free
 * rte_fbarray_find_prev_n_free
 * rte_fbarray_find_prev_n_used
 * rte_fbarray_find_prev_used
 * rte_fbarray_find_rev_contig_free
 * rte_fbarray_find_rev_contig_used
 * rte_memseg_contig_walk_thread_unsafe
 * rte_memseg_list_walk_thread_unsafe
 * rte_memseg_walk_thread_unsafe
 * rte_delay_us_sleep
 * rte_dev_event_callback_process
 * rte_dev_hotplug_handle_disable
 * rte_dev_hotplug_handle_enable
 * rte_malloc_heap_create
 * rte_malloc_heap_destroy
 * rte_malloc_heap_get_socket
 * rte_malloc_heap_memory_add
 * rte_malloc_heap_memory_attach
 * rte_malloc_heap_memory_detach
 * rte_malloc_heap_memory_remove
 * rte_malloc_heap_socket_is_external
 * rte_mem_check_dma_mask_thread_unsafe
 * rte_mem_set_dma_mask
 * rte_memseg_get_fd
 * rte_memseg_get_fd_offset
 * rte_memseg_get_fd_offset_thread_unsafe
 * rte_memseg_get_fd_thread_unsafe
 * rte_extmem_attach
 * rte_extmem_detach
 * rte_extmem_register
 * rte_extmem_unregister
 * rte_dev_dma_map
 * rte_dev_dma_unmap
 * rte_fbarray_find_biggest_free
 * rte_fbarray_find_biggest_used
 * rte_fbarray_find_rev_biggest_free
 * rte_fbarray_find_rev_biggest_used
 * rte_intr_callback_unregister_pending
 * rte_realloc_socket
 * rte_intr_ack
 * rte_lcore_cpuset
 * rte_lcore_to_cpu_id
 * rte_mcfg_timer_lock
 * rte_mcfg_timer_unlock
 * rte_rand_max

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in flow_classify lib
@ 2021-06-24 10:30  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:30 UTC (permalink / raw)
  To: Iremonger, Bernard, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Bernard, 

The following flow_classify experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_flow_classifier_create
 * rte_flow_classifier_free
 * rte_flow_classifier_query
 * rte_flow_classify_table_create
 * rte_flow_classify_table_entry_add
 * rte_flow_classify_table_entry_delete
 * rte_flow_classify_validate

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in vhost lib
@ 2021-06-24 10:30  3% Kinsella, Ray
  2021-06-24 11:04  0% ` Xia, Chenbo
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:30 UTC (permalink / raw)
  To: Maxime Coquelin, Chenbo Xia, Thomas Monjalon, Stephen Hemminger,
	dpdk-dev

Hi Maxime and Chenbo, 

The following vhost experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_vhost_driver_get_protocol_features
 * rte_vhost_driver_get_queue_num
 * rte_vhost_crypto_create
 * rte_vhost_crypto_free
 * rte_vhost_crypto_fetch_requests
 * rte_vhost_crypto_finalize_requests
 * rte_vhost_crypto_set_zero_copy
 * rte_vhost_va_from_guest_pa
 * rte_vhost_extern_callback_register
 * rte_vhost_driver_set_protocol_features
 * rte_vhost_set_inflight_desc_split
 * rte_vhost_set_inflight_desc_packed
 * rte_vhost_set_last_inflight_io_split
 * rte_vhost_set_last_inflight_io_packed
 * rte_vhost_clr_inflight_desc_split
 * rte_vhost_clr_inflight_desc_packed
 * rte_vhost_get_vhost_ring_inflight
 * rte_vhost_get_vring_base_from_inflight	

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in mbuf lib
@ 2021-06-24 10:29  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:29 UTC (permalink / raw)
  To: Olivier Matz, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Oliver, 

The following mbuf experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_mbuf_check
 * rte_mbuf_dynfield_lookup
 * rte_mbuf_dynfield_register
 * rte_mbuf_dynfield_register_offset
 * rte_mbuf_dynflag_lookup
 * rte_mbuf_dynflag_register
 * rte_mbuf_dynflag_register_bitnum
 * rte_mbuf_dyn_dump
 * rte_pktmbuf_copy
 * rte_pktmbuf_free_bulk

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in net lib
@ 2021-06-24 10:29  3% Kinsella, Ray
  0 siblings, 0 replies; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:29 UTC (permalink / raw)
  To: Olivier Matz, Thomas Monjalon, Stephen Hemminger, dpdk-dev

Hi Oliver, 

The following net experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_net_make_rarp_packet
 * rte_net_skip_ip6_ext
 * rte_ether_unformat_addr 

Ray K


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] Experimental symbols in security lib
@ 2021-06-24 10:28  3% Kinsella, Ray
  2021-06-24 10:49  0% ` Kinsella, Ray
  0 siblings, 1 reply; 200+ results
From: Kinsella, Ray @ 2021-06-24 10:28 UTC (permalink / raw)
  To: Declan Doherty, Akhil Goyal, Thomas Monjalon, Stephen Hemminger,
	dpdk-dev

Hi Declan and Goyal, 

The following security experimental symbols are present in both v21.05 and v19.11 release. These symbols should be considered for promotion to stable as part of the v22 ABI in DPDK 21.11, as they have been experimental for >= 2yrs at this point. 

 * rte_security_get_userdata
 * rte_security_session_stats_get
 * rte_security_session_update

Ray K


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus
  2021-06-24  6:37  3%         ` Thomas Monjalon
@ 2021-06-24  8:42  3%           ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-06-24  8:42 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon
  Cc: Parav Pandit, dev, Wang Haiyue, Kinsella Ray, david.marchand,
	ferruh.yigit

Thanks for clarification, will update in next version.
________________________________
From: Thomas Monjalon <thomas@monjalon.net>
Sent: Thursday, June 24, 2021 2:37:19 PM
To: Xueming(Steven) Li <xuemingl@nvidia.com>
Cc: Parav Pandit <parav@nvidia.com>; dev@dpdk.org <dev@dpdk.org>; Wang Haiyue <haiyue.wang@intel.com>; Kinsella Ray <mdr@ashroe.eu>; david.marchand@redhat.com <david.marchand@redhat.com>; ferruh.yigit@intel.com <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus

23/06/2021 16:52, Xueming(Steven) Li:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 23/06/2021 01:50, Xueming(Steven) Li:
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > 13/06/2021 14:58, Xueming Li:
> > > > > --- /dev/null
> > > > > +++ b/drivers/bus/auxiliary/version.map
> > > > > @@ -0,0 +1,7 @@
> > > > > +EXPERIMENTAL {
> > > > > +     global:
> > > > > +
> > > > > +     # added in 21.08
> > > > > +     rte_auxiliary_register;
> > > > > +     rte_auxiliary_unregister;
> > > > > +};
> > > >
> > > > After more thoughts, shouldn't it be an internal symbol?
> > > > It is used only by DPDK drivers.
> > >
> > > So users will not be able to compose their own driver and register
> > > with auxiliary bus?z
> >
> > Yes, that's an interesting question actually.
> > We can continue with experimental/stable status of driver ABI, but we should invent a new ABI flag like DRIVER, so there is no stability
> > policy on such symbol.
>
> Not quite understand here, why we want to export the function but no ABI guarantee? the api shouldn't change frequently IMHO.

Sorry my message was not clear.
I am OK to keep "EXPERIMENTAL" in this patch.
But in future, we don't want to make driver interface as part
of the stable ABI because it makes evolution harder for no good reason:
nobody is asking for a stable interface with drivers.



^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus
  2021-06-23 14:52  3%       ` Xueming(Steven) Li
@ 2021-06-24  6:37  3%         ` Thomas Monjalon
  2021-06-24  8:42  3%           ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-06-24  6:37 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Parav Pandit, dev, Wang Haiyue, Kinsella Ray, david.marchand,
	ferruh.yigit

23/06/2021 16:52, Xueming(Steven) Li:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 23/06/2021 01:50, Xueming(Steven) Li:
> > > From: Thomas Monjalon <thomas@monjalon.net>
> > > > 13/06/2021 14:58, Xueming Li:
> > > > > --- /dev/null
> > > > > +++ b/drivers/bus/auxiliary/version.map
> > > > > @@ -0,0 +1,7 @@
> > > > > +EXPERIMENTAL {
> > > > > +	global:
> > > > > +
> > > > > +	# added in 21.08
> > > > > +	rte_auxiliary_register;
> > > > > +	rte_auxiliary_unregister;
> > > > > +};
> > > >
> > > > After more thoughts, shouldn't it be an internal symbol?
> > > > It is used only by DPDK drivers.
> > >
> > > So users will not be able to compose their own driver and register
> > > with auxiliary bus?z
> > 
> > Yes, that's an interesting question actually.
> > We can continue with experimental/stable status of driver ABI, but we should invent a new ABI flag like DRIVER, so there is no stability
> > policy on such symbol.
> 
> Not quite understand here, why we want to export the function but no ABI guarantee? the api shouldn't change frequently IMHO.

Sorry my message was not clear.
I am OK to keep "EXPERIMENTAL" in this patch.
But in future, we don't want to make driver interface as part
of the stable ABI because it makes evolution harder for no good reason:
nobody is asking for a stable interface with drivers.



^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus
  2021-06-23  8:15  4%     ` Thomas Monjalon
@ 2021-06-23 14:52  3%       ` Xueming(Steven) Li
  2021-06-24  6:37  3%         ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Xueming(Steven) Li @ 2021-06-23 14:52 UTC (permalink / raw)
  To: NBU-Contact-Thomas Monjalon
  Cc: Parav Pandit, dev, Wang Haiyue, Kinsella Ray, david.marchand,
	ferruh.yigit



> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Wednesday, June 23, 2021 4:15 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: Parav Pandit <parav@nvidia.com>; dev@dpdk.org; Wang Haiyue <haiyue.wang@intel.com>; Kinsella Ray <mdr@ashroe.eu>;
> david.marchand@redhat.com; ferruh.yigit@intel.com
> Subject: Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus
> 
> 23/06/2021 01:50, Xueming(Steven) Li:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 13/06/2021 14:58, Xueming Li:
> > > > --- /dev/null
> > > > +++ b/drivers/bus/auxiliary/version.map
> > > > @@ -0,0 +1,7 @@
> > > > +EXPERIMENTAL {
> > > > +	global:
> > > > +
> > > > +	# added in 21.08
> > > > +	rte_auxiliary_register;
> > > > +	rte_auxiliary_unregister;
> > > > +};
> > >
> > > After more thoughts, shouldn't it be an internal symbol?
> > > It is used only by DPDK drivers.
> >
> > So users will not be able to compose their own driver and register
> > with auxiliary bus?z
> 
> Yes, that's an interesting question actually.
> We can continue with experimental/stable status of driver ABI, but we should invent a new ABI flag like DRIVER, so there is no stability
> policy on such symbol.

Not quite understand here, why we want to export the function but no ABI guarantee? the api shouldn't change frequently IMHO.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v4 2/2] bus/auxiliary: introduce auxiliary bus
  @ 2021-06-23  8:15  4%     ` Thomas Monjalon
  2021-06-23 14:52  3%       ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-06-23  8:15 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: Parav Pandit, dev, Wang Haiyue, Kinsella Ray, david.marchand,
	ferruh.yigit

23/06/2021 01:50, Xueming(Steven) Li:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 13/06/2021 14:58, Xueming Li:
> > > --- /dev/null
> > > +++ b/drivers/bus/auxiliary/version.map
> > > @@ -0,0 +1,7 @@
> > > +EXPERIMENTAL {
> > > +	global:
> > > +
> > > +	# added in 21.08
> > > +	rte_auxiliary_register;
> > > +	rte_auxiliary_unregister;
> > > +};
> > 
> > After more thoughts, shouldn't it be an internal symbol?
> > It is used only by DPDK drivers.
> 
> So users will not be able to compose their own driver and register with auxiliary bus?z

Yes, that's an interesting question actually.
We can continue with experimental/stable status of driver ABI,
but we should invent a new ABI flag like DRIVER,
so there is no stability policy on such symbol.



^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v1] doc: update ABI in MAINTAINERS file
@ 2021-06-22 15:50 12% Ray Kinsella
  2021-06-25  8:08  7% ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Ray Kinsella @ 2021-06-22 15:50 UTC (permalink / raw)
  To: dev; +Cc: stephen, ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Update to ABI MAINTAINERS.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
 MAINTAINERS | 1 -
 1 file changed, 1 deletion(-)

diff --git a/MAINTAINERS b/MAINTAINERS
index 5877a16971..dab8883a4f 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -117,7 +117,6 @@ F: .ci/
 
 ABI Policy & Versioning
 M: Ray Kinsella <mdr@ashroe.eu>
-M: Neil Horman <nhorman@tuxdriver.com>
 F: lib/eal/include/rte_compat.h
 F: lib/eal/include/rte_function_versioning.h
 F: doc/guides/contributing/abi_*.rst
-- 
2.26.2


^ permalink raw reply	[relevance 12%]

* [dpdk-dev] [PATCH v5] devtools: script to track map symbols
  2021-06-18 16:36  5% [dpdk-dev] [PATCH] devtools: script to track map symbols Ray Kinsella
  2021-06-21 15:25  6% ` [dpdk-dev] [PATCH v3] " Ray Kinsella
  2021-06-21 15:35  6% ` [dpdk-dev] [PATCH v4] " Ray Kinsella
@ 2021-06-22 10:19  6% ` Ray Kinsella
  2 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2021-06-22 10:19 UTC (permalink / raw)
  To: dev; +Cc: stephen, ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Script to track growth of stable and experimental symbols
over releases since v19.11.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
v2: reworked to fix pylint errors
v3: sent with the correct in-reply-to
v4: fix typos picked up by the CI
v5: fix terminal_size & directory args

 devtools/count_symbols.py | 262 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 262 insertions(+)
 create mode 100755 devtools/count_symbols.py

diff --git a/devtools/count_symbols.py b/devtools/count_symbols.py
new file mode 100755
index 0000000000..96990f609f
--- /dev/null
+++ b/devtools/count_symbols.py
@@ -0,0 +1,262 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 Intel Corporation
+'''Tool to count the number of symbols in each DPDK release'''
+from pathlib import Path
+import sys
+import os
+import subprocess
+import argparse
+import re
+import datetime
+
+try:
+    from parsley import makeGrammar
+except ImportError:
+    print('This script uses the package Parsley to parse C Mapfiles.\n'
+          'This can be installed with \"pip install parsley".')
+    sys.exit()
+
+MAP_GRAMMAR = r"""
+
+ws = (' ' | '\r' | '\n' | '\t')*
+
+ABI_VER = ({})
+DPDK_VER = ('DPDK_' ABI_VER)
+ABI_NAME = ('INTERNAL' | 'EXPERIMENTAL' | DPDK_VER)
+comment = '#' (~'\n' anything)+ '\n'
+symbol = (~(';' | '}}' | '#') anything )+:c ';' -> ''.join(c)
+global = 'global:'
+local = 'local: *;'
+symbols = comment* symbol:s ws comment* -> s
+
+abi = (abi_section+):m -> dict(m)
+abi_section = (ws ABI_NAME:e ws '{{' ws global* (~local ws symbols)*:s ws local* ws '}}' ws DPDK_VER* ';' ws) -> (e,s)
+"""
+
+def get_abi_versions():
+    '''Returns a string of possible dpdk abi versions'''
+
+    year = datetime.date.today().year - 2000
+    tags = " |".join(['\'{}\''.format(i) \
+                     for i in reversed(range(21, year + 1)) ])
+    tags  = tags + ' | \'20.0.1\' | \'20.0\' | \'20\''
+
+    return tags
+
+def get_dpdk_releases():
+    '''Returns a list of dpdk release tags names  since v19.11'''
+
+    year = datetime.date.today().year - 2000
+    year_range = "|".join("{}".format(i) for i in range(19,year + 1))
+    pattern = re.compile(r'^\"v(' +  year_range + r')\.\d{2}\"$')
+
+    cmd = ['git', 'for-each-ref', '--sort=taggerdate', '--format', '"%(tag)"']
+    try:
+        result = subprocess.run(cmd, \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        print("Failed to interogate git for release tags")
+        sys.exit()
+
+
+    tags = result.stdout.decode('utf-8').split('\n')
+
+    # find the non-rcs between now and v19.11
+    tags = [ tag.replace('\"','') \
+             for tag in reversed(tags) \
+             if pattern.match(tag) ][:-3]
+
+    return tags
+
+def fix_directory_name(path):
+    '''Prepend librte to the source directory name'''
+    mapfilepath1 = str(path.parent.name)
+    mapfilepath2 = str(path.parents[1])
+    mapfilepath = mapfilepath2 + '/librte_' + mapfilepath1
+
+    return mapfilepath
+
+def directory_renamed(path, rel):
+    '''Fix removal of the librte_ from the directory names'''
+
+    mapfilepath = fix_directory_name(path)
+    tagfile = '{}:{}/{}'.format(rel, mapfilepath,  path.name)
+
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    return result
+
+def mapfile_renamed(path, rel):
+    '''Fix renaming of the map file'''
+    newfile = None
+
+    result = subprocess.run(['git', 'ls-tree', \
+                             rel, str(path.parent) + '/'], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE,
+                            check=True)
+    dentries = result.stdout.decode('utf-8')
+    dentries = dentries.split('\n')
+
+    # filter entries looking for the map file
+    dentries = [dentry for dentry in dentries if dentry.endswith('.map')]
+    if len(dentries) > 1 or len(dentries) == 0:
+        return None
+
+    dparts = dentries[0].split('/')
+    newfile = dparts[len(dparts) - 1]
+
+    if newfile is not None:
+        tagfile = '{}:{}/{}'.format(rel, path.parent, newfile)
+
+        try:
+            result = subprocess.run(['git', 'show', tagfile], \
+                                    stdout=subprocess.PIPE, \
+                                    stderr=subprocess.PIPE,
+                                    check=True)
+        except subprocess.CalledProcessError:
+            result = None
+
+    else:
+        result = None
+
+    return result
+
+def mapfile_and_directory_renamed(path, rel):
+    '''Fix renaming of the map file & the source directory'''
+    mapfilepath = Path("{}/{}".format(fix_directory_name(path),path.name))
+
+    return mapfile_renamed(mapfilepath, rel)
+
+def get_terminal_rows():
+    '''Find the number of rows in the terminal'''
+
+    return os.get_terminal_size().lines
+
+class FormatOutput():
+    '''Format the output to supported formats'''
+    output_fmt = ""
+    column_fmt = ""
+
+    def __init__(self, format_output, dpdk_releases):
+        self.OUTPUT_FORMATS[format_output](self,dpdk_releases)
+        self.column_titles = ['mapfile'] +  dpdk_releases
+
+        self.terminal_rows = get_terminal_rows()
+        self.row = 0
+
+    def set_terminal_output(self,dpdk_rel):
+        '''Set the output format to Tabbed Separated Values'''
+
+        self.output_fmt = '{:<50}' + \
+            ''.join(['{:<6}{:<6}'] * (len(dpdk_rel)))
+        self.column_fmt = '{:50}' + \
+            ''.join(['{:<12}'] * (len(dpdk_rel)))
+
+    def set_csv_output(self,dpdk_rel):
+        '''Set the output format to Comma Separated Values'''
+
+        self.output_fmt = '{},' + \
+            ','.join(['{},{}'] * (len(dpdk_rel)))
+        self.column_fmt = '{},' + \
+            ','.join(['{},'] * (len(dpdk_rel)))
+
+    def print_columns(self):
+        '''Print column rows with release names'''
+        print(self.column_fmt.format(*self.column_titles))
+        self.row += 1
+
+    def print_row(self,symbols):
+        '''Print row of symbol values'''
+        print(self.output_fmt.format(*symbols))
+        self.row += 1
+
+        if((self.terminal_rows>0) and ((self.row % self.terminal_rows) == 0)):
+            self.print_columns()
+
+    OUTPUT_FORMATS = { None: set_terminal_output, \
+                       'terminal': set_terminal_output, \
+                       'csv': set_csv_output }
+
+SRC_DIRECTORIES = 'drivers,lib'
+IGNORE_SECTIONS = ['EXPERIMENTAL','INTERNAL']
+FIX_STRATEGIES = [directory_renamed, \
+                  mapfile_renamed, \
+                  mapfile_and_directory_renamed]
+
+def count_release_symbols(map_parser, release, mapfile_path):
+    '''Count the symbols for a given release and mapfile'''
+    csym = [0] * 2
+    abi_sections = None
+
+    tagfile = '{}:{}'.format(release,mapfile_path)
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    for fix_strategy in FIX_STRATEGIES:
+        if result is not None:
+            break
+        result = fix_strategy(mapfile_path, release)
+
+    if result is not None:
+        mapfile = result.stdout.decode('utf-8')
+        abi_sections = map_parser(mapfile).abi()
+
+    if abi_sections is not None:
+        # which versions are present, and we care about
+        found_ver = [ver \
+                     for ver in abi_sections \
+                     if ver not in IGNORE_SECTIONS]
+
+        for ver in found_ver:
+            csym[0] += len(abi_sections[ver])
+
+        # count experimental symbols
+        if 'EXPERIMENTAL' in abi_sections:
+            csym[1] = len(abi_sections['EXPERIMENTAL'])
+
+    return csym
+
+def main():
+    '''Main entry point'''
+
+    parser = argparse.ArgumentParser(description='Count symbols in DPDK Libs')
+    parser.add_argument('--format-output', choices=['terminal','csv'], \
+                        default='terminal')
+    parser.add_argument('--directory', choices=SRC_DIRECTORIES.split(','),
+                        default=SRC_DIRECTORIES)
+    args = parser.parse_args()
+
+    dpdk_releases = get_dpdk_releases()
+    format_output = FormatOutput(args.format_output, dpdk_releases)
+
+    map_grammar = MAP_GRAMMAR.format(get_abi_versions())
+    map_parser = makeGrammar(map_grammar, {})
+
+    format_output.print_columns()
+    for src_dir in args.directory.split(','):
+        for path in Path(src_dir).rglob('*.map'):
+            relsym = [str(path)]
+
+            for release in dpdk_releases:
+                csym = count_release_symbols(map_parser, release, path)
+                relsym += csym
+
+            format_output.print_row(relsym)
+
+if __name__ == '__main__':
+    main()
-- 
2.26.2


^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH v1 1/2] devtools: add relative path support for ABI compatibility check
  2021-06-01  1:56 17% ` [dpdk-dev] [PATCH v1 1/2] devtools: add " Feifei Wang
  2021-06-22  2:08  4%   ` [dpdk-dev] 回复: " Feifei Wang
@ 2021-06-22  9:19  4%   ` Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Bruce Richardson @ 2021-06-22  9:19 UTC (permalink / raw)
  To: Feifei Wang; +Cc: dev, nd, Phil Yang, Juraj Linkeš, Ruifeng Wang

On Tue, Jun 01, 2021 at 09:56:52AM +0800, Feifei Wang wrote:
> From: Phil Yang <phil.yang@arm.com>
> 
> Because dpdk guide does not limit the relative path for ABI
> compatibility check, users maybe set 'DPDK_ABI_REF_DIR' as a relative
> path:
> 
> ~/dpdk/devtools$ DPDK_ABI_REF_VERSION=v19.11 DPDK_ABI_REF_DIR=build-gcc-shared
> ./test-meson-builds.sh
> 
> And if the DESTDIR is not an absolute path, ninja complains:
> + install_target build-gcc-shared/v19.11/build build-gcc-shared/v19.11/build-gcc-shared
> + rm -rf build-gcc-shared/v19.11/build-gcc-shared
> + echo 'DESTDIR=build-gcc-shared/v19.11/build-gcc-shared ninja -C build-gcc-shared/v19.11/build install'
> + DESTDIR=build-gcc-shared/v19.11/build-gcc-shared
> + ninja -C build-gcc-shared/v19.11/build install
> ...
> ValueError: dst_dir must be absolute, got build-gcc-shared/v19.11/build-gcc-shared/usr/local/share/dpdk/
> examples/bbdev_app
> ...
> Error: install directory 'build-gcc-shared/v19.11/build-gcc-shared' does not exist.
> 
> To fix this, add relative path support using 'readlink -f'.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  devtools/test-meson-builds.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
> index daf817ac3e..43b906598d 100755
> --- a/devtools/test-meson-builds.sh
> +++ b/devtools/test-meson-builds.sh
> @@ -168,7 +168,8 @@ build () # <directory> <target cc | cross file> <ABI check> [meson options]
>  	config $srcdir $builds_dir/$targetdir $cross --werror $*
>  	compile $builds_dir/$targetdir
>  	if [ -n "$DPDK_ABI_REF_VERSION" -a "$abicheck" = ABI ] ; then
> -		abirefdir=${DPDK_ABI_REF_DIR:-reference}/$DPDK_ABI_REF_VERSION
> +		abirefdir=$(readlink -f \
> +			${DPDK_ABI_REF_DIR:-reference}/$DPDK_ABI_REF_VERSION)
>  		if [ ! -d $abirefdir/$targetdir ]; then
>  			# clone current sources
>  			if [ ! -d $abirefdir/src ]; then

This looks a simple enough change.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[relevance 4%]

* [dpdk-dev] 回复: [PATCH v1 1/2] devtools: add relative path support for ABI compatibility check
  2021-06-01  1:56 17% ` [dpdk-dev] [PATCH v1 1/2] devtools: add " Feifei Wang
@ 2021-06-22  2:08  4%   ` Feifei Wang
  2021-06-22  9:19  4%   ` [dpdk-dev] " Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Feifei Wang @ 2021-06-22  2:08 UTC (permalink / raw)
  To: Feifei Wang, Bruce Richardson
  Cc: dev, nd, Phil Yang, Juraj Linkeš, Ruifeng Wang, nd

Hi, Bruce

Would you please help review this patch series?
Thanks.

Best Regards
Feifei

> -----邮件原件-----
> 发件人: Feifei Wang <feifei.wang2@arm.com>
> 发送时间: 2021年6月1日 9:57
> 收件人: Bruce Richardson <bruce.richardson@intel.com>
> 抄送: dev@dpdk.org; nd <nd@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Feifei Wang <Feifei.Wang2@arm.com>; Juraj Linkeš
> <juraj.linkes@pantheon.tech>; Ruifeng Wang <Ruifeng.Wang@arm.com>
> 主题: [PATCH v1 1/2] devtools: add relative path support for ABI
> compatibility check
> 
> From: Phil Yang <phil.yang@arm.com>
> 
> Because dpdk guide does not limit the relative path for ABI compatibility
> check, users maybe set 'DPDK_ABI_REF_DIR' as a relative
> path:
> 
> ~/dpdk/devtools$ DPDK_ABI_REF_VERSION=v19.11
> DPDK_ABI_REF_DIR=build-gcc-shared ./test-meson-builds.sh
> 
> And if the DESTDIR is not an absolute path, ninja complains:
> + install_target build-gcc-shared/v19.11/build
> + build-gcc-shared/v19.11/build-gcc-shared
> + rm -rf build-gcc-shared/v19.11/build-gcc-shared
> + echo 'DESTDIR=build-gcc-shared/v19.11/build-gcc-shared ninja -C build-gcc-
> shared/v19.11/build install'
> + DESTDIR=build-gcc-shared/v19.11/build-gcc-shared
> + ninja -C build-gcc-shared/v19.11/build install
> ...
> ValueError: dst_dir must be absolute, got build-gcc-shared/v19.11/build-gcc-
> shared/usr/local/share/dpdk/
> examples/bbdev_app
> ...
> Error: install directory 'build-gcc-shared/v19.11/build-gcc-shared' does not
> exist.
> 
> To fix this, add relative path support using 'readlink -f'.
> 
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
> Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
> Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  devtools/test-meson-builds.sh | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
> index daf817ac3e..43b906598d 100755
> --- a/devtools/test-meson-builds.sh
> +++ b/devtools/test-meson-builds.sh
> @@ -168,7 +168,8 @@ build () # <directory> <target cc | cross file> <ABI
> check> [meson options]
>  	config $srcdir $builds_dir/$targetdir $cross --werror $*
>  	compile $builds_dir/$targetdir
>  	if [ -n "$DPDK_ABI_REF_VERSION" -a "$abicheck" = ABI ] ; then
> -		abirefdir=${DPDK_ABI_REF_DIR:-
> reference}/$DPDK_ABI_REF_VERSION
> +		abirefdir=$(readlink -f \
> +			${DPDK_ABI_REF_DIR:-
> reference}/$DPDK_ABI_REF_VERSION)
>  		if [ ! -d $abirefdir/$targetdir ]; then
>  			# clone current sources
>  			if [ ! -d $abirefdir/src ]; then
> --
> 2.25.1


^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v4] devtools: script to track map symbols
  2021-06-18 16:36  5% [dpdk-dev] [PATCH] devtools: script to track map symbols Ray Kinsella
  2021-06-21 15:25  6% ` [dpdk-dev] [PATCH v3] " Ray Kinsella
@ 2021-06-21 15:35  6% ` Ray Kinsella
  2021-06-22 10:19  6% ` [dpdk-dev] [PATCH v5] " Ray Kinsella
  2 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2021-06-21 15:35 UTC (permalink / raw)
  To: dev; +Cc: stephen, ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Script to track growth of stable and experimental symbols
over releases since v19.11.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
v2: reworked to fix pylint errors
v3: sent with the correct in-reply-to
v4: fix typos picked up by the CI

 devtools/count_symbols.py | 262 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 262 insertions(+)
 create mode 100755 devtools/count_symbols.py

diff --git a/devtools/count_symbols.py b/devtools/count_symbols.py
new file mode 100755
index 0000000000..6194df0318
--- /dev/null
+++ b/devtools/count_symbols.py
@@ -0,0 +1,262 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 Intel Corporation
+'''Tool to count the number of symbols in each DPDK release'''
+from pathlib import Path
+import sys
+import os
+import subprocess
+import argparse
+import re
+import datetime
+
+try:
+    from parsley import makeGrammar
+except ImportError:
+    print('This script uses the package Parsley to parse C Mapfiles.\n'
+          'This can be installed with \"pip install parsley".')
+    sys.exit()
+
+MAP_GRAMMAR = r"""
+
+ws = (' ' | '\r' | '\n' | '\t')*
+
+ABI_VER = ({})
+DPDK_VER = ('DPDK_' ABI_VER)
+ABI_NAME = ('INTERNAL' | 'EXPERIMENTAL' | DPDK_VER)
+comment = '#' (~'\n' anything)+ '\n'
+symbol = (~(';' | '}}' | '#') anything )+:c ';' -> ''.join(c)
+global = 'global:'
+local = 'local: *;'
+symbols = comment* symbol:s ws comment* -> s
+
+abi = (abi_section+):m -> dict(m)
+abi_section = (ws ABI_NAME:e ws '{{' ws global* (~local ws symbols)*:s ws local* ws '}}' ws DPDK_VER* ';' ws) -> (e,s)
+"""
+
+def get_abi_versions():
+    '''Returns a string of possible dpdk abi versions'''
+
+    year = datetime.date.today().year - 2000
+    tags = " |".join(['\'{}\''.format(i) \
+                     for i in reversed(range(21, year + 1)) ])
+    tags  = tags + ' | \'20.0.1\' | \'20.0\' | \'20\''
+
+    return tags
+
+def get_dpdk_releases():
+    '''Returns a list of dpdk release tags names  since v19.11'''
+
+    year = datetime.date.today().year - 2000
+    year_range = "|".join("{}".format(i) for i in range(19,year + 1))
+    pattern = re.compile(r'^\"v(' +  year_range + r')\.\d{2}\"$')
+
+    cmd = ['git', 'for-each-ref', '--sort=taggerdate', '--format', '"%(tag)"']
+    try:
+        result = subprocess.run(cmd, \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        print("Failed to interogate git for release tags")
+        sys.exit()
+
+    tags = result.stdout.decode('utf-8').split('\n')
+
+    # find the non-rcs between now and v19.11
+    tags = [ tag.replace('\"','') \
+             for tag in reversed(tags) \
+             if pattern.match(tag) ][:-3]
+
+    return tags
+
+def fix_directory_name(path):
+    '''Prepend librte to the source directory name'''
+    mapfilepath1 = str(path.parent.name)
+    mapfilepath2 = str(path.parents[1])
+    mapfilepath = mapfilepath2 + '/librte_' + mapfilepath1
+
+    return mapfilepath
+
+def directory_renamed(path, rel):
+    '''Fix removal of the librte_ from the directory names'''
+
+    mapfilepath = fix_directory_name(path)
+    tagfile = '{}:{}/{}'.format(rel, mapfilepath,  path.name)
+
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    return result
+
+def mapfile_renamed(path, rel):
+    '''Fix renaming of the map file'''
+    newfile = None
+
+    result = subprocess.run(['git', 'ls-tree', \
+                             rel, str(path.parent) + '/'], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE,
+                            check=True)
+    dentries = result.stdout.decode('utf-8')
+    dentries = dentries.split('\n')
+
+    # filter entries looking for the map file
+    dentries = [dentry for dentry in dentries if dentry.endswith('.map')]
+    if len(dentries) > 1 or len(dentries) == 0:
+        return None
+
+    dparts = dentries[0].split('/')
+    newfile = dparts[len(dparts) - 1]
+
+    if newfile is not None:
+        tagfile = '{}:{}/{}'.format(rel, path.parent, newfile)
+
+        try:
+            result = subprocess.run(['git', 'show', tagfile], \
+                                    stdout=subprocess.PIPE, \
+                                    stderr=subprocess.PIPE,
+                                    check=True)
+        except subprocess.CalledProcessError:
+            result = None
+
+    else:
+        result = None
+
+    return result
+
+def mapfile_and_directory_renamed(path, rel):
+    '''Fix renaming of the map file & the source directory'''
+    mapfilepath = Path("{}/{}".format(fix_directory_name(path),path.name))
+
+    return mapfile_renamed(mapfilepath, rel)
+
+def get_terminal_rows():
+    '''Find the number of rows in the terminal'''
+
+    rows, _ = os.popen('stty size', 'r').read().split()
+    return int(rows)
+
+class FormatOutput():
+    '''Format the output to supported formats'''
+    output_fmt = ""
+    column_fmt = ""
+
+    def __init__(self, format_output, dpdk_releases):
+        self.OUTPUT_FORMATS[format_output](self,dpdk_releases)
+        self.column_titles = ['mapfile'] +  dpdk_releases
+
+        self.terminal_rows = get_terminal_rows()
+        self.row = 0
+
+    def set_terminal_output(self,dpdk_rel):
+        '''Set the output format to Tabbed Separated Values'''
+
+        self.output_fmt = '{:<50}' + \
+            ''.join(['{:<6}{:<6}'] * (len(dpdk_rel)))
+        self.column_fmt = '{:50}' + \
+            ''.join(['{:<12}'] * (len(dpdk_rel)))
+
+    def set_csv_output(self,dpdk_rel):
+        '''Set the output format to Comma Separated Values'''
+
+        self.output_fmt = '{},' + \
+            ','.join(['{},{}'] * (len(dpdk_rel)))
+        self.column_fmt = '{},' + \
+            ','.join(['{},'] * (len(dpdk_rel)))
+
+    def print_columns(self):
+        '''Print column rows with release names'''
+        print(self.column_fmt.format(*self.column_titles))
+        self.row += 1
+
+    def print_row(self,symbols):
+        '''Print row of symbol values'''
+        print(self.output_fmt.format(*symbols))
+        self.row += 1
+
+        if((self.terminal_rows>0) and ((self.row % self.terminal_rows) == 0)):
+            self.print_columns()
+
+    OUTPUT_FORMATS = { None: set_terminal_output, \
+                       'terminal': set_terminal_output, \
+                       'csv': set_csv_output }
+
+SRC_DIRECTORIES = 'drivers, lib'
+IGNORE_SECTIONS = ['EXPERIMENTAL','INTERNAL']
+FIX_STRATEGIES = [directory_renamed, \
+                  mapfile_renamed, \
+                  mapfile_and_directory_renamed]
+
+def count_release_symbols(map_parser, release, mapfile_path):
+    '''Count the symbols for a given release and mapfile'''
+    csym = [0] * 2
+    abi_sections = None
+
+    tagfile = '{}:{}'.format(release,mapfile_path)
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    for fix_strategy in FIX_STRATEGIES:
+        if result is not None:
+            break
+        result = fix_strategy(mapfile_path, release)
+
+    if result is not None:
+        mapfile = result.stdout.decode('utf-8')
+        abi_sections = map_parser(mapfile).abi()
+
+    if abi_sections is not None:
+        # which versions are present, and we care about
+        found_ver = [ver \
+                     for ver in abi_sections \
+                     if ver not in IGNORE_SECTIONS]
+
+        for ver in found_ver:
+            csym[0] += len(abi_sections[ver])
+
+        # count experimental symbols
+        if 'EXPERIMENTAL' in abi_sections:
+            csym[1] = len(abi_sections['EXPERIMENTAL'])
+
+    return csym
+
+def main():
+    '''Main entry point'''
+
+    parser = argparse.ArgumentParser(description='Count symbols in DPDK Libs')
+    parser.add_argument('--format-output', choices=['terminal','csv'], \
+                        default='terminal')
+    parser.add_argument('--directory', choices=SRC_DIRECTORIES,
+                        default=SRC_DIRECTORIES)
+    args = parser.parse_args()
+
+    dpdk_releases = get_dpdk_releases()
+    format_output = FormatOutput(args.format_output, dpdk_releases)
+
+    map_grammar = MAP_GRAMMAR.format(get_abi_versions())
+    map_parser = makeGrammar(map_grammar, {})
+
+    format_output.print_columns()
+    for src_dir in args.directory.split(','):
+        for path in Path(src_dir).rglob('*.map'):
+            relsym = [str(path)]
+
+            for release in dpdk_releases:
+                csym = count_release_symbols(map_parser, release, path)
+                relsym += csym
+
+            format_output.print_row(relsym)
+
+if __name__ == '__main__':
+    main()
-- 
2.26.2


^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-21 15:32  0%                         ` Ferruh Yigit
@ 2021-06-21 15:37  0%                           ` Ananyev, Konstantin
  0 siblings, 0 replies; 200+ results
From: Ananyev, Konstantin @ 2021-06-21 15:37 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil


> 
> On 6/21/2021 3:42 PM, Ananyev, Konstantin wrote:
> >
> >>>>>>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> >>>>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> >>>>>>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> >>>>>>>> any regressions.
> >>>>>>>> That could still be flat array with max_size specified at application startup.
> >>>>>>>> 2. Hide rest of rte_ethdev struct in .c.
> >>>>>>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> >>>>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
> >>>>>>>>
> >>>>>>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> >>>>>>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> >>>>>>>> Probably some macro can be provided to simplify it.
> >>>>>>>>
> >>>>>>>
> >>>>>>> We are already planning some tasks for ABI stability for v21.11, I think
> >>>>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
> >>>>>>> internal data.
> >>>>>>
> >>>>>> Ok, sounds good.
> >>>>>>
> >>>>>>>
> >>>>>>>> The only significant complication I can foresee with implementing that approach -
> >>>>>>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> >>>>>>>> (to avoid extra indirection for callback implementation).
> >>>>>>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> >>>>>>>>
> >>>>>>>
> >>>>>>> What do you think split Rx/Tx callback into its own struct too?
> >>>>>>>
> >>>>>>> Overall 'rte_eth_dev' can be split into three as:
> >>>>>>> 1. rte_eth_dev
> >>>>>>> 2. rte_eth_dev_burst
> >>>>>>> 3. rte_eth_dev_cb
> >>>>>>>
> >>>>>>> And we can hide 1 from applications even with the inline functions.
> >>>>>>
> >>>>>> As discussed off-line, I think:
> >>>>>> it is possible.
> >>>>>> My absolute preference would be to have just 1/2 (with CB hidden).
> >>>>>
> >>>>> How can we hide the callbacks since they are used by inline burst functions.
> >>>>
> >>>> I probably I owe a better explanation to what I meant in first mail.
> >>>> Otherwise it sounds confusing.
> >>>> I'll try to write a more detailed one in next few days.
> >>>
> >>> Actually I gave it another thought over weekend, and might be we can
> >>> hide rte_eth_dev_cb even in a simpler way. I'd use eth_rx_burst() as
> >>> an example, but the same principle applies to other 'fast' functions.
> >>>
> >>>  1. Needed changes for PMDs rx_pkt_burst():
> >>>     a) change function prototype to accept 'uint16_t port_id' and 'uint16_t queue_id',
> >>>          instead of current 'void *'.
> >>>     b) Each PMD rx_pkt_burst() will have to call rte_eth_rx_epilog() function at return.
> >>>          This  inline function will do all CB calls for that queue.
> >>>
> >>> To be more specific, let say we have some PMD: xyz with RX function:
> >>>
> >>> uint16_t
> >>> xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> >>> {
> >>>      struct xyz_rx_queue *rxq = rx_queue;
> >>>      uint16_t nb_rx = 0;
> >>>
> >>>      /* do actual stuff here */
> >>>     ....
> >>>     return nb_rx;
> >>> }
> >>>
> >>> It will be transformed to:
> >>>
> >>> uint16_t
> >>> xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> >>> {
> >>>          struct xyz_rx_queue *rxq;
> >>>          uint16_t nb_rx;
> >>>
> >>>          rxq = _rte_eth_rx_prolog(port_id, queue_id);
> >>>          if (rxq == NULL)
> >>>              return 0;
> >>>          nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
> >>>          return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts, nb_pkts);
> >>> }
> >>>
> >>> And somewhere in ethdev_private.h:
> >>>
> >>> static inline void *
> >>> _rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id);
> >>> {
> >>>    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >>>
> >>> #ifdef RTE_ETHDEV_DEBUG_RX
> >>>         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> >>>         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);
> >>>
> >>>         if (queue_id >= dev->data->nb_rx_queues) {
> >>>                 RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
> >>>                 return NULL;
> >>>         }
> >>> #endif
> >>>   return dev->data->rx_queues[queue_id];
> >>> }
> >>>
> >>> static inline uint16_t
> >>> _rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, const uint16_t nb_pkts);
> >>> {
> >>>     struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >>>
> >>> #ifdef RTE_ETHDEV_RXTX_CALLBACKS
> >>>         struct rte_eth_rxtx_callback *cb;
> >>>
> >>>         /* __ATOMIC_RELEASE memory order was used when the
> >>>          * call back was inserted into the list.
> >>>          * Since there is a clear dependency between loading
> >>>          * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory order is
> >>>          * not required.
> >>>          */
> >>>         cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
> >>>                                 __ATOMIC_RELAXED);
> >>>
> >>>         if (unlikely(cb != NULL)) {
> >>>                 do {
> >>>                         nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
> >>>                                                 nb_pkts, cb->param);
> >>>                         cb = cb->next;
> >>>                 } while (cb != NULL);
> >>>         }
> >>> #endif
> >>>
> >>>         rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
> >>>         return nb_rx;
> >>>  }
> >>>
> >>> Now, as you said above, in rte_ethdev.h we will keep only a flat array
> >>> with pointers to 'fast' functions:
> >>> struct {
> >>>      eth_rx_burst_t             rx_pkt_burst
> >>>       eth_tx_burst_t             tx_pkt_burst;
> >>>       eth_tx_prep_t              tx_pkt_prepare;
> >>>      .....
> >>> } rte_eth_dev_burst[];
> >>>
> >>> And rte_eth_rx_burst() will look like:
> >>>
> >>> static inline uint16_t
> >>> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
> >>>                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> >>> {
> >>>     if (port_id >= RTE_MAX_ETHPORTS)
> >>>         return 0;
> >>>    return rte_eth_dev_burst[port_id](port_id, queue_id, rx_pkts, nb_pkts);
> >>> }
> >>>
> >>> Yes, it will require changes in *all* PMDs, but as I said before the changes will be a mechanic ones.
> >>>
> >>
> >> I did not like the idea to push to calling Rx/TX callbacks responsibility to the
> >> drivers, I think it should be in the ethdev layer.
> >
> > Well, I'd say it is an ethdev layer function that has to be called by PMD 😊
> >
> >>
> >> What about making 'rte_eth_rx_epilog' an API and call from 'rte_eth_rx_burst()',
> >> which will add another function call for Rx/Tx callback but shouldn't affect the
> >> Rx/Tx burst.
> >
> > But then we either need to expose call-back information to the user or pay the penalty
> > for extra function call, correct?
> >
> 
> Right. As a middle ground, we can keep Rx/Tx burst functions as inline, but have
> the Rx/Tx callback part of it as function, so get the hit only for callbacks.

To avoid the  hit we need to expose CB data to the user.
At least number of call-backs currently installed for each queue. 


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v3] devtools: script to track map symbols
  2021-06-18 16:36  5% [dpdk-dev] [PATCH] devtools: script to track map symbols Ray Kinsella
@ 2021-06-21 15:25  6% ` Ray Kinsella
  2021-06-21 15:35  6% ` [dpdk-dev] [PATCH v4] " Ray Kinsella
  2021-06-22 10:19  6% ` [dpdk-dev] [PATCH v5] " Ray Kinsella
  2 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2021-06-21 15:25 UTC (permalink / raw)
  To: dev; +Cc: stephen, ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Script to track growth of stable and experimental symbols
over releases since v19.11.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
v2: reworked to fix pylint errors
v3: sent with the current in-reply-to

 devtools/count_symbols.py | 262 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 262 insertions(+)
 create mode 100755 devtools/count_symbols.py

diff --git a/devtools/count_symbols.py b/devtools/count_symbols.py
new file mode 100755
index 0000000000..30be09754f
--- /dev/null
+++ b/devtools/count_symbols.py
@@ -0,0 +1,262 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 Intel Corporation
+'''Tool to count the number of symbols in each DPDK release'''
+from pathlib import Path
+import sys
+import os
+import subprocess
+import argparse
+import re
+import datetime
+
+try:
+    from parsley import makeGrammar
+except ImportError:
+    print('This script uses the package Parsley to parse C Mapfiles.\n'
+          'This can be installed with \"pip install parsley".')
+    sys.exit()
+
+MAP_GRAMMAR = r"""
+
+ws = (' ' | '\r' | '\n' | '\t')*
+
+ABI_VER = ({})
+DPDK_VER = ('DPDK_' ABI_VER)
+ABI_NAME = ('INTERNAL' | 'EXPERIMENTAL' | DPDK_VER)
+comment = '#' (~'\n' anything)+ '\n'
+symbol = (~(';' | '}}' | '#') anything )+:c ';' -> ''.join(c)
+global = 'global:'
+local = 'local: *;'
+symbols = comment* symbol:s ws comment* -> s
+
+abi = (abi_section+):m -> dict(m)
+abi_section = (ws ABI_NAME:e ws '{{' ws global* (~local ws symbols)*:s ws local* ws '}}' ws DPDK_VER* ';' ws) -> (e,s)
+"""
+
+def get_abi_versions():
+    '''Returns a string of possible dpdk abi versions'''
+
+    year = datetime.date.today().year - 2000
+    tags = " |".join(['\'{}\''.format(i) \
+                     for i in reversed(range(21, year + 1)) ])
+    tags  = tags + ' | \'20.0.1\' | \'20.0\' | \'20\''
+
+    return tags
+
+def get_dpdk_releases():
+    '''Returns a list of dpdk release tags names  since v19.11'''
+
+    year = datetime.date.today().year - 2000
+    year_range = "|".join("{}".format(i) for i in range(19,year + 1))
+    pattern = re.compile(r'^\"v(' +  year_range + r')\.\d{2}\"$')
+
+    cmd = ['git', 'for-each-ref', '--sort=taggerdate', '--format', '"%(tag)"']
+    try:
+        result = subprocess.run(cmd, \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        print("Failed to interogate git for release tags")
+        sys.exit()
+
+    tags = result.stdout.decode('utf-8').split('\n')
+
+    # find the non-rcs between now and v19.11
+    tags = [ tag.replace('\"','') \
+             for tag in reversed(tags) \
+             if pattern.match(tag) ][:-3]
+
+    return tags
+
+def fix_directory_name(path):
+    '''Prepend librte to the source directory name'''
+    mapfilepath1 = str(path.parent.name)
+    mapfilepath2 = str(path.parents[1])
+    mapfilepath = mapfilepath2 + '/librte_' + mapfilepath1
+
+    return mapfilepath
+
+def directory_renamed(path, rel):
+    '''Fix removal of the librte_ from the directory names'''
+
+    mapfilepath = fix_directory_name(path)
+    tagfile = '{}:{}/{}'.format(rel, mapfilepath,  path.name)
+
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    return result
+
+def mapfile_renamed(path, rel):
+    '''Fix renaming of map files'''
+    newfile = None
+
+    result = subprocess.run(['git', 'ls-tree', \
+                             rel, str(path.parent) + '/'], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE,
+                            check=True)
+    dentries = result.stdout.decode('utf-8')
+    dentries = dentries.split('\n')
+
+    # filter entries looking for the map file
+    dentries = [dentry for dentry in dentries if dentry.endswith('.map')]
+    if len(dentries) > 1 or len(dentries) == 0:
+        return None
+
+    dparts = dentries[0].split('/')
+    newfile = dparts[len(dparts) - 1]
+
+    if newfile is not None:
+        tagfile = '{}:{}/{}'.format(rel, path.parent, newfile)
+
+        try:
+            result = subprocess.run(['git', 'show', tagfile], \
+                                    stdout=subprocess.PIPE, \
+                                    stderr=subprocess.PIPE,
+                                    check=True)
+        except subprocess.CalledProcessError:
+            result = None
+
+    else:
+        result = None
+
+    return result
+
+def mapfile_and_directory_renamed(path, rel):
+    '''Fix renaming of the map file & the source directory'''
+    mapfilepath = Path("{}/{}".format(fix_directory_name(path),path.name))
+
+    return mapfile_renamed(mapfilepath, rel)
+
+def get_terminal_rows():
+    '''Find the number of rows in the terminal'''
+
+    rows, _ = os.popen('stty size', 'r').read().split()
+    return int(rows)
+
+class FormatOutput():
+    '''Format the output to supported formats'''
+    output_fmt = ""
+    column_fmt = ""
+
+    def __init__(self, format_output, dpdk_releases):
+        self.OUTPUT_FORMATS[format_output](self,dpdk_releases)
+        self.column_titles = ['mapfile'] +  dpdk_releases
+
+        self.terminal_rows = get_terminal_rows()
+        self.row = 0
+
+    def set_terminal_output(self,dpdk_rel):
+        '''Set the output format to Tabbed Seperated Values'''
+
+        self.output_fmt = '{:<50}' + \
+            ''.join(['{:<6}{:<6}'] * (len(dpdk_rel)))
+        self.column_fmt = '{:50}' + \
+            ''.join(['{:<12}'] * (len(dpdk_rel)))
+
+    def set_csv_output(self,dpdk_rel):
+        '''Set the output format to Comma Seperated Values'''
+
+        self.output_fmt = '{},' + \
+            ','.join(['{},{}'] * (len(dpdk_rel)))
+        self.column_fmt = '{},' + \
+            ','.join(['{},'] * (len(dpdk_rel)))
+
+    def print_columns(self):
+        '''Print column rows with release names'''
+        print(self.column_fmt.format(*self.column_titles))
+        self.row += 1
+
+    def print_row(self,symbols):
+        '''Print row of symbol values'''
+        print(self.output_fmt.format(*symbols))
+        self.row += 1
+
+        if((self.terminal_rows>0) and ((self.row % self.terminal_rows) == 0)):
+            self.print_columns()
+
+    OUTPUT_FORMATS = { None: set_terminal_output, \
+                       'terminal': set_terminal_output, \
+                       'csv': set_csv_output }
+
+SRC_DIRECTORIES = 'drivers, lib'
+IGNORE_SECTIONS = ['EXPERIMENTAL','INTERNAL']
+FIX_STRATEGIES = [directory_renamed, \
+                  mapfile_renamed, \
+                  mapfile_and_directory_renamed]
+
+def count_release_symbols(map_parser, release, mapfile_path):
+    '''Count the symbols for a given release and mapfile'''
+    csym = [0] * 2
+    abi_sections = None
+
+    tagfile = '{}:{}'.format(release,mapfile_path)
+    try:
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE,
+                                check=True)
+    except subprocess.CalledProcessError:
+        result = None
+
+    for fix_strategy in FIX_STRATEGIES:
+        if result is not None:
+            break
+        result = fix_strategy(mapfile_path, release)
+
+    if result is not None:
+        mapfile = result.stdout.decode('utf-8')
+        abi_sections = map_parser(mapfile).abi()
+
+    if abi_sections is not None:
+        # which versions are present, and we care about
+        found_ver = [ver \
+                     for ver in abi_sections \
+                     if ver not in IGNORE_SECTIONS]
+
+        for ver in found_ver:
+            csym[0] += len(abi_sections[ver])
+
+        # count experimental symbols
+        if 'EXPERIMENTAL' in abi_sections:
+            csym[1] = len(abi_sections['EXPERIMENTAL'])
+
+    return csym
+
+def main():
+    '''Main entry point'''
+
+    parser = argparse.ArgumentParser(description='Count symbols in DPDK Libs')
+    parser.add_argument('--format-output', choices=['terminal','csv'], \
+                        default='terminal')
+    parser.add_argument('--directory', choices=SRC_DIRECTORIES,
+                        default=SRC_DIRECTORIES)
+    args = parser.parse_args()
+
+    dpdk_releases = get_dpdk_releases()
+    format_output = FormatOutput(args.format_output, dpdk_releases)
+
+    map_grammar = MAP_GRAMMAR.format(get_abi_versions())
+    map_parser = makeGrammar(map_grammar, {})
+
+    format_output.print_columns()
+    for src_dir in args.directory.split(','):
+        for path in Path(src_dir).rglob('*.map'):
+            relsym = [str(path)]
+
+            for release in dpdk_releases:
+                csym = count_release_symbols(map_parser, release, path)
+                relsym += csym
+
+            format_output.print_row(relsym)
+
+if __name__ == '__main__':
+    main()
-- 
2.26.2


^ permalink raw reply	[relevance 6%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-21 14:42  0%                       ` Ananyev, Konstantin
@ 2021-06-21 15:32  0%                         ` Ferruh Yigit
  2021-06-21 15:37  0%                           ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-06-21 15:32 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil

On 6/21/2021 3:42 PM, Ananyev, Konstantin wrote:
> 
>>>>>>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
>>>>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
>>>>>>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
>>>>>>>> any regressions.
>>>>>>>> That could still be flat array with max_size specified at application startup.
>>>>>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>>>>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
>>>>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>>>>>
>>>>>>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
>>>>>>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
>>>>>>>> Probably some macro can be provided to simplify it.
>>>>>>>>
>>>>>>>
>>>>>>> We are already planning some tasks for ABI stability for v21.11, I think
>>>>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
>>>>>>> internal data.
>>>>>>
>>>>>> Ok, sounds good.
>>>>>>
>>>>>>>
>>>>>>>> The only significant complication I can foresee with implementing that approach -
>>>>>>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
>>>>>>>> (to avoid extra indirection for callback implementation).
>>>>>>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
>>>>>>>>
>>>>>>>
>>>>>>> What do you think split Rx/Tx callback into its own struct too?
>>>>>>>
>>>>>>> Overall 'rte_eth_dev' can be split into three as:
>>>>>>> 1. rte_eth_dev
>>>>>>> 2. rte_eth_dev_burst
>>>>>>> 3. rte_eth_dev_cb
>>>>>>>
>>>>>>> And we can hide 1 from applications even with the inline functions.
>>>>>>
>>>>>> As discussed off-line, I think:
>>>>>> it is possible.
>>>>>> My absolute preference would be to have just 1/2 (with CB hidden).
>>>>>
>>>>> How can we hide the callbacks since they are used by inline burst functions.
>>>>
>>>> I probably I owe a better explanation to what I meant in first mail.
>>>> Otherwise it sounds confusing.
>>>> I'll try to write a more detailed one in next few days.
>>>
>>> Actually I gave it another thought over weekend, and might be we can
>>> hide rte_eth_dev_cb even in a simpler way. I'd use eth_rx_burst() as
>>> an example, but the same principle applies to other 'fast' functions.
>>>
>>>  1. Needed changes for PMDs rx_pkt_burst():
>>>     a) change function prototype to accept 'uint16_t port_id' and 'uint16_t queue_id',
>>>          instead of current 'void *'.
>>>     b) Each PMD rx_pkt_burst() will have to call rte_eth_rx_epilog() function at return.
>>>          This  inline function will do all CB calls for that queue.
>>>
>>> To be more specific, let say we have some PMD: xyz with RX function:
>>>
>>> uint16_t
>>> xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
>>> {
>>>      struct xyz_rx_queue *rxq = rx_queue;
>>>      uint16_t nb_rx = 0;
>>>
>>>      /* do actual stuff here */
>>>     ....
>>>     return nb_rx;
>>> }
>>>
>>> It will be transformed to:
>>>
>>> uint16_t
>>> xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
>>> {
>>>          struct xyz_rx_queue *rxq;
>>>          uint16_t nb_rx;
>>>
>>>          rxq = _rte_eth_rx_prolog(port_id, queue_id);
>>>          if (rxq == NULL)
>>>              return 0;
>>>          nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
>>>          return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts, nb_pkts);
>>> }
>>>
>>> And somewhere in ethdev_private.h:
>>>
>>> static inline void *
>>> _rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id);
>>> {
>>>    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>>
>>> #ifdef RTE_ETHDEV_DEBUG_RX
>>>         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
>>>         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);
>>>
>>>         if (queue_id >= dev->data->nb_rx_queues) {
>>>                 RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
>>>                 return NULL;
>>>         }
>>> #endif
>>>   return dev->data->rx_queues[queue_id];
>>> }
>>>
>>> static inline uint16_t
>>> _rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, const uint16_t nb_pkts);
>>> {
>>>     struct rte_eth_dev *dev = &rte_eth_devices[port_id];
>>>
>>> #ifdef RTE_ETHDEV_RXTX_CALLBACKS
>>>         struct rte_eth_rxtx_callback *cb;
>>>
>>>         /* __ATOMIC_RELEASE memory order was used when the
>>>          * call back was inserted into the list.
>>>          * Since there is a clear dependency between loading
>>>          * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory order is
>>>          * not required.
>>>          */
>>>         cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
>>>                                 __ATOMIC_RELAXED);
>>>
>>>         if (unlikely(cb != NULL)) {
>>>                 do {
>>>                         nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
>>>                                                 nb_pkts, cb->param);
>>>                         cb = cb->next;
>>>                 } while (cb != NULL);
>>>         }
>>> #endif
>>>
>>>         rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
>>>         return nb_rx;
>>>  }
>>>
>>> Now, as you said above, in rte_ethdev.h we will keep only a flat array
>>> with pointers to 'fast' functions:
>>> struct {
>>>      eth_rx_burst_t             rx_pkt_burst
>>>       eth_tx_burst_t             tx_pkt_burst;
>>>       eth_tx_prep_t              tx_pkt_prepare;
>>>      .....
>>> } rte_eth_dev_burst[];
>>>
>>> And rte_eth_rx_burst() will look like:
>>>
>>> static inline uint16_t
>>> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
>>>                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
>>> {
>>>     if (port_id >= RTE_MAX_ETHPORTS)
>>>         return 0;
>>>    return rte_eth_dev_burst[port_id](port_id, queue_id, rx_pkts, nb_pkts);
>>> }
>>>
>>> Yes, it will require changes in *all* PMDs, but as I said before the changes will be a mechanic ones.
>>>
>>
>> I did not like the idea to push to calling Rx/TX callbacks responsibility to the
>> drivers, I think it should be in the ethdev layer.
> 
> Well, I'd say it is an ethdev layer function that has to be called by PMD 😊
> 
>>
>> What about making 'rte_eth_rx_epilog' an API and call from 'rte_eth_rx_burst()',
>> which will add another function call for Rx/Tx callback but shouldn't affect the
>> Rx/Tx burst.
> 
> But then we either need to expose call-back information to the user or pay the penalty
> for extra function call, correct?
> 

Right. As a middle ground, we can keep Rx/Tx burst functions as inline, but have
the Rx/Tx callback part of it as function, so get the hit only for callbacks.


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH] devtools: script to track map symbols
  @ 2021-06-21 15:11  5% ` Ray Kinsella
  0 siblings, 0 replies; 200+ results
From: Ray Kinsella @ 2021-06-21 15:11 UTC (permalink / raw)
  To: dev; +Cc: stephen, ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Script to track growth of stable and experimental symbols
over releases since v19.11.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
 devtools/count_symbols.py | 230 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 230 insertions(+)
 create mode 100755 devtools/count_symbols.py

diff --git a/devtools/count_symbols.py b/devtools/count_symbols.py
new file mode 100755
index 0000000000..7b29651044
--- /dev/null
+++ b/devtools/count_symbols.py
@@ -0,0 +1,230 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 Intel Corporation
+from pathlib import Path
+import sys, os
+import subprocess
+import argparse
+import re
+import datetime
+
+try:
+        from parsley import makeGrammar
+except ImportError:
+        print('This script uses the package Parsley to parse C Mapfiles.\n'
+              'This can be installed with \"pip install parsley".')
+        exit()
+
+symbolMapGrammar = r"""
+
+ws = (' ' | '\r' | '\n' | '\t')*
+
+ABI_VER = ({})
+DPDK_VER = ('DPDK_' ABI_VER)
+ABI_NAME = ('INTERNAL' | 'EXPERIMENTAL' | DPDK_VER)
+comment = '#' (~'\n' anything)+ '\n'
+symbol = (~(';' | '}}' | '#') anything )+:c ';' -> ''.join(c)
+global = 'global:'
+local = 'local: *;'
+symbols = comment* symbol:s ws comment* -> s
+
+abi = (abi_section+):m -> dict(m)
+abi_section = (ws ABI_NAME:e ws '{{' ws global* (~local ws symbols)*:s ws local* ws '}}' ws DPDK_VER* ';' ws) -> (e,s)
+"""
+
+#abi_ver = ['21', '20.0.1', '20.0', '20']
+
+def get_abi_versions():
+    year = datetime.date.today().year - 2000
+    s=" |".join(['\'{}\''.format(i) for i in reversed(range(21, year + 1)) ])
+    s = s + ' | \'20.0.1\' | \'20.0\' | \'20\''
+
+    return s
+
+def get_dpdk_releases():
+    year = datetime.date.today().year - 2000
+    s="|".join("{}".format(i) for i in range(19,year + 1))
+    pattern = re.compile('^\"v(' + s + ')\.\d{2}\"$')
+
+    cmd = ['git', 'for-each-ref', '--sort=taggerdate', '--format', '"%(tag)"']
+    result = subprocess.run(cmd, \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    if result.stderr.startswith(b'fatal'):
+        result = None
+
+    tags = result.stdout.decode('utf-8').split('\n')
+
+    # find the non-rcs between now and v19.11
+    tags = [ tag.replace('\"','') \
+             for tag in reversed(tags) \
+             if pattern.match(tag) ][:-3]
+
+    return tags
+
+
+def get_terminal_rows():
+    rows, _ = os.popen('stty size', 'r').read().split()
+    return int(rows)
+
+def fix_directory_name(path):
+    mapfilepath1 = str(path.parent.name)
+    mapfilepath2 = str(path.parents[1])
+    mapfilepath = mapfilepath2 + '/librte_' + mapfilepath1
+
+    return mapfilepath
+
+# fix removal of the librte_ from the directory names
+def directory_renamed(path, rel):
+    mapfilepath = fix_directory_name(path)
+    tagfile = '{}:{}/{}'.format(rel, mapfilepath,  path.name)
+
+    result = subprocess.run(['git', 'show', tagfile], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    if result.stderr.startswith(b'fatal'):
+        result = None
+
+    return result
+
+# fix renaming of map files
+def mapfile_renamed(path, rel):
+    newfile = None
+
+    result = subprocess.run(['git', 'ls-tree', \
+                             rel, str(path.parent) + '/'], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    dentries = result.stdout.decode('utf-8')
+    dentries = dentries.split('\n')
+
+    # filter entries looking for the map file
+    dentries = [dentry for dentry in dentries if dentry.endswith('.map')]
+    if len(dentries) > 1 or len(dentries) == 0:
+        return None
+
+    dparts = dentries[0].split('/')
+    newfile = dparts[len(dparts) - 1]
+
+    if(newfile is not None):
+        tagfile = '{}:{}/{}'.format(rel, path.parent, newfile)
+
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE)
+        if result.stderr.startswith(b'fatal'):
+            result = None
+
+    else:
+        result = None
+
+    return result
+
+# renaming of the map file & renaming of directory
+def mapfile_and_directory_renamed(path, rel):
+    mapfilepath = Path("{}/{}".format(fix_directory_name(path),path.name))
+
+    return mapfile_renamed(mapfilepath, rel)
+
+fix_strategies = [directory_renamed, \
+                  mapfile_renamed, \
+                  mapfile_and_directory_renamed]
+
+fmt = col_fmt = ""
+
+def set_terminal_output(dpdk_rel):
+    global fmt, col_fmt
+
+    fmt = '{:<50}'
+    col_fmt = fmt
+    for rel in dpdk_rel:
+        fmt += '{:<6}{:<6}'
+        col_fmt += '{:<12}'
+
+def set_csv_output(dpdk_rel):
+    global fmt, col_fmt
+
+    fmt = '{},'
+    col_fmt = fmt
+    for rel in dpdk_rel:
+        fmt += '{},{},'
+        col_fmt += '{},,'
+
+output_formats = { None: set_terminal_output, \
+                   'terminal': set_terminal_output, \
+                   'csv': set_csv_output }
+directories = 'drivers, lib'
+
+def main():
+    global fmt, col_fmt, symbolMapGrammar
+
+    parser = argparse.ArgumentParser(description='Count symbols in DPDK Libs')
+    parser.add_argument('--format-output', choices=['terminal','csv'], \
+                        default='terminal')
+    parser.add_argument('--directory', choices=directories,
+                        default=directories)
+    args = parser.parse_args()
+
+    dpdk_rel = get_dpdk_releases()
+
+    # set the output format
+    output_formats[args.format_output](dpdk_rel)
+
+    column_titles = ['mapfile'] + dpdk_rel
+    print(col_fmt.format(*column_titles))
+
+    symbolMapGrammar = symbolMapGrammar.format(get_abi_versions())
+    MAPParser = makeGrammar(symbolMapGrammar, {})
+
+    terminal_rows = get_terminal_rows()
+    row = 0
+
+    for src_dir in args.directory.split(','):
+        for path in Path(src_dir).rglob('*.map'):
+            csym = [0] * 2
+            relsym = [str(path)]
+
+            for rel in dpdk_rel:
+                i = csym[0] = csym[1] = 0
+                abi_sections = None
+
+                tagfile = '{}:{}'.format(rel,path)
+                result = subprocess.run(['git', 'show', tagfile], \
+                                        stdout=subprocess.PIPE, \
+                                        stderr=subprocess.PIPE)
+
+                if result.stderr.startswith(b'fatal'):
+                    result = None
+
+                while(result is None and i < len(fix_strategies)):
+                    result = fix_strategies[i](path, rel)
+                    i += 1
+
+                if result is not None:
+                    mapfile = result.stdout.decode('utf-8')
+                    abi_sections = MAPParser(mapfile).abi()
+
+                if abi_sections is not None:
+                    # which versions are present, and we care about
+                    ignore = ['EXPERIMENTAL','INTERNAL']
+                    found_ver = [ver \
+                                 for ver in abi_sections \
+                                 if ver not in ignore]
+
+                    for ver in found_ver:
+                        csym[0] += len(abi_sections[ver])
+
+                    # count experimental symbols
+                    if 'EXPERIMENTAL' in abi_sections:
+                        csym[1] = len(abi_sections['EXPERIMENTAL'])
+
+                relsym += csym
+
+            print(fmt.format(*relsym))
+            row += 1
+
+        if((terminal_rows>0) and ((row % terminal_rows) == 0)):
+            print(col_fmt.format(*column_titles))
+
+if __name__ == '__main__':
+        main()
-- 
2.26.2


^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-21 14:05  0%                     ` Ferruh Yigit
@ 2021-06-21 14:42  0%                       ` Ananyev, Konstantin
  2021-06-21 15:32  0%                         ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-06-21 14:42 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil


> >>>>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> >>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> >>>>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> >>>>>> any regressions.
> >>>>>> That could still be flat array with max_size specified at application startup.
> >>>>>> 2. Hide rest of rte_ethdev struct in .c.
> >>>>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> >>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
> >>>>>>
> >>>>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> >>>>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> >>>>>> Probably some macro can be provided to simplify it.
> >>>>>>
> >>>>>
> >>>>> We are already planning some tasks for ABI stability for v21.11, I think
> >>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
> >>>>> internal data.
> >>>>
> >>>> Ok, sounds good.
> >>>>
> >>>>>
> >>>>>> The only significant complication I can foresee with implementing that approach -
> >>>>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> >>>>>> (to avoid extra indirection for callback implementation).
> >>>>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> >>>>>>
> >>>>>
> >>>>> What do you think split Rx/Tx callback into its own struct too?
> >>>>>
> >>>>> Overall 'rte_eth_dev' can be split into three as:
> >>>>> 1. rte_eth_dev
> >>>>> 2. rte_eth_dev_burst
> >>>>> 3. rte_eth_dev_cb
> >>>>>
> >>>>> And we can hide 1 from applications even with the inline functions.
> >>>>
> >>>> As discussed off-line, I think:
> >>>> it is possible.
> >>>> My absolute preference would be to have just 1/2 (with CB hidden).
> >>>
> >>> How can we hide the callbacks since they are used by inline burst functions.
> >>
> >> I probably I owe a better explanation to what I meant in first mail.
> >> Otherwise it sounds confusing.
> >> I'll try to write a more detailed one in next few days.
> >
> > Actually I gave it another thought over weekend, and might be we can
> > hide rte_eth_dev_cb even in a simpler way. I'd use eth_rx_burst() as
> > an example, but the same principle applies to other 'fast' functions.
> >
> >  1. Needed changes for PMDs rx_pkt_burst():
> >     a) change function prototype to accept 'uint16_t port_id' and 'uint16_t queue_id',
> >          instead of current 'void *'.
> >     b) Each PMD rx_pkt_burst() will have to call rte_eth_rx_epilog() function at return.
> >          This  inline function will do all CB calls for that queue.
> >
> > To be more specific, let say we have some PMD: xyz with RX function:
> >
> > uint16_t
> > xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> > {
> >      struct xyz_rx_queue *rxq = rx_queue;
> >      uint16_t nb_rx = 0;
> >
> >      /* do actual stuff here */
> >     ....
> >     return nb_rx;
> > }
> >
> > It will be transformed to:
> >
> > uint16_t
> > xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> > {
> >          struct xyz_rx_queue *rxq;
> >          uint16_t nb_rx;
> >
> >          rxq = _rte_eth_rx_prolog(port_id, queue_id);
> >          if (rxq == NULL)
> >              return 0;
> >          nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
> >          return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts, nb_pkts);
> > }
> >
> > And somewhere in ethdev_private.h:
> >
> > static inline void *
> > _rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id);
> > {
> >    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >
> > #ifdef RTE_ETHDEV_DEBUG_RX
> >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
> >         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);
> >
> >         if (queue_id >= dev->data->nb_rx_queues) {
> >                 RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
> >                 return NULL;
> >         }
> > #endif
> >   return dev->data->rx_queues[queue_id];
> > }
> >
> > static inline uint16_t
> > _rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, const uint16_t nb_pkts);
> > {
> >     struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> >
> > #ifdef RTE_ETHDEV_RXTX_CALLBACKS
> >         struct rte_eth_rxtx_callback *cb;
> >
> >         /* __ATOMIC_RELEASE memory order was used when the
> >          * call back was inserted into the list.
> >          * Since there is a clear dependency between loading
> >          * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory order is
> >          * not required.
> >          */
> >         cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
> >                                 __ATOMIC_RELAXED);
> >
> >         if (unlikely(cb != NULL)) {
> >                 do {
> >                         nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
> >                                                 nb_pkts, cb->param);
> >                         cb = cb->next;
> >                 } while (cb != NULL);
> >         }
> > #endif
> >
> >         rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
> >         return nb_rx;
> >  }
> >
> > Now, as you said above, in rte_ethdev.h we will keep only a flat array
> > with pointers to 'fast' functions:
> > struct {
> >      eth_rx_burst_t             rx_pkt_burst
> >       eth_tx_burst_t             tx_pkt_burst;
> >       eth_tx_prep_t              tx_pkt_prepare;
> >      .....
> > } rte_eth_dev_burst[];
> >
> > And rte_eth_rx_burst() will look like:
> >
> > static inline uint16_t
> > rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
> >                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> > {
> >     if (port_id >= RTE_MAX_ETHPORTS)
> >         return 0;
> >    return rte_eth_dev_burst[port_id](port_id, queue_id, rx_pkts, nb_pkts);
> > }
> >
> > Yes, it will require changes in *all* PMDs, but as I said before the changes will be a mechanic ones.
> >
> 
> I did not like the idea to push to calling Rx/TX callbacks responsibility to the
> drivers, I think it should be in the ethdev layer.

Well, I'd say it is an ethdev layer function that has to be called by PMD 😊

> 
> What about making 'rte_eth_rx_epilog' an API and call from 'rte_eth_rx_burst()',
> which will add another function call for Rx/Tx callback but shouldn't affect the
> Rx/Tx burst.

But then we either need to expose call-back information to the user or pay the penalty
for extra function call, correct?



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-21 11:06  0%                   ` Ananyev, Konstantin
@ 2021-06-21 14:05  0%                     ` Ferruh Yigit
  2021-06-21 14:42  0%                       ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-06-21 14:05 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil

On 6/21/2021 12:06 PM, Ananyev, Konstantin wrote:
> 
> Hi everyone,
> 
>>>>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
>>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
>>>>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
>>>>>> any regressions.
>>>>>> That could still be flat array with max_size specified at application startup.
>>>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
>>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>>>
>>>>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
>>>>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
>>>>>> Probably some macro can be provided to simplify it.
>>>>>>
>>>>>
>>>>> We are already planning some tasks for ABI stability for v21.11, I think
>>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
>>>>> internal data.
>>>>
>>>> Ok, sounds good.
>>>>
>>>>>
>>>>>> The only significant complication I can foresee with implementing that approach -
>>>>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
>>>>>> (to avoid extra indirection for callback implementation).
>>>>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
>>>>>>
>>>>>
>>>>> What do you think split Rx/Tx callback into its own struct too?
>>>>>
>>>>> Overall 'rte_eth_dev' can be split into three as:
>>>>> 1. rte_eth_dev
>>>>> 2. rte_eth_dev_burst
>>>>> 3. rte_eth_dev_cb
>>>>>
>>>>> And we can hide 1 from applications even with the inline functions.
>>>>
>>>> As discussed off-line, I think:
>>>> it is possible.
>>>> My absolute preference would be to have just 1/2 (with CB hidden).
>>>
>>> How can we hide the callbacks since they are used by inline burst functions.
>>
>> I probably I owe a better explanation to what I meant in first mail.
>> Otherwise it sounds confusing.
>> I'll try to write a more detailed one in next few days.
> 
> Actually I gave it another thought over weekend, and might be we can
> hide rte_eth_dev_cb even in a simpler way. I'd use eth_rx_burst() as
> an example, but the same principle applies to other 'fast' functions.
> 
>  1. Needed changes for PMDs rx_pkt_burst():
>     a) change function prototype to accept 'uint16_t port_id' and 'uint16_t queue_id',
>          instead of current 'void *'.
>     b) Each PMD rx_pkt_burst() will have to call rte_eth_rx_epilog() function at return.
>          This  inline function will do all CB calls for that queue.
> 
> To be more specific, let say we have some PMD: xyz with RX function:
> 
> uint16_t
> xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> {
>      struct xyz_rx_queue *rxq = rx_queue;
>      uint16_t nb_rx = 0;
> 
>      /* do actual stuff here */
>     ....
>     return nb_rx;
> }
> 
> It will be transformed to:
> 
> uint16_t
> xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
> {
>          struct xyz_rx_queue *rxq;
>          uint16_t nb_rx;
> 
>          rxq = _rte_eth_rx_prolog(port_id, queue_id);
>          if (rxq == NULL)
>              return 0;
>          nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
>          return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts, nb_pkts);
> }
> 
> And somewhere in ethdev_private.h:
> 
> static inline void *
> _rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id);
> {
>    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> 
> #ifdef RTE_ETHDEV_DEBUG_RX
>         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
>         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);
> 
>         if (queue_id >= dev->data->nb_rx_queues) {
>                 RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
>                 return NULL;
>         }
> #endif
>   return dev->data->rx_queues[queue_id];
> }
> 
> static inline uint16_t
> _rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, const uint16_t nb_pkts);
> {
>     struct rte_eth_dev *dev = &rte_eth_devices[port_id];
> 
> #ifdef RTE_ETHDEV_RXTX_CALLBACKS
>         struct rte_eth_rxtx_callback *cb;
> 
>         /* __ATOMIC_RELEASE memory order was used when the
>          * call back was inserted into the list.
>          * Since there is a clear dependency between loading
>          * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory order is
>          * not required.
>          */
>         cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
>                                 __ATOMIC_RELAXED);
> 
>         if (unlikely(cb != NULL)) {
>                 do {
>                         nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
>                                                 nb_pkts, cb->param);
>                         cb = cb->next;
>                 } while (cb != NULL);
>         }
> #endif
> 
>         rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
>         return nb_rx;
>  }
> 
> Now, as you said above, in rte_ethdev.h we will keep only a flat array
> with pointers to 'fast' functions:
> struct {
>      eth_rx_burst_t             rx_pkt_burst
>       eth_tx_burst_t             tx_pkt_burst;
>       eth_tx_prep_t              tx_pkt_prepare;
>      .....
> } rte_eth_dev_burst[];
> 
> And rte_eth_rx_burst() will look like:
> 
> static inline uint16_t
> rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
>                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
> {
>     if (port_id >= RTE_MAX_ETHPORTS)
>         return 0;
>    return rte_eth_dev_burst[port_id](port_id, queue_id, rx_pkts, nb_pkts);
> }
> 
> Yes, it will require changes in *all* PMDs, but as I said before the changes will be a mechanic ones.
> 

I did not like the idea to push to calling Rx/TX callbacks responsibility to the
drivers, I think it should be in the ethdev layer.

What about making 'rte_eth_rx_epilog' an API and call from 'rte_eth_rx_burst()',
which will add another function call for Rx/Tx callback but shouldn't affect the
Rx/Tx burst.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-18 10:41  0%                 ` Ananyev, Konstantin
  2021-06-18 10:49  0%                   ` Ferruh Yigit
@ 2021-06-21 11:06  0%                   ` Ananyev, Konstantin
  2021-06-21 14:05  0%                     ` Ferruh Yigit
  1 sibling, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-06-21 11:06 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil


Hi everyone,
 
> > >>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> > >>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> > >>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> > >>> any regressions.
> > >>> That could still be flat array with max_size specified at application startup.
> > >>> 2. Hide rest of rte_ethdev struct in .c.
> > >>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> > >>> (flat array, vector, hash, linked list) without ABI/API breakages.
> > >>>
> > >>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> > >>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> > >>> Probably some macro can be provided to simplify it.
> > >>>
> > >>
> > >> We are already planning some tasks for ABI stability for v21.11, I think
> > >> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
> > >> internal data.
> > >
> > > Ok, sounds good.
> > >
> > >>
> > >>> The only significant complication I can foresee with implementing that approach -
> > >>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> > >>> (to avoid extra indirection for callback implementation).
> > >>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> > >>>
> > >>
> > >> What do you think split Rx/Tx callback into its own struct too?
> > >>
> > >> Overall 'rte_eth_dev' can be split into three as:
> > >> 1. rte_eth_dev
> > >> 2. rte_eth_dev_burst
> > >> 3. rte_eth_dev_cb
> > >>
> > >> And we can hide 1 from applications even with the inline functions.
> > >
> > > As discussed off-line, I think:
> > > it is possible.
> > > My absolute preference would be to have just 1/2 (with CB hidden).
> >
> > How can we hide the callbacks since they are used by inline burst functions.
> 
> I probably I owe a better explanation to what I meant in first mail.
> Otherwise it sounds confusing.
> I'll try to write a more detailed one in next few days.

Actually I gave it another thought over weekend, and might be we can
hide rte_eth_dev_cb even in a simpler way. I'd use eth_rx_burst() as
an example, but the same principle applies to other 'fast' functions. 

 1. Needed changes for PMDs rx_pkt_burst():
    a) change function prototype to accept 'uint16_t port_id' and 'uint16_t queue_id',
         instead of current 'void *'.
    b) Each PMD rx_pkt_burst() will have to call rte_eth_rx_epilog() function at return.
         This  inline function will do all CB calls for that queue.

To be more specific, let say we have some PMD: xyz with RX function:

uint16_t
xyz_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
     struct xyz_rx_queue *rxq = rx_queue;
     uint16_t nb_rx = 0;

     /* do actual stuff here */
    ....
    return nb_rx; 
}

It will be transformed to:

uint16_t
xyz_recv_pkts(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, uint16_t nb_pkts)
{
         struct xyz_rx_queue *rxq;
         uint16_t nb_rx;

         rxq = _rte_eth_rx_prolog(port_id, queue_id);
         if (rxq == NULL)
             return 0;
         nb_rx = _xyz_real_recv_pkts(rxq, rx_pkts, nb_pkts);
         return _rte_eth_rx_epilog(port_id, queue_id, rx_pkts, nb_pkts);
}

And somewhere in ethdev_private.h:

static inline void *
_rte_eth_rx_prolog(uint16_t port_id, uint16_t queue_id); 
{
   struct rte_eth_dev *dev = &rte_eth_devices[port_id];

#ifdef RTE_ETHDEV_DEBUG_RX
        RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, NULL);
        RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, NULL);

        if (queue_id >= dev->data->nb_rx_queues) {
                RTE_ETHDEV_LOG(ERR, "Invalid RX queue_id=%u\n", queue_id);
                return NULL;
        }
#endif
  return dev->data->rx_queues[queue_id];   
}

static inline uint16_t
_rte_eth_rx_epilog(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts, const uint16_t nb_pkts); 
{
    struct rte_eth_dev *dev = &rte_eth_devices[port_id];
 
#ifdef RTE_ETHDEV_RXTX_CALLBACKS
        struct rte_eth_rxtx_callback *cb;

        /* __ATOMIC_RELEASE memory order was used when the
         * call back was inserted into the list.
         * Since there is a clear dependency between loading
         * cb and cb->fn/cb->next, __ATOMIC_ACQUIRE memory order is
         * not required.
         */
        cb = __atomic_load_n(&dev->post_rx_burst_cbs[queue_id],
                                __ATOMIC_RELAXED);

        if (unlikely(cb != NULL)) {
                do {
                        nb_rx = cb->fn.rx(port_id, queue_id, rx_pkts, nb_rx,
                                                nb_pkts, cb->param);
                        cb = cb->next;
                } while (cb != NULL);
        }
#endif

        rte_ethdev_trace_rx_burst(port_id, queue_id, (void **)rx_pkts, nb_rx);
        return nb_rx;
 }

Now, as you said above, in rte_ethdev.h we will keep only a flat array
with pointers to 'fast' functions:
struct {
     eth_rx_burst_t             rx_pkt_burst
      eth_tx_burst_t             tx_pkt_burst;       
      eth_tx_prep_t              tx_pkt_prepare;
     .....
} rte_eth_dev_burst[];

And rte_eth_rx_burst() will look like:

static inline uint16_t
rte_eth_rx_burst(uint16_t port_id, uint16_t queue_id,
                 struct rte_mbuf **rx_pkts, const uint16_t nb_pkts)
{
    if (port_id >= RTE_MAX_ETHPORTS)
        return 0;
   return rte_eth_dev_burst[port_id](port_id, queue_id, rx_pkts, nb_pkts);
}

Yes, it will require changes in *all* PMDs, but as I said before the changes will be a mechanic ones.

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH v3 0/3] Add PIE support for HQoS library
  2021-06-15  9:01  3%   ` [dpdk-dev] [RFC PATCH v2 " Liguzinski, WojciechX
@ 2021-06-21  7:35  3%     ` Liguzinski, WojciechX
  2021-07-05  8:04  3%       ` [dpdk-dev] [RFC PATCH v4 " Liguzinski, WojciechX
  0 siblings, 1 reply; 200+ results
From: Liguzinski, WojciechX @ 2021-06-21  7:35 UTC (permalink / raw)
  To: dev, jasvinder.singh, cristian.dumitrescu; +Cc: savinay.dharmappa, megha.ajmera

DPDK sched library is equipped with mechanism that secures it from the bufferbloat problem
which is a situation when excess buffers in the network cause high latency and latency 
variation. Currently, it supports RED for active queue management (which is designed 
to control the queue length but it does not control latency directly and is now being 
obsoleted). However, more advanced queue management is required to address this problem
and provide desirable quality of service to users.

This solution (RFC) proposes usage of new algorithm called "PIE" (Proportional Integral
controller Enhanced) that can effectively and directly control queuing latency to address 
the bufferbloat problem.

The implementation of mentioned functionality includes modification of existing and 
adding a new set of data structures to the library, adding PIE related APIs. 
This affects structures in public API/ABI. That is why deprecation notice is going
to be prepared and sent.

Liguzinski, WojciechX (3):
  sched: add PIE based congestion management
  example/qos_sched: add PIE support
  example/ip_pipeline: add PIE support

 config/rte_config.h                      |   1 -
 drivers/net/softnic/rte_eth_softnic_tm.c |   6 +-
 examples/ip_pipeline/tmgr.c              |   6 +-
 examples/qos_sched/app_thread.c          |   1 -
 examples/qos_sched/cfg_file.c            |  82 ++++-
 examples/qos_sched/init.c                |   7 +-
 examples/qos_sched/profile.cfg           | 196 ++++++++----
 lib/sched/meson.build                    |  10 +-
 lib/sched/rte_pie.c                      |  78 +++++
 lib/sched/rte_pie.h                      | 388 +++++++++++++++++++++++
 lib/sched/rte_sched.c                    | 229 +++++++++----
 lib/sched/rte_sched.h                    |  53 +++-
 12 files changed, 876 insertions(+), 181 deletions(-)
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH v2 0/6] Enable the internal EAL thread API
    2021-06-18 21:54  4% ` [dpdk-dev] [PATCH 2/6] eal: add function for control thread creation Narcisa Ana Maria Vasile
@ 2021-06-19  1:57  4% ` Narcisa Ana Maria Vasile
  2021-06-19  1:57  4%   ` [dpdk-dev] [PATCH v2 2/6] eal: add function for control thread creation Narcisa Ana Maria Vasile
  1 sibling, 1 reply; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-19  1:57 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

This patchset enables the new EAL thread API.
The newly defined thread attributes, priority and affinity,
are used in eal/windows when creating the threads. Similarly, 
some changes have been done in eal/linux/eal.c and eal/freebsd/eal.c
to initialize priority to a default value and set thread attributes.

The user is offered the option of either using the rte_thread_* API or
a 3rd party thread library, through a meson flag
called "use_external_thread_lib".
By default, this flag is set to FALSE, which means Windows libraries
and applications will use the EAL rte_thread_* API 
defined in windows/rte_thread.c for managing threads.
When the flag is set to TRUE, the common/rte_thread.c file is compiled
and an external thread library is used.

This patchset adds a new function for creating control threads that
uses the new thread API.
It enables the usage of the new function in Windows code and common code.
The old function is kept to avoid ABI break, however, its definition
is commented away on Windows, since the pthread_t and pthread_attr_t
arguments that it receives have been replaced with the new API on Windows.
This allows testing the "eal: Add EAL API for threading" that this
patchset depends on.

The ethdev lib also contains some changes that break the ABI.
Enabling the new EAL thread API will probably require going through
the proper process of ABI changes.

Depends-on: series-17402 ("eal: Add EAL API for threading")

v2:
- fix typo in SetThreadDescription_type function pointer
- add Depends-on on all patches to fix apply errors.
- modify cover letter

Narcisa Vasile (6):
  eal: add function that sets thread name
  eal: add function for control thread creation
  Enable the new EAL thread API in app, drivers and examples
  lib: enable the new EAL thread API
  eal: set affinity and priority attributes
  Allow choice between internal EAL thread API and external lib

 app/test/process.h                            |   8 +-
 app/test/test_lcores.c                        |  18 +-
 app/test/test_link_bonding.c                  |  14 +-
 app/test/test_lpm_perf.c                      |  12 +-
 config/meson.build                            |   1 -
 drivers/bus/dpaa/base/qbman/bman_driver.c     |   5 +-
 drivers/bus/dpaa/base/qbman/dpaa_sys.c        |  14 +-
 drivers/bus/dpaa/base/qbman/process.c         |   6 +-
 drivers/bus/dpaa/dpaa_bus.c                   |  14 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  19 +-
 drivers/common/dpaax/compat.h                 |   2 +-
 drivers/common/mlx5/windows/mlx5_common_os.h  |   1 +
 drivers/compress/mlx5/mlx5_compress.c         |  10 +-
 drivers/event/dlb2/dlb2.c                     |   2 +-
 drivers/event/dlb2/pf/base/dlb2_osdep.h       |   7 +-
 drivers/mempool/dpaa/dpaa_mempool.c           |   2 +-
 drivers/net/af_xdp/rte_eth_af_xdp.c           |  18 +-
 drivers/net/ark/ark_ethdev.c                  |   4 +-
 drivers/net/ark/ark_pktgen.c                  |   4 +-
 drivers/net/atlantic/atl_ethdev.c             |   4 +-
 drivers/net/atlantic/atl_types.h              |   4 +-
 .../net/atlantic/hw_atl/hw_atl_utils_fw2x.c   |  26 +--
 drivers/net/axgbe/axgbe_common.h              |   2 +-
 drivers/net/axgbe/axgbe_dev.c                 |   8 +-
 drivers/net/axgbe/axgbe_ethdev.c              |   8 +-
 drivers/net/axgbe/axgbe_ethdev.h              |   8 +-
 drivers/net/axgbe/axgbe_i2c.c                 |   4 +-
 drivers/net/axgbe/axgbe_mdio.c                |   8 +-
 drivers/net/axgbe/axgbe_phy_impl.c            |   6 +-
 drivers/net/bnxt/bnxt.h                       |  16 +-
 drivers/net/bnxt/bnxt_cpr.c                   |   4 +-
 drivers/net/bnxt/bnxt_ethdev.c                |  54 ++---
 drivers/net/bnxt/bnxt_irq.c                   |   8 +-
 drivers/net/bnxt/bnxt_reps.c                  |  10 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c            |  34 ++--
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h            |   4 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c          |  28 +--
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h          |   2 +-
 drivers/net/dpaa/dpaa_ethdev.c                |   2 +-
 drivers/net/dpaa/dpaa_rxtx.c                  |   2 +-
 drivers/net/ena/base/ena_plat_dpdk.h          |  15 +-
 drivers/net/enic/enic.h                       |   2 +-
 drivers/net/ice/ice_dcf_parent.c              |   8 +-
 drivers/net/ixgbe/ixgbe_ethdev.c              |   6 +-
 drivers/net/ixgbe/ixgbe_ethdev.h              |   2 +-
 drivers/net/mlx5/linux/mlx5_os.c              |   2 +-
 drivers/net/mlx5/mlx5.c                       |  20 +-
 drivers/net/mlx5/mlx5.h                       |   2 +-
 drivers/net/mlx5/mlx5_txpp.c                  |   8 +-
 drivers/net/mlx5/windows/mlx5_flow_os.c       |  10 +-
 drivers/net/mlx5/windows/mlx5_os.c            |   2 +-
 drivers/net/qede/base/bcm_osal.h              |   8 +-
 drivers/net/vhost/rte_eth_vhost.c             |  24 +--
 .../net/virtio/virtio_user/virtio_user_dev.c  |  30 +--
 .../net/virtio/virtio_user/virtio_user_dev.h  |   2 +-
 drivers/vdpa/ifc/ifcvf_vdpa.c                 |  49 +++--
 drivers/vdpa/mlx5/mlx5_vdpa.c                 |  24 +--
 drivers/vdpa/mlx5/mlx5_vdpa.h                 |   4 +-
 drivers/vdpa/mlx5/mlx5_vdpa_event.c           |  51 ++---
 examples/kni/main.c                           |   1 +
 .../pthread_shim/pthread_shim.h               |   1 +
 lib/eal/common/eal_common_options.c           |   6 +-
 lib/eal/common/eal_common_thread.c            | 105 +++++++++-
 lib/eal/common/eal_common_trace.c             |   1 +
 lib/eal/common/eal_private.h                  |   2 +-
 lib/eal/common/eal_thread.h                   |   6 +
 lib/eal/common/malloc_mp.c                    |   2 +
 lib/eal/common/rte_thread.c                   |  17 ++
 lib/eal/freebsd/eal.c                         |  53 +++--
 lib/eal/freebsd/eal_alarm.c                   |  12 +-
 lib/eal/freebsd/eal_interrupts.c              |   6 +-
 lib/eal/freebsd/eal_thread.c                  |  10 +-
 lib/eal/include/rte_lcore.h                   |   6 +
 lib/eal/include/rte_per_lcore.h               |   2 +-
 lib/eal/include/rte_thread.h                  |  45 ++++
 lib/eal/linux/eal.c                           |  55 +++--
 lib/eal/linux/eal_alarm.c                     |  10 +-
 lib/eal/linux/eal_interrupts.c                |   8 +-
 lib/eal/linux/eal_thread.c                    |  11 +-
 lib/eal/linux/eal_timer.c                     |   6 +-
 lib/eal/version.map                           |   6 +-
 lib/eal/windows/eal.c                         |  44 +++-
 lib/eal/windows/eal_interrupts.c              |  10 +-
 lib/eal/windows/eal_thread.c                  |  35 +---
 lib/eal/windows/eal_windows.h                 |  10 -
 lib/eal/windows/include/pthread.h             | 192 ------------------
 lib/eal/windows/include/rte_windows.h         |   1 +
 lib/eal/windows/meson.build                   |   7 +-
 lib/eal/windows/rte_thread.c                  |  60 ++++++
 lib/ethdev/rte_ethdev.c                       |   4 +-
 lib/ethdev/rte_ethdev_core.h                  |   4 +-
 lib/ethdev/rte_flow.c                         |   4 +-
 lib/eventdev/rte_event_eth_rx_adapter.c       |   1 +
 lib/vhost/vhost.c                             |   1 +
 meson_options.txt                             |   2 +
 95 files changed, 764 insertions(+), 654 deletions(-)
 delete mode 100644 lib/eal/windows/include/pthread.h

-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH v2 2/6] eal: add function for control thread creation
  2021-06-19  1:57  4% ` [dpdk-dev] [PATCH v2 0/6] Enable the internal EAL thread API Narcisa Ana Maria Vasile
@ 2021-06-19  1:57  4%   ` Narcisa Ana Maria Vasile
  0 siblings, 0 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-19  1:57 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

The existing rte_ctrl_thread_create() function will be replaced
with rte_thread_ctrl_thread_create() that uses the internal
EAL thread API.

This patch only introduces the new control thread creation
function. Replacing of the old function needs to be done according
to the ABI change procedures, to avoid an ABI break.

Depends-on: series-17402 ("eal: Add EAL API for threading")

Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
---
 lib/eal/common/eal_common_thread.c | 81 ++++++++++++++++++++++++++++++
 lib/eal/include/rte_thread.h       | 27 ++++++++++
 lib/eal/version.map                |  1 +
 3 files changed, 109 insertions(+)

diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
index 1a52f42a2b..79545c67d9 100644
--- a/lib/eal/common/eal_common_thread.c
+++ b/lib/eal/common/eal_common_thread.c
@@ -259,6 +259,87 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
 	return -ret;
 }
 
+struct rte_thread_ctrl_ctx {
+	rte_thread_func start_routine;
+	void *arg;
+	const char *name;
+};
+
+static void *ctrl_thread_wrapper(void *arg)
+{
+	struct internal_config *conf = eal_get_internal_configuration();
+	rte_cpuset_t *cpuset = &conf->ctrl_cpuset;
+	struct rte_thread_ctrl_ctx *ctx = arg;
+	rte_thread_func start_routine = ctx->start_routine;
+	void *routine_arg = ctx->arg;
+
+	__rte_thread_init(rte_lcore_id(), cpuset);
+
+	if (ctx->name != NULL) {
+		if (rte_thread_name_set(rte_thread_self(), ctx->name) < 0)
+			RTE_LOG(DEBUG, EAL, "Cannot set name for ctrl thread\n");
+	}
+
+	free(arg);
+
+	return start_routine(routine_arg);
+}
+
+int
+rte_thread_ctrl_thread_create(rte_thread_t *thread, const char *name,
+		rte_thread_func start_routine, void *arg)
+{
+	int ret;
+	rte_thread_attr_t attr;
+	struct internal_config *conf = eal_get_internal_configuration();
+	rte_cpuset_t *cpuset = &conf->ctrl_cpuset;
+	struct rte_thread_ctrl_ctx *ctx = NULL;
+
+	if (start_routine == NULL) {
+		ret = EINVAL;
+		goto cleanup;
+	}
+
+	ctx = malloc(sizeof(*ctx));
+	if (ctx == NULL) {
+		ret = ENOMEM;
+		goto cleanup;
+	}
+
+	ctx->start_routine = start_routine;
+	ctx->arg = arg;
+	ctx->name = name;
+
+	ret = rte_thread_attr_init(&attr);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot init ctrl thread attributes\n");
+		goto cleanup;
+	}
+
+	ret = rte_thread_attr_set_affinity(&attr, cpuset);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot set afifnity attribute for ctrl thread\n");
+		goto cleanup;
+	}
+	ret = rte_thread_attr_set_priority(&attr, RTE_THREAD_PRIORITY_NORMAL);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot set priority attribute for ctrl thread\n");
+		goto cleanup;
+	}
+
+	ret = rte_thread_create(thread, &attr, ctrl_thread_wrapper, ctx);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot create ctrl thread\n");
+		goto cleanup;
+	}
+
+	return 0;
+
+cleanup:
+	free(ctx);
+	return ret;
+}
+
 int
 rte_thread_register(void)
 {
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index c65cfd8c9e..4da800ae27 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -457,6 +457,33 @@ int rte_thread_barrier_destroy(rte_thread_barrier *barrier);
 __rte_experimental
 int rte_thread_name_set(rte_thread_t thread_id, const char *name);
 
+/**
+ * Create a control thread.
+ *
+ * Set affinity and thread name. The affinity of the new thread is based
+ * on the CPU affinity retrieved at the time rte_eal_init() was called,
+ * the dataplane and service lcores are then excluded.
+ *
+ * @param thread
+ *   Filled with the thread id of the new created thread.
+ *
+ * @param name
+ *   The name of the control thread (max 16 characters including '\0').
+ *
+ * @param start_routine
+ *   Function to be executed by the new thread.
+ *
+ * @param arg
+ *   Argument passed to start_routine.
+ *
+ * @return
+ *   On success, return 0;
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_ctrl_thread_create(rte_thread_t *thread, const char *name,
+		rte_thread_func start_routine, void *arg);
+
 /**
  * Create a TLS data key visible to all threads in the process.
  * the created key is later used to get/set a value.
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 2a566c04af..02455a1c8d 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -444,6 +444,7 @@ EXPERIMENTAL {
 	rte_thread_barrier_wait;
 	rte_thread_barrier_destroy;
 	rte_thread_name_set;
+	rte_thread_ctrl_thread_create;
 };
 
 INTERNAL {
-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 4%]

* [dpdk-dev] [PATCH 2/6] eal: add function for control thread creation
  @ 2021-06-18 21:54  4% ` Narcisa Ana Maria Vasile
  2021-06-19  1:57  4% ` [dpdk-dev] [PATCH v2 0/6] Enable the internal EAL thread API Narcisa Ana Maria Vasile
  1 sibling, 0 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-18 21:54 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

The existing rte_ctrl_thread_create() function will be replaced
with rte_thread_ctrl_thread_create() that uses the internal
EAL thread API.

This patch only introduces the new control thread creation
function. Replacing of the old function needs to be done according
to the ABI change procedures, to avoid an ABI break.

Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
---
 lib/eal/common/eal_common_thread.c | 81 ++++++++++++++++++++++++++++++
 lib/eal/include/rte_thread.h       | 27 ++++++++++
 lib/eal/version.map                |  1 +
 3 files changed, 109 insertions(+)

diff --git a/lib/eal/common/eal_common_thread.c b/lib/eal/common/eal_common_thread.c
index 1a52f42a2b..79545c67d9 100644
--- a/lib/eal/common/eal_common_thread.c
+++ b/lib/eal/common/eal_common_thread.c
@@ -259,6 +259,87 @@ rte_ctrl_thread_create(pthread_t *thread, const char *name,
 	return -ret;
 }
 
+struct rte_thread_ctrl_ctx {
+	rte_thread_func start_routine;
+	void *arg;
+	const char *name;
+};
+
+static void *ctrl_thread_wrapper(void *arg)
+{
+	struct internal_config *conf = eal_get_internal_configuration();
+	rte_cpuset_t *cpuset = &conf->ctrl_cpuset;
+	struct rte_thread_ctrl_ctx *ctx = arg;
+	rte_thread_func start_routine = ctx->start_routine;
+	void *routine_arg = ctx->arg;
+
+	__rte_thread_init(rte_lcore_id(), cpuset);
+
+	if (ctx->name != NULL) {
+		if (rte_thread_name_set(rte_thread_self(), ctx->name) < 0)
+			RTE_LOG(DEBUG, EAL, "Cannot set name for ctrl thread\n");
+	}
+
+	free(arg);
+
+	return start_routine(routine_arg);
+}
+
+int
+rte_thread_ctrl_thread_create(rte_thread_t *thread, const char *name,
+		rte_thread_func start_routine, void *arg)
+{
+	int ret;
+	rte_thread_attr_t attr;
+	struct internal_config *conf = eal_get_internal_configuration();
+	rte_cpuset_t *cpuset = &conf->ctrl_cpuset;
+	struct rte_thread_ctrl_ctx *ctx = NULL;
+
+	if (start_routine == NULL) {
+		ret = EINVAL;
+		goto cleanup;
+	}
+
+	ctx = malloc(sizeof(*ctx));
+	if (ctx == NULL) {
+		ret = ENOMEM;
+		goto cleanup;
+	}
+
+	ctx->start_routine = start_routine;
+	ctx->arg = arg;
+	ctx->name = name;
+
+	ret = rte_thread_attr_init(&attr);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot init ctrl thread attributes\n");
+		goto cleanup;
+	}
+
+	ret = rte_thread_attr_set_affinity(&attr, cpuset);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot set afifnity attribute for ctrl thread\n");
+		goto cleanup;
+	}
+	ret = rte_thread_attr_set_priority(&attr, RTE_THREAD_PRIORITY_NORMAL);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot set priority attribute for ctrl thread\n");
+		goto cleanup;
+	}
+
+	ret = rte_thread_create(thread, &attr, ctrl_thread_wrapper, ctx);
+	if (ret != 0) {
+		RTE_LOG(DEBUG, EAL, "Cannot create ctrl thread\n");
+		goto cleanup;
+	}
+
+	return 0;
+
+cleanup:
+	free(ctx);
+	return ret;
+}
+
 int
 rte_thread_register(void)
 {
diff --git a/lib/eal/include/rte_thread.h b/lib/eal/include/rte_thread.h
index c65cfd8c9e..4da800ae27 100644
--- a/lib/eal/include/rte_thread.h
+++ b/lib/eal/include/rte_thread.h
@@ -457,6 +457,33 @@ int rte_thread_barrier_destroy(rte_thread_barrier *barrier);
 __rte_experimental
 int rte_thread_name_set(rte_thread_t thread_id, const char *name);
 
+/**
+ * Create a control thread.
+ *
+ * Set affinity and thread name. The affinity of the new thread is based
+ * on the CPU affinity retrieved at the time rte_eal_init() was called,
+ * the dataplane and service lcores are then excluded.
+ *
+ * @param thread
+ *   Filled with the thread id of the new created thread.
+ *
+ * @param name
+ *   The name of the control thread (max 16 characters including '\0').
+ *
+ * @param start_routine
+ *   Function to be executed by the new thread.
+ *
+ * @param arg
+ *   Argument passed to start_routine.
+ *
+ * @return
+ *   On success, return 0;
+ *   On failure, return a positive errno-style error number.
+ */
+__rte_experimental
+int rte_thread_ctrl_thread_create(rte_thread_t *thread, const char *name,
+		rte_thread_func start_routine, void *arg);
+
 /**
  * Create a TLS data key visible to all threads in the process.
  * the created key is later used to get/set a value.
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 2a566c04af..02455a1c8d 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -444,6 +444,7 @@ EXPERIMENTAL {
 	rte_thread_barrier_wait;
 	rte_thread_barrier_destroy;
 	rte_thread_name_set;
+	rte_thread_ctrl_thread_create;
 };
 
 INTERNAL {
-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH v9 10/10] Enable the new EAL thread API
  2021-06-08  7:45  5%           ` David Marchand
@ 2021-06-18 21:53  0%             ` Narcisa Ana Maria Vasile
  0 siblings, 0 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-18 21:53 UTC (permalink / raw)
  To: David Marchand
  Cc: dev, Thomas Monjalon, Dmitry Kozlyuk, Khoa To, navasile,
	Dmitry Malloy (MESHCHANINOV),
	roretzla, Tal Shnaiderman, Omar Cardona, Bruce Richardson,
	Pallavi Kadam

On Tue, Jun 08, 2021 at 09:45:44AM +0200, David Marchand wrote:
> On Tue, Jun 8, 2021 at 7:50 AM Narcisa Ana Maria Vasile
> <navasile@linux.microsoft.com> wrote:
> >
> > On Fri, Jun 04, 2021 at 04:44:34PM -0700, Narcisa Ana Maria Vasile wrote:
> > > From: Narcisa Vasile <navasile@microsoft.com>
> > >
> > > Rename pthread_* occurrences with the new rte_thread_* API.
> > > Enable the new API in the build system.
> > >
> > > Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
> > > ---
> >
> > I'll send v10.
> > Can someone please help with an example on how to check for ABI breaks? Thank you!
> >
> > I've run:
> > DPDK_ABI_REF_VERSION=v21.05 DPDK_ABI_REF_DIR=~/ref ./devtools/test-meson-builds.sh
> > which doesn't give any warnings about the ABI break.
> 
> This should work the way you tried if you have working toolchains and
> libabigail installed.
> Something is off in your env.
> 
> Side note: ovsrobot is out those days (we have some trouble in one of
> RH labs and it happens ovsrobot is hosted there), but you could try
> with a github repo of yours + GHA, and the ABI failure should be
> caught too.
> 
> 
> I just tried on my rhel7 (gcc 4.8.5 + libabigail 1.8.2) with your
> series applied.
> $ DPDK_ABI_REF_VERSION=v21.05
> DPDK_ABI_REF_DIR=~/git/pub/dpdk.org/reference
> ./devtools/test-meson-builds.sh
> ...
> Error: ABI issue reported for 'abidiff --suppr
> /home/dmarchan/git/pub/dpdk.org/devtools/../devtools/libabigail.abignore
> --no-added-syms --headers-dir1
> /home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/usr/local/include
> --headers-dir2 /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/usr/local/include
> /home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/dump/librte_eal.dump
> /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/dump/librte_eal.dump'
> ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged
> this as a potential issue).
> 
> 
> $ abidiff --suppr
> /home/dmarchan/git/pub/dpdk.org/devtools/../devtools/libabigail.abignore
> --no-added-syms --headers-dir1
> /home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/usr/local/include
> --headers-dir2 /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/usr/local/include
> /home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/dump/librte_eal.dump
> /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/dump/librte_eal.dump
> Functions changes summary: 0 Removed, 2 Changed (1 filtered out), 0
> Added (20 filtered out) functions
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> 
> 2 functions with some indirect sub-type change:
> 
>   [C] 'function int rte_ctrl_thread_create(pthread_t*, const char*,
> const pthread_attr_t*, void* (void*)*, void*)' at rte_lcore.h:443:1
> has some indirect sub-type changes:
>     parameter 1 of type 'pthread_t*' changed:
>       in pointed to type 'typedef pthread_t' at rte_thread.h:42:1:
>         typedef name changed from pthread_t to rte_thread_t at rte_thread.h:42:1
>         underlying type 'unsigned long int' changed:
>           entity changed from 'unsigned long int' to 'struct
> rte_thread_tag' at rte_thread.h:40:1
>           type size hasn't changed
>     parameter 3 of type 'const pthread_attr_t*' changed:
>       in pointed to type 'const pthread_attr_t':
>         'const pthread_attr_t' changed to 'const rte_thread_attr_t'
> 
>   [C] 'function int rte_thread_setname(pthread_t, const char*)' at
> rte_lcore.h:377:1 has some indirect sub-type changes:
>     parameter 1 of type 'typedef pthread_t' changed:
>       typedef name changed from pthread_t to rte_thread_t at rte_thread.h:42:1
>       underlying type 'unsigned long int' changed:
>         entity changed from 'unsigned long int' to 'struct
> rte_thread_tag' at rte_thread.h:40:1
>         type size hasn't changed
> 
> 
> 
> Can you check that in your env build-gcc-shared/ and the build
> directory for references are configured with debug symbols?
> You should see:
> $ meson configure build-gcc-shared | awk '$1=="buildtype" {print $2}'
> debugoptimized
> $ meson configure reference/v21.05/build | awk '$1=="buildtype" {print $2}'
> debugoptimized
> 
> 
Thank you very much David! There was something wrong with my local reference.
Using your commands, I am able to run the tools now.
> 

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v10 0/9] eal: Add EAL API for threading
  2021-06-04 23:44  2%     ` [dpdk-dev] [PATCH v9 " Narcisa Ana Maria Vasile
    @ 2021-06-18 21:26  3%       ` Narcisa Ana Maria Vasile
  2 siblings, 0 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-18 21:26 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

EAL thread API

**Problem Statement**
DPDK currently uses the pthread interface to create and manage threads.
Windows does not support the POSIX thread programming model,
so it currently relies on a header file that hides the Windows
calls under pthread matched interfaces.
Given that EAL should isolate the environment specifics from
the applications and libraries and mediate all the communication
with the operating systems, a new EAL interface
is needed for thread management.

**Goals**
* Introduce a generic EAL API for threading support that will remove
  the current Windows pthread.h shim.
* Replace references to pthread_* across the DPDK codebase with the new
  RTE_THREAD_* API.
* Allow users to choose between using the RTE_THREAD_* API or a
  3rd party thread library through a configuration option.

**Design plan**
New API main files:
* rte_thread.h (librte_eal/include)
* rte_thread.c (librte_eal/windows)
* rte_thread.c (librte_eal/common)

**A schematic example of the design**
--------------------------------------------------
lib/librte_eal/include/rte_thread.h
int rte_thread_create();

lib/librte_eal/common/rte_thread.c
int rte_thread_create() 
{
	return pthread_create();
}

lib/librte_eal/windows/rte_thread.c
int rte_thread_create() 
{
	return CreateThread();
}
-----------------------------------------------------

**Thread attributes**

When or after a thread is created, specific characteristics of the thread
can be adjusted. Given that the thread characteristics that are of interest
for DPDK applications are affinity and priority, the following structure
that represents thread attributes has been defined:

typedef struct
{
	enum rte_thread_priority priority;
	rte_cpuset_t cpuset;
} rte_thread_attr_t;

The *rte_thread_create()* function can optionally receive
an rte_thread_attr_t object that will cause the thread to be created
with the affinity and priority described by the attributes object.
If no rte_thread_attr_t is passed (parameter is NULL),
the default affinity and priority are used.
An rte_thread_attr_t object can also be set to the default values
by calling *rte_thread_attr_init()*.

*Priority* is represented through an enum that currently advertises
two values for priority:
	- RTE_THREAD_PRIORITY_NORMAL
	- RTE_THREAD_PRIORITY_REALTIME_CRITICAL
The enum can be extended to allow for multiple priority levels.
rte_thread_set_priority      - sets the priority of a thread
rte_thread_attr_set_priority - updates an rte_thread_attr_t object
                               with a new value for priority

The user can choose thread priority through an EAL parameter,
when starting an application.  If EAL parameter is not used,
the per-platform default value for thread priority is used.
Otherwise administrator has an option to set one of available options:
 --thread-prio normal
 --thread-prio realtime

Example:
./dpdk-l2fwd -l 0-3 -n 4 –thread-prio normal -- -q 8 -p ffff

*Affinity* is described by the already known “rte_cpuset_t” type.
rte_thread_attr_set/get_affinity - sets/gets the affinity field in a
                                   rte_thread_attr_t object
rte_thread_set/get_affinity      – sets/gets the affinity of a thread

**Errors**
A translation function that maps Windows error codes to errno-style
error codes is provided. 

**Future work**
The long term plan is for EAL to provide full threading support:
* Add support for conditional variables
* Add support for pthread_mutex_trylock
* Additional functionality offered by pthread_*
  (such as pthread_setname_np, etc.)

v10:
 - Remove patch no. 10. It will be broken down in subpatches 
   and sent as a different patchset that depends on this one.
   This is done due to the ABI breaks that would be caused by patch 10.
 - Replace unix/rte_thread.c with common/rte_thread.c
 - Remove initializations that may prevent compiler from issuing useful
   warnings.
 - Remove rte_thread_types.h and rte_windows_thread_types.h
 - Remove unneeded priority macros (EAL_THREAD_PRIORITY*)
 - Remove functions that retrieves thread handle from process handle
 - Remove rte_thread_cancel() until same behavior is obtained on
   all platforms.
 - Fix rte_thread_detach() function description,
   return value and remove empty line.
 - Reimplement mutex functions. Add compatible representation for mutex
   identifier. Add macro to replace static mutex initialization instances.
 - Fix commit messages (lines too long, remove unicode symbols)

v9:
- Sign patches

v8:
- Rebase
- Add rte_thread_detach() API
- Set default priority, when user did not specify a value

v7:
Based on DmitryK's review:
- Change thread id representation
- Change mutex id representation
- Implement static mutex inititalizer for Windows
- Change barrier identifier representation
- Improve commit messages
- Add missing doxygen comments
- Split error translation function
- Improve name for affinity function
- Remove cpuset_size parameter
- Fix eal_create_cpu_map function
- Map EAL priority values to OS specific values
- Add thread wrapper for start routine
- Do not export rte_thread_cancel() on Windows
- Cleanup, fix comments, fix typos.

v6:
- improve error-translation function
- call the error translation function in rte_thread_value_get()

v5:
- update cover letter with more details on the priority argument

v4:
- fix function description
- rebase

v3:
- rebase

v2:
- revert changes that break ABI 
- break up changes into smaller patches
- fix coding style issues
- fix issues with errors
- fix parameter type in examples/kni.c


Narcisa Vasile (9):
  eal: add basic threading functions
  eal: add thread attributes
  eal/windows: translate Windows errors to errno-style errors
  eal: implement functions for thread affinity management
  eal: implement thread priority management functions
  eal: add thread lifetime management
  eal: implement functions for mutex management
  eal: implement functions for thread barrier management
  eal: add EAL argument for setting thread priority

 lib/eal/common/eal_common_options.c |  28 +-
 lib/eal/common/eal_internal_cfg.h   |   2 +
 lib/eal/common/eal_options.h        |   2 +
 lib/eal/common/meson.build          |   1 +
 lib/eal/common/rte_thread.c         | 445 +++++++++++++++++++++
 lib/eal/include/rte_thread.h        | 406 ++++++++++++++++++-
 lib/eal/unix/meson.build            |   1 -
 lib/eal/unix/rte_thread.c           |  92 -----
 lib/eal/version.map                 |  20 +
 lib/eal/windows/eal_lcore.c         | 176 ++++++---
 lib/eal/windows/eal_windows.h       |  10 +
 lib/eal/windows/include/sched.h     |   2 +-
 lib/eal/windows/rte_thread.c        | 588 ++++++++++++++++++++++++++--
 13 files changed, 1599 insertions(+), 174 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c
 delete mode 100644 lib/eal/unix/rte_thread.c

-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 3%]

* [dpdk-dev] [PATCH] devtools: script to track map symbols
@ 2021-06-18 16:36  5% Ray Kinsella
  2021-06-21 15:25  6% ` [dpdk-dev] [PATCH v3] " Ray Kinsella
                   ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Ray Kinsella @ 2021-06-18 16:36 UTC (permalink / raw)
  To: dev; +Cc: ferruh.yigit, thomas, ktraynor, bruce.richardson, mdr

Script to track growth of stable and experimental symbols
over releases since v19.11.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
 devtools/count_symbols.py | 230 ++++++++++++++++++++++++++++++++++++++
 1 file changed, 230 insertions(+)
 create mode 100755 devtools/count_symbols.py

diff --git a/devtools/count_symbols.py b/devtools/count_symbols.py
new file mode 100755
index 0000000000..7b29651044
--- /dev/null
+++ b/devtools/count_symbols.py
@@ -0,0 +1,230 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: BSD-3-Clause
+# Copyright(c) 2021 Intel Corporation
+from pathlib import Path
+import sys, os
+import subprocess
+import argparse
+import re
+import datetime
+
+try:
+        from parsley import makeGrammar
+except ImportError:
+        print('This script uses the package Parsley to parse C Mapfiles.\n'
+              'This can be installed with \"pip install parsley".')
+        exit()
+
+symbolMapGrammar = r"""
+
+ws = (' ' | '\r' | '\n' | '\t')*
+
+ABI_VER = ({})
+DPDK_VER = ('DPDK_' ABI_VER)
+ABI_NAME = ('INTERNAL' | 'EXPERIMENTAL' | DPDK_VER)
+comment = '#' (~'\n' anything)+ '\n'
+symbol = (~(';' | '}}' | '#') anything )+:c ';' -> ''.join(c)
+global = 'global:'
+local = 'local: *;'
+symbols = comment* symbol:s ws comment* -> s
+
+abi = (abi_section+):m -> dict(m)
+abi_section = (ws ABI_NAME:e ws '{{' ws global* (~local ws symbols)*:s ws local* ws '}}' ws DPDK_VER* ';' ws) -> (e,s)
+"""
+
+#abi_ver = ['21', '20.0.1', '20.0', '20']
+
+def get_abi_versions():
+    year = datetime.date.today().year - 2000
+    s=" |".join(['\'{}\''.format(i) for i in reversed(range(21, year + 1)) ])
+    s = s + ' | \'20.0.1\' | \'20.0\' | \'20\''
+
+    return s
+
+def get_dpdk_releases():
+    year = datetime.date.today().year - 2000
+    s="|".join("{}".format(i) for i in range(19,year + 1))
+    pattern = re.compile('^\"v(' + s + ')\.\d{2}\"$')
+
+    cmd = ['git', 'for-each-ref', '--sort=taggerdate', '--format', '"%(tag)"']
+    result = subprocess.run(cmd, \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    if result.stderr.startswith(b'fatal'):
+        result = None
+
+    tags = result.stdout.decode('utf-8').split('\n')
+
+    # find the non-rcs between now and v19.11
+    tags = [ tag.replace('\"','') \
+             for tag in reversed(tags) \
+             if pattern.match(tag) ][:-3]
+
+    return tags
+
+
+def get_terminal_rows():
+    rows, _ = os.popen('stty size', 'r').read().split()
+    return int(rows)
+
+def fix_directory_name(path):
+    mapfilepath1 = str(path.parent.name)
+    mapfilepath2 = str(path.parents[1])
+    mapfilepath = mapfilepath2 + '/librte_' + mapfilepath1
+
+    return mapfilepath
+
+# fix removal of the librte_ from the directory names
+def directory_renamed(path, rel):
+    mapfilepath = fix_directory_name(path)
+    tagfile = '{}:{}/{}'.format(rel, mapfilepath,  path.name)
+
+    result = subprocess.run(['git', 'show', tagfile], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    if result.stderr.startswith(b'fatal'):
+        result = None
+
+    return result
+
+# fix renaming of map files
+def mapfile_renamed(path, rel):
+    newfile = None
+
+    result = subprocess.run(['git', 'ls-tree', \
+                             rel, str(path.parent) + '/'], \
+                            stdout=subprocess.PIPE, \
+                            stderr=subprocess.PIPE)
+    dentries = result.stdout.decode('utf-8')
+    dentries = dentries.split('\n')
+
+    # filter entries looking for the map file
+    dentries = [dentry for dentry in dentries if dentry.endswith('.map')]
+    if len(dentries) > 1 or len(dentries) == 0:
+        return None
+
+    dparts = dentries[0].split('/')
+    newfile = dparts[len(dparts) - 1]
+
+    if(newfile is not None):
+        tagfile = '{}:{}/{}'.format(rel, path.parent, newfile)
+
+        result = subprocess.run(['git', 'show', tagfile], \
+                                stdout=subprocess.PIPE, \
+                                stderr=subprocess.PIPE)
+        if result.stderr.startswith(b'fatal'):
+            result = None
+
+    else:
+        result = None
+
+    return result
+
+# renaming of the map file & renaming of directory
+def mapfile_and_directory_renamed(path, rel):
+    mapfilepath = Path("{}/{}".format(fix_directory_name(path),path.name))
+
+    return mapfile_renamed(mapfilepath, rel)
+
+fix_strategies = [directory_renamed, \
+                  mapfile_renamed, \
+                  mapfile_and_directory_renamed]
+
+fmt = col_fmt = ""
+
+def set_terminal_output(dpdk_rel):
+    global fmt, col_fmt
+
+    fmt = '{:<50}'
+    col_fmt = fmt
+    for rel in dpdk_rel:
+        fmt += '{:<6}{:<6}'
+        col_fmt += '{:<12}'
+
+def set_csv_output(dpdk_rel):
+    global fmt, col_fmt
+
+    fmt = '{},'
+    col_fmt = fmt
+    for rel in dpdk_rel:
+        fmt += '{},{},'
+        col_fmt += '{},,'
+
+output_formats = { None: set_terminal_output, \
+                   'terminal': set_terminal_output, \
+                   'csv': set_csv_output }
+directories = 'drivers, lib'
+
+def main():
+    global fmt, col_fmt, symbolMapGrammar
+
+    parser = argparse.ArgumentParser(description='Count symbols in DPDK Libs')
+    parser.add_argument('--format-output', choices=['terminal','csv'], \
+                        default='terminal')
+    parser.add_argument('--directory', choices=directories,
+                        default=directories)
+    args = parser.parse_args()
+
+    dpdk_rel = get_dpdk_releases()
+
+    # set the output format
+    output_formats[args.format_output](dpdk_rel)
+
+    column_titles = ['mapfile'] + dpdk_rel
+    print(col_fmt.format(*column_titles))
+
+    symbolMapGrammar = symbolMapGrammar.format(get_abi_versions())
+    MAPParser = makeGrammar(symbolMapGrammar, {})
+
+    terminal_rows = get_terminal_rows()
+    row = 0
+
+    for src_dir in args.directory.split(','):
+        for path in Path(src_dir).rglob('*.map'):
+            csym = [0] * 2
+            relsym = [str(path)]
+
+            for rel in dpdk_rel:
+                i = csym[0] = csym[1] = 0
+                abi_sections = None
+
+                tagfile = '{}:{}'.format(rel,path)
+                result = subprocess.run(['git', 'show', tagfile], \
+                                        stdout=subprocess.PIPE, \
+                                        stderr=subprocess.PIPE)
+
+                if result.stderr.startswith(b'fatal'):
+                    result = None
+
+                while(result is None and i < len(fix_strategies)):
+                    result = fix_strategies[i](path, rel)
+                    i += 1
+
+                if result is not None:
+                    mapfile = result.stdout.decode('utf-8')
+                    abi_sections = MAPParser(mapfile).abi()
+
+                if abi_sections is not None:
+                    # which versions are present, and we care about
+                    ignore = ['EXPERIMENTAL','INTERNAL']
+                    found_ver = [ver \
+                                 for ver in abi_sections \
+                                 if ver not in ignore]
+
+                    for ver in found_ver:
+                        csym[0] += len(abi_sections[ver])
+
+                    # count experimental symbols
+                    if 'EXPERIMENTAL' in abi_sections:
+                        csym[1] = len(abi_sections['EXPERIMENTAL'])
+
+                relsym += csym
+
+            print(fmt.format(*relsym))
+            row += 1
+
+        if((terminal_rows>0) and ((row % terminal_rows) == 0)):
+            print(col_fmt.format(*column_titles))
+
+if __name__ == '__main__':
+        main()
-- 
2.26.2


^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-18 10:41  0%                 ` Ananyev, Konstantin
@ 2021-06-18 10:49  0%                   ` Ferruh Yigit
  2021-06-21 11:06  0%                   ` Ananyev, Konstantin
  1 sibling, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-18 10:49 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil

On 6/18/2021 11:41 AM, Ananyev, Konstantin wrote:
> 
>>>>>>>
>>>>>>> 14/06/2021 15:15, Bruce Richardson:
>>>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
>>>>>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>>>>>
>>>>>>>>>> Performance of access in a fixed-size array is very good
>>>>>>>>>> because of cache locality
>>>>>>>>>> and because there is a single pointer to dereference.
>>>>>>>>>> The only drawback is the lack of flexibility:
>>>>>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>>>>>
>>>>>>>>>> An approach to this problem is to allocate the array at runtime,
>>>>>>>>>> being as efficient as static arrays, but still limited to a maximum.
>>>>>>>>>>
>>>>>>>>>> That's why the API rte_parray is introduced,
>>>>>>>>>> allowing to declare an array of pointer which can be resized
>>>>>>>>>> dynamically
>>>>>>>>>> and automatically at runtime while keeping a good read performance.
>>>>>>>>>>
>>>>>>>>>> After resize, the previous array is kept until the next resize
>>>>>>>>>> to avoid crashs during a read without any lock.
>>>>>>>>>>
>>>>>>>>>> Each element is a pointer to a memory chunk dynamically allocated.
>>>>>>>>>> This is not good for cache locality but it allows to keep the same
>>>>>>>>>> memory per element, no matter how the array is resized.
>>>>>>>>>> Cache locality could be improved with mempools.
>>>>>>>>>> The other drawback is having to dereference one more pointer
>>>>>>>>>> to read an element.
>>>>>>>>>>
>>>>>>>>>> There is not much locks, so the API is for internal use only.
>>>>>>>>>> This API may be used to completely remove some compilation-time
>>>>>>>>>> maximums.
>>>>>>>>>
>>>>>>>>> I get the purpose and overall intention of this library.
>>>>>>>>>
>>>>>>>>> I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime
>> configurability.
>>>>>> It's
>>>>>>> my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no
>> way
>>>> for
>>>>>>> me to stop this progress, and I do not intend to oppose to this library. :-)
>>>>>>>>>
>>>>>>>>> This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
>>>>>> examples
>>>>>>> where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line
>>>> between
>>>>>>> control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and
>>>> shrink
>>>>>> in
>>>>>>> the fast path.
>>>>>>>>>
>>>>>>>>> If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
>>>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for
>> their
>>>>>>> application specific per-port runtime data, and this library could serve that purpose too.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks Thomas for starting this discussion and Morten for follow-up.
>>>>>>>>
>>>>>>>> My thinking is as follows, and I'm particularly keeping in mind the cases
>>>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>>>>>
>>>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not convinced that
>>>>>>>> we should switch away from the flat arrays or that we need fully dynamic
>>>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
>>>>>>>> house here, where we keep the ethdevs as an array, but one allocated/sized
>>>>>>>> at runtime rather than statically. This would allow us to have a
>>>>>>>> compile-time default value, but, for use cases that need it, allow use of a
>>>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter given to the
>>>>>>>> malloc call for the array.  This max limit could then be provided to apps
>>>>>>>> too if they want to match any array sizes. [Alternatively those apps could
>>>>>>>> check the provided size and error out if the size has been increased beyond
>>>>>>>> what the app is designed to use?]. There would be no extra dereferences per
>>>>>>>> rx/tx burst call in this scenario so performance should be the same as
>>>>>>>> before (potentially better if array is in hugepage memory, I suppose).
>>>>>>>
>>>>>>> I think we need some benchmarks to decide what is the best tradeoff.
>>>>>>> I spent time on this implementation, but sorry I won't have time for benchmarks.
>>>>>>> Volunteers?
>>>>>>
>>>>>> I had only a quick look at your approach so far.
>>>>>> But from what I can read, in MT environment your suggestion will require
>>>>>> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
>>>>>> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
>>>>>> At least for rte_ethdevs[] and friends.
>>>>>> Konstantin
>>>>>
>>>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
>>>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
>>>>> any regressions.
>>>>> That could still be flat array with max_size specified at application startup.
>>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>>
>>>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
>>>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
>>>>> Probably some macro can be provided to simplify it.
>>>>>
>>>>
>>>> We are already planning some tasks for ABI stability for v21.11, I think
>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
>>>> internal data.
>>>
>>> Ok, sounds good.
>>>
>>>>
>>>>> The only significant complication I can foresee with implementing that approach -
>>>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
>>>>> (to avoid extra indirection for callback implementation).
>>>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
>>>>>
>>>>
>>>> What do you think split Rx/Tx callback into its own struct too?
>>>>
>>>> Overall 'rte_eth_dev' can be split into three as:
>>>> 1. rte_eth_dev
>>>> 2. rte_eth_dev_burst
>>>> 3. rte_eth_dev_cb
>>>>
>>>> And we can hide 1 from applications even with the inline functions.
>>>
>>> As discussed off-line, I think:
>>> it is possible.
>>> My absolute preference would be to have just 1/2 (with CB hidden).
>>
>> How can we hide the callbacks since they are used by inline burst functions.
> 
> I probably I owe a better explanation to what I meant in first mail.
> Otherwise it sounds confusing.
> I'll try to write a more detailed one in next few days.
> 
>>> But even with 1/2/3 in place I think it would be  a good step forward.
>>> Probably worth to start with 1/2/3 first and then see how difficult it
>>> would be to switch to 1/2.
>>
>> What do you mean by switch to 1/2?
> 
> When we'll have just:
> 1. rte_eth_dev (hidden in .c)
> 2. rte_eth_dev_burst (visible)
> 
> And no specific public struct/array for callbacks - they will be hidden in rte_eth_dev.
> 

If we can hide them, agree this is better.

>>
>> If we keep having inline functions, and split struct as above three structs, we
>> can only hide 1, and 2/3 will be still visible to apps because of inline
>> functions. This way we will be able to hide more still having same performance.
> 
> I understand that, and as I said above - I think it is a good step forward.
> Though even better would be to hide rte_eth_dev_cb too.
> 
>>
>>> Do you plan to start working on it?
>>>
>>
>> We are gathering the list of the tasks for the ABI stability, most probably they
>> will be worked on during v21.11. I can take this one.
> 
> Cool, please keep me in a loop.
> I'll try to free some cycles for 21.11 to get involved and help (if needed off-course).

That would be great, thanks.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 15:44  3%               ` Ferruh Yigit
@ 2021-06-18 10:41  0%                 ` Ananyev, Konstantin
  2021-06-18 10:49  0%                   ` Ferruh Yigit
  2021-06-21 11:06  0%                   ` Ananyev, Konstantin
  0 siblings, 2 replies; 200+ results
From: Ananyev, Konstantin @ 2021-06-18 10:41 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil


> >>>>>
> >>>>> 14/06/2021 15:15, Bruce Richardson:
> >>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> >>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> >>>>>>>> Sent: Monday, 14 June 2021 12.59
> >>>>>>>>
> >>>>>>>> Performance of access in a fixed-size array is very good
> >>>>>>>> because of cache locality
> >>>>>>>> and because there is a single pointer to dereference.
> >>>>>>>> The only drawback is the lack of flexibility:
> >>>>>>>> the size of such an array cannot be increase at runtime.
> >>>>>>>>
> >>>>>>>> An approach to this problem is to allocate the array at runtime,
> >>>>>>>> being as efficient as static arrays, but still limited to a maximum.
> >>>>>>>>
> >>>>>>>> That's why the API rte_parray is introduced,
> >>>>>>>> allowing to declare an array of pointer which can be resized
> >>>>>>>> dynamically
> >>>>>>>> and automatically at runtime while keeping a good read performance.
> >>>>>>>>
> >>>>>>>> After resize, the previous array is kept until the next resize
> >>>>>>>> to avoid crashs during a read without any lock.
> >>>>>>>>
> >>>>>>>> Each element is a pointer to a memory chunk dynamically allocated.
> >>>>>>>> This is not good for cache locality but it allows to keep the same
> >>>>>>>> memory per element, no matter how the array is resized.
> >>>>>>>> Cache locality could be improved with mempools.
> >>>>>>>> The other drawback is having to dereference one more pointer
> >>>>>>>> to read an element.
> >>>>>>>>
> >>>>>>>> There is not much locks, so the API is for internal use only.
> >>>>>>>> This API may be used to completely remove some compilation-time
> >>>>>>>> maximums.
> >>>>>>>
> >>>>>>> I get the purpose and overall intention of this library.
> >>>>>>>
> >>>>>>> I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime
> configurability.
> >>>> It's
> >>>>> my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no
> way
> >> for
> >>>>> me to stop this progress, and I do not intend to oppose to this library. :-)
> >>>>>>>
> >>>>>>> This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
> >>>> examples
> >>>>> where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line
> >> between
> >>>>> control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and
> >> shrink
> >>>> in
> >>>>> the fast path.
> >>>>>>>
> >>>>>>> If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
> >>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for
> their
> >>>>> application specific per-port runtime data, and this library could serve that purpose too.
> >>>>>>>
> >>>>>>
> >>>>>> Thanks Thomas for starting this discussion and Morten for follow-up.
> >>>>>>
> >>>>>> My thinking is as follows, and I'm particularly keeping in mind the cases
> >>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> >>>>>>
> >>>>>> While I dislike the hard-coded limits in DPDK, I'm also not convinced that
> >>>>>> we should switch away from the flat arrays or that we need fully dynamic
> >>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
> >>>>>> house here, where we keep the ethdevs as an array, but one allocated/sized
> >>>>>> at runtime rather than statically. This would allow us to have a
> >>>>>> compile-time default value, but, for use cases that need it, allow use of a
> >>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter given to the
> >>>>>> malloc call for the array.  This max limit could then be provided to apps
> >>>>>> too if they want to match any array sizes. [Alternatively those apps could
> >>>>>> check the provided size and error out if the size has been increased beyond
> >>>>>> what the app is designed to use?]. There would be no extra dereferences per
> >>>>>> rx/tx burst call in this scenario so performance should be the same as
> >>>>>> before (potentially better if array is in hugepage memory, I suppose).
> >>>>>
> >>>>> I think we need some benchmarks to decide what is the best tradeoff.
> >>>>> I spent time on this implementation, but sorry I won't have time for benchmarks.
> >>>>> Volunteers?
> >>>>
> >>>> I had only a quick look at your approach so far.
> >>>> But from what I can read, in MT environment your suggestion will require
> >>>> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
> >>>> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
> >>>> At least for rte_ethdevs[] and friends.
> >>>> Konstantin
> >>>
> >>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> >>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> >>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> >>> any regressions.
> >>> That could still be flat array with max_size specified at application startup.
> >>> 2. Hide rest of rte_ethdev struct in .c.
> >>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> >>> (flat array, vector, hash, linked list) without ABI/API breakages.
> >>>
> >>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> >>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> >>> Probably some macro can be provided to simplify it.
> >>>
> >>
> >> We are already planning some tasks for ABI stability for v21.11, I think
> >> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
> >> internal data.
> >
> > Ok, sounds good.
> >
> >>
> >>> The only significant complication I can foresee with implementing that approach -
> >>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> >>> (to avoid extra indirection for callback implementation).
> >>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> >>>
> >>
> >> What do you think split Rx/Tx callback into its own struct too?
> >>
> >> Overall 'rte_eth_dev' can be split into three as:
> >> 1. rte_eth_dev
> >> 2. rte_eth_dev_burst
> >> 3. rte_eth_dev_cb
> >>
> >> And we can hide 1 from applications even with the inline functions.
> >
> > As discussed off-line, I think:
> > it is possible.
> > My absolute preference would be to have just 1/2 (with CB hidden).
> 
> How can we hide the callbacks since they are used by inline burst functions.

I probably I owe a better explanation to what I meant in first mail.
Otherwise it sounds confusing.
I'll try to write a more detailed one in next few days.

> > But even with 1/2/3 in place I think it would be  a good step forward.
> > Probably worth to start with 1/2/3 first and then see how difficult it
> > would be to switch to 1/2.
> 
> What do you mean by switch to 1/2?

When we'll have just:
1. rte_eth_dev (hidden in .c)
2. rte_eth_dev_burst (visible)

And no specific public struct/array for callbacks - they will be hidden in rte_eth_dev.

> 
> If we keep having inline functions, and split struct as above three structs, we
> can only hide 1, and 2/3 will be still visible to apps because of inline
> functions. This way we will be able to hide more still having same performance.

I understand that, and as I said above - I think it is a good step forward.
Though even better would be to hide rte_eth_dev_cb too. 

> 
> > Do you plan to start working on it?
> >
> 
> We are gathering the list of the tasks for the ABI stability, most probably they
> will be worked on during v21.11. I can take this one.

Cool, please keep me in a loop.
I'll try to free some cycles for 21.11 to get involved and help (if needed off-course).
Konstantin



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 17:05  0%                   ` Ananyev, Konstantin
@ 2021-06-18 10:28  0%                     ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-18 10:28 UTC (permalink / raw)
  To: Ananyev, Konstantin, Morten Brørup, Thomas Monjalon,
	Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil

On 6/17/2021 6:05 PM, Ananyev, Konstantin wrote:
> 
> 
>> On 6/17/2021 4:17 PM, Morten Brørup wrote:
>>>> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
>>>> Sent: Thursday, 17 June 2021 16.59
>>>>
>>>>>>>>
>>>>>>>> 14/06/2021 15:15, Bruce Richardson:
>>>>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
>>>> Monjalon
>>>>>>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>>>>>>
>>>>>>>>>>> Performance of access in a fixed-size array is very good
>>>>>>>>>>> because of cache locality
>>>>>>>>>>> and because there is a single pointer to dereference.
>>>>>>>>>>> The only drawback is the lack of flexibility:
>>>>>>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>>>>>>
>>>>>>>>>>> An approach to this problem is to allocate the array at
>>>> runtime,
>>>>>>>>>>> being as efficient as static arrays, but still limited to a
>>>> maximum.
>>>>>>>>>>>
>>>>>>>>>>> That's why the API rte_parray is introduced,
>>>>>>>>>>> allowing to declare an array of pointer which can be resized
>>>>>>>>>>> dynamically
>>>>>>>>>>> and automatically at runtime while keeping a good read
>>>> performance.
>>>>>>>>>>>
>>>>>>>>>>> After resize, the previous array is kept until the next resize
>>>>>>>>>>> to avoid crashs during a read without any lock.
>>>>>>>>>>>
>>>>>>>>>>> Each element is a pointer to a memory chunk dynamically
>>>> allocated.
>>>>>>>>>>> This is not good for cache locality but it allows to keep the
>>>> same
>>>>>>>>>>> memory per element, no matter how the array is resized.
>>>>>>>>>>> Cache locality could be improved with mempools.
>>>>>>>>>>> The other drawback is having to dereference one more pointer
>>>>>>>>>>> to read an element.
>>>>>>>>>>>
>>>>>>>>>>> There is not much locks, so the API is for internal use only.
>>>>>>>>>>> This API may be used to completely remove some compilation-
>>>> time
>>>>>>>>>>> maximums.
>>>>>>>>>>
>>>>>>>>>> I get the purpose and overall intention of this library.
>>>>>>>>>>
>>>>>>>>>> I probably already mentioned that I prefer "embedded style
>>>> programming" with fixed size arrays, rather than runtime
>>>> configurability.
>>>>>>> It's
>>>>>>>> my personal opinion, and the DPDK Tech Board clearly prefers
>>>> reducing the amount of compile time configurability, so there is no way
>>>>> for
>>>>>>>> me to stop this progress, and I do not intend to oppose to this
>>>> library. :-)
>>>>>>>>>>
>>>>>>>>>> This library is likely to become a core library of DPDK, so I
>>>> think it is important getting it right. Could you please mention a few
>>>>>>> examples
>>>>>>>> where you think this internal library should be used, and where
>>>> it should not be used. Then it is easier to discuss if the border line
>>>>> between
>>>>>>>> control path and data plane is correct. E.g. this library is not
>>>> intended to be used for dynamically sized packet queues that grow and
>>>>> shrink
>>>>>>> in
>>>>>>>> the fast path.
>>>>>>>>>>
>>>>>>>>>> If the library becomes a core DPDK library, it should probably
>>>> be public instead of internal. E.g. if the library is used to make
>>>>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some
>>>> applications might also need dynamically sized arrays for their
>>>>>>>> application specific per-port runtime data, and this library
>>>> could serve that purpose too.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks Thomas for starting this discussion and Morten for
>>>> follow-up.
>>>>>>>>>
>>>>>>>>> My thinking is as follows, and I'm particularly keeping in mind
>>>> the cases
>>>>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>>>>>>
>>>>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not
>>>> convinced that
>>>>>>>>> we should switch away from the flat arrays or that we need fully
>>>> dynamic
>>>>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
>>>> a half-way
>>>>>>>>> house here, where we keep the ethdevs as an array, but one
>>>> allocated/sized
>>>>>>>>> at runtime rather than statically. This would allow us to have a
>>>>>>>>> compile-time default value, but, for use cases that need it,
>>>> allow use of a
>>>>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter
>>>> given to the
>>>>>>>>> malloc call for the array.  This max limit could then be
>>>> provided to apps
>>>>>>>>> too if they want to match any array sizes. [Alternatively those
>>>> apps could
>>>>>>>>> check the provided size and error out if the size has been
>>>> increased beyond
>>>>>>>>> what the app is designed to use?]. There would be no extra
>>>> dereferences per
>>>>>>>>> rx/tx burst call in this scenario so performance should be the
>>>> same as
>>>>>>>>> before (potentially better if array is in hugepage memory, I
>>>> suppose).
>>>>>>>>
>>>>>>>> I think we need some benchmarks to decide what is the best
>>>> tradeoff.
>>>>>>>> I spent time on this implementation, but sorry I won't have time
>>>> for benchmarks.
>>>>>>>> Volunteers?
>>>>>>>
>>>>>>> I had only a quick look at your approach so far.
>>>>>>> But from what I can read, in MT environment your suggestion will
>>>> require
>>>>>>> extra synchronization for each read-write access to such parray
>>>> element (lock, rcu, ...).
>>>>>>> I think what Bruce suggests will be much ligther, easier to
>>>> implement and less error prone.
>>>>>>> At least for rte_ethdevs[] and friends.
>>>>>>> Konstantin
>>>>>>
>>>>>> One more thought here - if we are talking about rte_ethdev[] in
>>>> particular, I think  we can:
>>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from
>>>> rte_ethdev into a separate flat array.
>>>>>> We can keep it public to still use inline functions for 'fast'
>>>> calls rte_eth_rx_burst(), etc. to avoid
>>>>>> any regressions.
>>>>>> That could still be flat array with max_size specified at
>>>> application startup.
>>>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>>>> That will allow us to change the struct itself and the whole
>>>> rte_ethdev[] table in a way we like
>>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>>>
>>>>>> Yes, it would require all PMDs to change prototype for
>>>> pkt_rx_burst() function
>>>>>> (to accept port_id, queue_id instead of queue pointer), but the
>>>> change is mechanical one.
>>>>>> Probably some macro can be provided to simplify it.
>>>>>>
>>>>>
>>>>> We are already planning some tasks for ABI stability for v21.11, I
>>>> think
>>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables
>>>> hiding more
>>>>> internal data.
>>>>
>>>> Ok, sounds good.
>>>>
>>>>>
>>>>>> The only significant complication I can foresee with implementing
>>>> that approach -
>>>>>> we'll need a an array of 'fast' function pointers per queue, not
>>>> per device as we have now
>>>>>> (to avoid extra indirection for callback implementation).
>>>>>> Though as a bonus we'll have ability to use different RX/TX
>>>> funcions per queue.
>>>>>>
>>>>>
>>>>> What do you think split Rx/Tx callback into its own struct too?
>>>>>
>>>>> Overall 'rte_eth_dev' can be split into three as:
>>>>> 1. rte_eth_dev
>>>>> 2. rte_eth_dev_burst
>>>>> 3. rte_eth_dev_cb
>>>>>
>>>>> And we can hide 1 from applications even with the inline functions.
>>>>
>>>> As discussed off-line, I think:
>>>> it is possible.
>>>> My absolute preference would be to have just 1/2 (with CB hidden).
>>>> But even with 1/2/3 in place I think it would be  a good step forward.
>>>> Probably worth to start with 1/2/3 first and then see how difficult it
>>>> would be to switch to 1/2.
>>>> Do you plan to start working on it?
>>>>
>>>> Konstantin
>>>
>>> If you do proceed with this, be very careful. E.g. the inlined rx/tx burst functions should not touch more cache lines than they do today -
>> especially if there are many active ports. The inlined rx/tx burst functions are very simple, so thorough code review (and possibly also of the
>> resulting assembly) is appropriate. Simple performance testing might not detect if more cache lines are accessed than before the
>> modifications.
>>>
>>> Don't get me wrong... I do consider this an improvement of the ethdev library; I'm only asking you to take extra care!
>>>
>>
>> ack
>>
>> If we split as above, I think device specific data 'struct rte_eth_dev_data'
>> should be part of 1 (rte_eth_dev). Which means Rx/Tx inline functions access
>> additional cache line.
>>
>> To prevent this, what about duplicating 'data' in 2 (rte_eth_dev_burst)?
> 
> I think it would be better to change rx_pkt_burst() to accept port_id and queue_id,
> instead of void *.
> I.E:
> typedef uint16_t (*eth_rx_burst_t)(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts,  uint16_t nb_pkts);
> 

May not need to add 'port_id', since in the callback you are already in the
driver scope and all required device specific variables already accessible via
help of queue struct.

> And we can do actual de-referencing of private rxq data inside the actual rx function.
> 

Yes we can replace queue struct with 'queue_id', and do the referencing in the
Rx instead of burst API, but what is the benefit of it?

>> We have
>> enough space for it to fit into single cache line, currently it is:
>> struct rte_eth_dev {
>>         eth_rx_burst_t             rx_pkt_burst;         /*     0     8 */
>>         eth_tx_burst_t             tx_pkt_burst;         /*     8     8 */
>>         eth_tx_prep_t              tx_pkt_prepare;       /*    16     8 */
>>         eth_rx_queue_count_t       rx_queue_count;       /*    24     8 */
>>         eth_rx_descriptor_done_t   rx_descriptor_done;   /*    32     8 */
>>         eth_rx_descriptor_status_t rx_descriptor_status; /*    40     8 */
>>         eth_tx_descriptor_status_t tx_descriptor_status; /*    48     8 */
>>         struct rte_eth_dev_data *  data;                 /*    56     8 */
>>         /* --- cacheline 1 boundary (64 bytes) --- */
>>
>> 'rx_descriptor_done' is deprecated and will be removed;


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 16:55  0%                   ` Morten Brørup
@ 2021-06-18 10:21  0%                     ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-18 10:21 UTC (permalink / raw)
  To: Morten Brørup, Ananyev, Konstantin, Thomas Monjalon,
	Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil

On 6/17/2021 5:55 PM, Morten Brørup wrote:
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
>> Sent: Thursday, 17 June 2021 18.13
>>
>> On 6/17/2021 4:17 PM, Morten Brørup wrote:
>>>> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
>>>> Sent: Thursday, 17 June 2021 16.59
>>>>
>>>>>>>>
>>>>>>>> 14/06/2021 15:15, Bruce Richardson:
>>>>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
>>>> Monjalon
>>>>>>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>>>>>>
>>>>>>>>>>> Performance of access in a fixed-size array is very good
>>>>>>>>>>> because of cache locality
>>>>>>>>>>> and because there is a single pointer to dereference.
>>>>>>>>>>> The only drawback is the lack of flexibility:
>>>>>>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>>>>>>
>>>>>>>>>>> An approach to this problem is to allocate the array at
>>>> runtime,
>>>>>>>>>>> being as efficient as static arrays, but still limited to a
>>>> maximum.
>>>>>>>>>>>
>>>>>>>>>>> That's why the API rte_parray is introduced,
>>>>>>>>>>> allowing to declare an array of pointer which can be resized
>>>>>>>>>>> dynamically
>>>>>>>>>>> and automatically at runtime while keeping a good read
>>>> performance.
>>>>>>>>>>>
>>>>>>>>>>> After resize, the previous array is kept until the next
>> resize
>>>>>>>>>>> to avoid crashs during a read without any lock.
>>>>>>>>>>>
>>>>>>>>>>> Each element is a pointer to a memory chunk dynamically
>>>> allocated.
>>>>>>>>>>> This is not good for cache locality but it allows to keep the
>>>> same
>>>>>>>>>>> memory per element, no matter how the array is resized.
>>>>>>>>>>> Cache locality could be improved with mempools.
>>>>>>>>>>> The other drawback is having to dereference one more pointer
>>>>>>>>>>> to read an element.
>>>>>>>>>>>
>>>>>>>>>>> There is not much locks, so the API is for internal use only.
>>>>>>>>>>> This API may be used to completely remove some compilation-
>>>> time
>>>>>>>>>>> maximums.
>>>>>>>>>>
>>>>>>>>>> I get the purpose and overall intention of this library.
>>>>>>>>>>
>>>>>>>>>> I probably already mentioned that I prefer "embedded style
>>>> programming" with fixed size arrays, rather than runtime
>>>> configurability.
>>>>>>> It's
>>>>>>>> my personal opinion, and the DPDK Tech Board clearly prefers
>>>> reducing the amount of compile time configurability, so there is no
>> way
>>>>> for
>>>>>>>> me to stop this progress, and I do not intend to oppose to this
>>>> library. :-)
>>>>>>>>>>
>>>>>>>>>> This library is likely to become a core library of DPDK, so I
>>>> think it is important getting it right. Could you please mention a
>> few
>>>>>>> examples
>>>>>>>> where you think this internal library should be used, and where
>>>> it should not be used. Then it is easier to discuss if the border
>> line
>>>>> between
>>>>>>>> control path and data plane is correct. E.g. this library is not
>>>> intended to be used for dynamically sized packet queues that grow
>> and
>>>>> shrink
>>>>>>> in
>>>>>>>> the fast path.
>>>>>>>>>>
>>>>>>>>>> If the library becomes a core DPDK library, it should probably
>>>> be public instead of internal. E.g. if the library is used to make
>>>>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then
>> some
>>>> applications might also need dynamically sized arrays for their
>>>>>>>> application specific per-port runtime data, and this library
>>>> could serve that purpose too.
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Thanks Thomas for starting this discussion and Morten for
>>>> follow-up.
>>>>>>>>>
>>>>>>>>> My thinking is as follows, and I'm particularly keeping in mind
>>>> the cases
>>>>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>>>>>>
>>>>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not
>>>> convinced that
>>>>>>>>> we should switch away from the flat arrays or that we need
>> fully
>>>> dynamic
>>>>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
>>>> a half-way
>>>>>>>>> house here, where we keep the ethdevs as an array, but one
>>>> allocated/sized
>>>>>>>>> at runtime rather than statically. This would allow us to have
>> a
>>>>>>>>> compile-time default value, but, for use cases that need it,
>>>> allow use of a
>>>>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter
>>>> given to the
>>>>>>>>> malloc call for the array.  This max limit could then be
>>>> provided to apps
>>>>>>>>> too if they want to match any array sizes. [Alternatively those
>>>> apps could
>>>>>>>>> check the provided size and error out if the size has been
>>>> increased beyond
>>>>>>>>> what the app is designed to use?]. There would be no extra
>>>> dereferences per
>>>>>>>>> rx/tx burst call in this scenario so performance should be the
>>>> same as
>>>>>>>>> before (potentially better if array is in hugepage memory, I
>>>> suppose).
>>>>>>>>
>>>>>>>> I think we need some benchmarks to decide what is the best
>>>> tradeoff.
>>>>>>>> I spent time on this implementation, but sorry I won't have time
>>>> for benchmarks.
>>>>>>>> Volunteers?
>>>>>>>
>>>>>>> I had only a quick look at your approach so far.
>>>>>>> But from what I can read, in MT environment your suggestion will
>>>> require
>>>>>>> extra synchronization for each read-write access to such parray
>>>> element (lock, rcu, ...).
>>>>>>> I think what Bruce suggests will be much ligther, easier to
>>>> implement and less error prone.
>>>>>>> At least for rte_ethdevs[] and friends.
>>>>>>> Konstantin
>>>>>>
>>>>>> One more thought here - if we are talking about rte_ethdev[] in
>>>> particular, I think  we can:
>>>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from
>>>> rte_ethdev into a separate flat array.
>>>>>> We can keep it public to still use inline functions for 'fast'
>>>> calls rte_eth_rx_burst(), etc. to avoid
>>>>>> any regressions.
>>>>>> That could still be flat array with max_size specified at
>>>> application startup.
>>>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>>>> That will allow us to change the struct itself and the whole
>>>> rte_ethdev[] table in a way we like
>>>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>>>
>>>>>> Yes, it would require all PMDs to change prototype for
>>>> pkt_rx_burst() function
>>>>>> (to accept port_id, queue_id instead of queue pointer), but the
>>>> change is mechanical one.
>>>>>> Probably some macro can be provided to simplify it.
>>>>>>
>>>>>
>>>>> We are already planning some tasks for ABI stability for v21.11, I
>>>> think
>>>>> splitting 'struct rte_eth_dev' can be part of that task, it enables
>>>> hiding more
>>>>> internal data.
>>>>
>>>> Ok, sounds good.
>>>>
>>>>>
>>>>>> The only significant complication I can foresee with implementing
>>>> that approach -
>>>>>> we'll need a an array of 'fast' function pointers per queue, not
>>>> per device as we have now
>>>>>> (to avoid extra indirection for callback implementation).
>>>>>> Though as a bonus we'll have ability to use different RX/TX
>>>> funcions per queue.
>>>>>>
>>>>>
>>>>> What do you think split Rx/Tx callback into its own struct too?
>>>>>
>>>>> Overall 'rte_eth_dev' can be split into three as:
>>>>> 1. rte_eth_dev
>>>>> 2. rte_eth_dev_burst
>>>>> 3. rte_eth_dev_cb
>>>>>
>>>>> And we can hide 1 from applications even with the inline functions.
>>>>
>>>> As discussed off-line, I think:
>>>> it is possible.
>>>> My absolute preference would be to have just 1/2 (with CB hidden).
>>>> But even with 1/2/3 in place I think it would be  a good step
>> forward.
>>>> Probably worth to start with 1/2/3 first and then see how difficult
>> it
>>>> would be to switch to 1/2.
>>>> Do you plan to start working on it?
>>>>
>>>> Konstantin
>>>
>>> If you do proceed with this, be very careful. E.g. the inlined rx/tx
>> burst functions should not touch more cache lines than they do today -
>> especially if there are many active ports. The inlined rx/tx burst
>> functions are very simple, so thorough code review (and possibly also
>> of the resulting assembly) is appropriate. Simple performance testing
>> might not detect if more cache lines are accessed than before the
>> modifications.
>>>
>>> Don't get me wrong... I do consider this an improvement of the ethdev
>> library; I'm only asking you to take extra care!
>>>
>>
>> ack
>>
>> If we split as above, I think device specific data 'struct
>> rte_eth_dev_data'
>> should be part of 1 (rte_eth_dev). Which means Rx/Tx inline functions
>> access
>> additional cache line.
>>
>> To prevent this, what about duplicating 'data' in 2
>> (rte_eth_dev_burst)? We have
>> enough space for it to fit into single cache line, currently it is:
>> struct rte_eth_dev {
>>         eth_rx_burst_t             rx_pkt_burst;         /*     0     8
>> */
>>         eth_tx_burst_t             tx_pkt_burst;         /*     8     8
>> */
>>         eth_tx_prep_t              tx_pkt_prepare;       /*    16     8
>> */
>>         eth_rx_queue_count_t       rx_queue_count;       /*    24     8
>> */
>>         eth_rx_descriptor_done_t   rx_descriptor_done;   /*    32     8
>> */
>>         eth_rx_descriptor_status_t rx_descriptor_status; /*    40     8
>> */
>>         eth_tx_descriptor_status_t tx_descriptor_status; /*    48     8
>> */
>>         struct rte_eth_dev_data *  data;                 /*    56     8
>> */
>>         /* --- cacheline 1 boundary (64 bytes) --- */
>>
>> 'rx_descriptor_done' is deprecated and will be removed;
> 
> Makes sense.
> 
> Also consider moving 'data' to the top of the new struct, so there is room to add future functions below. (Without growing to more than the one cache line size, one new function can be added when 'rx_descriptor_done' has been removed.)
> 

+1

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 16:12  0%                 ` Ferruh Yigit
  2021-06-17 16:55  0%                   ` Morten Brørup
@ 2021-06-17 17:05  0%                   ` Ananyev, Konstantin
  2021-06-18 10:28  0%                     ` Ferruh Yigit
  1 sibling, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-06-17 17:05 UTC (permalink / raw)
  To: Yigit, Ferruh, Morten Brørup, Thomas Monjalon, Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil


 
> On 6/17/2021 4:17 PM, Morten Brørup wrote:
> >> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> >> Sent: Thursday, 17 June 2021 16.59
> >>
> >>>>>>
> >>>>>> 14/06/2021 15:15, Bruce Richardson:
> >>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> >>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> >> Monjalon
> >>>>>>>>> Sent: Monday, 14 June 2021 12.59
> >>>>>>>>>
> >>>>>>>>> Performance of access in a fixed-size array is very good
> >>>>>>>>> because of cache locality
> >>>>>>>>> and because there is a single pointer to dereference.
> >>>>>>>>> The only drawback is the lack of flexibility:
> >>>>>>>>> the size of such an array cannot be increase at runtime.
> >>>>>>>>>
> >>>>>>>>> An approach to this problem is to allocate the array at
> >> runtime,
> >>>>>>>>> being as efficient as static arrays, but still limited to a
> >> maximum.
> >>>>>>>>>
> >>>>>>>>> That's why the API rte_parray is introduced,
> >>>>>>>>> allowing to declare an array of pointer which can be resized
> >>>>>>>>> dynamically
> >>>>>>>>> and automatically at runtime while keeping a good read
> >> performance.
> >>>>>>>>>
> >>>>>>>>> After resize, the previous array is kept until the next resize
> >>>>>>>>> to avoid crashs during a read without any lock.
> >>>>>>>>>
> >>>>>>>>> Each element is a pointer to a memory chunk dynamically
> >> allocated.
> >>>>>>>>> This is not good for cache locality but it allows to keep the
> >> same
> >>>>>>>>> memory per element, no matter how the array is resized.
> >>>>>>>>> Cache locality could be improved with mempools.
> >>>>>>>>> The other drawback is having to dereference one more pointer
> >>>>>>>>> to read an element.
> >>>>>>>>>
> >>>>>>>>> There is not much locks, so the API is for internal use only.
> >>>>>>>>> This API may be used to completely remove some compilation-
> >> time
> >>>>>>>>> maximums.
> >>>>>>>>
> >>>>>>>> I get the purpose and overall intention of this library.
> >>>>>>>>
> >>>>>>>> I probably already mentioned that I prefer "embedded style
> >> programming" with fixed size arrays, rather than runtime
> >> configurability.
> >>>>> It's
> >>>>>> my personal opinion, and the DPDK Tech Board clearly prefers
> >> reducing the amount of compile time configurability, so there is no way
> >>> for
> >>>>>> me to stop this progress, and I do not intend to oppose to this
> >> library. :-)
> >>>>>>>>
> >>>>>>>> This library is likely to become a core library of DPDK, so I
> >> think it is important getting it right. Could you please mention a few
> >>>>> examples
> >>>>>> where you think this internal library should be used, and where
> >> it should not be used. Then it is easier to discuss if the border line
> >>> between
> >>>>>> control path and data plane is correct. E.g. this library is not
> >> intended to be used for dynamically sized packet queues that grow and
> >>> shrink
> >>>>> in
> >>>>>> the fast path.
> >>>>>>>>
> >>>>>>>> If the library becomes a core DPDK library, it should probably
> >> be public instead of internal. E.g. if the library is used to make
> >>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some
> >> applications might also need dynamically sized arrays for their
> >>>>>> application specific per-port runtime data, and this library
> >> could serve that purpose too.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Thanks Thomas for starting this discussion and Morten for
> >> follow-up.
> >>>>>>>
> >>>>>>> My thinking is as follows, and I'm particularly keeping in mind
> >> the cases
> >>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> >>>>>>>
> >>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not
> >> convinced that
> >>>>>>> we should switch away from the flat arrays or that we need fully
> >> dynamic
> >>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
> >> a half-way
> >>>>>>> house here, where we keep the ethdevs as an array, but one
> >> allocated/sized
> >>>>>>> at runtime rather than statically. This would allow us to have a
> >>>>>>> compile-time default value, but, for use cases that need it,
> >> allow use of a
> >>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter
> >> given to the
> >>>>>>> malloc call for the array.  This max limit could then be
> >> provided to apps
> >>>>>>> too if they want to match any array sizes. [Alternatively those
> >> apps could
> >>>>>>> check the provided size and error out if the size has been
> >> increased beyond
> >>>>>>> what the app is designed to use?]. There would be no extra
> >> dereferences per
> >>>>>>> rx/tx burst call in this scenario so performance should be the
> >> same as
> >>>>>>> before (potentially better if array is in hugepage memory, I
> >> suppose).
> >>>>>>
> >>>>>> I think we need some benchmarks to decide what is the best
> >> tradeoff.
> >>>>>> I spent time on this implementation, but sorry I won't have time
> >> for benchmarks.
> >>>>>> Volunteers?
> >>>>>
> >>>>> I had only a quick look at your approach so far.
> >>>>> But from what I can read, in MT environment your suggestion will
> >> require
> >>>>> extra synchronization for each read-write access to such parray
> >> element (lock, rcu, ...).
> >>>>> I think what Bruce suggests will be much ligther, easier to
> >> implement and less error prone.
> >>>>> At least for rte_ethdevs[] and friends.
> >>>>> Konstantin
> >>>>
> >>>> One more thought here - if we are talking about rte_ethdev[] in
> >> particular, I think  we can:
> >>>> 1. move public function pointers (rx_pkt_burst(), etc.) from
> >> rte_ethdev into a separate flat array.
> >>>> We can keep it public to still use inline functions for 'fast'
> >> calls rte_eth_rx_burst(), etc. to avoid
> >>>> any regressions.
> >>>> That could still be flat array with max_size specified at
> >> application startup.
> >>>> 2. Hide rest of rte_ethdev struct in .c.
> >>>> That will allow us to change the struct itself and the whole
> >> rte_ethdev[] table in a way we like
> >>>> (flat array, vector, hash, linked list) without ABI/API breakages.
> >>>>
> >>>> Yes, it would require all PMDs to change prototype for
> >> pkt_rx_burst() function
> >>>> (to accept port_id, queue_id instead of queue pointer), but the
> >> change is mechanical one.
> >>>> Probably some macro can be provided to simplify it.
> >>>>
> >>>
> >>> We are already planning some tasks for ABI stability for v21.11, I
> >> think
> >>> splitting 'struct rte_eth_dev' can be part of that task, it enables
> >> hiding more
> >>> internal data.
> >>
> >> Ok, sounds good.
> >>
> >>>
> >>>> The only significant complication I can foresee with implementing
> >> that approach -
> >>>> we'll need a an array of 'fast' function pointers per queue, not
> >> per device as we have now
> >>>> (to avoid extra indirection for callback implementation).
> >>>> Though as a bonus we'll have ability to use different RX/TX
> >> funcions per queue.
> >>>>
> >>>
> >>> What do you think split Rx/Tx callback into its own struct too?
> >>>
> >>> Overall 'rte_eth_dev' can be split into three as:
> >>> 1. rte_eth_dev
> >>> 2. rte_eth_dev_burst
> >>> 3. rte_eth_dev_cb
> >>>
> >>> And we can hide 1 from applications even with the inline functions.
> >>
> >> As discussed off-line, I think:
> >> it is possible.
> >> My absolute preference would be to have just 1/2 (with CB hidden).
> >> But even with 1/2/3 in place I think it would be  a good step forward.
> >> Probably worth to start with 1/2/3 first and then see how difficult it
> >> would be to switch to 1/2.
> >> Do you plan to start working on it?
> >>
> >> Konstantin
> >
> > If you do proceed with this, be very careful. E.g. the inlined rx/tx burst functions should not touch more cache lines than they do today -
> especially if there are many active ports. The inlined rx/tx burst functions are very simple, so thorough code review (and possibly also of the
> resulting assembly) is appropriate. Simple performance testing might not detect if more cache lines are accessed than before the
> modifications.
> >
> > Don't get me wrong... I do consider this an improvement of the ethdev library; I'm only asking you to take extra care!
> >
> 
> ack
> 
> If we split as above, I think device specific data 'struct rte_eth_dev_data'
> should be part of 1 (rte_eth_dev). Which means Rx/Tx inline functions access
> additional cache line.
> 
> To prevent this, what about duplicating 'data' in 2 (rte_eth_dev_burst)? 

I think it would be better to change rx_pkt_burst() to accept port_id and queue_id,
instead of void *.
I.E:
typedef uint16_t (*eth_rx_burst_t)(uint16_t port_id, uint16_t queue_id, struct rte_mbuf **rx_pkts,  uint16_t nb_pkts);

And we can do actual de-referencing of private rxq data inside the actual rx function.

> We have
> enough space for it to fit into single cache line, currently it is:
> struct rte_eth_dev {
>         eth_rx_burst_t             rx_pkt_burst;         /*     0     8 */
>         eth_tx_burst_t             tx_pkt_burst;         /*     8     8 */
>         eth_tx_prep_t              tx_pkt_prepare;       /*    16     8 */
>         eth_rx_queue_count_t       rx_queue_count;       /*    24     8 */
>         eth_rx_descriptor_done_t   rx_descriptor_done;   /*    32     8 */
>         eth_rx_descriptor_status_t rx_descriptor_status; /*    40     8 */
>         eth_tx_descriptor_status_t tx_descriptor_status; /*    48     8 */
>         struct rte_eth_dev_data *  data;                 /*    56     8 */
>         /* --- cacheline 1 boundary (64 bytes) --- */
> 
> 'rx_descriptor_done' is deprecated and will be removed;

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 16:12  0%                 ` Ferruh Yigit
@ 2021-06-17 16:55  0%                   ` Morten Brørup
  2021-06-18 10:21  0%                     ` Ferruh Yigit
  2021-06-17 17:05  0%                   ` Ananyev, Konstantin
  1 sibling, 1 reply; 200+ results
From: Morten Brørup @ 2021-06-17 16:55 UTC (permalink / raw)
  To: Ferruh Yigit, Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ferruh Yigit
> Sent: Thursday, 17 June 2021 18.13
> 
> On 6/17/2021 4:17 PM, Morten Brørup wrote:
> >> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> >> Sent: Thursday, 17 June 2021 16.59
> >>
> >>>>>>
> >>>>>> 14/06/2021 15:15, Bruce Richardson:
> >>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> >>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> >> Monjalon
> >>>>>>>>> Sent: Monday, 14 June 2021 12.59
> >>>>>>>>>
> >>>>>>>>> Performance of access in a fixed-size array is very good
> >>>>>>>>> because of cache locality
> >>>>>>>>> and because there is a single pointer to dereference.
> >>>>>>>>> The only drawback is the lack of flexibility:
> >>>>>>>>> the size of such an array cannot be increase at runtime.
> >>>>>>>>>
> >>>>>>>>> An approach to this problem is to allocate the array at
> >> runtime,
> >>>>>>>>> being as efficient as static arrays, but still limited to a
> >> maximum.
> >>>>>>>>>
> >>>>>>>>> That's why the API rte_parray is introduced,
> >>>>>>>>> allowing to declare an array of pointer which can be resized
> >>>>>>>>> dynamically
> >>>>>>>>> and automatically at runtime while keeping a good read
> >> performance.
> >>>>>>>>>
> >>>>>>>>> After resize, the previous array is kept until the next
> resize
> >>>>>>>>> to avoid crashs during a read without any lock.
> >>>>>>>>>
> >>>>>>>>> Each element is a pointer to a memory chunk dynamically
> >> allocated.
> >>>>>>>>> This is not good for cache locality but it allows to keep the
> >> same
> >>>>>>>>> memory per element, no matter how the array is resized.
> >>>>>>>>> Cache locality could be improved with mempools.
> >>>>>>>>> The other drawback is having to dereference one more pointer
> >>>>>>>>> to read an element.
> >>>>>>>>>
> >>>>>>>>> There is not much locks, so the API is for internal use only.
> >>>>>>>>> This API may be used to completely remove some compilation-
> >> time
> >>>>>>>>> maximums.
> >>>>>>>>
> >>>>>>>> I get the purpose and overall intention of this library.
> >>>>>>>>
> >>>>>>>> I probably already mentioned that I prefer "embedded style
> >> programming" with fixed size arrays, rather than runtime
> >> configurability.
> >>>>> It's
> >>>>>> my personal opinion, and the DPDK Tech Board clearly prefers
> >> reducing the amount of compile time configurability, so there is no
> way
> >>> for
> >>>>>> me to stop this progress, and I do not intend to oppose to this
> >> library. :-)
> >>>>>>>>
> >>>>>>>> This library is likely to become a core library of DPDK, so I
> >> think it is important getting it right. Could you please mention a
> few
> >>>>> examples
> >>>>>> where you think this internal library should be used, and where
> >> it should not be used. Then it is easier to discuss if the border
> line
> >>> between
> >>>>>> control path and data plane is correct. E.g. this library is not
> >> intended to be used for dynamically sized packet queues that grow
> and
> >>> shrink
> >>>>> in
> >>>>>> the fast path.
> >>>>>>>>
> >>>>>>>> If the library becomes a core DPDK library, it should probably
> >> be public instead of internal. E.g. if the library is used to make
> >>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then
> some
> >> applications might also need dynamically sized arrays for their
> >>>>>> application specific per-port runtime data, and this library
> >> could serve that purpose too.
> >>>>>>>>
> >>>>>>>
> >>>>>>> Thanks Thomas for starting this discussion and Morten for
> >> follow-up.
> >>>>>>>
> >>>>>>> My thinking is as follows, and I'm particularly keeping in mind
> >> the cases
> >>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> >>>>>>>
> >>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not
> >> convinced that
> >>>>>>> we should switch away from the flat arrays or that we need
> fully
> >> dynamic
> >>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
> >> a half-way
> >>>>>>> house here, where we keep the ethdevs as an array, but one
> >> allocated/sized
> >>>>>>> at runtime rather than statically. This would allow us to have
> a
> >>>>>>> compile-time default value, but, for use cases that need it,
> >> allow use of a
> >>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter
> >> given to the
> >>>>>>> malloc call for the array.  This max limit could then be
> >> provided to apps
> >>>>>>> too if they want to match any array sizes. [Alternatively those
> >> apps could
> >>>>>>> check the provided size and error out if the size has been
> >> increased beyond
> >>>>>>> what the app is designed to use?]. There would be no extra
> >> dereferences per
> >>>>>>> rx/tx burst call in this scenario so performance should be the
> >> same as
> >>>>>>> before (potentially better if array is in hugepage memory, I
> >> suppose).
> >>>>>>
> >>>>>> I think we need some benchmarks to decide what is the best
> >> tradeoff.
> >>>>>> I spent time on this implementation, but sorry I won't have time
> >> for benchmarks.
> >>>>>> Volunteers?
> >>>>>
> >>>>> I had only a quick look at your approach so far.
> >>>>> But from what I can read, in MT environment your suggestion will
> >> require
> >>>>> extra synchronization for each read-write access to such parray
> >> element (lock, rcu, ...).
> >>>>> I think what Bruce suggests will be much ligther, easier to
> >> implement and less error prone.
> >>>>> At least for rte_ethdevs[] and friends.
> >>>>> Konstantin
> >>>>
> >>>> One more thought here - if we are talking about rte_ethdev[] in
> >> particular, I think  we can:
> >>>> 1. move public function pointers (rx_pkt_burst(), etc.) from
> >> rte_ethdev into a separate flat array.
> >>>> We can keep it public to still use inline functions for 'fast'
> >> calls rte_eth_rx_burst(), etc. to avoid
> >>>> any regressions.
> >>>> That could still be flat array with max_size specified at
> >> application startup.
> >>>> 2. Hide rest of rte_ethdev struct in .c.
> >>>> That will allow us to change the struct itself and the whole
> >> rte_ethdev[] table in a way we like
> >>>> (flat array, vector, hash, linked list) without ABI/API breakages.
> >>>>
> >>>> Yes, it would require all PMDs to change prototype for
> >> pkt_rx_burst() function
> >>>> (to accept port_id, queue_id instead of queue pointer), but the
> >> change is mechanical one.
> >>>> Probably some macro can be provided to simplify it.
> >>>>
> >>>
> >>> We are already planning some tasks for ABI stability for v21.11, I
> >> think
> >>> splitting 'struct rte_eth_dev' can be part of that task, it enables
> >> hiding more
> >>> internal data.
> >>
> >> Ok, sounds good.
> >>
> >>>
> >>>> The only significant complication I can foresee with implementing
> >> that approach -
> >>>> we'll need a an array of 'fast' function pointers per queue, not
> >> per device as we have now
> >>>> (to avoid extra indirection for callback implementation).
> >>>> Though as a bonus we'll have ability to use different RX/TX
> >> funcions per queue.
> >>>>
> >>>
> >>> What do you think split Rx/Tx callback into its own struct too?
> >>>
> >>> Overall 'rte_eth_dev' can be split into three as:
> >>> 1. rte_eth_dev
> >>> 2. rte_eth_dev_burst
> >>> 3. rte_eth_dev_cb
> >>>
> >>> And we can hide 1 from applications even with the inline functions.
> >>
> >> As discussed off-line, I think:
> >> it is possible.
> >> My absolute preference would be to have just 1/2 (with CB hidden).
> >> But even with 1/2/3 in place I think it would be  a good step
> forward.
> >> Probably worth to start with 1/2/3 first and then see how difficult
> it
> >> would be to switch to 1/2.
> >> Do you plan to start working on it?
> >>
> >> Konstantin
> >
> > If you do proceed with this, be very careful. E.g. the inlined rx/tx
> burst functions should not touch more cache lines than they do today -
> especially if there are many active ports. The inlined rx/tx burst
> functions are very simple, so thorough code review (and possibly also
> of the resulting assembly) is appropriate. Simple performance testing
> might not detect if more cache lines are accessed than before the
> modifications.
> >
> > Don't get me wrong... I do consider this an improvement of the ethdev
> library; I'm only asking you to take extra care!
> >
> 
> ack
> 
> If we split as above, I think device specific data 'struct
> rte_eth_dev_data'
> should be part of 1 (rte_eth_dev). Which means Rx/Tx inline functions
> access
> additional cache line.
> 
> To prevent this, what about duplicating 'data' in 2
> (rte_eth_dev_burst)? We have
> enough space for it to fit into single cache line, currently it is:
> struct rte_eth_dev {
>         eth_rx_burst_t             rx_pkt_burst;         /*     0     8
> */
>         eth_tx_burst_t             tx_pkt_burst;         /*     8     8
> */
>         eth_tx_prep_t              tx_pkt_prepare;       /*    16     8
> */
>         eth_rx_queue_count_t       rx_queue_count;       /*    24     8
> */
>         eth_rx_descriptor_done_t   rx_descriptor_done;   /*    32     8
> */
>         eth_rx_descriptor_status_t rx_descriptor_status; /*    40     8
> */
>         eth_tx_descriptor_status_t tx_descriptor_status; /*    48     8
> */
>         struct rte_eth_dev_data *  data;                 /*    56     8
> */
>         /* --- cacheline 1 boundary (64 bytes) --- */
> 
> 'rx_descriptor_done' is deprecated and will be removed;

Makes sense.

Also consider moving 'data' to the top of the new struct, so there is room to add future functions below. (Without growing to more than the one cache line size, one new function can be added when 'rx_descriptor_done' has been removed.)


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-14 16:36  4%                   ` Andrew Rybchenko
@ 2021-06-17 16:29  0%                     ` Ferruh Yigit
  0 siblings, 0 replies; 200+ results
From: Ferruh Yigit @ 2021-06-17 16:29 UTC (permalink / raw)
  To: Andrew Rybchenko, Olivier Matz, Gregory Etelson
  Cc: Iremonger, Bernard, Morten Brørup, dev, Matan Azrad,
	Ori Kam, Raslan Darawsheh, Asaf Penso, Thomas Monjalon

On 6/14/2021 5:36 PM, Andrew Rybchenko wrote:
> On 6/10/21 12:22 PM, Olivier Matz wrote:
>> Hi Gregory,
>>
>> On Thu, Jun 10, 2021 at 04:10:25AM +0000, Gregory Etelson wrote:
>>> Hello,
>>>
>>> There was no activity that patch for a long time.
>>> The patch is marked as failed, but we verified failed tests and concluded
>>> that the failures can be ignored.
>>> https://patchwork.dpdk.org/project/dpdk/patch/20210527152858.13312-1-getelson@nvidia.com/
>>>
>>> How should I proceed with this case ?
>>> Please advise.
>>>
>>
>> I like the idea of this patch: to me it is more convenient to access to
>> these fields with a bitfield. I don't see a problem about using
>> bitfields here, glibc or FreeBSD netinet/ip.h are doing the same.
>>
>> However, as stated previously, this patch breaks the initialization API.
> 
> Very good point. I guess we overlooked it in a number of patches
> with fix RTE flow API items to start from corresponding network
> headers. We used unions there to avoid ABI breakage, but it looks
> like we have broken initialization API anyway.
> 

Hi Andrew,

What is broken with the flow API item updates, can you please give a sample?

> We should decide if initialization ABI breakage is a show-stopper
> for RTE flow API items switching to use network protocol headers.
> 
>> The DPDK ABI/API policy is described here:
>> http://doc.dpdk.org/guides/contributing/abi_policy.html#the-dpdk-abi-policy
>>
>>> From this document:
>>
>>    The API should only be changed for significant reasons, such as
>>    performance enhancements. API breakages due to changes such as
>>    reorganizing public structure fields for aesthetic or readability
>>    purposes should be avoided.
>>
>> So to follow the project policy, I think we should reject this path.
>>
>> Regards,
>> Olivier
>>
> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 15:17  0%               ` Morten Brørup
@ 2021-06-17 16:12  0%                 ` Ferruh Yigit
  2021-06-17 16:55  0%                   ` Morten Brørup
  2021-06-17 17:05  0%                   ` Ananyev, Konstantin
  0 siblings, 2 replies; 200+ results
From: Ferruh Yigit @ 2021-06-17 16:12 UTC (permalink / raw)
  To: Morten Brørup, Ananyev, Konstantin, Thomas Monjalon,
	Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil

On 6/17/2021 4:17 PM, Morten Brørup wrote:
>> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
>> Sent: Thursday, 17 June 2021 16.59
>>
>>>>>>
>>>>>> 14/06/2021 15:15, Bruce Richardson:
>>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
>> Monjalon
>>>>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>>>>
>>>>>>>>> Performance of access in a fixed-size array is very good
>>>>>>>>> because of cache locality
>>>>>>>>> and because there is a single pointer to dereference.
>>>>>>>>> The only drawback is the lack of flexibility:
>>>>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>>>>
>>>>>>>>> An approach to this problem is to allocate the array at
>> runtime,
>>>>>>>>> being as efficient as static arrays, but still limited to a
>> maximum.
>>>>>>>>>
>>>>>>>>> That's why the API rte_parray is introduced,
>>>>>>>>> allowing to declare an array of pointer which can be resized
>>>>>>>>> dynamically
>>>>>>>>> and automatically at runtime while keeping a good read
>> performance.
>>>>>>>>>
>>>>>>>>> After resize, the previous array is kept until the next resize
>>>>>>>>> to avoid crashs during a read without any lock.
>>>>>>>>>
>>>>>>>>> Each element is a pointer to a memory chunk dynamically
>> allocated.
>>>>>>>>> This is not good for cache locality but it allows to keep the
>> same
>>>>>>>>> memory per element, no matter how the array is resized.
>>>>>>>>> Cache locality could be improved with mempools.
>>>>>>>>> The other drawback is having to dereference one more pointer
>>>>>>>>> to read an element.
>>>>>>>>>
>>>>>>>>> There is not much locks, so the API is for internal use only.
>>>>>>>>> This API may be used to completely remove some compilation-
>> time
>>>>>>>>> maximums.
>>>>>>>>
>>>>>>>> I get the purpose and overall intention of this library.
>>>>>>>>
>>>>>>>> I probably already mentioned that I prefer "embedded style
>> programming" with fixed size arrays, rather than runtime
>> configurability.
>>>>> It's
>>>>>> my personal opinion, and the DPDK Tech Board clearly prefers
>> reducing the amount of compile time configurability, so there is no way
>>> for
>>>>>> me to stop this progress, and I do not intend to oppose to this
>> library. :-)
>>>>>>>>
>>>>>>>> This library is likely to become a core library of DPDK, so I
>> think it is important getting it right. Could you please mention a few
>>>>> examples
>>>>>> where you think this internal library should be used, and where
>> it should not be used. Then it is easier to discuss if the border line
>>> between
>>>>>> control path and data plane is correct. E.g. this library is not
>> intended to be used for dynamically sized packet queues that grow and
>>> shrink
>>>>> in
>>>>>> the fast path.
>>>>>>>>
>>>>>>>> If the library becomes a core DPDK library, it should probably
>> be public instead of internal. E.g. if the library is used to make
>>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some
>> applications might also need dynamically sized arrays for their
>>>>>> application specific per-port runtime data, and this library
>> could serve that purpose too.
>>>>>>>>
>>>>>>>
>>>>>>> Thanks Thomas for starting this discussion and Morten for
>> follow-up.
>>>>>>>
>>>>>>> My thinking is as follows, and I'm particularly keeping in mind
>> the cases
>>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>>>>
>>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not
>> convinced that
>>>>>>> we should switch away from the flat arrays or that we need fully
>> dynamic
>>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
>> a half-way
>>>>>>> house here, where we keep the ethdevs as an array, but one
>> allocated/sized
>>>>>>> at runtime rather than statically. This would allow us to have a
>>>>>>> compile-time default value, but, for use cases that need it,
>> allow use of a
>>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter
>> given to the
>>>>>>> malloc call for the array.  This max limit could then be
>> provided to apps
>>>>>>> too if they want to match any array sizes. [Alternatively those
>> apps could
>>>>>>> check the provided size and error out if the size has been
>> increased beyond
>>>>>>> what the app is designed to use?]. There would be no extra
>> dereferences per
>>>>>>> rx/tx burst call in this scenario so performance should be the
>> same as
>>>>>>> before (potentially better if array is in hugepage memory, I
>> suppose).
>>>>>>
>>>>>> I think we need some benchmarks to decide what is the best
>> tradeoff.
>>>>>> I spent time on this implementation, but sorry I won't have time
>> for benchmarks.
>>>>>> Volunteers?
>>>>>
>>>>> I had only a quick look at your approach so far.
>>>>> But from what I can read, in MT environment your suggestion will
>> require
>>>>> extra synchronization for each read-write access to such parray
>> element (lock, rcu, ...).
>>>>> I think what Bruce suggests will be much ligther, easier to
>> implement and less error prone.
>>>>> At least for rte_ethdevs[] and friends.
>>>>> Konstantin
>>>>
>>>> One more thought here - if we are talking about rte_ethdev[] in
>> particular, I think  we can:
>>>> 1. move public function pointers (rx_pkt_burst(), etc.) from
>> rte_ethdev into a separate flat array.
>>>> We can keep it public to still use inline functions for 'fast'
>> calls rte_eth_rx_burst(), etc. to avoid
>>>> any regressions.
>>>> That could still be flat array with max_size specified at
>> application startup.
>>>> 2. Hide rest of rte_ethdev struct in .c.
>>>> That will allow us to change the struct itself and the whole
>> rte_ethdev[] table in a way we like
>>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>>
>>>> Yes, it would require all PMDs to change prototype for
>> pkt_rx_burst() function
>>>> (to accept port_id, queue_id instead of queue pointer), but the
>> change is mechanical one.
>>>> Probably some macro can be provided to simplify it.
>>>>
>>>
>>> We are already planning some tasks for ABI stability for v21.11, I
>> think
>>> splitting 'struct rte_eth_dev' can be part of that task, it enables
>> hiding more
>>> internal data.
>>
>> Ok, sounds good.
>>
>>>
>>>> The only significant complication I can foresee with implementing
>> that approach -
>>>> we'll need a an array of 'fast' function pointers per queue, not
>> per device as we have now
>>>> (to avoid extra indirection for callback implementation).
>>>> Though as a bonus we'll have ability to use different RX/TX
>> funcions per queue.
>>>>
>>>
>>> What do you think split Rx/Tx callback into its own struct too?
>>>
>>> Overall 'rte_eth_dev' can be split into three as:
>>> 1. rte_eth_dev
>>> 2. rte_eth_dev_burst
>>> 3. rte_eth_dev_cb
>>>
>>> And we can hide 1 from applications even with the inline functions.
>>
>> As discussed off-line, I think:
>> it is possible.
>> My absolute preference would be to have just 1/2 (with CB hidden).
>> But even with 1/2/3 in place I think it would be  a good step forward.
>> Probably worth to start with 1/2/3 first and then see how difficult it
>> would be to switch to 1/2.
>> Do you plan to start working on it?
>>
>> Konstantin
> 
> If you do proceed with this, be very careful. E.g. the inlined rx/tx burst functions should not touch more cache lines than they do today - especially if there are many active ports. The inlined rx/tx burst functions are very simple, so thorough code review (and possibly also of the resulting assembly) is appropriate. Simple performance testing might not detect if more cache lines are accessed than before the modifications.
> 
> Don't get me wrong... I do consider this an improvement of the ethdev library; I'm only asking you to take extra care!
> 

ack

If we split as above, I think device specific data 'struct rte_eth_dev_data'
should be part of 1 (rte_eth_dev). Which means Rx/Tx inline functions access
additional cache line.

To prevent this, what about duplicating 'data' in 2 (rte_eth_dev_burst)? We have
enough space for it to fit into single cache line, currently it is:
struct rte_eth_dev {
        eth_rx_burst_t             rx_pkt_burst;         /*     0     8 */
        eth_tx_burst_t             tx_pkt_burst;         /*     8     8 */
        eth_tx_prep_t              tx_pkt_prepare;       /*    16     8 */
        eth_rx_queue_count_t       rx_queue_count;       /*    24     8 */
        eth_rx_descriptor_done_t   rx_descriptor_done;   /*    32     8 */
        eth_rx_descriptor_status_t rx_descriptor_status; /*    40     8 */
        eth_tx_descriptor_status_t tx_descriptor_status; /*    48     8 */
        struct rte_eth_dev_data *  data;                 /*    56     8 */
        /* --- cacheline 1 boundary (64 bytes) --- */

'rx_descriptor_done' is deprecated and will be removed;

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 14:58  0%             ` Ananyev, Konstantin
  2021-06-17 15:17  0%               ` Morten Brørup
@ 2021-06-17 15:44  3%               ` Ferruh Yigit
  2021-06-18 10:41  0%                 ` Ananyev, Konstantin
  1 sibling, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-06-17 15:44 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil

On 6/17/2021 3:58 PM, Ananyev, Konstantin wrote:
> 
> 
>>>>>
>>>>> 14/06/2021 15:15, Bruce Richardson:
>>>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
>>>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>>>
>>>>>>>> Performance of access in a fixed-size array is very good
>>>>>>>> because of cache locality
>>>>>>>> and because there is a single pointer to dereference.
>>>>>>>> The only drawback is the lack of flexibility:
>>>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>>>
>>>>>>>> An approach to this problem is to allocate the array at runtime,
>>>>>>>> being as efficient as static arrays, but still limited to a maximum.
>>>>>>>>
>>>>>>>> That's why the API rte_parray is introduced,
>>>>>>>> allowing to declare an array of pointer which can be resized
>>>>>>>> dynamically
>>>>>>>> and automatically at runtime while keeping a good read performance.
>>>>>>>>
>>>>>>>> After resize, the previous array is kept until the next resize
>>>>>>>> to avoid crashs during a read without any lock.
>>>>>>>>
>>>>>>>> Each element is a pointer to a memory chunk dynamically allocated.
>>>>>>>> This is not good for cache locality but it allows to keep the same
>>>>>>>> memory per element, no matter how the array is resized.
>>>>>>>> Cache locality could be improved with mempools.
>>>>>>>> The other drawback is having to dereference one more pointer
>>>>>>>> to read an element.
>>>>>>>>
>>>>>>>> There is not much locks, so the API is for internal use only.
>>>>>>>> This API may be used to completely remove some compilation-time
>>>>>>>> maximums.
>>>>>>>
>>>>>>> I get the purpose and overall intention of this library.
>>>>>>>
>>>>>>> I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime configurability.
>>>> It's
>>>>> my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no way
>> for
>>>>> me to stop this progress, and I do not intend to oppose to this library. :-)
>>>>>>>
>>>>>>> This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
>>>> examples
>>>>> where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line
>> between
>>>>> control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and
>> shrink
>>>> in
>>>>> the fast path.
>>>>>>>
>>>>>>> If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
>>>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for their
>>>>> application specific per-port runtime data, and this library could serve that purpose too.
>>>>>>>
>>>>>>
>>>>>> Thanks Thomas for starting this discussion and Morten for follow-up.
>>>>>>
>>>>>> My thinking is as follows, and I'm particularly keeping in mind the cases
>>>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>>>
>>>>>> While I dislike the hard-coded limits in DPDK, I'm also not convinced that
>>>>>> we should switch away from the flat arrays or that we need fully dynamic
>>>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
>>>>>> house here, where we keep the ethdevs as an array, but one allocated/sized
>>>>>> at runtime rather than statically. This would allow us to have a
>>>>>> compile-time default value, but, for use cases that need it, allow use of a
>>>>>> flag e.g.  "max-ethdevs" to change the size of the parameter given to the
>>>>>> malloc call for the array.  This max limit could then be provided to apps
>>>>>> too if they want to match any array sizes. [Alternatively those apps could
>>>>>> check the provided size and error out if the size has been increased beyond
>>>>>> what the app is designed to use?]. There would be no extra dereferences per
>>>>>> rx/tx burst call in this scenario so performance should be the same as
>>>>>> before (potentially better if array is in hugepage memory, I suppose).
>>>>>
>>>>> I think we need some benchmarks to decide what is the best tradeoff.
>>>>> I spent time on this implementation, but sorry I won't have time for benchmarks.
>>>>> Volunteers?
>>>>
>>>> I had only a quick look at your approach so far.
>>>> But from what I can read, in MT environment your suggestion will require
>>>> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
>>>> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
>>>> At least for rte_ethdevs[] and friends.
>>>> Konstantin
>>>
>>> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
>>> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
>>> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
>>> any regressions.
>>> That could still be flat array with max_size specified at application startup.
>>> 2. Hide rest of rte_ethdev struct in .c.
>>> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
>>> (flat array, vector, hash, linked list) without ABI/API breakages.
>>>
>>> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
>>> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
>>> Probably some macro can be provided to simplify it.
>>>
>>
>> We are already planning some tasks for ABI stability for v21.11, I think
>> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
>> internal data.
> 
> Ok, sounds good.
> 
>>
>>> The only significant complication I can foresee with implementing that approach -
>>> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
>>> (to avoid extra indirection for callback implementation).
>>> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
>>>
>>
>> What do you think split Rx/Tx callback into its own struct too?
>>
>> Overall 'rte_eth_dev' can be split into three as:
>> 1. rte_eth_dev
>> 2. rte_eth_dev_burst
>> 3. rte_eth_dev_cb
>>
>> And we can hide 1 from applications even with the inline functions.
> 
> As discussed off-line, I think:
> it is possible.
> My absolute preference would be to have just 1/2 (with CB hidden).

How can we hide the callbacks since they are used by inline burst functions.

> But even with 1/2/3 in place I think it would be  a good step forward.
> Probably worth to start with 1/2/3 first and then see how difficult it
> would be to switch to 1/2.

What do you mean by switch to 1/2?

If we keep having inline functions, and split struct as above three structs, we
can only hide 1, and 2/3 will be still visible to apps because of inline
functions. This way we will be able to hide more still having same performance.

> Do you plan to start working on it?
> 

We are gathering the list of the tasks for the ABI stability, most probably they
will be worked on during v21.11. I can take this one.

> Konstantin
> 
> 
> 
> 


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 14:58  0%             ` Ananyev, Konstantin
@ 2021-06-17 15:17  0%               ` Morten Brørup
  2021-06-17 16:12  0%                 ` Ferruh Yigit
  2021-06-17 15:44  3%               ` Ferruh Yigit
  1 sibling, 1 reply; 200+ results
From: Morten Brørup @ 2021-06-17 15:17 UTC (permalink / raw)
  To: Ananyev, Konstantin, Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: dev, olivier.matz, andrew.rybchenko, honnappa.nagarahalli,
	jerinj, gakhil

> From: Ananyev, Konstantin [mailto:konstantin.ananyev@intel.com]
> Sent: Thursday, 17 June 2021 16.59
> 
> > >>>
> > >>> 14/06/2021 15:15, Bruce Richardson:
> > >>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> > >>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> Monjalon
> > >>>>>> Sent: Monday, 14 June 2021 12.59
> > >>>>>>
> > >>>>>> Performance of access in a fixed-size array is very good
> > >>>>>> because of cache locality
> > >>>>>> and because there is a single pointer to dereference.
> > >>>>>> The only drawback is the lack of flexibility:
> > >>>>>> the size of such an array cannot be increase at runtime.
> > >>>>>>
> > >>>>>> An approach to this problem is to allocate the array at
> runtime,
> > >>>>>> being as efficient as static arrays, but still limited to a
> maximum.
> > >>>>>>
> > >>>>>> That's why the API rte_parray is introduced,
> > >>>>>> allowing to declare an array of pointer which can be resized
> > >>>>>> dynamically
> > >>>>>> and automatically at runtime while keeping a good read
> performance.
> > >>>>>>
> > >>>>>> After resize, the previous array is kept until the next resize
> > >>>>>> to avoid crashs during a read without any lock.
> > >>>>>>
> > >>>>>> Each element is a pointer to a memory chunk dynamically
> allocated.
> > >>>>>> This is not good for cache locality but it allows to keep the
> same
> > >>>>>> memory per element, no matter how the array is resized.
> > >>>>>> Cache locality could be improved with mempools.
> > >>>>>> The other drawback is having to dereference one more pointer
> > >>>>>> to read an element.
> > >>>>>>
> > >>>>>> There is not much locks, so the API is for internal use only.
> > >>>>>> This API may be used to completely remove some compilation-
> time
> > >>>>>> maximums.
> > >>>>>
> > >>>>> I get the purpose and overall intention of this library.
> > >>>>>
> > >>>>> I probably already mentioned that I prefer "embedded style
> programming" with fixed size arrays, rather than runtime
> configurability.
> > >> It's
> > >>> my personal opinion, and the DPDK Tech Board clearly prefers
> reducing the amount of compile time configurability, so there is no way
> > for
> > >>> me to stop this progress, and I do not intend to oppose to this
> library. :-)
> > >>>>>
> > >>>>> This library is likely to become a core library of DPDK, so I
> think it is important getting it right. Could you please mention a few
> > >> examples
> > >>> where you think this internal library should be used, and where
> it should not be used. Then it is easier to discuss if the border line
> > between
> > >>> control path and data plane is correct. E.g. this library is not
> intended to be used for dynamically sized packet queues that grow and
> > shrink
> > >> in
> > >>> the fast path.
> > >>>>>
> > >>>>> If the library becomes a core DPDK library, it should probably
> be public instead of internal. E.g. if the library is used to make
> > >>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some
> applications might also need dynamically sized arrays for their
> > >>> application specific per-port runtime data, and this library
> could serve that purpose too.
> > >>>>>
> > >>>>
> > >>>> Thanks Thomas for starting this discussion and Morten for
> follow-up.
> > >>>>
> > >>>> My thinking is as follows, and I'm particularly keeping in mind
> the cases
> > >>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> > >>>>
> > >>>> While I dislike the hard-coded limits in DPDK, I'm also not
> convinced that
> > >>>> we should switch away from the flat arrays or that we need fully
> dynamic
> > >>>> arrays that grow/shrink at runtime for ethdevs. I would suggest
> a half-way
> > >>>> house here, where we keep the ethdevs as an array, but one
> allocated/sized
> > >>>> at runtime rather than statically. This would allow us to have a
> > >>>> compile-time default value, but, for use cases that need it,
> allow use of a
> > >>>> flag e.g.  "max-ethdevs" to change the size of the parameter
> given to the
> > >>>> malloc call for the array.  This max limit could then be
> provided to apps
> > >>>> too if they want to match any array sizes. [Alternatively those
> apps could
> > >>>> check the provided size and error out if the size has been
> increased beyond
> > >>>> what the app is designed to use?]. There would be no extra
> dereferences per
> > >>>> rx/tx burst call in this scenario so performance should be the
> same as
> > >>>> before (potentially better if array is in hugepage memory, I
> suppose).
> > >>>
> > >>> I think we need some benchmarks to decide what is the best
> tradeoff.
> > >>> I spent time on this implementation, but sorry I won't have time
> for benchmarks.
> > >>> Volunteers?
> > >>
> > >> I had only a quick look at your approach so far.
> > >> But from what I can read, in MT environment your suggestion will
> require
> > >> extra synchronization for each read-write access to such parray
> element (lock, rcu, ...).
> > >> I think what Bruce suggests will be much ligther, easier to
> implement and less error prone.
> > >> At least for rte_ethdevs[] and friends.
> > >> Konstantin
> > >
> > > One more thought here - if we are talking about rte_ethdev[] in
> particular, I think  we can:
> > > 1. move public function pointers (rx_pkt_burst(), etc.) from
> rte_ethdev into a separate flat array.
> > > We can keep it public to still use inline functions for 'fast'
> calls rte_eth_rx_burst(), etc. to avoid
> > > any regressions.
> > > That could still be flat array with max_size specified at
> application startup.
> > > 2. Hide rest of rte_ethdev struct in .c.
> > > That will allow us to change the struct itself and the whole
> rte_ethdev[] table in a way we like
> > > (flat array, vector, hash, linked list) without ABI/API breakages.
> > >
> > > Yes, it would require all PMDs to change prototype for
> pkt_rx_burst() function
> > > (to accept port_id, queue_id instead of queue pointer), but the
> change is mechanical one.
> > > Probably some macro can be provided to simplify it.
> > >
> >
> > We are already planning some tasks for ABI stability for v21.11, I
> think
> > splitting 'struct rte_eth_dev' can be part of that task, it enables
> hiding more
> > internal data.
> 
> Ok, sounds good.
> 
> >
> > > The only significant complication I can foresee with implementing
> that approach -
> > > we'll need a an array of 'fast' function pointers per queue, not
> per device as we have now
> > > (to avoid extra indirection for callback implementation).
> > > Though as a bonus we'll have ability to use different RX/TX
> funcions per queue.
> > >
> >
> > What do you think split Rx/Tx callback into its own struct too?
> >
> > Overall 'rte_eth_dev' can be split into three as:
> > 1. rte_eth_dev
> > 2. rte_eth_dev_burst
> > 3. rte_eth_dev_cb
> >
> > And we can hide 1 from applications even with the inline functions.
> 
> As discussed off-line, I think:
> it is possible.
> My absolute preference would be to have just 1/2 (with CB hidden).
> But even with 1/2/3 in place I think it would be  a good step forward.
> Probably worth to start with 1/2/3 first and then see how difficult it
> would be to switch to 1/2.
> Do you plan to start working on it?
> 
> Konstantin

If you do proceed with this, be very careful. E.g. the inlined rx/tx burst functions should not touch more cache lines than they do today - especially if there are many active ports. The inlined rx/tx burst functions are very simple, so thorough code review (and possibly also of the resulting assembly) is appropriate. Simple performance testing might not detect if more cache lines are accessed than before the modifications.

Don't get me wrong... I do consider this an improvement of the ethdev library; I'm only asking you to take extra care!

-Morten


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
    2021-05-27 15:56  3% ` Morten Brørup
@ 2021-06-17 15:02  3% ` Tyler Retzlaff
  1 sibling, 0 replies; 200+ results
From: Tyler Retzlaff @ 2021-06-17 15:02 UTC (permalink / raw)
  To: Gregory Etelson
  Cc: dev, matan, orika, rasland, Bernard Iremonger, Olivier Matz

On Thu, May 27, 2021 at 06:28:58PM +0300, Gregory Etelson wrote:
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 4b728969c1..684bb028b2 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -38,7 +38,21 @@ extern "C" {
>   * IPv4 Header
>   */
>  struct rte_ipv4_hdr {
> -	uint8_t  version_ihl;		/**< version and header length */
> +	__extension__

this patch reduces compiler portability, though not strictly objecting
so long as the community accepts that it may lead to conditional
compilation having to be introduced in a future change.

please also be mindful of the impact of __attribute__ ((__packed__)) in
the presence of bitfields on gcc when evaluating abi compatibility.

> +	union {
> +		uint8_t version_ihl;    /**< version and header length */
> +		struct {
> +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> +			uint8_t ihl:4;
> +			uint8_t version:4;
> +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> +			uint8_t version:4;
> +			uint8_t ihl:4;
> +#else
> +#error "setup endian definition"
> +#endif
> +		};
> +	};
>  	uint8_t  type_of_service;	/**< type of service */
>  	rte_be16_t total_length;	/**< length of packet */
>  	rte_be16_t packet_id;		/**< packet ID */
> -- 
> 2.31.1

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-17 13:08  3%           ` Ferruh Yigit
@ 2021-06-17 14:58  0%             ` Ananyev, Konstantin
  2021-06-17 15:17  0%               ` Morten Brørup
  2021-06-17 15:44  3%               ` Ferruh Yigit
  0 siblings, 2 replies; 200+ results
From: Ananyev, Konstantin @ 2021-06-17 14:58 UTC (permalink / raw)
  To: Yigit, Ferruh, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil



> >>>
> >>> 14/06/2021 15:15, Bruce Richardson:
> >>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> >>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> >>>>>> Sent: Monday, 14 June 2021 12.59
> >>>>>>
> >>>>>> Performance of access in a fixed-size array is very good
> >>>>>> because of cache locality
> >>>>>> and because there is a single pointer to dereference.
> >>>>>> The only drawback is the lack of flexibility:
> >>>>>> the size of such an array cannot be increase at runtime.
> >>>>>>
> >>>>>> An approach to this problem is to allocate the array at runtime,
> >>>>>> being as efficient as static arrays, but still limited to a maximum.
> >>>>>>
> >>>>>> That's why the API rte_parray is introduced,
> >>>>>> allowing to declare an array of pointer which can be resized
> >>>>>> dynamically
> >>>>>> and automatically at runtime while keeping a good read performance.
> >>>>>>
> >>>>>> After resize, the previous array is kept until the next resize
> >>>>>> to avoid crashs during a read without any lock.
> >>>>>>
> >>>>>> Each element is a pointer to a memory chunk dynamically allocated.
> >>>>>> This is not good for cache locality but it allows to keep the same
> >>>>>> memory per element, no matter how the array is resized.
> >>>>>> Cache locality could be improved with mempools.
> >>>>>> The other drawback is having to dereference one more pointer
> >>>>>> to read an element.
> >>>>>>
> >>>>>> There is not much locks, so the API is for internal use only.
> >>>>>> This API may be used to completely remove some compilation-time
> >>>>>> maximums.
> >>>>>
> >>>>> I get the purpose and overall intention of this library.
> >>>>>
> >>>>> I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime configurability.
> >> It's
> >>> my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no way
> for
> >>> me to stop this progress, and I do not intend to oppose to this library. :-)
> >>>>>
> >>>>> This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
> >> examples
> >>> where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line
> between
> >>> control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and
> shrink
> >> in
> >>> the fast path.
> >>>>>
> >>>>> If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
> >>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for their
> >>> application specific per-port runtime data, and this library could serve that purpose too.
> >>>>>
> >>>>
> >>>> Thanks Thomas for starting this discussion and Morten for follow-up.
> >>>>
> >>>> My thinking is as follows, and I'm particularly keeping in mind the cases
> >>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> >>>>
> >>>> While I dislike the hard-coded limits in DPDK, I'm also not convinced that
> >>>> we should switch away from the flat arrays or that we need fully dynamic
> >>>> arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
> >>>> house here, where we keep the ethdevs as an array, but one allocated/sized
> >>>> at runtime rather than statically. This would allow us to have a
> >>>> compile-time default value, but, for use cases that need it, allow use of a
> >>>> flag e.g.  "max-ethdevs" to change the size of the parameter given to the
> >>>> malloc call for the array.  This max limit could then be provided to apps
> >>>> too if they want to match any array sizes. [Alternatively those apps could
> >>>> check the provided size and error out if the size has been increased beyond
> >>>> what the app is designed to use?]. There would be no extra dereferences per
> >>>> rx/tx burst call in this scenario so performance should be the same as
> >>>> before (potentially better if array is in hugepage memory, I suppose).
> >>>
> >>> I think we need some benchmarks to decide what is the best tradeoff.
> >>> I spent time on this implementation, but sorry I won't have time for benchmarks.
> >>> Volunteers?
> >>
> >> I had only a quick look at your approach so far.
> >> But from what I can read, in MT environment your suggestion will require
> >> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
> >> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
> >> At least for rte_ethdevs[] and friends.
> >> Konstantin
> >
> > One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> > 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> > We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> > any regressions.
> > That could still be flat array with max_size specified at application startup.
> > 2. Hide rest of rte_ethdev struct in .c.
> > That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> > (flat array, vector, hash, linked list) without ABI/API breakages.
> >
> > Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> > (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> > Probably some macro can be provided to simplify it.
> >
> 
> We are already planning some tasks for ABI stability for v21.11, I think
> splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
> internal data.

Ok, sounds good.

> 
> > The only significant complication I can foresee with implementing that approach -
> > we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> > (to avoid extra indirection for callback implementation).
> > Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> >
> 
> What do you think split Rx/Tx callback into its own struct too?
> 
> Overall 'rte_eth_dev' can be split into three as:
> 1. rte_eth_dev
> 2. rte_eth_dev_burst
> 3. rte_eth_dev_cb
> 
> And we can hide 1 from applications even with the inline functions.

As discussed off-line, I think:
it is possible. 
My absolute preference would be to have just 1/2 (with CB hidden).
But even with 1/2/3 in place I think it would be  a good step forward.
Probably worth to start with 1/2/3 first and then see how difficult it
would be to switch to 1/2.
Do you plan to start working on it?
 
Konstantin





^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-14 15:54  3%         ` Ananyev, Konstantin
@ 2021-06-17 13:08  3%           ` Ferruh Yigit
  2021-06-17 14:58  0%             ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Ferruh Yigit @ 2021-06-17 13:08 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, jerinj, gakhil

On 6/14/2021 4:54 PM, Ananyev, Konstantin wrote:
> 
> 
>>>
>>> 14/06/2021 15:15, Bruce Richardson:
>>>> On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
>>>>>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
>>>>>> Sent: Monday, 14 June 2021 12.59
>>>>>>
>>>>>> Performance of access in a fixed-size array is very good
>>>>>> because of cache locality
>>>>>> and because there is a single pointer to dereference.
>>>>>> The only drawback is the lack of flexibility:
>>>>>> the size of such an array cannot be increase at runtime.
>>>>>>
>>>>>> An approach to this problem is to allocate the array at runtime,
>>>>>> being as efficient as static arrays, but still limited to a maximum.
>>>>>>
>>>>>> That's why the API rte_parray is introduced,
>>>>>> allowing to declare an array of pointer which can be resized
>>>>>> dynamically
>>>>>> and automatically at runtime while keeping a good read performance.
>>>>>>
>>>>>> After resize, the previous array is kept until the next resize
>>>>>> to avoid crashs during a read without any lock.
>>>>>>
>>>>>> Each element is a pointer to a memory chunk dynamically allocated.
>>>>>> This is not good for cache locality but it allows to keep the same
>>>>>> memory per element, no matter how the array is resized.
>>>>>> Cache locality could be improved with mempools.
>>>>>> The other drawback is having to dereference one more pointer
>>>>>> to read an element.
>>>>>>
>>>>>> There is not much locks, so the API is for internal use only.
>>>>>> This API may be used to completely remove some compilation-time
>>>>>> maximums.
>>>>>
>>>>> I get the purpose and overall intention of this library.
>>>>>
>>>>> I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime configurability.
>> It's
>>> my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no way for
>>> me to stop this progress, and I do not intend to oppose to this library. :-)
>>>>>
>>>>> This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
>> examples
>>> where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line between
>>> control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and shrink
>> in
>>> the fast path.
>>>>>
>>>>> If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
>>> RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for their
>>> application specific per-port runtime data, and this library could serve that purpose too.
>>>>>
>>>>
>>>> Thanks Thomas for starting this discussion and Morten for follow-up.
>>>>
>>>> My thinking is as follows, and I'm particularly keeping in mind the cases
>>>> of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
>>>>
>>>> While I dislike the hard-coded limits in DPDK, I'm also not convinced that
>>>> we should switch away from the flat arrays or that we need fully dynamic
>>>> arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
>>>> house here, where we keep the ethdevs as an array, but one allocated/sized
>>>> at runtime rather than statically. This would allow us to have a
>>>> compile-time default value, but, for use cases that need it, allow use of a
>>>> flag e.g.  "max-ethdevs" to change the size of the parameter given to the
>>>> malloc call for the array.  This max limit could then be provided to apps
>>>> too if they want to match any array sizes. [Alternatively those apps could
>>>> check the provided size and error out if the size has been increased beyond
>>>> what the app is designed to use?]. There would be no extra dereferences per
>>>> rx/tx burst call in this scenario so performance should be the same as
>>>> before (potentially better if array is in hugepage memory, I suppose).
>>>
>>> I think we need some benchmarks to decide what is the best tradeoff.
>>> I spent time on this implementation, but sorry I won't have time for benchmarks.
>>> Volunteers?
>>
>> I had only a quick look at your approach so far.
>> But from what I can read, in MT environment your suggestion will require
>> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
>> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
>> At least for rte_ethdevs[] and friends.
>> Konstantin
> 
> One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
> 1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
> We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
> any regressions.
> That could still be flat array with max_size specified at application startup.
> 2. Hide rest of rte_ethdev struct in .c.
> That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
> (flat array, vector, hash, linked list) without ABI/API breakages.
> 
> Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
> (to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
> Probably some macro can be provided to simplify it.
> 

We are already planning some tasks for ABI stability for v21.11, I think
splitting 'struct rte_eth_dev' can be part of that task, it enables hiding more
internal data.

> The only significant complication I can foresee with implementing that approach -
> we'll need a an array of 'fast' function pointers per queue, not per device as we have now
> (to avoid extra indirection for callback implementation).
> Though as a bonus we'll have ability to use different RX/TX funcions per queue.
> 

What do you think split Rx/Tx callback into its own struct too?

Overall 'rte_eth_dev' can be split into three as:
1. rte_eth_dev
2. rte_eth_dev_burst
3. rte_eth_dev_cb

And we can hide 1 from applications even with the inline functions.



^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-16 13:02  0%               ` Bruce Richardson
@ 2021-06-16 15:01  0%                 ` Morten Brørup
  0 siblings, 0 replies; 200+ results
From: Morten Brørup @ 2021-06-16 15:01 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Jerin Jacob, Thomas Monjalon, dpdk-dev, Olivier Matz,
	Andrew Rybchenko, Honnappa Nagarahalli, Ananyev, Konstantin,
	Ferruh Yigit, Jerin Jacob, Akhil Goyal

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> Sent: Wednesday, 16 June 2021 15.03
> 
> On Wed, Jun 16, 2021 at 01:27:17PM +0200, Morten Brørup wrote:
> > > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > > Sent: Wednesday, 16 June 2021 11.42
> > >
> > > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon
> <thomas@monjalon.net>
> > > wrote:
> > > >
> > > > 14/06/2021 17:48, Morten Brørup:
> > > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> > > Monjalon
> > > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > > something big enough to hold a sufficiently large array. And
> possibly
> > > add an rte_max_ethports variable to indicate the number of
> populated
> > > entries in the array, for use when iterating over the array.
> > > > >
> > > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > > this library provides a better benefit?
> > > >
> > > > What is big enough?
> > > > Is 640KB enough for RAM? ;)
> > >
> > > If I understand it correctly, Linux process allocates 640KB due to
> > > that fact currently
> > > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and
> it
> > > is from BSS.
> >
> > Correct.
> >
> > > If we make this from heap i.e use malloc() to allocate this memory
> > > then in my understanding Linux
> > > really won't allocate the real page for backend memory until
> unless,
> > > someone write/read to this memory.
> >
> > If the array is allocated from the heap, its members will be accessed
> though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This
> might affect performance, which is probably why the array is allocated
> the way it is.
> >
> 
> It depends on whether the array contains pointers to malloced elements
> or
> the array itself is just a single malloced array of all the structures.
> While I think the parray proposal referred to the former - which would
> have
> an extra level of indirection - the switch we are discussing here is
> the
> latter which should have no performance difference, since the method of
> accessing the elements will be the same, only with the base address
> pointing to a different area of memory.

I was not talking about an array of pointers. And it is not the same:

int arr[27];
int * parr = arr;

// direct access
int dir(int i) { return arr[i]; }

// indirect access
int indir(int i) { return parr[i]; }

The direct access knows the address of arr, so it will compile to:
        movsx   rdi, edi
        mov     eax, DWORD PTR arr[0+rdi*4]
        ret

The indirect access needs to first read the memory location holding the pointer to the array, and then it can read the array member, so it will compile to:
        mov     rax, QWORD PTR parr[rip]
        movsx   rdi, edi
        mov     eax, DWORD PTR [rax+rdi*4]
        ret

> 
> > Although it might be worth investigating how much it actually affects
> the performance.
> >
> > So we need to do something else if we want to conserve memory and
> still allow a large rte_eth_devices[] array.
> >
> > Looking at struct rte_eth_dev, we could reduce its size as follows:
> >
> > 1. Change the two callback arrays
> post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to
> callback arrays, which are allocated from the heap.
> > With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays
> are the sinners that make the struct rte_eth_dev use so much memory.
> This modification would save 16 KB (minus 16 bytes for the pointers to
> the two arrays) per port.
> > Furthermore, these callback arrays would only need to be allocated if
> the application is compiled with callbacks enabled (#define
> RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the
> actual number of queues for the port.
> >
> > The disadvantage is that this would add another level of indirection,
> although only for applications compiled with callbacks enabled.
> >
> This seems reasonable to at least investigate.
> 
> > 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64
> bytes per port. Not much, but worth considering if we are changing the
> API/ABI anyway.
> >
> I strongly dislike reserved fields to I would tend to favour these.
> However, it does possibly reduce future compatibility if we do need to
> add
> something to ethdev.

There should be an official policy about adding reserved fields for future compatibility. I'm against adding them, unless it can be argued that they are likely to match what is needed in the future; in the real world there is no way to know if they match future requirements.

> 
> Another option is to split ethdev into fast-path and non-fastpath parts
> -
> similar to Konstantin's suggestion of just having an array of the ops.
> We
> can have an array of minimal structures with fastpath ops and queue
> pointers, for example, with an ethdev-private pointer to the rest of
> the
> struct elsewhere in memory. Since that second struct would be allocated
> on-demand, the size of the ethdev array can be scaled with far smaller
> footprint.
> 
> /Bruce

The rte_eth_dev structures are really well organized now. E.g. the rx/tx function pointers and the pointer to the shared memory data of the driver are in the same cache line. We must be very careful if we change them.

Also, rte_ethdev.h and rte_ethdev_core.h are easy to read and understand.

-Morten

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-16 11:27  3%             ` Morten Brørup
  2021-06-16 12:00  0%               ` Jerin Jacob
@ 2021-06-16 13:02  0%               ` Bruce Richardson
  2021-06-16 15:01  0%                 ` Morten Brørup
  1 sibling, 1 reply; 200+ results
From: Bruce Richardson @ 2021-06-16 13:02 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Jerin Jacob, Thomas Monjalon, dpdk-dev, Olivier Matz,
	Andrew Rybchenko, Honnappa Nagarahalli, Ananyev, Konstantin,
	Ferruh Yigit, Jerin Jacob, Akhil Goyal

On Wed, Jun 16, 2021 at 01:27:17PM +0200, Morten Brørup wrote:
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Wednesday, 16 June 2021 11.42
> > 
> > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <thomas@monjalon.net>
> > wrote:
> > >
> > > 14/06/2021 17:48, Morten Brørup:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> > Monjalon
> > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > something big enough to hold a sufficiently large array. And possibly
> > add an rte_max_ethports variable to indicate the number of populated
> > entries in the array, for use when iterating over the array.
> > > >
> > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > this library provides a better benefit?
> > >
> > > What is big enough?
> > > Is 640KB enough for RAM? ;)
> > 
> > If I understand it correctly, Linux process allocates 640KB due to
> > that fact currently
> > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it
> > is from BSS.
> 
> Correct.
> 
> > If we make this from heap i.e use malloc() to allocate this memory
> > then in my understanding Linux
> > really won't allocate the real page for backend memory until unless,
> > someone write/read to this memory.
> 
> If the array is allocated from the heap, its members will be accessed though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect performance, which is probably why the array is allocated the way it is.
>

It depends on whether the array contains pointers to malloced elements or
the array itself is just a single malloced array of all the structures.
While I think the parray proposal referred to the former - which would have
an extra level of indirection - the switch we are discussing here is the
latter which should have no performance difference, since the method of
accessing the elements will be the same, only with the base address
pointing to a different area of memory.
 
> Although it might be worth investigating how much it actually affects the performance.
> 
> So we need to do something else if we want to conserve memory and still allow a large rte_eth_devices[] array.
> 
> Looking at struct rte_eth_dev, we could reduce its size as follows:
> 
> 1. Change the two callback arrays post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback arrays, which are allocated from the heap.
> With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the sinners that make the struct rte_eth_dev use so much memory. This modification would save 16 KB (minus 16 bytes for the pointers to the two arrays) per port.
> Furthermore, these callback arrays would only need to be allocated if the application is compiled with callbacks enabled (#define RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the actual number of queues for the port.
> 
> The disadvantage is that this would add another level of indirection, although only for applications compiled with callbacks enabled.
> 
This seems reasonable to at least investigate.

> 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per port. Not much, but worth considering if we are changing the API/ABI anyway.
> 
I strongly dislike reserved fields to I would tend to favour these.
However, it does possibly reduce future compatibility if we do need to add
something to ethdev.

Another option is to split ethdev into fast-path and non-fastpath parts -
similar to Konstantin's suggestion of just having an array of the ops. We
can have an array of minimal structures with fastpath ops and queue
pointers, for example, with an ethdev-private pointer to the rest of the
struct elsewhere in memory. Since that second struct would be allocated
on-demand, the size of the ethdev array can be scaled with far smaller
footprint.

/Bruce

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  2021-06-16 11:27  3%             ` Morten Brørup
@ 2021-06-16 12:00  0%               ` Jerin Jacob
  2021-06-16 13:02  0%               ` Bruce Richardson
  1 sibling, 0 replies; 200+ results
From: Jerin Jacob @ 2021-06-16 12:00 UTC (permalink / raw)
  To: Morten Brørup
  Cc: Thomas Monjalon, Bruce Richardson, dpdk-dev, Olivier Matz,
	Andrew Rybchenko, Honnappa Nagarahalli, Ananyev, Konstantin,
	Ferruh Yigit, Jerin Jacob, Akhil Goyal

On Wed, Jun 16, 2021 at 4:57 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> > Sent: Wednesday, 16 June 2021 11.42
> >
> > On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <thomas@monjalon.net>
> > wrote:
> > >
> > > 14/06/2021 17:48, Morten Brørup:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> > Monjalon
> > > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> > something big enough to hold a sufficiently large array. And possibly
> > add an rte_max_ethports variable to indicate the number of populated
> > entries in the array, for use when iterating over the array.
> > > >
> > > > Can we come up with another example than RTE_MAX_ETHPORTS where
> > this library provides a better benefit?
> > >
> > > What is big enough?
> > > Is 640KB enough for RAM? ;)
> >
> > If I understand it correctly, Linux process allocates 640KB due to
> > that fact currently
> > struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it
> > is from BSS.
>
> Correct.
>
> > If we make this from heap i.e use malloc() to allocate this memory
> > then in my understanding Linux
> > really won't allocate the real page for backend memory until unless,
> > someone write/read to this memory.
>
> If the array is allocated from the heap, its members will be accessed though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect performance, which is probably why the array is allocated the way it is.
>
> Although it might be worth investigating how much it actually affects the performance.

it should not. From CPU and compiler PoV it is same.
if see cryptodev, it is using following

static struct rte_cryptodev rte_crypto_devices[RTE_CRYPTO_MAX_DEVS];
struct rte_cryptodev *rte_cryptodevs = rte_crypto_devices;

And accessing  rte_cryptodevs[].

Also, this structure is not cache aligned. Probably need to fix it.


> So we need to do something else if we want to conserve memory and still allow a large rte_eth_devices[] array.
>
> Looking at struct rte_eth_dev, we could reduce its size as follows:
>
> 1. Change the two callback arrays post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback arrays, which are allocated from the heap.
> With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the sinners that make the struct rte_eth_dev use so much memory. This modification would save 16 KB (minus 16 bytes for the pointers to the two arrays) per port.
> Furthermore, these callback arrays would only need to be allocated if the application is compiled with callbacks enabled (#define RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the actual number of queues for the port.
>
> The disadvantage is that this would add another level of indirection, although only for applications compiled with callbacks enabled.

I think, we don't need one more indirection if all allocated from the
heap. as memory is not wasted if not touched by CPU.

>
> 2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per port. Not much, but worth considering if we are changing the API/ABI anyway.
>
>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  @ 2021-06-16 11:27  3%             ` Morten Brørup
  2021-06-16 12:00  0%               ` Jerin Jacob
  2021-06-16 13:02  0%               ` Bruce Richardson
  0 siblings, 2 replies; 200+ results
From: Morten Brørup @ 2021-06-16 11:27 UTC (permalink / raw)
  To: Jerin Jacob, Thomas Monjalon
  Cc: Bruce Richardson, dpdk-dev, Olivier Matz, Andrew Rybchenko,
	Honnappa Nagarahalli, Ananyev, Konstantin, Ferruh Yigit,
	Jerin Jacob, Akhil Goyal

> From: Jerin Jacob [mailto:jerinjacobk@gmail.com]
> Sent: Wednesday, 16 June 2021 11.42
> 
> On Tue, Jun 15, 2021 at 12:18 PM Thomas Monjalon <thomas@monjalon.net>
> wrote:
> >
> > 14/06/2021 17:48, Morten Brørup:
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas
> Monjalon
> > > It would be much simpler to just increase RTE_MAX_ETHPORTS to
> something big enough to hold a sufficiently large array. And possibly
> add an rte_max_ethports variable to indicate the number of populated
> entries in the array, for use when iterating over the array.
> > >
> > > Can we come up with another example than RTE_MAX_ETHPORTS where
> this library provides a better benefit?
> >
> > What is big enough?
> > Is 640KB enough for RAM? ;)
> 
> If I understand it correctly, Linux process allocates 640KB due to
> that fact currently
> struct rte_eth_dev rte_eth_devices[RTE_MAX_ETHPORTS] is global and it
> is from BSS.

Correct.

> If we make this from heap i.e use malloc() to allocate this memory
> then in my understanding Linux
> really won't allocate the real page for backend memory until unless,
> someone write/read to this memory.

If the array is allocated from the heap, its members will be accessed though a pointer to the array, e.g. in rte_eth_rx/tx_burst(). This might affect performance, which is probably why the array is allocated the way it is.

Although it might be worth investigating how much it actually affects the performance.

So we need to do something else if we want to conserve memory and still allow a large rte_eth_devices[] array.

Looking at struct rte_eth_dev, we could reduce its size as follows:

1. Change the two callback arrays post_rx/pre_tx_burst_cbs[RTE_MAX_QUEUES_PER_PORT] to pointers to callback arrays, which are allocated from the heap.
With the default RTE_MAX_QUEUES_PER_PORT of 1024, these two arrays are the sinners that make the struct rte_eth_dev use so much memory. This modification would save 16 KB (minus 16 bytes for the pointers to the two arrays) per port.
Furthermore, these callback arrays would only need to be allocated if the application is compiled with callbacks enabled (#define RTE_ETHDEV_RXTX_CALLBACKS). And they would only need to be sized to the actual number of queues for the port.

The disadvantage is that this would add another level of indirection, although only for applications compiled with callbacks enabled.

2. Remove reserved_64s[4] and reserved_ptrs[4]. This would save 64 bytes per port. Not much, but worth considering if we are changing the API/ABI anyway.



^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
  2021-06-15  7:48  0%       ` Thomas Monjalon
@ 2021-06-15 10:44  0%         ` Xia, Chenbo
  0 siblings, 0 replies; 200+ results
From: Xia, Chenbo @ 2021-06-15 10:44 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Liang, Cunming, Wu, Jingjing, Burakov, Anatoly, Yigit,
	Ferruh, mdr, nhorman, Richardson, Bruce, david.marchand, stephen,
	Ananyev, Konstantin, jgg, parav, xuemingl

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Tuesday, June 15, 2021 3:48 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; Liang, Cunming <cunming.liang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; mdr@ashroe.eu; nhorman@tuxdriver.com;
> Richardson, Bruce <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> jgg@nvidia.com; parav@nvidia.com; xuemingl@nvidia.com
> Subject: Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in
> DPDK
> 
> 15/06/2021 04:49, Xia, Chenbo:
> > From: Thomas Monjalon <thomas@monjalon.net>
> > > 01/06/2021 05:06, Chenbo Xia:
> > > > Hi everyone,
> > > >
> > > > This is a draft implementation of the mdev (Mediated device [1])
> > > > support in DPDK PCI bus driver. Mdev is a way to virtualize devices
> > > > in Linux kernel. Based on the device-api (mdev_type/device_api),
> > > > there could be different types of mdev devices (e.g. vfio-pci).
> > >
> > > Please could you illustrate with an usage of mdev in DPDK?
> > > What does it enable which is not possible today?
> >
> > The main purpose is for DPDK to drive mdev-based devices, which is not
> > possible today.
> >
> > I'd take PCI devices for an example. Currently DPDK can only drive devices
> > of physical pci bus under /sys/bus/pci and kernel exposes the pci devices
> > to APP in that way.
> >
> > But there are PCI devices using vfio-mdev as a software framework to expose
> > Mdev to APP under /sys/bus/mdev. Devices could choose this way of
> virtualizing
> > itself to let multiple APPs share one physical device. For example, Intel
> > Scalable IOV technology is known to use vfio-mdev as SW framework for
> Scalable
> > IOV enabled devices (and Intel net/crypto/raw devices support this tech).
> For
> > those mdev-based devices, DPDK needs support on the bus layer to
> scan/plug/probe/..
> > them, which is the main effort this patchset does. There are also other
> devices
> > using the vfio-mdev framework, AFAIK, Nvidia's GPU is the first one using
> mdev
> > and Intel's GPU virtualization also uses it.
> 
> Yes mdev was designed for virtualization I think.
> The use of mdev for Scalable IOV without virtualization
> may be seen as an abuse by Linux maintainers,
> as they currently seem to prefer the auxiliary bus (which is a real bus).
> 
> Mellanox got a push back when trying to use mdev for the same purpose
> (Scalable Function, also called Sub-Function) in the kernel.
> The Linux community decided to use the auxiliary bus.
> 
> Any other feedback on the choice mdev vs aux?

OK. Thanks for the info. Much appreciated.

I could investigate a bit about the choice and later come back to you.

> Is there any kernel code supporting this mdev model for Intel devices?

Now there's only intel GPU. But I think you care more about devices that DPDK could
drive: a dma device (DPDK's name ioat under raw/ioat) is on its way upstreaming
(https://www.spinics.net/lists/kvm/msg244417.html)

Thanks,
Chenbo

> 
> > > > In this patchset, the PCI bus driver is extended to support scanning
> > > > and probing the mdev devices whose device-api is "vfio-pci".
> > > >
> > > >                      +---------+
> > > >                      | PCI bus |
> > > >                      +----+----+
> > > >                           |
> > > >          +--------+-------+-------+--------+
> > > >          |        |               |        |
> > > >   Physical PCI devices ...   Mediated PCI devices ...
> > > >
> > > > The first four patches in this patchset are mainly preparation of mdev
> > > > bus support. The left two patches are the key implementation of mdev bus.
> > > >
> > > > The implementation of mdev bus in DPDK has several options:
> > > >
> > > > 1: Embed mdev bus in current pci bus
> > > >
> > > >    This patchset takes this option for an example. Mdev has several
> > > >    device types: pci/platform/amba/ccw/ap. DPDK currently only cares
> > > >    pci devices in all mdev device types so we could embed the mdev bus
> > > >    into current pci bus. Then pci bus with mdev support will scan/plug/
> > > >    unplug/.. not only normal pci devices but also mediated pci devices.
> > >
> > > I think it is a different bus.
> > > It would be cleaner to not touch the PCI bus.
> > > Having a separate bus will allow an easy way to identify a device
> > > with the new generic devargs syntax, example:
> > > 	bus=mdev,uuid=XXX
> > > or more complex:
> > > 	bus=mdev,uuid=XXX/class=crypto/driver=qat,foo=bar
> >
> > OK. Agree on cleaner to not touch PCI bus. And there may also be a
> 'type=pci'
> > as mdev has several types in its definition (pci/ap/platform/ccw/...).
> >
> > > > 2: A new mdev bus that scans mediated pci devices and probes mdev driver
> to
> > > >    plug-in pci devices to pci bus
> > > >
> > > >    If we took this option, a new mdev bus will be implemented to scan
> > > >    mediated pci devices and a new mdev driver for pci devices will be
> > > >    implemented in pci bus to plug-in mediated pci devices to pci bus.
> > > >
> > > >    Our RFC v1 takes this option:
> > > >    http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-
> > > tiwei.bie@intel.com/
> > > >
> > > >    Note that: for either option 1 or 2, device drivers do not know the
> > > >    implementation difference but only use structs/functions exposed by
> > > >    pci bus. Mediated pci devices are different from normal pci devices
> > > >    on: 1. Mediated pci devices use UUID as address but normal ones use
> BDF.
> > > >    2. Mediated pci devices may have some capabilities that normal pci
> > > >    devices do not have. For example, mediated pci devices could have
> > > >    regions that have sparse mmap capability, which allows a region to
> have
> > > >    multiple mmap areas. Another example is mediated pci devices may have
> > > >    regions/part of regions not mmaped but need to access them. Above
> > > >    difference will change the current ABI (i.e., struct rte_pci_device).
> > > >    Please check 5th and 6th patch for details.
> > > >
> > > > 3. A brand new mdev bus that does everything
> > > >
> > > >    This option will implement a new and standalone mdev bus. This option
> > > >    does not need any changes in current pci bus but only needs some
> shared
> > > >    code (linux vfio part) in pci bus. Drivers of devices that support
> mdev
> > > >    will register itself as a mdev driver and do not rely on pci bus
> anymore.
> > > >    This option, IMHO, will make the code clean. The only potential
> problem
> > > >    may be code duplication, which could be solved by making code of
> linux
> > > >    vfio part of pci bus common and shared.
> > >
> > > Yes I prefer this third option.
> > > We can find an elegant way of sharing some VFIO code between buses.
> >
> > Yes, I have not thought about the details of the code sharing but will try
> to make
> > it elegant.
> 
> Great, thanks.
> 


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH v2 0/3] Add PIE support for HQoS library
  2021-06-09 10:53  3% ` [dpdk-dev] [RFC PATCH v1 " Liguzinski, WojciechX
@ 2021-06-15  9:01  3%   ` Liguzinski, WojciechX
  2021-06-21  7:35  3%     ` [dpdk-dev] [RFC PATCH v3 " Liguzinski, WojciechX
  0 siblings, 1 reply; 200+ results
From: Liguzinski, WojciechX @ 2021-06-15  9:01 UTC (permalink / raw)
  To: dev, jasvinder.singh, cristian.dumitrescu; +Cc: savinay.dharmappa, megha.ajmera

DPDK sched library is equipped with mechanism that secures it from the bufferbloat problem
which is a situation when excess buffers in the network cause high latency and latency 
variation. Currently, it supports RED for active queue management (which is designed 
to control the queue length but it does not control latency directly and is now being 
obsoleted). However, more advanced queue management is required to address this problem
and provide desirable quality of service to users.

This solution (RFC) proposes usage of new algorithm called "PIE" (Proportional Integral
controller Enhanced) that can effectively and directly control queuing latency to address 
the bufferbloat problem.

The implementation of mentioned functionality includes modification of existing and 
adding a new set of data structures to the library, adding PIE related APIs. 
This affects structures in public API/ABI. That is why deprecation notice is going
to be prepared and sent.

Liguzinski, WojciechX (3):
  sched: add PIE based congestion management
  example/qos_sched: add PIE support
  example/ip_pipeline: add PIE support

 config/rte_config.h                      |   1 -
 drivers/net/softnic/rte_eth_softnic_tm.c |   6 +-
 examples/ip_pipeline/tmgr.c              |   6 +-
 examples/qos_sched/app_thread.c          |   1 -
 examples/qos_sched/cfg_file.c            |  82 ++++-
 examples/qos_sched/init.c                |   7 +-
 examples/qos_sched/profile.cfg           | 196 ++++++++----
 lib/sched/meson.build                    |  10 +-
 lib/sched/rte_pie.c                      |  78 +++++
 lib/sched/rte_pie.h                      | 389 +++++++++++++++++++++++
 lib/sched/rte_sched.c                    | 229 +++++++++----
 lib/sched/rte_sched.h                    |  53 ++-
 12 files changed, 877 insertions(+), 181 deletions(-)
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
  2021-06-15  2:49  0%     ` Xia, Chenbo
@ 2021-06-15  7:48  0%       ` Thomas Monjalon
  2021-06-15 10:44  0%         ` Xia, Chenbo
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-06-15  7:48 UTC (permalink / raw)
  To: Xia, Chenbo
  Cc: dev, Liang, Cunming, Wu, Jingjing, Burakov, Anatoly, Yigit,
	Ferruh, mdr, nhorman, Richardson, Bruce, david.marchand, stephen,
	Ananyev, Konstantin, jgg, parav, xuemingl

15/06/2021 04:49, Xia, Chenbo:
> From: Thomas Monjalon <thomas@monjalon.net>
> > 01/06/2021 05:06, Chenbo Xia:
> > > Hi everyone,
> > >
> > > This is a draft implementation of the mdev (Mediated device [1])
> > > support in DPDK PCI bus driver. Mdev is a way to virtualize devices
> > > in Linux kernel. Based on the device-api (mdev_type/device_api),
> > > there could be different types of mdev devices (e.g. vfio-pci).
> > 
> > Please could you illustrate with an usage of mdev in DPDK?
> > What does it enable which is not possible today?
> 
> The main purpose is for DPDK to drive mdev-based devices, which is not
> possible today.
> 
> I'd take PCI devices for an example. Currently DPDK can only drive devices
> of physical pci bus under /sys/bus/pci and kernel exposes the pci devices
> to APP in that way.
> 
> But there are PCI devices using vfio-mdev as a software framework to expose
> Mdev to APP under /sys/bus/mdev. Devices could choose this way of virtualizing
> itself to let multiple APPs share one physical device. For example, Intel
> Scalable IOV technology is known to use vfio-mdev as SW framework for Scalable
> IOV enabled devices (and Intel net/crypto/raw devices support this tech). For
> those mdev-based devices, DPDK needs support on the bus layer to scan/plug/probe/..
> them, which is the main effort this patchset does. There are also other devices
> using the vfio-mdev framework, AFAIK, Nvidia's GPU is the first one using mdev
> and Intel's GPU virtualization also uses it.

Yes mdev was designed for virtualization I think.
The use of mdev for Scalable IOV without virtualization
may be seen as an abuse by Linux maintainers,
as they currently seem to prefer the auxiliary bus (which is a real bus).

Mellanox got a push back when trying to use mdev for the same purpose
(Scalable Function, also called Sub-Function) in the kernel.
The Linux community decided to use the auxiliary bus.

Any other feedback on the choice mdev vs aux?
Is there any kernel code supporting this mdev model for Intel devices?

> > > In this patchset, the PCI bus driver is extended to support scanning
> > > and probing the mdev devices whose device-api is "vfio-pci".
> > >
> > >                      +---------+
> > >                      | PCI bus |
> > >                      +----+----+
> > >                           |
> > >          +--------+-------+-------+--------+
> > >          |        |               |        |
> > >   Physical PCI devices ...   Mediated PCI devices ...
> > >
> > > The first four patches in this patchset are mainly preparation of mdev
> > > bus support. The left two patches are the key implementation of mdev bus.
> > >
> > > The implementation of mdev bus in DPDK has several options:
> > >
> > > 1: Embed mdev bus in current pci bus
> > >
> > >    This patchset takes this option for an example. Mdev has several
> > >    device types: pci/platform/amba/ccw/ap. DPDK currently only cares
> > >    pci devices in all mdev device types so we could embed the mdev bus
> > >    into current pci bus. Then pci bus with mdev support will scan/plug/
> > >    unplug/.. not only normal pci devices but also mediated pci devices.
> > 
> > I think it is a different bus.
> > It would be cleaner to not touch the PCI bus.
> > Having a separate bus will allow an easy way to identify a device
> > with the new generic devargs syntax, example:
> > 	bus=mdev,uuid=XXX
> > or more complex:
> > 	bus=mdev,uuid=XXX/class=crypto/driver=qat,foo=bar
> 
> OK. Agree on cleaner to not touch PCI bus. And there may also be a 'type=pci'
> as mdev has several types in its definition (pci/ap/platform/ccw/...).
> 
> > > 2: A new mdev bus that scans mediated pci devices and probes mdev driver to
> > >    plug-in pci devices to pci bus
> > >
> > >    If we took this option, a new mdev bus will be implemented to scan
> > >    mediated pci devices and a new mdev driver for pci devices will be
> > >    implemented in pci bus to plug-in mediated pci devices to pci bus.
> > >
> > >    Our RFC v1 takes this option:
> > >    http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-
> > tiwei.bie@intel.com/
> > >
> > >    Note that: for either option 1 or 2, device drivers do not know the
> > >    implementation difference but only use structs/functions exposed by
> > >    pci bus. Mediated pci devices are different from normal pci devices
> > >    on: 1. Mediated pci devices use UUID as address but normal ones use BDF.
> > >    2. Mediated pci devices may have some capabilities that normal pci
> > >    devices do not have. For example, mediated pci devices could have
> > >    regions that have sparse mmap capability, which allows a region to have
> > >    multiple mmap areas. Another example is mediated pci devices may have
> > >    regions/part of regions not mmaped but need to access them. Above
> > >    difference will change the current ABI (i.e., struct rte_pci_device).
> > >    Please check 5th and 6th patch for details.
> > >
> > > 3. A brand new mdev bus that does everything
> > >
> > >    This option will implement a new and standalone mdev bus. This option
> > >    does not need any changes in current pci bus but only needs some shared
> > >    code (linux vfio part) in pci bus. Drivers of devices that support mdev
> > >    will register itself as a mdev driver and do not rely on pci bus anymore.
> > >    This option, IMHO, will make the code clean. The only potential problem
> > >    may be code duplication, which could be solved by making code of linux
> > >    vfio part of pci bus common and shared.
> > 
> > Yes I prefer this third option.
> > We can find an elegant way of sharing some VFIO code between buses.
> 
> Yes, I have not thought about the details of the code sharing but will try to make
> it elegant.

Great, thanks.



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
  2021-06-11  7:15  0%   ` Thomas Monjalon
@ 2021-06-15  2:49  0%     ` Xia, Chenbo
  2021-06-15  7:48  0%       ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Xia, Chenbo @ 2021-06-15  2:49 UTC (permalink / raw)
  To: Thomas Monjalon
  Cc: dev, Liang, Cunming, Wu, Jingjing, Burakov, Anatoly, Yigit,
	Ferruh, mdr, nhorman, Richardson, Bruce, david.marchand, stephen,
	Ananyev, Konstantin

Hi Thomas,

> -----Original Message-----
> From: Thomas Monjalon <thomas@monjalon.net>
> Sent: Friday, June 11, 2021 3:16 PM
> To: Xia, Chenbo <chenbo.xia@intel.com>
> Cc: dev@dpdk.org; Liang, Cunming <cunming.liang@intel.com>; Wu, Jingjing
> <jingjing.wu@intel.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Yigit,
> Ferruh <ferruh.yigit@intel.com>; mdr@ashroe.eu; nhorman@tuxdriver.com;
> Richardson, Bruce <bruce.richardson@intel.com>; david.marchand@redhat.com;
> stephen@networkplumber.org; Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Subject: Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in
> DPDK
> 
> 01/06/2021 05:06, Chenbo Xia:
> > Hi everyone,
> >
> > This is a draft implementation of the mdev (Mediated device [1])
> > support in DPDK PCI bus driver. Mdev is a way to virtualize devices
> > in Linux kernel. Based on the device-api (mdev_type/device_api),
> > there could be different types of mdev devices (e.g. vfio-pci).
> 
> Please could you illustrate with an usage of mdev in DPDK?
> What does it enable which is not possible today?

The main purpose is for DPDK to drive mdev-based devices, which is not
possible today.

I'd take PCI devices for an example. Currently DPDK can only drive devices
of physical pci bus under /sys/bus/pci and kernel exposes the pci devices
to APP in that way.

But there are PCI devices using vfio-mdev as a software framework to expose
Mdev to APP under /sys/bus/mdev. Devices could choose this way of virtualizing
itself to let multiple APPs share one physical device. For example, Intel
Scalable IOV technology is known to use vfio-mdev as SW framework for Scalable
IOV enabled devices (and Intel net/crypto/raw devices support this tech). For
those mdev-based devices, DPDK needs support on the bus layer to scan/plug/probe/..
them, which is the main effort this patchset does. There are also other devices
using the vfio-mdev framework, AFAIK, Nvidia's GPU is the first one using mdev
and Intel's GPU virtualization also uses it.

> 
> > In this patchset, the PCI bus driver is extended to support scanning
> > and probing the mdev devices whose device-api is "vfio-pci".
> >
> >                      +---------+
> >                      | PCI bus |
> >                      +----+----+
> >                           |
> >          +--------+-------+-------+--------+
> >          |        |               |        |
> >   Physical PCI devices ...   Mediated PCI devices ...
> >
> > The first four patches in this patchset are mainly preparation of mdev
> > bus support. The left two patches are the key implementation of mdev bus.
> >
> > The implementation of mdev bus in DPDK has several options:
> >
> > 1: Embed mdev bus in current pci bus
> >
> >    This patchset takes this option for an example. Mdev has several
> >    device types: pci/platform/amba/ccw/ap. DPDK currently only cares
> >    pci devices in all mdev device types so we could embed the mdev bus
> >    into current pci bus. Then pci bus with mdev support will scan/plug/
> >    unplug/.. not only normal pci devices but also mediated pci devices.
> 
> I think it is a different bus.
> It would be cleaner to not touch the PCI bus.
> Having a separate bus will allow an easy way to identify a device
> with the new generic devargs syntax, example:
> 	bus=mdev,uuid=XXX
> or more complex:
> 	bus=mdev,uuid=XXX/class=crypto/driver=qat,foo=bar

OK. Agree on cleaner to not touch PCI bus. And there may also be a 'type=pci'
as mdev has several types in its definition (pci/ap/platform/ccw/...).

> 
> > 2: A new mdev bus that scans mediated pci devices and probes mdev driver to
> >    plug-in pci devices to pci bus
> >
> >    If we took this option, a new mdev bus will be implemented to scan
> >    mediated pci devices and a new mdev driver for pci devices will be
> >    implemented in pci bus to plug-in mediated pci devices to pci bus.
> >
> >    Our RFC v1 takes this option:
> >    http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-
> tiwei.bie@intel.com/
> >
> >    Note that: for either option 1 or 2, device drivers do not know the
> >    implementation difference but only use structs/functions exposed by
> >    pci bus. Mediated pci devices are different from normal pci devices
> >    on: 1. Mediated pci devices use UUID as address but normal ones use BDF.
> >    2. Mediated pci devices may have some capabilities that normal pci
> >    devices do not have. For example, mediated pci devices could have
> >    regions that have sparse mmap capability, which allows a region to have
> >    multiple mmap areas. Another example is mediated pci devices may have
> >    regions/part of regions not mmaped but need to access them. Above
> >    difference will change the current ABI (i.e., struct rte_pci_device).
> >    Please check 5th and 6th patch for details.
> >
> > 3. A brand new mdev bus that does everything
> >
> >    This option will implement a new and standalone mdev bus. This option
> >    does not need any changes in current pci bus but only needs some shared
> >    code (linux vfio part) in pci bus. Drivers of devices that support mdev
> >    will register itself as a mdev driver and do not rely on pci bus anymore.
> >    This option, IMHO, will make the code clean. The only potential problem
> >    may be code duplication, which could be solved by making code of linux
> >    vfio part of pci bus common and shared.
> 
> Yes I prefer this third option.
> We can find an elegant way of sharing some VFIO code between buses.

Yes, I have not thought about the details of the code sharing but will try to make
it elegant.

Thanks,
Chenbo

> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-10  9:22  4%                 ` Olivier Matz
@ 2021-06-14 16:36  4%                   ` Andrew Rybchenko
  2021-06-17 16:29  0%                     ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-06-14 16:36 UTC (permalink / raw)
  To: Olivier Matz, Gregory Etelson
  Cc: Iremonger, Bernard, Morten Brørup, dev, Matan Azrad,
	Ori Kam, Raslan Darawsheh, Asaf Penso, Thomas Monjalon,
	Ferruh Yigit

On 6/10/21 12:22 PM, Olivier Matz wrote:
> Hi Gregory,
> 
> On Thu, Jun 10, 2021 at 04:10:25AM +0000, Gregory Etelson wrote:
>> Hello,
>>
>> There was no activity that patch for a long time.
>> The patch is marked as failed, but we verified failed tests and concluded that the failures can be ignored.
>> https://patchwork.dpdk.org/project/dpdk/patch/20210527152858.13312-1-getelson@nvidia.com/
>> How should I proceed with this case ?
>> Please advise.
>>
> 
> I like the idea of this patch: to me it is more convenient to access to
> these fields with a bitfield. I don't see a problem about using
> bitfields here, glibc or FreeBSD netinet/ip.h are doing the same.
> 
> However, as stated previously, this patch breaks the initialization API.

Very good point. I guess we overlooked it in a number of patches
with fix RTE flow API items to start from corresponding network
headers. We used unions there to avoid ABI breakage, but it looks
like we have broken initialization API anyway.

We should decide if initialization ABI breakage is a show-stopper
for RTE flow API items switching to use network protocol headers.

> The DPDK ABI/API policy is described here:
> http://doc.dpdk.org/guides/contributing/abi_policy.html#the-dpdk-abi-policy
> 
>>From this document:
> 
>    The API should only be changed for significant reasons, such as
>    performance enhancements. API breakages due to changes such as
>    reorganizing public structure fields for aesthetic or readability
>    purposes should be avoided.
> 
> So to follow the project policy, I think we should reject this path.
> 
> Regards,
> Olivier
> 


^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
  @ 2021-06-14 15:54  3%         ` Ananyev, Konstantin
  2021-06-17 13:08  3%           ` Ferruh Yigit
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-06-14 15:54 UTC (permalink / raw)
  To: Ananyev, Konstantin, Thomas Monjalon, Richardson, Bruce
  Cc: Morten Brørup, dev, olivier.matz, andrew.rybchenko,
	honnappa.nagarahalli, Yigit, Ferruh, jerinj, gakhil



> >
> > 14/06/2021 15:15, Bruce Richardson:
> > > On Mon, Jun 14, 2021 at 02:22:42PM +0200, Morten Brørup wrote:
> > > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Thomas Monjalon
> > > > > Sent: Monday, 14 June 2021 12.59
> > > > >
> > > > > Performance of access in a fixed-size array is very good
> > > > > because of cache locality
> > > > > and because there is a single pointer to dereference.
> > > > > The only drawback is the lack of flexibility:
> > > > > the size of such an array cannot be increase at runtime.
> > > > >
> > > > > An approach to this problem is to allocate the array at runtime,
> > > > > being as efficient as static arrays, but still limited to a maximum.
> > > > >
> > > > > That's why the API rte_parray is introduced,
> > > > > allowing to declare an array of pointer which can be resized
> > > > > dynamically
> > > > > and automatically at runtime while keeping a good read performance.
> > > > >
> > > > > After resize, the previous array is kept until the next resize
> > > > > to avoid crashs during a read without any lock.
> > > > >
> > > > > Each element is a pointer to a memory chunk dynamically allocated.
> > > > > This is not good for cache locality but it allows to keep the same
> > > > > memory per element, no matter how the array is resized.
> > > > > Cache locality could be improved with mempools.
> > > > > The other drawback is having to dereference one more pointer
> > > > > to read an element.
> > > > >
> > > > > There is not much locks, so the API is for internal use only.
> > > > > This API may be used to completely remove some compilation-time
> > > > > maximums.
> > > >
> > > > I get the purpose and overall intention of this library.
> > > >
> > > > I probably already mentioned that I prefer "embedded style programming" with fixed size arrays, rather than runtime configurability.
> It's
> > my personal opinion, and the DPDK Tech Board clearly prefers reducing the amount of compile time configurability, so there is no way for
> > me to stop this progress, and I do not intend to oppose to this library. :-)
> > > >
> > > > This library is likely to become a core library of DPDK, so I think it is important getting it right. Could you please mention a few
> examples
> > where you think this internal library should be used, and where it should not be used. Then it is easier to discuss if the border line between
> > control path and data plane is correct. E.g. this library is not intended to be used for dynamically sized packet queues that grow and shrink
> in
> > the fast path.
> > > >
> > > > If the library becomes a core DPDK library, it should probably be public instead of internal. E.g. if the library is used to make
> > RTE_MAX_ETHPORTS dynamic instead of compile time fixed, then some applications might also need dynamically sized arrays for their
> > application specific per-port runtime data, and this library could serve that purpose too.
> > > >
> > >
> > > Thanks Thomas for starting this discussion and Morten for follow-up.
> > >
> > > My thinking is as follows, and I'm particularly keeping in mind the cases
> > > of e.g. RTE_MAX_ETHPORTS, as a leading candidate here.
> > >
> > > While I dislike the hard-coded limits in DPDK, I'm also not convinced that
> > > we should switch away from the flat arrays or that we need fully dynamic
> > > arrays that grow/shrink at runtime for ethdevs. I would suggest a half-way
> > > house here, where we keep the ethdevs as an array, but one allocated/sized
> > > at runtime rather than statically. This would allow us to have a
> > > compile-time default value, but, for use cases that need it, allow use of a
> > > flag e.g.  "max-ethdevs" to change the size of the parameter given to the
> > > malloc call for the array.  This max limit could then be provided to apps
> > > too if they want to match any array sizes. [Alternatively those apps could
> > > check the provided size and error out if the size has been increased beyond
> > > what the app is designed to use?]. There would be no extra dereferences per
> > > rx/tx burst call in this scenario so performance should be the same as
> > > before (potentially better if array is in hugepage memory, I suppose).
> >
> > I think we need some benchmarks to decide what is the best tradeoff.
> > I spent time on this implementation, but sorry I won't have time for benchmarks.
> > Volunteers?
> 
> I had only a quick look at your approach so far.
> But from what I can read, in MT environment your suggestion will require
> extra synchronization for each read-write access to such parray element (lock, rcu, ...).
> I think what Bruce suggests will be much ligther, easier to implement and less error prone.
> At least for rte_ethdevs[] and friends.
> Konstantin

One more thought here - if we are talking about rte_ethdev[] in particular, I think  we can:
1. move public function pointers (rx_pkt_burst(), etc.) from rte_ethdev into a separate flat array.
We can keep it public to still use inline functions for 'fast' calls rte_eth_rx_burst(), etc. to avoid
any regressions.
That could still be flat array with max_size specified at application startup.
2. Hide rest of rte_ethdev struct in .c.
That will allow us to change the struct itself and the whole rte_ethdev[] table in a way we like
(flat array, vector, hash, linked list) without ABI/API breakages.

Yes, it would require all PMDs to change prototype for pkt_rx_burst() function
(to accept port_id, queue_id instead of queue pointer), but the change is mechanical one.
Probably some macro can be provided to simplify it.

The only significant complication I can foresee with implementing that approach -
we'll need a an array of 'fast' function pointers per queue, not per device as we have now
(to avoid extra indirection for callback implementation).
Though as a bonus we'll have ability to use different RX/TX funcions per queue.




 


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-08 13:10  0%   ` Xueming(Steven) Li
@ 2021-06-14 12:39  0%     ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-06-14 12:39 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: dev, John McNamara, Luca Boccassi, NBU-Contact-Thomas Monjalon,
	Christian Ehrhardt, Ferruh Yigit, David Marchand



> -----Original Message-----
> From: Xueming(Steven) Li
> Sent: Tuesday, June 8, 2021 9:10 PM
> To: Kevin Traynor <ktraynor@redhat.com>
> Cc: dev@dpdk.org; John McNamara <john.mcnamara@intel.com>; Luca Boccassi <bluca@debian.org>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; Christian Ehrhardt <christian.ehrhardt@canonical.com>; Ferruh Yigit <ferruh.yigit@intel.com>;
> David Marchand <david.marchand@redhat.com>
> Subject: RE: 20.11.2 patches review and test
> 
> 
> 
> > -----Original Message-----
> > From: Kevin Traynor <ktraynor@redhat.com>
> > Sent: Tuesday, June 8, 2021 7:31 PM
> > To: Xueming(Steven) Li <xuemingl@nvidia.com>
> > Cc: dev@dpdk.org; John McNamara <john.mcnamara@intel.com>; Luca
> > Boccassi <bluca@debian.org>; NBU-Contact-Thomas Monjalon
> > <thomas@monjalon.net>; Christian Ehrhardt
> > <christian.ehrhardt@canonical.com>; Ferruh Yigit
> > <ferruh.yigit@intel.com>; David Marchand <david.marchand@redhat.com>
> > Subject: Re: 20.11.2 patches review and test
> >
> > (reduced Cc)
> >
> > Hi Steven,
> >
> > On 01/06/2021 08:54, Xueming(Steven) Li wrote:
> > > Hi all,
> > >
> > > Here is a list of patches targeted for stable release 20.11.2.
> > >
> > > The planned date for the final release is 15th June.
> > >
> > > Please help with testing and validation of your use cases and report
> > > any issues/results with reply-all to this mail. For the final
> > > release the fixes and reported validations will be added to the release notes.
> > >
> > > A release candidate tarball can be found at:
> > >
> > >     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> > >
> > > These patches are located at branch 20.11 of dpdk-stable repo:
> > >     https://dpdk.org/browse/dpdk-stable/
> > >
> >
> > Is the list of patches up to 21.05? Did you drop the fixes for
> > GCC11/clang12? I didn't see them here or in the failed list. I think there is a couple that didn't get the right tags, but the ones that did
> seem missing too.
> 
> You are correct, some fixes from v21.05rc1 - v21.05 are missing.
> Seems an issue caused by ./devtools/git-log-fixes.sh, if running scripts with other branches checked out, some patches are hidden.

Fixed the scripts with 2 patches:
1. look for stable version tag with name pattern.
2. auto resolve branch used to lookup fixes.
http://patchwork.dpdk.org/project/dpdk/list/?series=17303


> I will make another scan soon.
> 
> >
> > It would mean that 20.11.2 would not compile on the latest Fedora (34) with the distro packaged compiler versions.
> >
> > Kevin.
> >
> > >
> > > Thanks.
> > >
> > > Xueming Li <xuemingl@nvidia.com>
> > >
> > > ---
> > > Ajit Khaparde (3):
> > >       net/bnxt: fix RSS context cleanup
> > >       net/bnxt: check kvargs parsing
> > >       net/bnxt: fix resource cleanup
> > >
> > > Alvin Zhang (7):
> > >       net/ice: fix VLAN filter with PF
> > >       net/i40e: fix input set field mask
> > >       net/igc: fix Rx RSS hash offload capability
> > >       net/igc: fix Rx error counter for bad length
> > >       net/e1000: fix Rx error counter for bad length
> > >       net/e1000: fix max Rx packet size
> > >       net/igc: fix Rx packet size
> > >
> > > Anatoly Burakov (2):
> > >       fbarray: fix log message on truncation error
> > >       power: do not skip saving original P-state governor
> > >
> > > Andrew Boyer (1):
> > >       net/ionic: fix completion type in lif init
> > >
> > > Andrew Rybchenko (3):
> > >       net/failsafe: fix RSS hash offload reporting
> > >       net/failsafe: report minimum and maximum MTU
> > >       common/sfc_efx: remove GENEVE from supported tunnels
> > >
> > > Ankur Dwivedi (1):
> > >       crypto/octeontx: fix session-less mode
> > >
> > > Apeksha Gupta (1):
> > >       examples/l2fwd-crypto: skip masked devices
> > >
> > > Arek Kusztal (1):
> > >       crypto/qat: fix offset for out-of-place scatter-gather
> > >
> > > Beilei Xing (1):
> > >       net/i40evf: fix packet loss for X722
> > >
> > > Bruce Richardson (1):
> > >       build: exclude meson files from examples installation
> > >
> > > Chenbo Xia (1):
> > >       examples/vhost: check memory table query
> > >
> > > Chengchang Tang (15):
> > >       net/hns3: fix HW buffer size on MTU update
> > >       net/hns3: fix processing Tx offload flags
> > >       net/hns3: fix Tx checksum for UDP packets with special port
> > >       net/hns3: fix long task queue pairs reset time
> > >       ethdev: validate input in module EEPROM dump
> > >       ethdev: validate input in register info
> > >       ethdev: validate input in EEPROM info
> > >       net/hns3: fix rollback after setting PVID failure
> > >       net/hns3: fix timing in resetting queues
> > >       net/hns3: fix queue state when concurrent with reset
> > >       net/hns3: fix configure FEC when concurrent with reset
> > >       net/hns3: fix use of command status enumeration
> > >       examples: add eal cleanup to examples
> > >       net/bonding: fix adding itself as its slave
> > >       net/hns3: fix timing in mailbox
> > >
> > > Chengwen Feng (15):
> > >       net/hns3: fix flow counter value
> > >       net/hns3: fix VF mailbox head field
> > >       net/hns3: support get device version when dump register
> > >       net/hns3: fix some packet types
> > >       net/hns3: fix missing outer L4 UDP flag for VXLAN
> > >       net/hns3: remove VLAN/QinQ ptypes from support list
> > >       test: check thread creation
> > >       common/dpaax: fix possible null pointer access
> > >       examples/ethtool: remove unused parsing
> > >       net/hns3: fix flow director lock
> > >       net/e1000/base: fix timeout for shadow RAM write
> > >       net/hns3: fix setting default MAC address in bonding of VF
> > >       net/hns3: fix possible mismatched response of mailbox
> > >       net/hns3: fix VF handling LSC event in secondary process
> > >       net/hns3: fix verification of NEON support
> > >
> > > Ciara Loftus (1):
> > >       net/af_xdp: fix error handling during Rx queue setup
> > >
> > > Conor Walsh (1):
> > >       examples/l3fwd: fix LPM IPv6 subnets
> > >
> > > Cristian Dumitrescu (3):
> > >       table: fix actions with different data size
> > >       pipeline: fix instruction translation
> > >       pipeline: fix endianness conversions
> > >
> > > Dapeng Yu (3):
> > >       net/igc: remove MTU setting limitation
> > >       net/e1000: remove MTU setting limitation
> > >       examples/packet_ordering: fix port configuration
> > >
> > > David Harton (1):
> > >       net/ena: fix releasing Tx ring mbufs
> > >
> > > David Marchand (8):
> > >       doc: fix sphinx rtd theme import in GHA
> > >       service: clean references to removed symbol
> > >       eal: fix evaluation of log level option
> > >       ci: hook to GitHub Actions
> > >       ci: enable v21 ABI checks
> > >       ci: fix package installation in GitHub Actions
> > >       ci: ignore APT update failure in GitHub Actions
> > >       ci: catch coredumps
> > >
> > > Dekel Peled (1):
> > >       common/mlx5: fix DevX read output buffer size
> > >
> > > Dmitry Kozlyuk (3):
> > >       net/pcap: fix format string
> > >       eal/windows: add missing SPDX license tag
> > >       buildtools: fix all drivers disabled on Windows
> > >
> > > Ed Czeck (2):
> > >       net/ark: update packet director initial state
> > >       net/ark: refactor Rx buffer recovery
> > >
> > > Elad Nachman (2):
> > >       kni: support async user request
> > >       kni: fix kernel deadlock with bifurcated device
> > >
> > > Feifei Wang (2):
> > >       net/i40e: fix parsing packet type for NEON
> > >       test/trace: fix race on collected perf data
> > >
> > > Ferruh Yigit (3):
> > >       power: remove duplicated symbols from map file
> > >       log/linux: make default output stderr
> > >       license: fix typos
> > >
> > > Guoyang Zhou (1):
> > >      net/hinic: fix crash in secondary process
> > >
> > > Haiyue Wang (1):
> > >       net/ixgbe: fix Rx errors statistics for UDP checksum
> > >
> > > Harman Kalra (1):
> > >       event/octeontx2: fix device reconfigure for single slot
> > >
> > > Hongbo Zheng (3):
> > >       app/testpmd: fix Tx/Rx descriptor query error log
> > >       net/hns3: fix FLR miss detection
> > >       net/hns3: delete redundant blank line
> > >
> > > Huisong Li (11):
> > >       net/hns3: fix device capabilities for copper media type
> > >       net/hns3: remove unused parameter markers
> > >       net/hns3: fix reporting undefined speed
> > >       net/hns3: fix link update when failed to get link info
> > >       net/hns3: fix flow control exception
> > >       app/testpmd: fix bitmap of link speeds when force speed
> > >       net/hns3: fix flow control mode
> > >       net/hns3: remove redundant mailbox response
> > >       net/hns3: fix DCB mode check
> > >       net/hns3: fix VMDq mode check
> > >       net/hns3: fix mbuf leakage
> > >
> > > Ibtisam Tariq (1):
> > >       examples/vhost_crypto: remove unused short option
> > >
> > > Igor Russkikh (2):
> > >       net/qede: reduce log verbosity
> > >       net/qede: accept bigger RSS table
> > >
> > > Ilya Maximets (1):
> > >       net/virtio: fix interrupt unregistering for listening socket
> > >
> > > Ivan Malov (5):
> > >       net/sfc: fix buffer size for flow parse
> > >       net: fix comment in IPv6 header
> > >       net/sfc: fix error path inconsistency
> > >       common/sfc_efx/base: fix indication of MAE encap support
> > >       net/sfc: fix outer rule rollback on error
> > >
> > > Jiawei Wang (2):
> > >       app/testpmd: fix NVGRE encap configuration
> > >       net/mlx5: fix resource release for mirror flow
> > >
> > > Jiawei Zhu (1):
> > >       net/mlx5: fix Rx segmented packets on mbuf starvation
> > >
> > > Jiawen Wu (3):
> > >       net/txgbe: remove unused functions
> > >       net/txgbe: fix Rx missed packet counter
> > >       net/txgbe: update packet type
> > >
> > > John Daley (1):
> > >       net/enic: fix flow initialization error handling
> > >
> > > Kalesh AP (18):
> > >       net/bnxt: remove unused macro
> > >       net/bnxt: fix VNIC configuration
> > >       net/bnxt: fix firmware fatal error handling
> > >       net/bnxt: fix FW readiness check during recovery
> > >       net/bnxt: fix device readiness check
> > >       net/bnxt: fix VF info allocation
> > >       net/bnxt: fix HWRM and FW incompatibility handling
> > >       net/bnxt: mute some failure logs
> > >       app/testpmd: check MAC address query
> > >       net/bnxt: fix PCI write check
> > >       net/bnxt: fix link state operations
> > >       net/bnxt: fix timesync when PTP is not supported
> > >       net/bnxt: fix memory allocation for command response
> > >       net/bnxt: fix double free in port start failure
> > >       net/bnxt: fix configuring LRO
> > >       net/bnxt: fix health check alarm cancellation
> > >       net/bnxt: fix PTP support for Thor
> > >       net/bnxt: fix ring count calculation for Thor
> > >
> > > Kevin Traynor (1):
> > >       test/cmdline: fix inputs array
> > >
> > > Lance Richardson (6):
> > >       net/bnxt: fix Rx buffer posting
> > >       net/bnxt: fix Tx length hint threshold
> > >       net/bnxt: fix handling of null flow mask
> > >       test: fix TCP header initialization
> > >       net/bnxt: fix Rx descriptor status
> > >       net/bnxt: fix Rx queue count
> > >
> > > Leyi Rong (1):
> > >       net/iavf: fix packet length parsing in AVX512
> > >
> > > Li Zhang (1):
> > >       net/mlx5: fix flow actions index in cache
> > >
> > > Luc Pelletier (2):
> > >       eal: fix race in control thread creation
> > >       eal: fix hang in control thread creation
> > >
> > > Marvin Liu (5):
> > >       vhost: fix split ring potential buffer overflow
> > >       vhost: fix packed ring potential buffer overflow
> > >       vhost: fix batch dequeue potential buffer overflow
> > >       vhost: fix initialization of temporary header
> > >       vhost: fix initialization of async temporary header
> > >
> > > Matan Azrad (4):
> > >       common/mlx5/linux: add glue function to query WQ
> > >       common/mlx5: add DevX command to query WQ
> > >       common/mlx5: add DevX commands for queue counters
> > >       vdpa/mlx5: fix virtq cleaning
> > >
> > > Min Hu (Connor) (8):
> > >       net/hns3: fix MTU config complexity
> > >       net/hns3: update HiSilicon copyright syntax
> > >       net/hns3: fix copyright date
> > >       examples/ptpclient: remove wrong comment
> > >       test/bpf: fix error message
> > >       doc: fix HiSilicon copyright syntax
> > >       net/hns3: remove unused macros
> > >       net/hns3: remove unused macro
> > >
> > > Murphy Yang (3):
> > >       net/ixgbe: fix RSS RETA being reset after port start
> > >       net/i40e: fix flow director config after flow validate
> > >       net/i40e: fix flow director for common pctypes
> > >
> > > Natanael Copa (5):
> > >       common/dpaax/caamflib: fix build with musl
> > >       bus/dpaa: fix 64-bit arch detection
> > >       bus/dpaa: fix build with musl
> > >       net/cxgbe: remove use of uint type
> > >       app/testpmd: fix build with musl
> > >
> > > Nipun Gupta (1):
> > >       bus/dpaa: fix statistics reading
> > >
> > > Nithin Dabilpuram (3):
> > >       vfio: do not merge contiguous areas
> > >       vfio: fix DMA mapping granularity for IOVA as VA
> > >       test/mem: fix page size for external memory
> > >
> > > Pallavi Kadam (1):
> > >       bus/pci: skip probing some Windows NDIS devices
> > >
> > > Pavan Nikhilesh (2):
> > >       test/event: fix timeout accuracy
> > >       app/eventdev: fix timeout accuracy
> > >
> > > Pu Xu (1):
> > >       ip_frag: fix fragmenting IPv4 packet with header option
> > >
> > > Qi Zhang (7):
> > >       net/ice/base: fix payload indicator on ptype
> > >       net/ice/base: fix uninitialized struct
> > >       net/ice/base: cleanup filter list on error
> > >       net/ice/base: fix memory allocation for MAC addresses
> > >       net/iavf: fix TSO max segment size
> > >       doc: fix matching versions in ice guide
> > >       net/iavf: fix wrong Tx context descriptor
> > >
> > > Radha Mohan Chintakuntla (1):
> > >       raw/octeontx2_dma: assign PCI device in DPI VF
> > >
> > > Raslan Darawsheh (1):
> > >       ethdev: update flow item GTP QFI definition
> > >
> > > Richael Zhuang (2):
> > >       test/power: add delay before checking CPU frequency
> > >       test/power: round CPU frequency to check
> > >
> > > Robin Zhang (4):
> > >       net/i40e: announce request queue capability in PF
> > >       doc: update recommended versions for i40e
> > >       net/i40e: fix lack of MAC type when set MAC address
> > >       net/iavf: fix lack of MAC type when set MAC address
> > >
> > > Rohit Raj (3):
> > >       net/dpaa2: fix getting link status
> > >       net/dpaa: fix getting link status
> > >       examples/l2fwd-crypto: fix packet length while decryption
> > >
> > > Roy Shterman (1):
> > >       mem: fix freeing segments in --huge-unlink mode
> > >
> > > Satheesh Paul (1):
> > >       net/octeontx2: fix VLAN filter
> > >
> > > Savinay Dharmappa (1):
> > >       sched: fix traffic class oversubscription parameter
> > >
> > > Shijith Thotton (1):
> > >       eventdev: fix case to initiate crypto adapter service
> > >
> > > Siwar Zitouni (1):
> > >       net/ice: fix disabling promiscuous mode
> > Somnath Kotur (3):
> > >       net/bnxt: fix xstats get
> > >       net/bnxt: fix Rx and Tx timestamps
> > >       net/bnxt: fix Tx timestamp init
> > >
> > > Stanislaw Kardach (1):
> > >       test: proceed if timer subsystem already initialized
> > >
> > > Stephen Hemminger (1):
> > >       kni: refactor user request processing
> > >
> > > Tal Shnaiderman (2):
> > >       eal/windows: fix default thread priority
> > >       eal/windows: fix return codes of pthread shim layer
> > >
> > > Tengfei Zhang (1):
> > >       net/pcap: fix file descriptor leak on close
> > >
> > > Thinh Tran (1):
> > >       test: fix autotest handling of skipped tests
> > >
> > > Thomas Monjalon (16):
> > >       bus/pci: fix Windows kernel driver categories
> > >       eal: fix comment of OS-specific header files
> > >       buildtools: fix build with busybox
> > >       build: detect execinfo library on Linux
> > >       build: remove redundant _GNU_SOURCE definitions
> > >       eal: fix build with musl
> > >       net/igc: remove use of uint type
> > >       event/dlb: fix header includes for musl
> > >       examples/bbdev: fix header include for musl
> > >       drivers: fix log level after loading
> > >       app/regex: fix usage text
> > >       app/testpmd: fix usage text
> > >       doc: fix names of UIO drivers
> > >       doc: fix build with Sphinx 4
> > >       bus/pci: support I/O port operations with musl
> > >       app: fix exit messages
> > >
> > > Tyler Retzlaff (1):
> > >       eal: add C++ include guard for reciprocal header
> > >
> > > Vadim Podovinnikov (1):
> > >       net/bonding: fix LACP system address check
> > >
> > > Venkat Duvvuru (1):
> > >       net/bnxt: fix queues per VNIC
> > >
> > > Viacheslav Ovsiienko (11):
> > >       net/mlx5: fix external buffer pool registration for Rx queue
> > >       net/mlx5: fix metadata item validation for ingress flows
> > >       net/mlx5: fix hashed list size for tunnel flow groups
> > >       net/mlx5: fix UAR allocation diagnostics messages
> > >       common/mlx5: add timestamp format support to DevX
> > >       vdpa/mlx5: support timestamp format
> > >       net/mlx5: fix Rx metadata leftovers
> > >       net/mlx5: fix drop action for Direct Rules/Verbs
> > >       net/mlx4: fix RSS action with null hash key
> > >       net/mlx5: support timestamp format
> > >       regex/mlx5: support timestamp format
> > >
> > > Wenjun Wu (2):
> > >       net/ice: check some functions return
> > >       net/ice: fix RSS hash update
> > >
> > > Wenwu Ma (1):
> > >       net/ice: fix illegal access when removing MAC filter
> > >
> > > Wenzhuo Lu (2):
> > >       net/iavf: fix crash in AVX512
> > >       net/ice: fix crash in AVX512
> > >
> > > Wisam Jaddo (1):
> > >       app/flow-perf: fix encap/decap actions
> > >
> > > Xiao Wang (1):
> > >       vdpa/ifc: check PCI config read
> > >
> > > Xiaoyu Min (4):
> > >       net/mlx5: support RSS expansion for IPv6 GRE
> > >       net/mlx5: fix shared inner RSS
> > >       net/mlx5: fix missing shared RSS hash types
> > >       net/mlx5: fix redundant flow after RSS expansion
> > >
> > > Xiaoyun Li (2):
> > >       app/testpmd: remove unnecessary UDP tunnel check
> > >       net/i40e: fix IPv4 fragment offload
> > >
> > > Youri Querry (1):
> > >       bus/fslmc: fix random portal hangs with qbman 5.0
> > >
> > > Yunjian Wang (3):
> > >       vfio: fix API description
> > >       net/mlx5: fix using flow tunnel before null check
> > >       vfio: fix duplicated user mem map
> > >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-10  8:53  0%   ` Christian Ehrhardt
@ 2021-06-14 12:35  0%     ` Xueming(Steven) Li
  0 siblings, 0 replies; 200+ results
From: Xueming(Steven) Li @ 2021-06-14 12:35 UTC (permalink / raw)
  To: Christian Ehrhardt
  Cc: dpdk stable, dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	yuan.peng, zhaoyan.chen



> -----Original Message-----
> From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> Sent: Thursday, June 10, 2021 4:54 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dpdk stable <stable@dpdk.org>; dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>; Akhil Goyal
> <akhil.goyal@nxp.com>; Ali Alnubani <alialnu@nvidia.com>; benjamin.walker@intel.com; David Christensen
> <drc@linux.vnet.ibm.com>; hariprasad.govindharajan@intel.com; Hemant Agrawal <hemant.agrawal@nxp.com>; Ian Stokes
> <ian.stokes@intel.com>; Jerin Jacob <jerinj@marvell.com>; John McNamara <john.mcnamara@intel.com>; Ju-Hyoung Lee
> <juhlee@microsoft.com>; Kevin Traynor <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang
> <pezhang@redhat.com>; pingx.yu@intel.com; qian.q.xu@intel.com; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; yuan.peng@intel.com; zhaoyan.chen@intel.com
> Subject: Re: [dpdk-dev] 20.11.2 patches review and test
> 
> On Wed, Jun 9, 2021 at 1:56 PM Xueming(Steven) Li <xuemingl@nvidia.com> wrote:
> >
> > Hi all,
> >
> > Thanks Kevin's feedback, there are some patches missing between v21.05-rc1..v21.05.
> > Will roll out rc2 to include them all, please hold test and verification.
> 
> Hi,
> chances are quite high that nowadays SLES15-SP3 will be broken for
> 20.11.2 as well.
> The fix isn't final yet (I had an early one applied and it failed for other SLES releases).
> I'd recommend watching the thread "[PATCH] kni: fix compilation on SLES15-SP3"
> and once concluded pull that into your next RC as well.

Sure, thanks for reminding!

> 
> > Best Regards,
> > Xueming
> >
> >
> > > -----Original Message-----
> > > From: Xueming(Steven) Li <xuemingl@nvidia.com>
> > > Sent: Tuesday, June 1, 2021 3:55 PM
> > > Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>;
> > > Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani
> > > <alialnu@nvidia.com>; benjamin.walker@intel.com; David Christensen
> > > <drc@linux.vnet.ibm.com>; hariprasad.govindharajan@intel.com; Hemant
> > > Agrawal <hemant.agrawal@nxp.com>; Ian Stokes <ian.stokes@intel.com>;
> > > Jerin Jacob <jerinj@marvell.com>; John McNamara
> > > <john.mcnamara@intel.com>; Ju-Hyoung Lee <juhlee@microsoft.com>;
> > > Kevin Traynor <ktraynor@redhat.com>; Luca Boccassi
> > > <bluca@debian.org>; Pei Zhang <pezhang@redhat.com>;
> > > pingx.yu@intel.com; qian.q.xu@intel.com; Raslan Darawsheh
> > > <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>; yuan.peng@intel.com; zhaoyan.chen@intel.com;
> > > Xueming(Steven) Li <xuemingl@nvidia.com>
> > > Subject: 20.11.2 patches review and test
> > >
> > > Hi all,
> > >
> > > Here is a list of patches targeted for stable release 20.11.2.
> > >
> > > The planned date for the final release is 15th June.
> > >
> > > Please help with testing and validation of your use cases and report
> > > any issues/results with reply-all to this mail. For the final release the fixes and reported validations will be added to the release
> notes.
> > >
> > > A release candidate tarball can be found at:
> > >
> > >     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> > >
> > > These patches are located at branch 20.11 of dpdk-stable repo:
> > >     https://dpdk.org/browse/dpdk-stable/
> > >
> > >
> > > Thanks.
> > >
> > > Xueming Li <xuemingl@nvidia.com>
> > >
> > > ---
> > > Ajit Khaparde (3):
> > >       net/bnxt: fix RSS context cleanup
> > >       net/bnxt: check kvargs parsing
> > >       net/bnxt: fix resource cleanup
> > >
> > > Alvin Zhang (7):
> > >       net/ice: fix VLAN filter with PF
> > >       net/i40e: fix input set field mask
> > >       net/igc: fix Rx RSS hash offload capability
> > >       net/igc: fix Rx error counter for bad length
> > >       net/e1000: fix Rx error counter for bad length
> > >       net/e1000: fix max Rx packet size
> > >       net/igc: fix Rx packet size
> > >
> > > Anatoly Burakov (2):
> > >       fbarray: fix log message on truncation error
> > >       power: do not skip saving original P-state governor
> > >
> > > Andrew Boyer (1):
> > >       net/ionic: fix completion type in lif init
> > >
> > > Andrew Rybchenko (3):
> > >       net/failsafe: fix RSS hash offload reporting
> > >       net/failsafe: report minimum and maximum MTU
> > >       common/sfc_efx: remove GENEVE from supported tunnels
> > >
> > > Ankur Dwivedi (1):
> > >       crypto/octeontx: fix session-less mode
> > >
> > > Apeksha Gupta (1):
> > >       examples/l2fwd-crypto: skip masked devices
> > >
> > > Arek Kusztal (1):
> > >       crypto/qat: fix offset for out-of-place scatter-gather
> > >
> > > Beilei Xing (1):
> > >       net/i40evf: fix packet loss for X722
> > >
> > > Bruce Richardson (1):
> > >       build: exclude meson files from examples installation
> > >
> > > Chenbo Xia (1):
> > >       examples/vhost: check memory table query
> > >
> > > Chengchang Tang (15):
> > >       net/hns3: fix HW buffer size on MTU update
> > >       net/hns3: fix processing Tx offload flags
> > >       net/hns3: fix Tx checksum for UDP packets with special port
> > >       net/hns3: fix long task queue pairs reset time
> > >       ethdev: validate input in module EEPROM dump
> > >       ethdev: validate input in register info
> > >       ethdev: validate input in EEPROM info
> > >       net/hns3: fix rollback after setting PVID failure
> > >       net/hns3: fix timing in resetting queues
> > >       net/hns3: fix queue state when concurrent with reset
> > >       net/hns3: fix configure FEC when concurrent with reset
> > >       net/hns3: fix use of command status enumeration
> > >       examples: add eal cleanup to examples
> > >       net/bonding: fix adding itself as its slave
> > >       net/hns3: fix timing in mailbox
> > >
> > > Chengwen Feng (15):
> > >       net/hns3: fix flow counter value
> > >       net/hns3: fix VF mailbox head field
> > >       net/hns3: support get device version when dump register
> > >       net/hns3: fix some packet types
> > >       net/hns3: fix missing outer L4 UDP flag for VXLAN
> > >       net/hns3: remove VLAN/QinQ ptypes from support list
> > >       test: check thread creation
> > >       common/dpaax: fix possible null pointer access
> > >       examples/ethtool: remove unused parsing
> > >       net/hns3: fix flow director lock
> > >       net/e1000/base: fix timeout for shadow RAM write
> > >       net/hns3: fix setting default MAC address in bonding of VF
> > >       net/hns3: fix possible mismatched response of mailbox
> > >       net/hns3: fix VF handling LSC event in secondary process
> > >       net/hns3: fix verification of NEON support
> > >
> > > Ciara Loftus (1):
> > >       net/af_xdp: fix error handling during Rx queue setup
> > >
> > > Conor Walsh (1):
> > >       examples/l3fwd: fix LPM IPv6 subnets
> > >
> > > Cristian Dumitrescu (3):
> > >       table: fix actions with different data size
> > >       pipeline: fix instruction translation
> > >       pipeline: fix endianness conversions
> > >
> > > Dapeng Yu (3):
> > >       net/igc: remove MTU setting limitation
> > >       net/e1000: remove MTU setting limitation
> > >       examples/packet_ordering: fix port configuration
> > >
> > > David Harton (1):
> > >       net/ena: fix releasing Tx ring mbufs
> > >
> > > David Marchand (8):
> > >       doc: fix sphinx rtd theme import in GHA
> > >       service: clean references to removed symbol
> > >       eal: fix evaluation of log level option
> > >       ci: hook to GitHub Actions
> > >       ci: enable v21 ABI checks
> > >       ci: fix package installation in GitHub Actions
> > >       ci: ignore APT update failure in GitHub Actions
> > >       ci: catch coredumps
> > >
> > > Dekel Peled (1):
> > >       common/mlx5: fix DevX read output buffer size
> > >
> > > Dmitry Kozlyuk (3):
> > >       net/pcap: fix format string
> > >       eal/windows: add missing SPDX license tag
> > >       buildtools: fix all drivers disabled on Windows
> > >
> > > Ed Czeck (2):
> > >       net/ark: update packet director initial state
> > >       net/ark: refactor Rx buffer recovery
> > >
> > > Elad Nachman (2):
> > >       kni: support async user request
> > >       kni: fix kernel deadlock with bifurcated device
> > >
> > > Feifei Wang (2):
> > >       net/i40e: fix parsing packet type for NEON
> > >       test/trace: fix race on collected perf data
> > >
> > > Ferruh Yigit (3):
> > >       power: remove duplicated symbols from map file
> > >       log/linux: make default output stderr
> > >       license: fix typos
> > >
> > > Guoyang Zhou (1):
> > >      net/hinic: fix crash in secondary process
> > >
> > > Haiyue Wang (1):
> > >       net/ixgbe: fix Rx errors statistics for UDP checksum
> > >
> > > Harman Kalra (1):
> > >       event/octeontx2: fix device reconfigure for single slot
> > >
> > > Hongbo Zheng (3):
> > >       app/testpmd: fix Tx/Rx descriptor query error log
> > >       net/hns3: fix FLR miss detection
> > >       net/hns3: delete redundant blank line
> > >
> > > Huisong Li (11):
> > >       net/hns3: fix device capabilities for copper media type
> > >       net/hns3: remove unused parameter markers
> > >       net/hns3: fix reporting undefined speed
> > >       net/hns3: fix link update when failed to get link info
> > >       net/hns3: fix flow control exception
> > >       app/testpmd: fix bitmap of link speeds when force speed
> > >       net/hns3: fix flow control mode
> > >       net/hns3: remove redundant mailbox response
> > >       net/hns3: fix DCB mode check
> > >       net/hns3: fix VMDq mode check
> > >       net/hns3: fix mbuf leakage
> > >
> > > Ibtisam Tariq (1):
> > >       examples/vhost_crypto: remove unused short option
> > >
> > > Igor Russkikh (2):
> > >       net/qede: reduce log verbosity
> > >       net/qede: accept bigger RSS table
> > >
> > > Ilya Maximets (1):
> > >       net/virtio: fix interrupt unregistering for listening socket
> > >
> > > Ivan Malov (5):
> > >       net/sfc: fix buffer size for flow parse
> > >       net: fix comment in IPv6 header
> > >       net/sfc: fix error path inconsistency
> > >       common/sfc_efx/base: fix indication of MAE encap support
> > >       net/sfc: fix outer rule rollback on error
> > >
> > > Jiawei Wang (2):
> > >       app/testpmd: fix NVGRE encap configuration
> > >       net/mlx5: fix resource release for mirror flow
> > >
> > > Jiawei Zhu (1):
> > >       net/mlx5: fix Rx segmented packets on mbuf starvation
> > >
> > > Jiawen Wu (3):
> > >       net/txgbe: remove unused functions
> > >       net/txgbe: fix Rx missed packet counter
> > >       net/txgbe: update packet type
> > >
> > > John Daley (1):
> > >       net/enic: fix flow initialization error handling
> > >
> > > Kalesh AP (18):
> > >       net/bnxt: remove unused macro
> > >       net/bnxt: fix VNIC configuration
> > >       net/bnxt: fix firmware fatal error handling
> > >       net/bnxt: fix FW readiness check during recovery
> > >       net/bnxt: fix device readiness check
> > >       net/bnxt: fix VF info allocation
> > >       net/bnxt: fix HWRM and FW incompatibility handling
> > >       net/bnxt: mute some failure logs
> > >       app/testpmd: check MAC address query
> > >       net/bnxt: fix PCI write check
> > >       net/bnxt: fix link state operations
> > >       net/bnxt: fix timesync when PTP is not supported
> > >       net/bnxt: fix memory allocation for command response
> > >       net/bnxt: fix double free in port start failure
> > >       net/bnxt: fix configuring LRO
> > >       net/bnxt: fix health check alarm cancellation
> > >       net/bnxt: fix PTP support for Thor
> > >       net/bnxt: fix ring count calculation for Thor
> > >
> > > Kevin Traynor (1):
> > >       test/cmdline: fix inputs array
> > >
> > > Lance Richardson (6):
> > >       net/bnxt: fix Rx buffer posting
> > >       net/bnxt: fix Tx length hint threshold
> > >       net/bnxt: fix handling of null flow mask
> > >       test: fix TCP header initialization
> > >       net/bnxt: fix Rx descriptor status
> > >       net/bnxt: fix Rx queue count
> > >
> > > Leyi Rong (1):
> > >       net/iavf: fix packet length parsing in AVX512
> > >
> > > Li Zhang (1):
> > >       net/mlx5: fix flow actions index in cache
> > >
> > > Luc Pelletier (2):
> > >       eal: fix race in control thread creation
> > >       eal: fix hang in control thread creation
> > >
> > > Marvin Liu (5):
> > >       vhost: fix split ring potential buffer overflow
> > >       vhost: fix packed ring potential buffer overflow
> > >       vhost: fix batch dequeue potential buffer overflow
> > >       vhost: fix initialization of temporary header
> > >       vhost: fix initialization of async temporary header
> > >
> > > Matan Azrad (4):
> > >       common/mlx5/linux: add glue function to query WQ
> > >       common/mlx5: add DevX command to query WQ
> > >       common/mlx5: add DevX commands for queue counters
> > >       vdpa/mlx5: fix virtq cleaning
> > >
> > > Min Hu (Connor) (8):
> > >       net/hns3: fix MTU config complexity
> > >       net/hns3: update HiSilicon copyright syntax
> > >       net/hns3: fix copyright date
> > >       examples/ptpclient: remove wrong comment
> > >       test/bpf: fix error message
> > >       doc: fix HiSilicon copyright syntax
> > >       net/hns3: remove unused macros
> > >       net/hns3: remove unused macro
> > >
> > > Murphy Yang (3):
> > >       net/ixgbe: fix RSS RETA being reset after port start
> > >       net/i40e: fix flow director config after flow validate
> > >       net/i40e: fix flow director for common pctypes
> > >
> > > Natanael Copa (5):
> > >       common/dpaax/caamflib: fix build with musl
> > >       bus/dpaa: fix 64-bit arch detection
> > >       bus/dpaa: fix build with musl
> > >       net/cxgbe: remove use of uint type
> > >       app/testpmd: fix build with musl
> > >
> > > Nipun Gupta (1):
> > >       bus/dpaa: fix statistics reading
> > >
> > > Nithin Dabilpuram (3):
> > >       vfio: do not merge contiguous areas
> > >       vfio: fix DMA mapping granularity for IOVA as VA
> > >       test/mem: fix page size for external memory
> > >
> > > Pallavi Kadam (1):
> > >       bus/pci: skip probing some Windows NDIS devices
> > >
> > > Pavan Nikhilesh (2):
> > >       test/event: fix timeout accuracy
> > >       app/eventdev: fix timeout accuracy
> > >
> > > Pu Xu (1):
> > >       ip_frag: fix fragmenting IPv4 packet with header option
> > >
> > > Qi Zhang (7):
> > >       net/ice/base: fix payload indicator on ptype
> > >       net/ice/base: fix uninitialized struct
> > >       net/ice/base: cleanup filter list on error
> > >       net/ice/base: fix memory allocation for MAC addresses
> > >       net/iavf: fix TSO max segment size
> > >       doc: fix matching versions in ice guide
> > >       net/iavf: fix wrong Tx context descriptor
> > >
> > > Radha Mohan Chintakuntla (1):
> > >       raw/octeontx2_dma: assign PCI device in DPI VF
> > >
> > > Raslan Darawsheh (1):
> > >       ethdev: update flow item GTP QFI definition
> > >
> > > Richael Zhuang (2):
> > >       test/power: add delay before checking CPU frequency
> > >       test/power: round CPU frequency to check
> > >
> > > Robin Zhang (4):
> > >       net/i40e: announce request queue capability in PF
> > >       doc: update recommended versions for i40e
> > >       net/i40e: fix lack of MAC type when set MAC address
> > >       net/iavf: fix lack of MAC type when set MAC address
> > >
> > > Rohit Raj (3):
> > >       net/dpaa2: fix getting link status
> > >       net/dpaa: fix getting link status
> > >       examples/l2fwd-crypto: fix packet length while decryption
> > >
> > > Roy Shterman (1):
> > >       mem: fix freeing segments in --huge-unlink mode
> > >
> > > Satheesh Paul (1):
> > >       net/octeontx2: fix VLAN filter
> > >
> > > Savinay Dharmappa (1):
> > >       sched: fix traffic class oversubscription parameter
> > >
> > > Shijith Thotton (1):
> > >       eventdev: fix case to initiate crypto adapter service
> > >
> > > Siwar Zitouni (1):
> > >       net/ice: fix disabling promiscuous mode Somnath Kotur (3):
> > >       net/bnxt: fix xstats get
> > >       net/bnxt: fix Rx and Tx timestamps
> > >       net/bnxt: fix Tx timestamp init
> > >
> > > Stanislaw Kardach (1):
> > >       test: proceed if timer subsystem already initialized
> > >
> > > Stephen Hemminger (1):
> > >       kni: refactor user request processing
> > >
> > > Tal Shnaiderman (2):
> > >       eal/windows: fix default thread priority
> > >       eal/windows: fix return codes of pthread shim layer
> > >
> > > Tengfei Zhang (1):
> > >       net/pcap: fix file descriptor leak on close
> > >
> > > Thinh Tran (1):
> > >       test: fix autotest handling of skipped tests
> > >
> > > Thomas Monjalon (16):
> > >       bus/pci: fix Windows kernel driver categories
> > >       eal: fix comment of OS-specific header files
> > >       buildtools: fix build with busybox
> > >       build: detect execinfo library on Linux
> > >       build: remove redundant _GNU_SOURCE definitions
> > >       eal: fix build with musl
> > >       net/igc: remove use of uint type
> > >       event/dlb: fix header includes for musl
> > >       examples/bbdev: fix header include for musl
> > >       drivers: fix log level after loading
> > >       app/regex: fix usage text
> > >       app/testpmd: fix usage text
> > >       doc: fix names of UIO drivers
> > >       doc: fix build with Sphinx 4
> > >       bus/pci: support I/O port operations with musl
> > >       app: fix exit messages
> > >
> > > Tyler Retzlaff (1):
> > >       eal: add C++ include guard for reciprocal header
> > >
> > > Vadim Podovinnikov (1):
> > >       net/bonding: fix LACP system address check
> > >
> > > Venkat Duvvuru (1):
> > >       net/bnxt: fix queues per VNIC
> > >
> > > Viacheslav Ovsiienko (11):
> > >       net/mlx5: fix external buffer pool registration for Rx queue
> > >       net/mlx5: fix metadata item validation for ingress flows
> > >       net/mlx5: fix hashed list size for tunnel flow groups
> > >       net/mlx5: fix UAR allocation diagnostics messages
> > >       common/mlx5: add timestamp format support to DevX
> > >       vdpa/mlx5: support timestamp format
> > >       net/mlx5: fix Rx metadata leftovers
> > >       net/mlx5: fix drop action for Direct Rules/Verbs
> > >       net/mlx4: fix RSS action with null hash key
> > >       net/mlx5: support timestamp format
> > >       regex/mlx5: support timestamp format
> > >
> > > Wenjun Wu (2):
> > >       net/ice: check some functions return
> > >       net/ice: fix RSS hash update
> > >
> > > Wenwu Ma (1):
> > >       net/ice: fix illegal access when removing MAC filter
> > >
> > > Wenzhuo Lu (2):
> > >       net/iavf: fix crash in AVX512
> > >       net/ice: fix crash in AVX512
> > >
> > > Wisam Jaddo (1):
> > >       app/flow-perf: fix encap/decap actions
> > >
> > > Xiao Wang (1):
> > >       vdpa/ifc: check PCI config read
> > >
> > > Xiaoyu Min (4):
> > >       net/mlx5: support RSS expansion for IPv6 GRE
> > >       net/mlx5: fix shared inner RSS
> > >       net/mlx5: fix missing shared RSS hash types
> > >       net/mlx5: fix redundant flow after RSS expansion
> > >
> > > Xiaoyun Li (2):
> > >       app/testpmd: remove unnecessary UDP tunnel check
> > >       net/i40e: fix IPv4 fragment offload
> > >
> > > Youri Querry (1):
> > >       bus/fslmc: fix random portal hangs with qbman 5.0
> > >
> > > Yunjian Wang (3):
> > >       vfio: fix API description
> > >       net/mlx5: fix using flow tunnel before null check
> > >       vfio: fix duplicated user mem map
> 
> 
> 
> --
> Christian Ehrhardt
> Staff Engineer, Ubuntu Server
> Canonical Ltd

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 07/10] eal: implement functions for mutex management
  2021-06-09 22:37  0%           ` Dmitry Kozlyuk
@ 2021-06-12  2:39  0%             ` Narcisa Ana Maria Vasile
  0 siblings, 0 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-12  2:39 UTC (permalink / raw)
  To: Dmitry Kozlyuk
  Cc: dev, thomas, khot, navasile, dmitrym, roretzla, talshn, ocardona,
	bruce.richardson, david.marchand, pallavi.kadam

On Thu, Jun 10, 2021 at 01:37:17AM +0300, Dmitry Kozlyuk wrote:
> 2021-06-09 02:04 (UTC+0300), Dmitry Kozlyuk:
> > 2021-06-04 16:44 (UTC-0700), Narcisa Ana Maria Vasile:
> > [...]
> > > diff --git a/lib/eal/include/rte_thread_types.h b/lib/eal/include/rte_thread_types.h
> > > index d67b24a563..7bb0d2948c 100644
> > > --- a/lib/eal/include/rte_thread_types.h
> > > +++ b/lib/eal/include/rte_thread_types.h
> > > @@ -7,4 +7,8 @@
> > >  
> > >  #include <pthread.h>
> > >  
> > > +#define RTE_THREAD_MUTEX_INITIALIZER     PTHREAD_MUTEX_INITIALIZER
> > > +
> > > +typedef pthread_mutex_t                 rte_thread_mutex_t;
> > > +
> > >  #endif /* _RTE_THREAD_TYPES_H_ */
> > > diff --git a/lib/eal/windows/include/rte_windows_thread_types.h b/lib/eal/windows/include/rte_windows_thread_types.h
> > > index 60e6d94553..c6c8502bfb 100644
> > > --- a/lib/eal/windows/include/rte_windows_thread_types.h
> > > +++ b/lib/eal/windows/include/rte_windows_thread_types.h
> > > @@ -7,4 +7,13 @@
> > >  
> > >  #include <rte_windows.h>
> > >  
> > > +#define WINDOWS_MUTEX_INITIALIZER               (void*)-1
> > > +#define RTE_THREAD_MUTEX_INITIALIZER            {WINDOWS_MUTEX_INITIALIZER}
> > > +
> > > +struct thread_mutex_t {
> > > +	void* mutex_id;
> > > +};
> > > +
> > > +typedef struct thread_mutex_t rte_thread_mutex_t;
> > > +
> > >  #endif /* _RTE_THREAD_TYPES_H_ */  
> > 
> > In previous patches rte_thread content was made opaque and of equal size
> > for pthread (most implementations) and non-pthread variant.
> > AFAIU, we agree on the requirement of compatible ABI between variants,
> > that is, a compiled app can work with any threading variant of DPDK.
> > Above definition of `rte_thread_mutex_t` does not satisfy it.
> > Or do we only promise API compatibility?
> > This is the most important question now.
> 
> From Windows community call 2021-06-10, for everyone's information.
> 
> 1. Yes, binary compatibility is a requirement.
> 
> 2. Static mutex initializer for Windows is tricky (an old topic).
> This patch proposes `rte_mutex` to hold a pointer to actual mutex
> and use NULL as static initializer, then allocate on first use.
> At the same time we want to use the same initializer for pthread variant.
> This means it would also need indirection, allocation, and tricky logic.
> 
> My opinion:
> 
> New threading API can just be without static initilizer.
> All it usages in DPDK could be converted to dynamic initialization
> either in appropriate function or using `RTE_INIT`.
> Maybe create a convenient macro to declare a static mutex and its
> initialization code, what do others think?
> 
> 	RTE_STATIC_MUTEX(private_lock)
> 
> Expanding to:
> 
> 	static RTE_DECLARE_MUTEX(private_lock)
> 	RTE_DEFINE_MUTEX(private_lock)
> 
> 
> Expanding to:
> 
> 	static rte_mutex private_lock;
> 
> 	RTE_INIT(__rte_private_lock_init)
> 	{
> 		RTE_VERIFY(rte_thread_mutex_init(&private_lock));
> 	}
> 
> As a bonus it removes the need of `rte_*_thread_types.h`.

Thank you Dmitry, I think this is the best and most elegant solution.
I will use a pointer to represent the mutex:
typedef struct rte_thread_mutex_tag {
	void* mutex_id;
} rte_thread_mutex;

..and use the macro for static initializations as you described.


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
  2021-06-01  3:06  2% ` [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK Chenbo Xia
@ 2021-06-11  7:15  0%   ` Thomas Monjalon
  2021-06-15  2:49  0%     ` Xia, Chenbo
  0 siblings, 1 reply; 200+ results
From: Thomas Monjalon @ 2021-06-11  7:15 UTC (permalink / raw)
  To: Chenbo Xia
  Cc: dev, cunming.liang, jingjing.wu, anatoly.burakov, ferruh.yigit,
	mdr, nhorman, bruce.richardson, david.marchand, stephen,
	konstantin.ananyev

01/06/2021 05:06, Chenbo Xia:
> Hi everyone,
> 
> This is a draft implementation of the mdev (Mediated device [1])
> support in DPDK PCI bus driver. Mdev is a way to virtualize devices
> in Linux kernel. Based on the device-api (mdev_type/device_api),
> there could be different types of mdev devices (e.g. vfio-pci).

Please could you illustrate with an usage of mdev in DPDK?
What does it enable which is not possible today?

> In this patchset, the PCI bus driver is extended to support scanning
> and probing the mdev devices whose device-api is "vfio-pci".
> 
>                      +---------+
>                      | PCI bus |
>                      +----+----+
>                           |
>          +--------+-------+-------+--------+
>          |        |               |        |
>   Physical PCI devices ...   Mediated PCI devices ...
> 
> The first four patches in this patchset are mainly preparation of mdev
> bus support. The left two patches are the key implementation of mdev bus.
> 
> The implementation of mdev bus in DPDK has several options:
> 
> 1: Embed mdev bus in current pci bus
> 
>    This patchset takes this option for an example. Mdev has several
>    device types: pci/platform/amba/ccw/ap. DPDK currently only cares
>    pci devices in all mdev device types so we could embed the mdev bus
>    into current pci bus. Then pci bus with mdev support will scan/plug/
>    unplug/.. not only normal pci devices but also mediated pci devices.

I think it is a different bus.
It would be cleaner to not touch the PCI bus.
Having a separate bus will allow an easy way to identify a device
with the new generic devargs syntax, example:
	bus=mdev,uuid=XXX
or more complex:
	bus=mdev,uuid=XXX/class=crypto/driver=qat,foo=bar

> 2: A new mdev bus that scans mediated pci devices and probes mdev driver to
>    plug-in pci devices to pci bus
> 
>    If we took this option, a new mdev bus will be implemented to scan
>    mediated pci devices and a new mdev driver for pci devices will be
>    implemented in pci bus to plug-in mediated pci devices to pci bus.
> 
>    Our RFC v1 takes this option:
>    http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-tiwei.bie@intel.com/
> 
>    Note that: for either option 1 or 2, device drivers do not know the
>    implementation difference but only use structs/functions exposed by
>    pci bus. Mediated pci devices are different from normal pci devices
>    on: 1. Mediated pci devices use UUID as address but normal ones use BDF.
>    2. Mediated pci devices may have some capabilities that normal pci
>    devices do not have. For example, mediated pci devices could have
>    regions that have sparse mmap capability, which allows a region to have
>    multiple mmap areas. Another example is mediated pci devices may have
>    regions/part of regions not mmaped but need to access them. Above
>    difference will change the current ABI (i.e., struct rte_pci_device).
>    Please check 5th and 6th patch for details.
> 
> 3. A brand new mdev bus that does everything
> 
>    This option will implement a new and standalone mdev bus. This option
>    does not need any changes in current pci bus but only needs some shared
>    code (linux vfio part) in pci bus. Drivers of devices that support mdev
>    will register itself as a mdev driver and do not rely on pci bus anymore.
>    This option, IMHO, will make the code clean. The only potential problem
>    may be code duplication, which could be solved by making code of linux
>    vfio part of pci bus common and shared.

Yes I prefer this third option.
We can find an elegant way of sharing some VFIO code between buses.



^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-10  4:10  0%               ` Gregory Etelson
@ 2021-06-10  9:22  4%                 ` Olivier Matz
  2021-06-14 16:36  4%                   ` Andrew Rybchenko
  0 siblings, 1 reply; 200+ results
From: Olivier Matz @ 2021-06-10  9:22 UTC (permalink / raw)
  To: Gregory Etelson
  Cc: Iremonger, Bernard, Morten Brørup, dev, Matan Azrad,
	Ori Kam, Raslan Darawsheh, Asaf Penso

Hi Gregory,

On Thu, Jun 10, 2021 at 04:10:25AM +0000, Gregory Etelson wrote:
> Hello,
>
> There was no activity that patch for a long time.
> The patch is marked as failed, but we verified failed tests and concluded that the failures can be ignored.
> https://patchwork.dpdk.org/project/dpdk/patch/20210527152858.13312-1-getelson@nvidia.com/
> How should I proceed with this case ?
> Please advise.
>

I like the idea of this patch: to me it is more convenient to access to
these fields with a bitfield. I don't see a problem about using
bitfields here, glibc or FreeBSD netinet/ip.h are doing the same.

However, as stated previously, this patch breaks the initialization API.
The DPDK ABI/API policy is described here:
http://doc.dpdk.org/guides/contributing/abi_policy.html#the-dpdk-abi-policy

From this document:

  The API should only be changed for significant reasons, such as
  performance enhancements. API breakages due to changes such as
  reorganizing public structure fields for aesthetic or readability
  purposes should be avoided.

So to follow the project policy, I think we should reject this path.

Regards,
Olivier

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-09 11:56  0% ` Xueming(Steven) Li
@ 2021-06-10  8:53  0%   ` Christian Ehrhardt
  2021-06-14 12:35  0%     ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Christian Ehrhardt @ 2021-06-10  8:53 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dpdk stable, dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	yuan.peng, zhaoyan.chen

On Wed, Jun 9, 2021 at 1:56 PM Xueming(Steven) Li <xuemingl@nvidia.com> wrote:
>
> Hi all,
>
> Thanks Kevin's feedback, there are some patches missing between v21.05-rc1..v21.05.
> Will roll out rc2 to include them all, please hold test and verification.

Hi,
chances are quite high that nowadays SLES15-SP3 will be broken for
20.11.2 as well.
The fix isn't final yet (I had an early one applied and it failed for
other SLES releases).
I'd recommend watching the thread "[PATCH] kni: fix compilation on SLES15-SP3"
and once concluded pull that into your next RC as well.

> Best Regards,
> Xueming
>
>
> > -----Original Message-----
> > From: Xueming(Steven) Li <xuemingl@nvidia.com>
> > Sent: Tuesday, June 1, 2021 3:55 PM
> > Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>; Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani
> > <alialnu@nvidia.com>; benjamin.walker@intel.com; David Christensen <drc@linux.vnet.ibm.com>;
> > hariprasad.govindharajan@intel.com; Hemant Agrawal <hemant.agrawal@nxp.com>; Ian Stokes <ian.stokes@intel.com>; Jerin Jacob
> > <jerinj@marvell.com>; John McNamara <john.mcnamara@intel.com>; Ju-Hyoung Lee <juhlee@microsoft.com>; Kevin Traynor
> > <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang <pezhang@redhat.com>; pingx.yu@intel.com;
> > qian.q.xu@intel.com; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>;
> > yuan.peng@intel.com; zhaoyan.chen@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
> > Subject: 20.11.2 patches review and test
> >
> > Hi all,
> >
> > Here is a list of patches targeted for stable release 20.11.2.
> >
> > The planned date for the final release is 15th June.
> >
> > Please help with testing and validation of your use cases and report any issues/results with reply-all to this mail. For the final release
> > the fixes and reported validations will be added to the release notes.
> >
> > A release candidate tarball can be found at:
> >
> >     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> >
> > These patches are located at branch 20.11 of dpdk-stable repo:
> >     https://dpdk.org/browse/dpdk-stable/
> >
> >
> > Thanks.
> >
> > Xueming Li <xuemingl@nvidia.com>
> >
> > ---
> > Ajit Khaparde (3):
> >       net/bnxt: fix RSS context cleanup
> >       net/bnxt: check kvargs parsing
> >       net/bnxt: fix resource cleanup
> >
> > Alvin Zhang (7):
> >       net/ice: fix VLAN filter with PF
> >       net/i40e: fix input set field mask
> >       net/igc: fix Rx RSS hash offload capability
> >       net/igc: fix Rx error counter for bad length
> >       net/e1000: fix Rx error counter for bad length
> >       net/e1000: fix max Rx packet size
> >       net/igc: fix Rx packet size
> >
> > Anatoly Burakov (2):
> >       fbarray: fix log message on truncation error
> >       power: do not skip saving original P-state governor
> >
> > Andrew Boyer (1):
> >       net/ionic: fix completion type in lif init
> >
> > Andrew Rybchenko (3):
> >       net/failsafe: fix RSS hash offload reporting
> >       net/failsafe: report minimum and maximum MTU
> >       common/sfc_efx: remove GENEVE from supported tunnels
> >
> > Ankur Dwivedi (1):
> >       crypto/octeontx: fix session-less mode
> >
> > Apeksha Gupta (1):
> >       examples/l2fwd-crypto: skip masked devices
> >
> > Arek Kusztal (1):
> >       crypto/qat: fix offset for out-of-place scatter-gather
> >
> > Beilei Xing (1):
> >       net/i40evf: fix packet loss for X722
> >
> > Bruce Richardson (1):
> >       build: exclude meson files from examples installation
> >
> > Chenbo Xia (1):
> >       examples/vhost: check memory table query
> >
> > Chengchang Tang (15):
> >       net/hns3: fix HW buffer size on MTU update
> >       net/hns3: fix processing Tx offload flags
> >       net/hns3: fix Tx checksum for UDP packets with special port
> >       net/hns3: fix long task queue pairs reset time
> >       ethdev: validate input in module EEPROM dump
> >       ethdev: validate input in register info
> >       ethdev: validate input in EEPROM info
> >       net/hns3: fix rollback after setting PVID failure
> >       net/hns3: fix timing in resetting queues
> >       net/hns3: fix queue state when concurrent with reset
> >       net/hns3: fix configure FEC when concurrent with reset
> >       net/hns3: fix use of command status enumeration
> >       examples: add eal cleanup to examples
> >       net/bonding: fix adding itself as its slave
> >       net/hns3: fix timing in mailbox
> >
> > Chengwen Feng (15):
> >       net/hns3: fix flow counter value
> >       net/hns3: fix VF mailbox head field
> >       net/hns3: support get device version when dump register
> >       net/hns3: fix some packet types
> >       net/hns3: fix missing outer L4 UDP flag for VXLAN
> >       net/hns3: remove VLAN/QinQ ptypes from support list
> >       test: check thread creation
> >       common/dpaax: fix possible null pointer access
> >       examples/ethtool: remove unused parsing
> >       net/hns3: fix flow director lock
> >       net/e1000/base: fix timeout for shadow RAM write
> >       net/hns3: fix setting default MAC address in bonding of VF
> >       net/hns3: fix possible mismatched response of mailbox
> >       net/hns3: fix VF handling LSC event in secondary process
> >       net/hns3: fix verification of NEON support
> >
> > Ciara Loftus (1):
> >       net/af_xdp: fix error handling during Rx queue setup
> >
> > Conor Walsh (1):
> >       examples/l3fwd: fix LPM IPv6 subnets
> >
> > Cristian Dumitrescu (3):
> >       table: fix actions with different data size
> >       pipeline: fix instruction translation
> >       pipeline: fix endianness conversions
> >
> > Dapeng Yu (3):
> >       net/igc: remove MTU setting limitation
> >       net/e1000: remove MTU setting limitation
> >       examples/packet_ordering: fix port configuration
> >
> > David Harton (1):
> >       net/ena: fix releasing Tx ring mbufs
> >
> > David Marchand (8):
> >       doc: fix sphinx rtd theme import in GHA
> >       service: clean references to removed symbol
> >       eal: fix evaluation of log level option
> >       ci: hook to GitHub Actions
> >       ci: enable v21 ABI checks
> >       ci: fix package installation in GitHub Actions
> >       ci: ignore APT update failure in GitHub Actions
> >       ci: catch coredumps
> >
> > Dekel Peled (1):
> >       common/mlx5: fix DevX read output buffer size
> >
> > Dmitry Kozlyuk (3):
> >       net/pcap: fix format string
> >       eal/windows: add missing SPDX license tag
> >       buildtools: fix all drivers disabled on Windows
> >
> > Ed Czeck (2):
> >       net/ark: update packet director initial state
> >       net/ark: refactor Rx buffer recovery
> >
> > Elad Nachman (2):
> >       kni: support async user request
> >       kni: fix kernel deadlock with bifurcated device
> >
> > Feifei Wang (2):
> >       net/i40e: fix parsing packet type for NEON
> >       test/trace: fix race on collected perf data
> >
> > Ferruh Yigit (3):
> >       power: remove duplicated symbols from map file
> >       log/linux: make default output stderr
> >       license: fix typos
> >
> > Guoyang Zhou (1):
> >      net/hinic: fix crash in secondary process
> >
> > Haiyue Wang (1):
> >       net/ixgbe: fix Rx errors statistics for UDP checksum
> >
> > Harman Kalra (1):
> >       event/octeontx2: fix device reconfigure for single slot
> >
> > Hongbo Zheng (3):
> >       app/testpmd: fix Tx/Rx descriptor query error log
> >       net/hns3: fix FLR miss detection
> >       net/hns3: delete redundant blank line
> >
> > Huisong Li (11):
> >       net/hns3: fix device capabilities for copper media type
> >       net/hns3: remove unused parameter markers
> >       net/hns3: fix reporting undefined speed
> >       net/hns3: fix link update when failed to get link info
> >       net/hns3: fix flow control exception
> >       app/testpmd: fix bitmap of link speeds when force speed
> >       net/hns3: fix flow control mode
> >       net/hns3: remove redundant mailbox response
> >       net/hns3: fix DCB mode check
> >       net/hns3: fix VMDq mode check
> >       net/hns3: fix mbuf leakage
> >
> > Ibtisam Tariq (1):
> >       examples/vhost_crypto: remove unused short option
> >
> > Igor Russkikh (2):
> >       net/qede: reduce log verbosity
> >       net/qede: accept bigger RSS table
> >
> > Ilya Maximets (1):
> >       net/virtio: fix interrupt unregistering for listening socket
> >
> > Ivan Malov (5):
> >       net/sfc: fix buffer size for flow parse
> >       net: fix comment in IPv6 header
> >       net/sfc: fix error path inconsistency
> >       common/sfc_efx/base: fix indication of MAE encap support
> >       net/sfc: fix outer rule rollback on error
> >
> > Jiawei Wang (2):
> >       app/testpmd: fix NVGRE encap configuration
> >       net/mlx5: fix resource release for mirror flow
> >
> > Jiawei Zhu (1):
> >       net/mlx5: fix Rx segmented packets on mbuf starvation
> >
> > Jiawen Wu (3):
> >       net/txgbe: remove unused functions
> >       net/txgbe: fix Rx missed packet counter
> >       net/txgbe: update packet type
> >
> > John Daley (1):
> >       net/enic: fix flow initialization error handling
> >
> > Kalesh AP (18):
> >       net/bnxt: remove unused macro
> >       net/bnxt: fix VNIC configuration
> >       net/bnxt: fix firmware fatal error handling
> >       net/bnxt: fix FW readiness check during recovery
> >       net/bnxt: fix device readiness check
> >       net/bnxt: fix VF info allocation
> >       net/bnxt: fix HWRM and FW incompatibility handling
> >       net/bnxt: mute some failure logs
> >       app/testpmd: check MAC address query
> >       net/bnxt: fix PCI write check
> >       net/bnxt: fix link state operations
> >       net/bnxt: fix timesync when PTP is not supported
> >       net/bnxt: fix memory allocation for command response
> >       net/bnxt: fix double free in port start failure
> >       net/bnxt: fix configuring LRO
> >       net/bnxt: fix health check alarm cancellation
> >       net/bnxt: fix PTP support for Thor
> >       net/bnxt: fix ring count calculation for Thor
> >
> > Kevin Traynor (1):
> >       test/cmdline: fix inputs array
> >
> > Lance Richardson (6):
> >       net/bnxt: fix Rx buffer posting
> >       net/bnxt: fix Tx length hint threshold
> >       net/bnxt: fix handling of null flow mask
> >       test: fix TCP header initialization
> >       net/bnxt: fix Rx descriptor status
> >       net/bnxt: fix Rx queue count
> >
> > Leyi Rong (1):
> >       net/iavf: fix packet length parsing in AVX512
> >
> > Li Zhang (1):
> >       net/mlx5: fix flow actions index in cache
> >
> > Luc Pelletier (2):
> >       eal: fix race in control thread creation
> >       eal: fix hang in control thread creation
> >
> > Marvin Liu (5):
> >       vhost: fix split ring potential buffer overflow
> >       vhost: fix packed ring potential buffer overflow
> >       vhost: fix batch dequeue potential buffer overflow
> >       vhost: fix initialization of temporary header
> >       vhost: fix initialization of async temporary header
> >
> > Matan Azrad (4):
> >       common/mlx5/linux: add glue function to query WQ
> >       common/mlx5: add DevX command to query WQ
> >       common/mlx5: add DevX commands for queue counters
> >       vdpa/mlx5: fix virtq cleaning
> >
> > Min Hu (Connor) (8):
> >       net/hns3: fix MTU config complexity
> >       net/hns3: update HiSilicon copyright syntax
> >       net/hns3: fix copyright date
> >       examples/ptpclient: remove wrong comment
> >       test/bpf: fix error message
> >       doc: fix HiSilicon copyright syntax
> >       net/hns3: remove unused macros
> >       net/hns3: remove unused macro
> >
> > Murphy Yang (3):
> >       net/ixgbe: fix RSS RETA being reset after port start
> >       net/i40e: fix flow director config after flow validate
> >       net/i40e: fix flow director for common pctypes
> >
> > Natanael Copa (5):
> >       common/dpaax/caamflib: fix build with musl
> >       bus/dpaa: fix 64-bit arch detection
> >       bus/dpaa: fix build with musl
> >       net/cxgbe: remove use of uint type
> >       app/testpmd: fix build with musl
> >
> > Nipun Gupta (1):
> >       bus/dpaa: fix statistics reading
> >
> > Nithin Dabilpuram (3):
> >       vfio: do not merge contiguous areas
> >       vfio: fix DMA mapping granularity for IOVA as VA
> >       test/mem: fix page size for external memory
> >
> > Pallavi Kadam (1):
> >       bus/pci: skip probing some Windows NDIS devices
> >
> > Pavan Nikhilesh (2):
> >       test/event: fix timeout accuracy
> >       app/eventdev: fix timeout accuracy
> >
> > Pu Xu (1):
> >       ip_frag: fix fragmenting IPv4 packet with header option
> >
> > Qi Zhang (7):
> >       net/ice/base: fix payload indicator on ptype
> >       net/ice/base: fix uninitialized struct
> >       net/ice/base: cleanup filter list on error
> >       net/ice/base: fix memory allocation for MAC addresses
> >       net/iavf: fix TSO max segment size
> >       doc: fix matching versions in ice guide
> >       net/iavf: fix wrong Tx context descriptor
> >
> > Radha Mohan Chintakuntla (1):
> >       raw/octeontx2_dma: assign PCI device in DPI VF
> >
> > Raslan Darawsheh (1):
> >       ethdev: update flow item GTP QFI definition
> >
> > Richael Zhuang (2):
> >       test/power: add delay before checking CPU frequency
> >       test/power: round CPU frequency to check
> >
> > Robin Zhang (4):
> >       net/i40e: announce request queue capability in PF
> >       doc: update recommended versions for i40e
> >       net/i40e: fix lack of MAC type when set MAC address
> >       net/iavf: fix lack of MAC type when set MAC address
> >
> > Rohit Raj (3):
> >       net/dpaa2: fix getting link status
> >       net/dpaa: fix getting link status
> >       examples/l2fwd-crypto: fix packet length while decryption
> >
> > Roy Shterman (1):
> >       mem: fix freeing segments in --huge-unlink mode
> >
> > Satheesh Paul (1):
> >       net/octeontx2: fix VLAN filter
> >
> > Savinay Dharmappa (1):
> >       sched: fix traffic class oversubscription parameter
> >
> > Shijith Thotton (1):
> >       eventdev: fix case to initiate crypto adapter service
> >
> > Siwar Zitouni (1):
> >       net/ice: fix disabling promiscuous mode
> > Somnath Kotur (3):
> >       net/bnxt: fix xstats get
> >       net/bnxt: fix Rx and Tx timestamps
> >       net/bnxt: fix Tx timestamp init
> >
> > Stanislaw Kardach (1):
> >       test: proceed if timer subsystem already initialized
> >
> > Stephen Hemminger (1):
> >       kni: refactor user request processing
> >
> > Tal Shnaiderman (2):
> >       eal/windows: fix default thread priority
> >       eal/windows: fix return codes of pthread shim layer
> >
> > Tengfei Zhang (1):
> >       net/pcap: fix file descriptor leak on close
> >
> > Thinh Tran (1):
> >       test: fix autotest handling of skipped tests
> >
> > Thomas Monjalon (16):
> >       bus/pci: fix Windows kernel driver categories
> >       eal: fix comment of OS-specific header files
> >       buildtools: fix build with busybox
> >       build: detect execinfo library on Linux
> >       build: remove redundant _GNU_SOURCE definitions
> >       eal: fix build with musl
> >       net/igc: remove use of uint type
> >       event/dlb: fix header includes for musl
> >       examples/bbdev: fix header include for musl
> >       drivers: fix log level after loading
> >       app/regex: fix usage text
> >       app/testpmd: fix usage text
> >       doc: fix names of UIO drivers
> >       doc: fix build with Sphinx 4
> >       bus/pci: support I/O port operations with musl
> >       app: fix exit messages
> >
> > Tyler Retzlaff (1):
> >       eal: add C++ include guard for reciprocal header
> >
> > Vadim Podovinnikov (1):
> >       net/bonding: fix LACP system address check
> >
> > Venkat Duvvuru (1):
> >       net/bnxt: fix queues per VNIC
> >
> > Viacheslav Ovsiienko (11):
> >       net/mlx5: fix external buffer pool registration for Rx queue
> >       net/mlx5: fix metadata item validation for ingress flows
> >       net/mlx5: fix hashed list size for tunnel flow groups
> >       net/mlx5: fix UAR allocation diagnostics messages
> >       common/mlx5: add timestamp format support to DevX
> >       vdpa/mlx5: support timestamp format
> >       net/mlx5: fix Rx metadata leftovers
> >       net/mlx5: fix drop action for Direct Rules/Verbs
> >       net/mlx4: fix RSS action with null hash key
> >       net/mlx5: support timestamp format
> >       regex/mlx5: support timestamp format
> >
> > Wenjun Wu (2):
> >       net/ice: check some functions return
> >       net/ice: fix RSS hash update
> >
> > Wenwu Ma (1):
> >       net/ice: fix illegal access when removing MAC filter
> >
> > Wenzhuo Lu (2):
> >       net/iavf: fix crash in AVX512
> >       net/ice: fix crash in AVX512
> >
> > Wisam Jaddo (1):
> >       app/flow-perf: fix encap/decap actions
> >
> > Xiao Wang (1):
> >       vdpa/ifc: check PCI config read
> >
> > Xiaoyu Min (4):
> >       net/mlx5: support RSS expansion for IPv6 GRE
> >       net/mlx5: fix shared inner RSS
> >       net/mlx5: fix missing shared RSS hash types
> >       net/mlx5: fix redundant flow after RSS expansion
> >
> > Xiaoyun Li (2):
> >       app/testpmd: remove unnecessary UDP tunnel check
> >       net/i40e: fix IPv4 fragment offload
> >
> > Youri Querry (1):
> >       bus/fslmc: fix random portal hangs with qbman 5.0
> >
> > Yunjian Wang (3):
> >       vfio: fix API description
> >       net/mlx5: fix using flow tunnel before null check
> >       vfio: fix duplicated user mem map



-- 
Christian Ehrhardt
Staff Engineer, Ubuntu Server
Canonical Ltd

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-02  9:51  0%             ` Gregory Etelson
@ 2021-06-10  4:10  0%               ` Gregory Etelson
  2021-06-10  9:22  4%                 ` Olivier Matz
  0 siblings, 1 reply; 200+ results
From: Gregory Etelson @ 2021-06-10  4:10 UTC (permalink / raw)
  To: Iremonger, Bernard, Olivier Matz, Morten Brørup, dev
  Cc: Matan Azrad, Ori Kam, Raslan Darawsheh, Asaf Penso

Hello,

There was no activity that patch for a long time.
The patch is marked as failed, but we verified failed tests and concluded that the failures can be ignored.
https://patchwork.dpdk.org/project/dpdk/patch/20210527152858.13312-1-getelson@nvidia.com/
How should I proceed with this case ?
Please advise.

Thank you.

Regards,
Gregory

> -----Original Message-----
> From: Gregory Etelson
> Sent: Wednesday, June 2, 2021 12:52
> To: Morten Brørup <mb@smartsharesystems.com>; Iremonger, Bernard
> <bernard.iremonger@intel.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Ori Kam <orika@nvidia.com>;
> Raslan Darawsheh <rasland@nvidia.com>; Olivier Matz
> <olivier.matz@6wind.com>; Thomas Monjalon <tmonjalon@nvidia.com>
> Subject: RE: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
> 
> Hello,
> 
> Is there another concern about that patch ?
> Please comment.
> 
> Regards,
> Gregory
> 
> > -----Original Message-----
> > From: Gregory Etelson
> > Sent: Monday, May 31, 2021 14:10
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Morten
> Brørup
> > <mb@smartsharesystems.com>; dev@dpdk.org
> > Cc: Matan Azrad <matan@nvidia.com>; Ori Kam <orika@nvidia.com>;
> Raslan
> > Darawsheh <rasland@nvidia.com>; Iremonger, Bernard
> > <bernard.iremonger@intel.com>; Olivier Matz
> <olivier.matz@6wind.com>
> > Subject: RE: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version
> > fields
> >
> > > > > > > > RTE IPv4 header definition combines the `version' and `ihl'
> > > > > > > > fields into a single structure member.
> > > > > > > > This patch introduces dedicated structure members for both
> > > > > > `version'
> > > > > > > > and `ihl' IPv4 fields. Separated header fields definitions
> > > > > > > > allow to create simplified code to match on the IHL value
> > > > > > > > in a flow
> > > rule.
> > > > > > > > The original `version_ihl' structure member is kept for
> > > > > > > > backward compatibility.
> > > > > > > >
> > > > > > > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > > > > > > ---
> > > > > > > >  app/test/test_flow_classify.c |  8 ++++----
> > > > > > > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > > > > > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > > > > > >
> > > > > > > > diff --git a/app/test/test_flow_classify.c
> > > > > > > > b/app/test/test_flow_classify.c index
> > > > > > > > 951606f248..4f64be5357
> > > > > > > > 100644
> > > > > > > > --- a/app/test/test_flow_classify.c
> > > > > > > > +++ b/app/test/test_flow_classify.c
> > > > > > > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > > > > > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > > > > > > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > > > > > > >   */
> > > > > > > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > > > >     RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}  };
> > > > > > > > static const struct rte_flow_item_ipv4 ipv4_mask_24 = { @@
> > > > > > > > -131,7
> > > > > > > > +131,7 @@ static struct rte_flow_item  end_item = {
> > > > > RTE_FLOW_ITEM_TYPE_END,
> > > > > > > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > > > > > > >   */
> > > > > > > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > > > >     RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}  };
> > > > > > > >
> > > > > > > > @@ -150,8 +150,8 @@ static struct rte_flow_item
> > > > > > > > tcp_item_1 = { RTE_FLOW_ITEM_TYPE_TCP,
> > > > > > > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > > > > > > >   */
> > > > > > > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12,
> > > > > > > > 13, 14),
> > > > > > > > - RTE_IPV4(15, 16, 17, 18)}
> > > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > > > > > > + RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > > > > > > >  };
> > > > > > > >
> > > > > > > >  static struct rte_flow_item_sctp sctp_spec_1 = { diff
> > > > > > > > --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h index
> > > > > > > > 4b728969c1..684bb028b2
> > > > > > > > 100644
> > > > > > > > --- a/lib/net/rte_ip.h
> > > > > > > > +++ b/lib/net/rte_ip.h
> > > > > > > > @@ -38,7 +38,21 @@ extern "C" {
> > > > > > > >   * IPv4 Header
> > > > > > > >   */
> > > > > > > >  struct rte_ipv4_hdr {
> > > > > > > > - uint8_t  version_ihl;           /**< version and header length */
> > > > > > > > + __extension__
> > > > > > > > + union {
> > > > > > > > +         uint8_t version_ihl;    /**< version and header length */
> > > > > > > > +         struct {
> > > > > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > > > > +                 uint8_t ihl:4;
> > > > > > > > +                 uint8_t version:4; #elif RTE_BYTE_ORDER
> > > > > > > > +== RTE_BIG_ENDIAN
> > > > > > > > +                 uint8_t version:4;
> > > > > > > > +                 uint8_t ihl:4; #else #error "setup
> > > > > > > > +endian definition"
> > > > > > > > +#endif
> > > > > > > > +         };
> > > > > > > > + };
> > > > > > > >   uint8_t  type_of_service;       /**< type of service */
> > > > > > > >   rte_be16_t total_length;        /**< length of packet */
> > > > > > > >   rte_be16_t packet_id;           /**< packet ID */
> > > > > > > > --
> > > > > > > > 2.31.1
> > > > > > > >
> > > > > > >
> > > > > > > This does not break the ABI, but it could be discussed if it
> > > > > > > breaks
> > > > > > the API due to the required structure initialization changes
> > > > > > shown in
> > > > > > > test_flow_classify.c.
> > > > > >
> > > > > > Yep, I guess it might be classified as API change.
> > > > > > Another thing that concerns me - it is not the only place in
> > > > > > IPv4 header when we unite multiple bit-fields into one field:
> > > > > > type_of_service, fragment_offset.
> > > > > > If we start splitting ipv4 fields into actual bitfields, I
> > > > > > suppose we'll end-up splitting these ones too.
> > > > > > But I am not sure it will pay off - as compiler not always
> > > > > > generates optimal code for reading/updating bitfields.
> > > > > > Did you consider just adding extra macros to simplify access
> > > > > > to these fields (like RTE_IPV4_HDR_(GET_SET)_*), instead?
> > > > > >
> > > > >
> > > > > Let's please not introduce accessor macros for bitfields. If we
> > > > > don't introduce bitfields like these, I would rather stick with
> > > > > the current _MASK, _SHIFT and _FLAG defines.
> > > > >
> > > > > Yes, this change will lead to the introduction of more
> > > > > bitfields, both here and in other places. We already accepted it
> > > > > in the eCPRI structure (/lib/net/rte_ecpri.h), so why not just
> generally accept it.
> > > > >
> > > > > Are modern compilers really worse at handling a bitfield defined
> > > > > like this, compared to handling a single uint8_t with hand coding?
> > > > > I consider your concern very important, so I'm only asking if it
> > > > > is still relevant, to avoid making decisions based on past
> > > > > experience that might be outdated. (I admit to falling into that
> > > > > trap myself, once in a while.)
> > > > >
> > > >
> > > > I compared x86 code generated with gcc-9, gcc-10 and clang-10 for
> > > > these
> > > 2 functions:
> > > > void test_ipv4_hdr_byte(struct rte_ipv4_hdr *h, uint8_t version,
> > > > uint8_t ihl) {
> > > >       h->version_ihl = ((version & 0x0f) << 4) | (ihl & 0x0f); }
> > > > void test_ipv4_hdr_bits(struct rte_ipv4_hdr *h, uint8_t version,
> > > > uint8_t
> > > > ihl) {
> > > >       h->version = version & 0x0f;
> > > >       h->ihl = ihl & 0x0f;
> > > > }
> > > > meson configuration flags: --default-library=static
> > > > --buildtype=release Each compiler produced identical code for both
> > > functions.
> > >
> > > For that particular case (2 bit-fields packed tightly into one byte)
> > > compilers usually perform quite well. At least I never saw issues
> > > for such
> > case.
> > > Bit-fields that do cross byte boundaries - that might be a trouble.
> > >
> >
> > Can we keep both implementations, the combined byte and the bit-field,
> > grouped into a union ? In that case application or PMD can select
> > access method that fits.
> >
> > > >
> > > >
> > > > > > > I think this patch is an improvement, and that such
> > > > > > > structure
> > > > > > modifications should be generally accepted, so:
> > > > > > >
> > > > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > > >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 07/10] eal: implement functions for mutex management
  2021-06-08 23:04  3%         ` Dmitry Kozlyuk
@ 2021-06-09 22:37  0%           ` Dmitry Kozlyuk
  2021-06-12  2:39  0%             ` Narcisa Ana Maria Vasile
  0 siblings, 1 reply; 200+ results
From: Dmitry Kozlyuk @ 2021-06-09 22:37 UTC (permalink / raw)
  To: Narcisa Ana Maria Vasile
  Cc: dev, thomas, khot, navasile, dmitrym, roretzla, talshn, ocardona,
	bruce.richardson, david.marchand, pallavi.kadam

2021-06-09 02:04 (UTC+0300), Dmitry Kozlyuk:
> 2021-06-04 16:44 (UTC-0700), Narcisa Ana Maria Vasile:
> [...]
> > diff --git a/lib/eal/include/rte_thread_types.h b/lib/eal/include/rte_thread_types.h
> > index d67b24a563..7bb0d2948c 100644
> > --- a/lib/eal/include/rte_thread_types.h
> > +++ b/lib/eal/include/rte_thread_types.h
> > @@ -7,4 +7,8 @@
> >  
> >  #include <pthread.h>
> >  
> > +#define RTE_THREAD_MUTEX_INITIALIZER     PTHREAD_MUTEX_INITIALIZER
> > +
> > +typedef pthread_mutex_t                 rte_thread_mutex_t;
> > +
> >  #endif /* _RTE_THREAD_TYPES_H_ */
> > diff --git a/lib/eal/windows/include/rte_windows_thread_types.h b/lib/eal/windows/include/rte_windows_thread_types.h
> > index 60e6d94553..c6c8502bfb 100644
> > --- a/lib/eal/windows/include/rte_windows_thread_types.h
> > +++ b/lib/eal/windows/include/rte_windows_thread_types.h
> > @@ -7,4 +7,13 @@
> >  
> >  #include <rte_windows.h>
> >  
> > +#define WINDOWS_MUTEX_INITIALIZER               (void*)-1
> > +#define RTE_THREAD_MUTEX_INITIALIZER            {WINDOWS_MUTEX_INITIALIZER}
> > +
> > +struct thread_mutex_t {
> > +	void* mutex_id;
> > +};
> > +
> > +typedef struct thread_mutex_t rte_thread_mutex_t;
> > +
> >  #endif /* _RTE_THREAD_TYPES_H_ */  
> 
> In previous patches rte_thread content was made opaque and of equal size
> for pthread (most implementations) and non-pthread variant.
> AFAIU, we agree on the requirement of compatible ABI between variants,
> that is, a compiled app can work with any threading variant of DPDK.
> Above definition of `rte_thread_mutex_t` does not satisfy it.
> Or do we only promise API compatibility?
> This is the most important question now.

From Windows community call 2021-06-10, for everyone's information.

1. Yes, binary compatibility is a requirement.

2. Static mutex initializer for Windows is tricky (an old topic).
This patch proposes `rte_mutex` to hold a pointer to actual mutex
and use NULL as static initializer, then allocate on first use.
At the same time we want to use the same initializer for pthread variant.
This means it would also need indirection, allocation, and tricky logic.

My opinion:

New threading API can just be without static initilizer.
All it usages in DPDK could be converted to dynamic initialization
either in appropriate function or using `RTE_INIT`.
Maybe create a convenient macro to declare a static mutex and its
initialization code, what do others think?

	RTE_STATIC_MUTEX(private_lock)

Expanding to:

	static RTE_DECLARE_MUTEX(private_lock)
	RTE_DEFINE_MUTEX(private_lock)


Expanding to:

	static rte_mutex private_lock;

	RTE_INIT(__rte_private_lock_init)
	{
		RTE_VERIFY(rte_thread_mutex_init(&private_lock));
	}

As a bonus it removes the need of `rte_*_thread_types.h`.

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-01  7:54  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
                   ` (2 preceding siblings ...)
  2021-06-08 11:31  0% ` Kevin Traynor
@ 2021-06-09 11:56  0% ` Xueming(Steven) Li
  2021-06-10  8:53  0%   ` Christian Ehrhardt
  3 siblings, 1 reply; 200+ results
From: Xueming(Steven) Li @ 2021-06-09 11:56 UTC (permalink / raw)
  To: dpdk stable
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	yuan.peng, zhaoyan.chen

Hi all,

Thanks Kevin's feedback, there are some patches missing between v21.05-rc1..v21.05.
Will roll out rc2 to include them all, please hold test and verification.

Best Regards,
Xueming


> -----Original Message-----
> From: Xueming(Steven) Li <xuemingl@nvidia.com>
> Sent: Tuesday, June 1, 2021 3:55 PM
> Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>; Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani
> <alialnu@nvidia.com>; benjamin.walker@intel.com; David Christensen <drc@linux.vnet.ibm.com>;
> hariprasad.govindharajan@intel.com; Hemant Agrawal <hemant.agrawal@nxp.com>; Ian Stokes <ian.stokes@intel.com>; Jerin Jacob
> <jerinj@marvell.com>; John McNamara <john.mcnamara@intel.com>; Ju-Hyoung Lee <juhlee@microsoft.com>; Kevin Traynor
> <ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang <pezhang@redhat.com>; pingx.yu@intel.com;
> qian.q.xu@intel.com; Raslan Darawsheh <rasland@nvidia.com>; NBU-Contact-Thomas Monjalon <thomas@monjalon.net>;
> yuan.peng@intel.com; zhaoyan.chen@intel.com; Xueming(Steven) Li <xuemingl@nvidia.com>
> Subject: 20.11.2 patches review and test
> 
> Hi all,
> 
> Here is a list of patches targeted for stable release 20.11.2.
> 
> The planned date for the final release is 15th June.
> 
> Please help with testing and validation of your use cases and report any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
> 
> A release candidate tarball can be found at:
> 
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> 
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
> 
> 
> Thanks.
> 
> Xueming Li <xuemingl@nvidia.com>
> 
> ---
> Ajit Khaparde (3):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
> 
> Alvin Zhang (7):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
> 
> Anatoly Burakov (2):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
> 
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
> 
> Andrew Rybchenko (3):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
> 
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
> 
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
> 
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
> 
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
> 
> Bruce Richardson (1):
>       build: exclude meson files from examples installation
> 
> Chenbo Xia (1):
>       examples/vhost: check memory table query
> 
> Chengchang Tang (15):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
> 
> Chengwen Feng (15):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
> 
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
> 
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
> 
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
> 
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
> 
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
> 
> David Marchand (8):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
> 
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
> 
> Dmitry Kozlyuk (3):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
> 
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
> 
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
> 
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
> 
> Ferruh Yigit (3):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
> 
> Guoyang Zhou (1):
>      net/hinic: fix crash in secondary process
> 
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
> 
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
> 
> Hongbo Zheng (3):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
> 
> Huisong Li (11):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
> 
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
> 
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
> 
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
> 
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
> 
> Jiawei Wang (2):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
> 
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
> 
> Jiawen Wu (3):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
> 
> John Daley (1):
>       net/enic: fix flow initialization error handling
> 
> Kalesh AP (18):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
> 
> Kevin Traynor (1):
>       test/cmdline: fix inputs array
> 
> Lance Richardson (6):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
> 
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
> 
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
> 
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
> 
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
> 
> Matan Azrad (4):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
> 
> Min Hu (Connor) (8):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
> 
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
> 
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
> 
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
> 
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
> 
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
> 
> Pavan Nikhilesh (2):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
> 
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
> 
> Qi Zhang (7):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
> 
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
> 
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
> 
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
> 
> Robin Zhang (4):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
> 
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
> 
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
> 
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
> 
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
> 
> Shijith Thotton (1):
>       eventdev: fix case to initiate crypto adapter service
> 
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode
> Somnath Kotur (3):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
> 
> Stanislaw Kardach (1):
>       test: proceed if timer subsystem already initialized
> 
> Stephen Hemminger (1):
>       kni: refactor user request processing
> 
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
> 
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
> 
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
> 
> Thomas Monjalon (16):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
> 
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
> 
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
> 
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
> 
> Viacheslav Ovsiienko (11):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
> 
> Wenjun Wu (2):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
> 
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
> 
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
> 
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
> 
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
> 
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
> 
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
> 
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
> 
> Yunjian Wang (3):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [RFC PATCH v1 0/3] Add PIE support for HQoS library
    @ 2021-06-09 10:53  3% ` Liguzinski, WojciechX
  2021-06-15  9:01  3%   ` [dpdk-dev] [RFC PATCH v2 " Liguzinski, WojciechX
  1 sibling, 1 reply; 200+ results
From: Liguzinski, WojciechX @ 2021-06-09 10:53 UTC (permalink / raw)
  To: dev, jasvinder.singh, cristian.dumitrescu; +Cc: savinay.dharmappa, megha.ajmera

DPDK sched library is equipped with mechanism that secures it from the bufferbloat problem
which is a situation when excess buffers in the network cause high latency and latency 
variation. Currently, it supports RED for active queue management (which is designed 
to control the queue length but it does not control latency directly and is now being 
obsoleted). However, more advanced queue management is required to address this problem
and provide desirable quality of service to users.

This solution (RFC) proposes usage of new algorithm called "PIE" (Proportional Integral
controller Enhanced) that can effectively and directly control queuing latency to address 
the bufferbloat problem.

The implementation of mentioned functionality includes modification of existing and 
adding a new set of data structures to the library, adding PIE related APIs. 
This affects structures in public API/ABI. That is why deprecation notice is going
to be prepared and sent.

Liguzinski, WojciechX (3):
  sched: add PIE based congestion management
  example/qos_sched: add PIE support
  example/ip_pipeline: add PIE support

 config/rte_config.h                      |   1 -
 drivers/net/softnic/rte_eth_softnic_tm.c |   6 +-
 examples/ip_pipeline/tmgr.c              |   6 +-
 examples/qos_sched/app_thread.c          |   1 -
 examples/qos_sched/cfg_file.c            |  82 ++++-
 examples/qos_sched/init.c                |   7 +-
 examples/qos_sched/profile.cfg           | 196 ++++++++----
 lib/sched/meson.build                    |  10 +-
 lib/sched/rte_pie.c                      |  79 +++++
 lib/sched/rte_pie.h                      | 387 +++++++++++++++++++++++
 lib/sched/rte_sched.c                    | 229 ++++++++++----
 lib/sched/rte_sched.h                    |  53 +++-
 12 files changed, 876 insertions(+), 181 deletions(-)
 create mode 100644 lib/sched/rte_pie.c
 create mode 100644 lib/sched/rte_pie.h

-- 
2.17.1


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH v9 07/10] eal: implement functions for mutex management
  @ 2021-06-08 23:04  3%         ` Dmitry Kozlyuk
  2021-06-09 22:37  0%           ` Dmitry Kozlyuk
  0 siblings, 1 reply; 200+ results
From: Dmitry Kozlyuk @ 2021-06-08 23:04 UTC (permalink / raw)
  To: Narcisa Ana Maria Vasile
  Cc: dev, thomas, khot, navasile, dmitrym, roretzla, talshn, ocardona,
	bruce.richardson, david.marchand, pallavi.kadam

2021-06-04 16:44 (UTC-0700), Narcisa Ana Maria Vasile:
[...]
> diff --git a/lib/eal/include/rte_thread_types.h b/lib/eal/include/rte_thread_types.h
> index d67b24a563..7bb0d2948c 100644
> --- a/lib/eal/include/rte_thread_types.h
> +++ b/lib/eal/include/rte_thread_types.h
> @@ -7,4 +7,8 @@
>  
>  #include <pthread.h>
>  
> +#define RTE_THREAD_MUTEX_INITIALIZER     PTHREAD_MUTEX_INITIALIZER
> +
> +typedef pthread_mutex_t                 rte_thread_mutex_t;
> +
>  #endif /* _RTE_THREAD_TYPES_H_ */
> diff --git a/lib/eal/windows/include/rte_windows_thread_types.h b/lib/eal/windows/include/rte_windows_thread_types.h
> index 60e6d94553..c6c8502bfb 100644
> --- a/lib/eal/windows/include/rte_windows_thread_types.h
> +++ b/lib/eal/windows/include/rte_windows_thread_types.h
> @@ -7,4 +7,13 @@
>  
>  #include <rte_windows.h>
>  
> +#define WINDOWS_MUTEX_INITIALIZER               (void*)-1
> +#define RTE_THREAD_MUTEX_INITIALIZER            {WINDOWS_MUTEX_INITIALIZER}
> +
> +struct thread_mutex_t {
> +	void* mutex_id;
> +};
> +
> +typedef struct thread_mutex_t rte_thread_mutex_t;
> +
>  #endif /* _RTE_THREAD_TYPES_H_ */

In previous patches rte_thread content was made opaque and of equal size
for pthread (most implementations) and non-pthread variant.
AFAIU, we agree on the requirement of compatible ABI between variants,
that is, a compiled app can work with any threading variant of DPDK.
Above definition of `rte_thread_mutex_t` does not satisfy it.
Or do we only promise API compatibility?
This is the most important question now.

Also: DPDK should not export names without `rte_` prefix,
i. e. `WINDOWS_MUTEX_INITIALIZER` and `thread_mutex_t`.
Besides, why `_t`?

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-08 11:31  0% ` Kevin Traynor
@ 2021-06-08 13:10  0%   ` Xueming(Steven) Li
  2021-06-14 12:39  0%     ` Xueming(Steven) Li
  0 siblings, 1 reply; 200+ results
From: Xueming(Steven) Li @ 2021-06-08 13:10 UTC (permalink / raw)
  To: Kevin Traynor
  Cc: dev, John McNamara, Luca Boccassi, NBU-Contact-Thomas Monjalon,
	Christian Ehrhardt, Ferruh Yigit, David Marchand



> -----Original Message-----
> From: Kevin Traynor <ktraynor@redhat.com>
> Sent: Tuesday, June 8, 2021 7:31 PM
> To: Xueming(Steven) Li <xuemingl@nvidia.com>
> Cc: dev@dpdk.org; John McNamara <john.mcnamara@intel.com>; Luca Boccassi <bluca@debian.org>; NBU-Contact-Thomas
> Monjalon <thomas@monjalon.net>; Christian Ehrhardt <christian.ehrhardt@canonical.com>; Ferruh Yigit <ferruh.yigit@intel.com>;
> David Marchand <david.marchand@redhat.com>
> Subject: Re: 20.11.2 patches review and test
> 
> (reduced Cc)
> 
> Hi Steven,
> 
> On 01/06/2021 08:54, Xueming(Steven) Li wrote:
> > Hi all,
> >
> > Here is a list of patches targeted for stable release 20.11.2.
> >
> > The planned date for the final release is 15th June.
> >
> > Please help with testing and validation of your use cases and report
> > any issues/results with reply-all to this mail. For the final release
> > the fixes and reported validations will be added to the release notes.
> >
> > A release candidate tarball can be found at:
> >
> >     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> >
> > These patches are located at branch 20.11 of dpdk-stable repo:
> >     https://dpdk.org/browse/dpdk-stable/
> >
> 
> Is the list of patches up to 21.05? Did you drop the fixes for GCC11/clang12? I didn't see them here or in the failed list. I think there is a
> couple that didn't get the right tags, but the ones that did seem missing too.

You are correct, some fixes from v21.05rc1 - v21.05 are missing. 
Seems an issue caused by ./devtools/git-log-fixes.sh, if running scripts with other branches checked out, some patches are hidden. 
I will make another scan soon.

> 
> It would mean that 20.11.2 would not compile on the latest Fedora (34) with the distro packaged compiler versions.
> 
> Kevin.
> 
> >
> > Thanks.
> >
> > Xueming Li <xuemingl@nvidia.com>
> >
> > ---
> > Ajit Khaparde (3):
> >       net/bnxt: fix RSS context cleanup
> >       net/bnxt: check kvargs parsing
> >       net/bnxt: fix resource cleanup
> >
> > Alvin Zhang (7):
> >       net/ice: fix VLAN filter with PF
> >       net/i40e: fix input set field mask
> >       net/igc: fix Rx RSS hash offload capability
> >       net/igc: fix Rx error counter for bad length
> >       net/e1000: fix Rx error counter for bad length
> >       net/e1000: fix max Rx packet size
> >       net/igc: fix Rx packet size
> >
> > Anatoly Burakov (2):
> >       fbarray: fix log message on truncation error
> >       power: do not skip saving original P-state governor
> >
> > Andrew Boyer (1):
> >       net/ionic: fix completion type in lif init
> >
> > Andrew Rybchenko (3):
> >       net/failsafe: fix RSS hash offload reporting
> >       net/failsafe: report minimum and maximum MTU
> >       common/sfc_efx: remove GENEVE from supported tunnels
> >
> > Ankur Dwivedi (1):
> >       crypto/octeontx: fix session-less mode
> >
> > Apeksha Gupta (1):
> >       examples/l2fwd-crypto: skip masked devices
> >
> > Arek Kusztal (1):
> >       crypto/qat: fix offset for out-of-place scatter-gather
> >
> > Beilei Xing (1):
> >       net/i40evf: fix packet loss for X722
> >
> > Bruce Richardson (1):
> >       build: exclude meson files from examples installation
> >
> > Chenbo Xia (1):
> >       examples/vhost: check memory table query
> >
> > Chengchang Tang (15):
> >       net/hns3: fix HW buffer size on MTU update
> >       net/hns3: fix processing Tx offload flags
> >       net/hns3: fix Tx checksum for UDP packets with special port
> >       net/hns3: fix long task queue pairs reset time
> >       ethdev: validate input in module EEPROM dump
> >       ethdev: validate input in register info
> >       ethdev: validate input in EEPROM info
> >       net/hns3: fix rollback after setting PVID failure
> >       net/hns3: fix timing in resetting queues
> >       net/hns3: fix queue state when concurrent with reset
> >       net/hns3: fix configure FEC when concurrent with reset
> >       net/hns3: fix use of command status enumeration
> >       examples: add eal cleanup to examples
> >       net/bonding: fix adding itself as its slave
> >       net/hns3: fix timing in mailbox
> >
> > Chengwen Feng (15):
> >       net/hns3: fix flow counter value
> >       net/hns3: fix VF mailbox head field
> >       net/hns3: support get device version when dump register
> >       net/hns3: fix some packet types
> >       net/hns3: fix missing outer L4 UDP flag for VXLAN
> >       net/hns3: remove VLAN/QinQ ptypes from support list
> >       test: check thread creation
> >       common/dpaax: fix possible null pointer access
> >       examples/ethtool: remove unused parsing
> >       net/hns3: fix flow director lock
> >       net/e1000/base: fix timeout for shadow RAM write
> >       net/hns3: fix setting default MAC address in bonding of VF
> >       net/hns3: fix possible mismatched response of mailbox
> >       net/hns3: fix VF handling LSC event in secondary process
> >       net/hns3: fix verification of NEON support
> >
> > Ciara Loftus (1):
> >       net/af_xdp: fix error handling during Rx queue setup
> >
> > Conor Walsh (1):
> >       examples/l3fwd: fix LPM IPv6 subnets
> >
> > Cristian Dumitrescu (3):
> >       table: fix actions with different data size
> >       pipeline: fix instruction translation
> >       pipeline: fix endianness conversions
> >
> > Dapeng Yu (3):
> >       net/igc: remove MTU setting limitation
> >       net/e1000: remove MTU setting limitation
> >       examples/packet_ordering: fix port configuration
> >
> > David Harton (1):
> >       net/ena: fix releasing Tx ring mbufs
> >
> > David Marchand (8):
> >       doc: fix sphinx rtd theme import in GHA
> >       service: clean references to removed symbol
> >       eal: fix evaluation of log level option
> >       ci: hook to GitHub Actions
> >       ci: enable v21 ABI checks
> >       ci: fix package installation in GitHub Actions
> >       ci: ignore APT update failure in GitHub Actions
> >       ci: catch coredumps
> >
> > Dekel Peled (1):
> >       common/mlx5: fix DevX read output buffer size
> >
> > Dmitry Kozlyuk (3):
> >       net/pcap: fix format string
> >       eal/windows: add missing SPDX license tag
> >       buildtools: fix all drivers disabled on Windows
> >
> > Ed Czeck (2):
> >       net/ark: update packet director initial state
> >       net/ark: refactor Rx buffer recovery
> >
> > Elad Nachman (2):
> >       kni: support async user request
> >       kni: fix kernel deadlock with bifurcated device
> >
> > Feifei Wang (2):
> >       net/i40e: fix parsing packet type for NEON
> >       test/trace: fix race on collected perf data
> >
> > Ferruh Yigit (3):
> >       power: remove duplicated symbols from map file
> >       log/linux: make default output stderr
> >       license: fix typos
> >
> > Guoyang Zhou (1):
> >      net/hinic: fix crash in secondary process
> >
> > Haiyue Wang (1):
> >       net/ixgbe: fix Rx errors statistics for UDP checksum
> >
> > Harman Kalra (1):
> >       event/octeontx2: fix device reconfigure for single slot
> >
> > Hongbo Zheng (3):
> >       app/testpmd: fix Tx/Rx descriptor query error log
> >       net/hns3: fix FLR miss detection
> >       net/hns3: delete redundant blank line
> >
> > Huisong Li (11):
> >       net/hns3: fix device capabilities for copper media type
> >       net/hns3: remove unused parameter markers
> >       net/hns3: fix reporting undefined speed
> >       net/hns3: fix link update when failed to get link info
> >       net/hns3: fix flow control exception
> >       app/testpmd: fix bitmap of link speeds when force speed
> >       net/hns3: fix flow control mode
> >       net/hns3: remove redundant mailbox response
> >       net/hns3: fix DCB mode check
> >       net/hns3: fix VMDq mode check
> >       net/hns3: fix mbuf leakage
> >
> > Ibtisam Tariq (1):
> >       examples/vhost_crypto: remove unused short option
> >
> > Igor Russkikh (2):
> >       net/qede: reduce log verbosity
> >       net/qede: accept bigger RSS table
> >
> > Ilya Maximets (1):
> >       net/virtio: fix interrupt unregistering for listening socket
> >
> > Ivan Malov (5):
> >       net/sfc: fix buffer size for flow parse
> >       net: fix comment in IPv6 header
> >       net/sfc: fix error path inconsistency
> >       common/sfc_efx/base: fix indication of MAE encap support
> >       net/sfc: fix outer rule rollback on error
> >
> > Jiawei Wang (2):
> >       app/testpmd: fix NVGRE encap configuration
> >       net/mlx5: fix resource release for mirror flow
> >
> > Jiawei Zhu (1):
> >       net/mlx5: fix Rx segmented packets on mbuf starvation
> >
> > Jiawen Wu (3):
> >       net/txgbe: remove unused functions
> >       net/txgbe: fix Rx missed packet counter
> >       net/txgbe: update packet type
> >
> > John Daley (1):
> >       net/enic: fix flow initialization error handling
> >
> > Kalesh AP (18):
> >       net/bnxt: remove unused macro
> >       net/bnxt: fix VNIC configuration
> >       net/bnxt: fix firmware fatal error handling
> >       net/bnxt: fix FW readiness check during recovery
> >       net/bnxt: fix device readiness check
> >       net/bnxt: fix VF info allocation
> >       net/bnxt: fix HWRM and FW incompatibility handling
> >       net/bnxt: mute some failure logs
> >       app/testpmd: check MAC address query
> >       net/bnxt: fix PCI write check
> >       net/bnxt: fix link state operations
> >       net/bnxt: fix timesync when PTP is not supported
> >       net/bnxt: fix memory allocation for command response
> >       net/bnxt: fix double free in port start failure
> >       net/bnxt: fix configuring LRO
> >       net/bnxt: fix health check alarm cancellation
> >       net/bnxt: fix PTP support for Thor
> >       net/bnxt: fix ring count calculation for Thor
> >
> > Kevin Traynor (1):
> >       test/cmdline: fix inputs array
> >
> > Lance Richardson (6):
> >       net/bnxt: fix Rx buffer posting
> >       net/bnxt: fix Tx length hint threshold
> >       net/bnxt: fix handling of null flow mask
> >       test: fix TCP header initialization
> >       net/bnxt: fix Rx descriptor status
> >       net/bnxt: fix Rx queue count
> >
> > Leyi Rong (1):
> >       net/iavf: fix packet length parsing in AVX512
> >
> > Li Zhang (1):
> >       net/mlx5: fix flow actions index in cache
> >
> > Luc Pelletier (2):
> >       eal: fix race in control thread creation
> >       eal: fix hang in control thread creation
> >
> > Marvin Liu (5):
> >       vhost: fix split ring potential buffer overflow
> >       vhost: fix packed ring potential buffer overflow
> >       vhost: fix batch dequeue potential buffer overflow
> >       vhost: fix initialization of temporary header
> >       vhost: fix initialization of async temporary header
> >
> > Matan Azrad (4):
> >       common/mlx5/linux: add glue function to query WQ
> >       common/mlx5: add DevX command to query WQ
> >       common/mlx5: add DevX commands for queue counters
> >       vdpa/mlx5: fix virtq cleaning
> >
> > Min Hu (Connor) (8):
> >       net/hns3: fix MTU config complexity
> >       net/hns3: update HiSilicon copyright syntax
> >       net/hns3: fix copyright date
> >       examples/ptpclient: remove wrong comment
> >       test/bpf: fix error message
> >       doc: fix HiSilicon copyright syntax
> >       net/hns3: remove unused macros
> >       net/hns3: remove unused macro
> >
> > Murphy Yang (3):
> >       net/ixgbe: fix RSS RETA being reset after port start
> >       net/i40e: fix flow director config after flow validate
> >       net/i40e: fix flow director for common pctypes
> >
> > Natanael Copa (5):
> >       common/dpaax/caamflib: fix build with musl
> >       bus/dpaa: fix 64-bit arch detection
> >       bus/dpaa: fix build with musl
> >       net/cxgbe: remove use of uint type
> >       app/testpmd: fix build with musl
> >
> > Nipun Gupta (1):
> >       bus/dpaa: fix statistics reading
> >
> > Nithin Dabilpuram (3):
> >       vfio: do not merge contiguous areas
> >       vfio: fix DMA mapping granularity for IOVA as VA
> >       test/mem: fix page size for external memory
> >
> > Pallavi Kadam (1):
> >       bus/pci: skip probing some Windows NDIS devices
> >
> > Pavan Nikhilesh (2):
> >       test/event: fix timeout accuracy
> >       app/eventdev: fix timeout accuracy
> >
> > Pu Xu (1):
> >       ip_frag: fix fragmenting IPv4 packet with header option
> >
> > Qi Zhang (7):
> >       net/ice/base: fix payload indicator on ptype
> >       net/ice/base: fix uninitialized struct
> >       net/ice/base: cleanup filter list on error
> >       net/ice/base: fix memory allocation for MAC addresses
> >       net/iavf: fix TSO max segment size
> >       doc: fix matching versions in ice guide
> >       net/iavf: fix wrong Tx context descriptor
> >
> > Radha Mohan Chintakuntla (1):
> >       raw/octeontx2_dma: assign PCI device in DPI VF
> >
> > Raslan Darawsheh (1):
> >       ethdev: update flow item GTP QFI definition
> >
> > Richael Zhuang (2):
> >       test/power: add delay before checking CPU frequency
> >       test/power: round CPU frequency to check
> >
> > Robin Zhang (4):
> >       net/i40e: announce request queue capability in PF
> >       doc: update recommended versions for i40e
> >       net/i40e: fix lack of MAC type when set MAC address
> >       net/iavf: fix lack of MAC type when set MAC address
> >
> > Rohit Raj (3):
> >       net/dpaa2: fix getting link status
> >       net/dpaa: fix getting link status
> >       examples/l2fwd-crypto: fix packet length while decryption
> >
> > Roy Shterman (1):
> >       mem: fix freeing segments in --huge-unlink mode
> >
> > Satheesh Paul (1):
> >       net/octeontx2: fix VLAN filter
> >
> > Savinay Dharmappa (1):
> >       sched: fix traffic class oversubscription parameter
> >
> > Shijith Thotton (1):
> >       eventdev: fix case to initiate crypto adapter service
> >
> > Siwar Zitouni (1):
> >       net/ice: fix disabling promiscuous mode
> Somnath Kotur (3):
> >       net/bnxt: fix xstats get
> >       net/bnxt: fix Rx and Tx timestamps
> >       net/bnxt: fix Tx timestamp init
> >
> > Stanislaw Kardach (1):
> >       test: proceed if timer subsystem already initialized
> >
> > Stephen Hemminger (1):
> >       kni: refactor user request processing
> >
> > Tal Shnaiderman (2):
> >       eal/windows: fix default thread priority
> >       eal/windows: fix return codes of pthread shim layer
> >
> > Tengfei Zhang (1):
> >       net/pcap: fix file descriptor leak on close
> >
> > Thinh Tran (1):
> >       test: fix autotest handling of skipped tests
> >
> > Thomas Monjalon (16):
> >       bus/pci: fix Windows kernel driver categories
> >       eal: fix comment of OS-specific header files
> >       buildtools: fix build with busybox
> >       build: detect execinfo library on Linux
> >       build: remove redundant _GNU_SOURCE definitions
> >       eal: fix build with musl
> >       net/igc: remove use of uint type
> >       event/dlb: fix header includes for musl
> >       examples/bbdev: fix header include for musl
> >       drivers: fix log level after loading
> >       app/regex: fix usage text
> >       app/testpmd: fix usage text
> >       doc: fix names of UIO drivers
> >       doc: fix build with Sphinx 4
> >       bus/pci: support I/O port operations with musl
> >       app: fix exit messages
> >
> > Tyler Retzlaff (1):
> >       eal: add C++ include guard for reciprocal header
> >
> > Vadim Podovinnikov (1):
> >       net/bonding: fix LACP system address check
> >
> > Venkat Duvvuru (1):
> >       net/bnxt: fix queues per VNIC
> >
> > Viacheslav Ovsiienko (11):
> >       net/mlx5: fix external buffer pool registration for Rx queue
> >       net/mlx5: fix metadata item validation for ingress flows
> >       net/mlx5: fix hashed list size for tunnel flow groups
> >       net/mlx5: fix UAR allocation diagnostics messages
> >       common/mlx5: add timestamp format support to DevX
> >       vdpa/mlx5: support timestamp format
> >       net/mlx5: fix Rx metadata leftovers
> >       net/mlx5: fix drop action for Direct Rules/Verbs
> >       net/mlx4: fix RSS action with null hash key
> >       net/mlx5: support timestamp format
> >       regex/mlx5: support timestamp format
> >
> > Wenjun Wu (2):
> >       net/ice: check some functions return
> >       net/ice: fix RSS hash update
> >
> > Wenwu Ma (1):
> >       net/ice: fix illegal access when removing MAC filter
> >
> > Wenzhuo Lu (2):
> >       net/iavf: fix crash in AVX512
> >       net/ice: fix crash in AVX512
> >
> > Wisam Jaddo (1):
> >       app/flow-perf: fix encap/decap actions
> >
> > Xiao Wang (1):
> >       vdpa/ifc: check PCI config read
> >
> > Xiaoyu Min (4):
> >       net/mlx5: support RSS expansion for IPv6 GRE
> >       net/mlx5: fix shared inner RSS
> >       net/mlx5: fix missing shared RSS hash types
> >       net/mlx5: fix redundant flow after RSS expansion
> >
> > Xiaoyun Li (2):
> >       app/testpmd: remove unnecessary UDP tunnel check
> >       net/i40e: fix IPv4 fragment offload
> >
> > Youri Querry (1):
> >       bus/fslmc: fix random portal hangs with qbman 5.0
> >
> > Yunjian Wang (3):
> >       vfio: fix API description
> >       net/mlx5: fix using flow tunnel before null check
> >       vfio: fix duplicated user mem map
> >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-01  7:54  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
  2021-06-08  8:52  0% ` Jiang, YuX
  2021-06-08 10:28  0% ` Pei Zhang
@ 2021-06-08 11:31  0% ` Kevin Traynor
  2021-06-08 13:10  0%   ` Xueming(Steven) Li
  2021-06-09 11:56  0% ` Xueming(Steven) Li
  3 siblings, 1 reply; 200+ results
From: Kevin Traynor @ 2021-06-08 11:31 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, John McNamara, Luca Boccassi, NBU-Contact-Thomas Monjalon,
	Christian Ehrhardt, Ferruh Yigit, David Marchand

(reduced Cc)

Hi Steven,

On 01/06/2021 08:54, Xueming(Steven) Li wrote:
> Hi all,
> 
> Here is a list of patches targeted for stable release 20.11.2.
> 
> The planned date for the final release is 15th June.
> 
> Please help with testing and validation of your use cases and report
> any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
> 
> A release candidate tarball can be found at:
> 
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
> 
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
> 

Is the list of patches up to 21.05? Did you drop the fixes for
GCC11/clang12? I didn't see them here or in the failed list. I think
there is a couple that didn't get the right tags, but the ones that did
seem missing too.

It would mean that 20.11.2 would not compile on the latest Fedora (34)
with the distro packaged compiler versions.

Kevin.

> 
> Thanks.
> 
> Xueming Li <xuemingl@nvidia.com>
> 
> ---
> Ajit Khaparde (3):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
> 
> Alvin Zhang (7):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
> 
> Anatoly Burakov (2):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
> 
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
> 
> Andrew Rybchenko (3):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
> 
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
> 
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
> 
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
> 
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
> 
> Bruce Richardson (1):
>       build: exclude meson files from examples installation
> 
> Chenbo Xia (1):
>       examples/vhost: check memory table query
> 
> Chengchang Tang (15):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
> 
> Chengwen Feng (15):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
> 
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
> 
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
> 
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
> 
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
> 
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
> 
> David Marchand (8):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
> 
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
> 
> Dmitry Kozlyuk (3):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
> 
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
> 
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
> 
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
> 
> Ferruh Yigit (3):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
> 
> Guoyang Zhou (1):
>      net/hinic: fix crash in secondary process
> 
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
> 
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
> 
> Hongbo Zheng (3):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
> 
> Huisong Li (11):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
> 
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
> 
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
> 
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
> 
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
> 
> Jiawei Wang (2):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
> 
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
> 
> Jiawen Wu (3):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
> 
> John Daley (1):
>       net/enic: fix flow initialization error handling
> 
> Kalesh AP (18):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
> 
> Kevin Traynor (1):
>       test/cmdline: fix inputs array
> 
> Lance Richardson (6):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
> 
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
> 
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
> 
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
> 
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
> 
> Matan Azrad (4):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
> 
> Min Hu (Connor) (8):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
> 
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
> 
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
> 
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
> 
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
> 
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
> 
> Pavan Nikhilesh (2):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
> 
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
> 
> Qi Zhang (7):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
> 
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
> 
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
> 
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
> 
> Robin Zhang (4):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
> 
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
> 
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
> 
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
> 
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
> 
> Shijith Thotton (1):
>       eventdev: fix case to initiate crypto adapter service
> 
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Somnath Kotur (3):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
> 
> Stanislaw Kardach (1):
>       test: proceed if timer subsystem already initialized
> 
> Stephen Hemminger (1):
>       kni: refactor user request processing
> 
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
> 
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
> 
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
> 
> Thomas Monjalon (16):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
> 
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
> 
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
> 
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
> 
> Viacheslav Ovsiienko (11):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
> 
> Wenjun Wu (2):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
> 
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
> 
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
> 
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
> 
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
> 
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
> 
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
> 
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
> 
> Yunjian Wang (3):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map
> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-01  7:54  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
  2021-06-08  8:52  0% ` Jiang, YuX
@ 2021-06-08 10:28  0% ` Pei Zhang
  2021-06-08 11:31  0% ` Kevin Traynor
  2021-06-09 11:56  0% ` Xueming(Steven) Li
  3 siblings, 0 replies; 200+ results
From: Pei Zhang @ 2021-06-08 10:28 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, pingx.yu, qian.q.xu,
	Raslan Darawsheh, NBU-Contact-Thomas Monjalon, yuan.peng,
	zhaoyan.chen

Hello Xueming,

The testing with dpdk 20.11.2-rc1 from Red Hat looks good. We tested below
16 scenarios and all got PASS on RHEL8:

(1)Guest with device assignment(PF) throughput testing(1G hugepage size):
PASS
(2)Guest with device assignment(PF) throughput testing(2M hugepage size) :
PASS
(3)Guest with device assignment(VF) throughput testing: PASS
(4)PVP (host dpdk testpmd as vswitch) 1Q: throughput testing: PASS
(5)PVP vhost-user 2Q throughput testing: PASS
(6)PVP vhost-user 1Q - cross numa node throughput testing: PASS
(7)Guest with vhost-user 2 queues throughput testing: PASS
(8)vhost-user reconnect with dpdk-client, qemu-server: qemu reconnect: PASS
(9)vhost-user reconnect with dpdk-client, qemu-server: ovs reconnect: PASS
(10)PVP 1Q live migration testing: PASS
(11)PVP 1Q cross numa node live migration testing: PASS
(12)Guest with ovs+dpdk+vhost-user 1Q live migration testing: PASS
(13)Guest with ovs+dpdk+vhost-user 1Q live migration testing (2M): PASS
(14)Guest with ovs+dpdk+vhost-user 2Q live migration testing: PASS
(15)Host PF + DPDK testing: PASS
(16)Host VF + DPDK testing: PASS

Versions:

kernel 4.18
qemu 6.0

dpdk: git://dpdk.org/dpdk-stable
# git log -1
commit ad11991368c46d818f5bdfe014106173d88179be (HEAD, tag: v20.11.2-rc1,
origin/20.11)
Author: Xueming Li <xuemingl@nvidia.com>
Date:   Tue Jun 1 14:12:07 2021 +0800

    version: 20.11.2-rc1

    Signed-off-by: Xueming Li <xuemingl@nvidia.com>

NICs: X540-AT2 NIC(ixgbe, 10G)

Best regards,

Pei

On Tue, Jun 1, 2021 at 3:55 PM Xueming(Steven) Li <xuemingl@nvidia.com>
wrote:

> Hi all,
>
> Here is a list of patches targeted for stable release 20.11.2.
>
> The planned date for the final release is 15th June.
>
> Please help with testing and validation of your use cases and report
> any issues/results with reply-all to this mail. For the final release
> the fixes and reported validations will be added to the release notes.
>
> A release candidate tarball can be found at:
>
>     https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
>
> These patches are located at branch 20.11 of dpdk-stable repo:
>     https://dpdk.org/browse/dpdk-stable/
>
>
> Thanks.
>
> Xueming Li <xuemingl@nvidia.com>
>
> ---
> Ajit Khaparde (3):
>       net/bnxt: fix RSS context cleanup
>       net/bnxt: check kvargs parsing
>       net/bnxt: fix resource cleanup
>
> Alvin Zhang (7):
>       net/ice: fix VLAN filter with PF
>       net/i40e: fix input set field mask
>       net/igc: fix Rx RSS hash offload capability
>       net/igc: fix Rx error counter for bad length
>       net/e1000: fix Rx error counter for bad length
>       net/e1000: fix max Rx packet size
>       net/igc: fix Rx packet size
>
> Anatoly Burakov (2):
>       fbarray: fix log message on truncation error
>       power: do not skip saving original P-state governor
>
> Andrew Boyer (1):
>       net/ionic: fix completion type in lif init
>
> Andrew Rybchenko (3):
>       net/failsafe: fix RSS hash offload reporting
>       net/failsafe: report minimum and maximum MTU
>       common/sfc_efx: remove GENEVE from supported tunnels
>
> Ankur Dwivedi (1):
>       crypto/octeontx: fix session-less mode
>
> Apeksha Gupta (1):
>       examples/l2fwd-crypto: skip masked devices
>
> Arek Kusztal (1):
>       crypto/qat: fix offset for out-of-place scatter-gather
>
> Beilei Xing (1):
>       net/i40evf: fix packet loss for X722
>
> Bruce Richardson (1):
>       build: exclude meson files from examples installation
>
> Chenbo Xia (1):
>       examples/vhost: check memory table query
>
> Chengchang Tang (15):
>       net/hns3: fix HW buffer size on MTU update
>       net/hns3: fix processing Tx offload flags
>       net/hns3: fix Tx checksum for UDP packets with special port
>       net/hns3: fix long task queue pairs reset time
>       ethdev: validate input in module EEPROM dump
>       ethdev: validate input in register info
>       ethdev: validate input in EEPROM info
>       net/hns3: fix rollback after setting PVID failure
>       net/hns3: fix timing in resetting queues
>       net/hns3: fix queue state when concurrent with reset
>       net/hns3: fix configure FEC when concurrent with reset
>       net/hns3: fix use of command status enumeration
>       examples: add eal cleanup to examples
>       net/bonding: fix adding itself as its slave
>       net/hns3: fix timing in mailbox
>
> Chengwen Feng (15):
>       net/hns3: fix flow counter value
>       net/hns3: fix VF mailbox head field
>       net/hns3: support get device version when dump register
>       net/hns3: fix some packet types
>       net/hns3: fix missing outer L4 UDP flag for VXLAN
>       net/hns3: remove VLAN/QinQ ptypes from support list
>       test: check thread creation
>       common/dpaax: fix possible null pointer access
>       examples/ethtool: remove unused parsing
>       net/hns3: fix flow director lock
>       net/e1000/base: fix timeout for shadow RAM write
>       net/hns3: fix setting default MAC address in bonding of VF
>       net/hns3: fix possible mismatched response of mailbox
>       net/hns3: fix VF handling LSC event in secondary process
>       net/hns3: fix verification of NEON support
>
> Ciara Loftus (1):
>       net/af_xdp: fix error handling during Rx queue setup
>
> Conor Walsh (1):
>       examples/l3fwd: fix LPM IPv6 subnets
>
> Cristian Dumitrescu (3):
>       table: fix actions with different data size
>       pipeline: fix instruction translation
>       pipeline: fix endianness conversions
>
> Dapeng Yu (3):
>       net/igc: remove MTU setting limitation
>       net/e1000: remove MTU setting limitation
>       examples/packet_ordering: fix port configuration
>
> David Harton (1):
>       net/ena: fix releasing Tx ring mbufs
>
> David Marchand (8):
>       doc: fix sphinx rtd theme import in GHA
>       service: clean references to removed symbol
>       eal: fix evaluation of log level option
>       ci: hook to GitHub Actions
>       ci: enable v21 ABI checks
>       ci: fix package installation in GitHub Actions
>       ci: ignore APT update failure in GitHub Actions
>       ci: catch coredumps
>
> Dekel Peled (1):
>       common/mlx5: fix DevX read output buffer size
>
> Dmitry Kozlyuk (3):
>       net/pcap: fix format string
>       eal/windows: add missing SPDX license tag
>       buildtools: fix all drivers disabled on Windows
>
> Ed Czeck (2):
>       net/ark: update packet director initial state
>       net/ark: refactor Rx buffer recovery
>
> Elad Nachman (2):
>       kni: support async user request
>       kni: fix kernel deadlock with bifurcated device
>
> Feifei Wang (2):
>       net/i40e: fix parsing packet type for NEON
>       test/trace: fix race on collected perf data
>
> Ferruh Yigit (3):
>       power: remove duplicated symbols from map file
>       log/linux: make default output stderr
>       license: fix typos
>
> Guoyang Zhou (1):
>      net/hinic: fix crash in secondary process
>
> Haiyue Wang (1):
>       net/ixgbe: fix Rx errors statistics for UDP checksum
>
> Harman Kalra (1):
>       event/octeontx2: fix device reconfigure for single slot
>
> Hongbo Zheng (3):
>       app/testpmd: fix Tx/Rx descriptor query error log
>       net/hns3: fix FLR miss detection
>       net/hns3: delete redundant blank line
>
> Huisong Li (11):
>       net/hns3: fix device capabilities for copper media type
>       net/hns3: remove unused parameter markers
>       net/hns3: fix reporting undefined speed
>       net/hns3: fix link update when failed to get link info
>       net/hns3: fix flow control exception
>       app/testpmd: fix bitmap of link speeds when force speed
>       net/hns3: fix flow control mode
>       net/hns3: remove redundant mailbox response
>       net/hns3: fix DCB mode check
>       net/hns3: fix VMDq mode check
>       net/hns3: fix mbuf leakage
>
> Ibtisam Tariq (1):
>       examples/vhost_crypto: remove unused short option
>
> Igor Russkikh (2):
>       net/qede: reduce log verbosity
>       net/qede: accept bigger RSS table
>
> Ilya Maximets (1):
>       net/virtio: fix interrupt unregistering for listening socket
>
> Ivan Malov (5):
>       net/sfc: fix buffer size for flow parse
>       net: fix comment in IPv6 header
>       net/sfc: fix error path inconsistency
>       common/sfc_efx/base: fix indication of MAE encap support
>       net/sfc: fix outer rule rollback on error
>
> Jiawei Wang (2):
>       app/testpmd: fix NVGRE encap configuration
>       net/mlx5: fix resource release for mirror flow
>
> Jiawei Zhu (1):
>       net/mlx5: fix Rx segmented packets on mbuf starvation
>
> Jiawen Wu (3):
>       net/txgbe: remove unused functions
>       net/txgbe: fix Rx missed packet counter
>       net/txgbe: update packet type
>
> John Daley (1):
>       net/enic: fix flow initialization error handling
>
> Kalesh AP (18):
>       net/bnxt: remove unused macro
>       net/bnxt: fix VNIC configuration
>       net/bnxt: fix firmware fatal error handling
>       net/bnxt: fix FW readiness check during recovery
>       net/bnxt: fix device readiness check
>       net/bnxt: fix VF info allocation
>       net/bnxt: fix HWRM and FW incompatibility handling
>       net/bnxt: mute some failure logs
>       app/testpmd: check MAC address query
>       net/bnxt: fix PCI write check
>       net/bnxt: fix link state operations
>       net/bnxt: fix timesync when PTP is not supported
>       net/bnxt: fix memory allocation for command response
>       net/bnxt: fix double free in port start failure
>       net/bnxt: fix configuring LRO
>       net/bnxt: fix health check alarm cancellation
>       net/bnxt: fix PTP support for Thor
>       net/bnxt: fix ring count calculation for Thor
>
> Kevin Traynor (1):
>       test/cmdline: fix inputs array
>
> Lance Richardson (6):
>       net/bnxt: fix Rx buffer posting
>       net/bnxt: fix Tx length hint threshold
>       net/bnxt: fix handling of null flow mask
>       test: fix TCP header initialization
>       net/bnxt: fix Rx descriptor status
>       net/bnxt: fix Rx queue count
>
> Leyi Rong (1):
>       net/iavf: fix packet length parsing in AVX512
>
> Li Zhang (1):
>       net/mlx5: fix flow actions index in cache
>
> Luc Pelletier (2):
>       eal: fix race in control thread creation
>       eal: fix hang in control thread creation
>
> Marvin Liu (5):
>       vhost: fix split ring potential buffer overflow
>       vhost: fix packed ring potential buffer overflow
>       vhost: fix batch dequeue potential buffer overflow
>       vhost: fix initialization of temporary header
>       vhost: fix initialization of async temporary header
>
> Matan Azrad (4):
>       common/mlx5/linux: add glue function to query WQ
>       common/mlx5: add DevX command to query WQ
>       common/mlx5: add DevX commands for queue counters
>       vdpa/mlx5: fix virtq cleaning
>
> Min Hu (Connor) (8):
>       net/hns3: fix MTU config complexity
>       net/hns3: update HiSilicon copyright syntax
>       net/hns3: fix copyright date
>       examples/ptpclient: remove wrong comment
>       test/bpf: fix error message
>       doc: fix HiSilicon copyright syntax
>       net/hns3: remove unused macros
>       net/hns3: remove unused macro
>
> Murphy Yang (3):
>       net/ixgbe: fix RSS RETA being reset after port start
>       net/i40e: fix flow director config after flow validate
>       net/i40e: fix flow director for common pctypes
>
> Natanael Copa (5):
>       common/dpaax/caamflib: fix build with musl
>       bus/dpaa: fix 64-bit arch detection
>       bus/dpaa: fix build with musl
>       net/cxgbe: remove use of uint type
>       app/testpmd: fix build with musl
>
> Nipun Gupta (1):
>       bus/dpaa: fix statistics reading
>
> Nithin Dabilpuram (3):
>       vfio: do not merge contiguous areas
>       vfio: fix DMA mapping granularity for IOVA as VA
>       test/mem: fix page size for external memory
>
> Pallavi Kadam (1):
>       bus/pci: skip probing some Windows NDIS devices
>
> Pavan Nikhilesh (2):
>       test/event: fix timeout accuracy
>       app/eventdev: fix timeout accuracy
>
> Pu Xu (1):
>       ip_frag: fix fragmenting IPv4 packet with header option
>
> Qi Zhang (7):
>       net/ice/base: fix payload indicator on ptype
>       net/ice/base: fix uninitialized struct
>       net/ice/base: cleanup filter list on error
>       net/ice/base: fix memory allocation for MAC addresses
>       net/iavf: fix TSO max segment size
>       doc: fix matching versions in ice guide
>       net/iavf: fix wrong Tx context descriptor
>
> Radha Mohan Chintakuntla (1):
>       raw/octeontx2_dma: assign PCI device in DPI VF
>
> Raslan Darawsheh (1):
>       ethdev: update flow item GTP QFI definition
>
> Richael Zhuang (2):
>       test/power: add delay before checking CPU frequency
>       test/power: round CPU frequency to check
>
> Robin Zhang (4):
>       net/i40e: announce request queue capability in PF
>       doc: update recommended versions for i40e
>       net/i40e: fix lack of MAC type when set MAC address
>       net/iavf: fix lack of MAC type when set MAC address
>
> Rohit Raj (3):
>       net/dpaa2: fix getting link status
>       net/dpaa: fix getting link status
>       examples/l2fwd-crypto: fix packet length while decryption
>
> Roy Shterman (1):
>       mem: fix freeing segments in --huge-unlink mode
>
> Satheesh Paul (1):
>       net/octeontx2: fix VLAN filter
>
> Savinay Dharmappa (1):
>       sched: fix traffic class oversubscription parameter
>
> Shijith Thotton (1):
>       eventdev: fix case to initiate crypto adapter service
>
> Siwar Zitouni (1):
>       net/ice: fix disabling promiscuous mode
>
>
>
>
>
>
>
>
>                            Somnath Kotur (3):
>       net/bnxt: fix xstats get
>       net/bnxt: fix Rx and Tx timestamps
>       net/bnxt: fix Tx timestamp init
>
> Stanislaw Kardach (1):
>       test: proceed if timer subsystem already initialized
>
> Stephen Hemminger (1):
>       kni: refactor user request processing
>
> Tal Shnaiderman (2):
>       eal/windows: fix default thread priority
>       eal/windows: fix return codes of pthread shim layer
>
> Tengfei Zhang (1):
>       net/pcap: fix file descriptor leak on close
>
> Thinh Tran (1):
>       test: fix autotest handling of skipped tests
>
> Thomas Monjalon (16):
>       bus/pci: fix Windows kernel driver categories
>       eal: fix comment of OS-specific header files
>       buildtools: fix build with busybox
>       build: detect execinfo library on Linux
>       build: remove redundant _GNU_SOURCE definitions
>       eal: fix build with musl
>       net/igc: remove use of uint type
>       event/dlb: fix header includes for musl
>       examples/bbdev: fix header include for musl
>       drivers: fix log level after loading
>       app/regex: fix usage text
>       app/testpmd: fix usage text
>       doc: fix names of UIO drivers
>       doc: fix build with Sphinx 4
>       bus/pci: support I/O port operations with musl
>       app: fix exit messages
>
> Tyler Retzlaff (1):
>       eal: add C++ include guard for reciprocal header
>
> Vadim Podovinnikov (1):
>       net/bonding: fix LACP system address check
>
> Venkat Duvvuru (1):
>       net/bnxt: fix queues per VNIC
>
> Viacheslav Ovsiienko (11):
>       net/mlx5: fix external buffer pool registration for Rx queue
>       net/mlx5: fix metadata item validation for ingress flows
>       net/mlx5: fix hashed list size for tunnel flow groups
>       net/mlx5: fix UAR allocation diagnostics messages
>       common/mlx5: add timestamp format support to DevX
>       vdpa/mlx5: support timestamp format
>       net/mlx5: fix Rx metadata leftovers
>       net/mlx5: fix drop action for Direct Rules/Verbs
>       net/mlx4: fix RSS action with null hash key
>       net/mlx5: support timestamp format
>       regex/mlx5: support timestamp format
>
> Wenjun Wu (2):
>       net/ice: check some functions return
>       net/ice: fix RSS hash update
>
> Wenwu Ma (1):
>       net/ice: fix illegal access when removing MAC filter
>
> Wenzhuo Lu (2):
>       net/iavf: fix crash in AVX512
>       net/ice: fix crash in AVX512
>
> Wisam Jaddo (1):
>       app/flow-perf: fix encap/decap actions
>
> Xiao Wang (1):
>       vdpa/ifc: check PCI config read
>
> Xiaoyu Min (4):
>       net/mlx5: support RSS expansion for IPv6 GRE
>       net/mlx5: fix shared inner RSS
>       net/mlx5: fix missing shared RSS hash types
>       net/mlx5: fix redundant flow after RSS expansion
>
> Xiaoyun Li (2):
>       app/testpmd: remove unnecessary UDP tunnel check
>       net/i40e: fix IPv4 fragment offload
>
> Youri Querry (1):
>       bus/fslmc: fix random portal hangs with qbman 5.0
>
> Yunjian Wang (3):
>       vfio: fix API description
>       net/mlx5: fix using flow tunnel before null check
>       vfio: fix duplicated user mem map
>

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] 20.11.2 patches review and test
  2021-06-01  7:54  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
@ 2021-06-08  8:52  0% ` Jiang, YuX
  2021-06-08 10:28  0% ` Pei Zhang
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 200+ results
From: Jiang, YuX @ 2021-06-08  8:52 UTC (permalink / raw)
  To: Xueming(Steven) Li
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani, Walker,
	Benjamin, David Christensen, Govindharajan, Hariprasad,
	Hemant Agrawal, Stokes, Ian, Jerin Jacob, Mcnamara, John,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, Yu,
	PingX, Xu, Qian Q, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	Peng, Yuan, Chen, Zhaoyan

Hi Steven,

Testing with dpdk v20.11.2-rc1 from Intel looks good, no critical issue is found. All of them are known issues. More details as below:

# Basic Intel(R) NIC testing
	*PF(i40e, ixgbe): test scenarios including rte_flow/TSO/Jumboframe/checksum offload/Tunnel, etc. Listed but not all.
		- Below two known issues are found.
			> (1)https://bugs.dpdk.org/show_bug.cgi?id=687 : unit_tests_power/power_cpufreq: unit test failed. This issue is found in 21.05 and dev has patches to fix it but not merged into main.
			> (2)ddp_gtp_qregion/fd_gtpu_ipv4_dstip: flow director does not work. This issue is found in 21.05 and dev has patches to fix it but not merged into main.
                             
	*VF(i40e,ixgbe): test scenarios including vf-rte_flow/TSO/Jumboframe/checksum offload/Tunnel, Listed but not all.
		- No new issues are found.        
              
	*PF/VF(ice): test scenarios including switch features/Flow Director/Advanced RSS/ACL/DCF/Flexible Descriptor and so on, Listed but not all.
		- Below 4 known DPDK issues are found. 
			> (1)rxtx_offload/rxoffload_port: Pkt1 can't be distributed to the same queue. This issue is found in 21.05. Dev has patches to fix it, but not merged into main.
			> (2)dcf_lifecycle/handle_acl_filter_05: after reset port the mac changed. This issue is found in 21.05 and fixed in 21.05, apply below patch series passed in lts20.11.2. 
				Patches link: 
					https://patchwork.dpdk.org/project/dpdk/list/?series=16712&state=%2A&archive=both 
			> (3)cvl_advanced_iavf_rss: change the SCTP port value, the hash value remains unchanged. This issue is found in 20.11-rc3, not fixed yet.
			> (4)Can't create 512 acl rules after creating a full mask switch rule. This issue is also occurred in dpdk 20.11 and not fixed yet.
                             
	* Build or compile:  
		* Build: cover the build test combination with latest GCC/Clang/ICC version and the popular OS revision such as Ubuntu20.04, CentOS8.3 and so on. Listed but not all.
			- All passed expect build failed on Fedora34 with GCC11 and Clang12.
				> GCC11 issue: https://bugs.dpdk.org/show_bug.cgi?id=692 : bnx2x build fail on Fedora 34 with gcc 11. 
					This issue is found in 21.05 on Fedora34 with GCC 11. Has patches to fix in 21.05 and merged into main, but apply failed in lts20.11.2.
					Patches link: 
						http://patchwork.dpdk.org/project/dpdk/list/?series=16927&state=%2A&archive=both (apply failed)
						http://patchwork.dpdk.org/project/dpdk/patch/20210505085314.54750-1-ktraynor@redhat.com/  
						http://patchwork.dpdk.org/project/dpdk/patch/20210514150834.227474-1-ktraynor@redhat.com/  
						http://patchwork.dpdk.org/project/dpdk/patch/20210517155739.800371-1-ferruh.yigit@intel.com/ 
				> Clang12 issue: app/test/dpdk-test.p/test_cmdline_num.c.o build failed on Fedora34 with Clang12, but build passed on Fedora33 with Clang11. Should we create bugzilla to track this issue?
              
		* Compile: cover the CFLAGES(O0/O1/O2/O3) with popular OS such as Ubuntu20.04 and CentOS 8.3. 
			- No test this.
              
	* Intel NIC single core/NIC performance: test scenarios including PF/VF single core performance test(AVX2+AVX512) test and so on. Listed but not all.
		- All passed. No big data drop. 

 # Basic cryptodev and virtio testing
	* Virtio: both function and performance test are covered. Such as PVP/Virtio_loopback/virtio-user loopback/virtio-net VM2VM perf testing, etc.. Listed but not all.
		- One known issues as below:
			> (1)The UDP fragmentation offload feature of Virtio-net device can’t be turned on in the VM, kernel issue, bugzilla has been submited: https://bugzilla.kernel.org/show_bug.cgi?id=207075 , not fixed yet.
	* Cryptodev: 
		- Function test: test scenarios including Cryptodev API testing/CompressDev ISA-L/QAT/ZLIB PMD Testing/FIPS, etc. Listed but not all.
			- All passed.
		- Performance test: test scenarios including Thoughput Performance /Cryptodev Latency, etc. Listed but not all.
			- No big data drop.

dpdk: https://dpdk.org/browse/dpdk-stable/
commit ad11991368c46d818f5bdfe014106173d88179be (HEAD, tag: v20.11.2-rc1, origin/20.11)
Author: Xueming Li <xuemingl@nvidia.com>
Date:   Tue Jun 1 14:12:07 2021 +0800

    version: 20.11.2-rc1

    Signed-off-by: Xueming Li <xuemingl@nvidia.com>

Best regards,
Yu Jiang

>-----Original Message-----
>From: dev <dev-bounces@dpdk.org> On Behalf Of Xueming(Steven) Li
>Sent: Tuesday, June 1, 2021 3:55 PM
>Cc: dev@dpdk.org; Abhishek Marathe <Abhishek.Marathe@microsoft.com>;
>Akhil Goyal <akhil.goyal@nxp.com>; Ali Alnubani <alialnu@nvidia.com>;
>Walker, Benjamin <benjamin.walker@intel.com>; David Christensen
><drc@linux.vnet.ibm.com>; Govindharajan, Hariprasad
><hariprasad.govindharajan@intel.com>; Hemant Agrawal
><hemant.agrawal@nxp.com>; Stokes, Ian <ian.stokes@intel.com>; Jerin
>Jacob <jerinj@marvell.com>; Mcnamara, John <john.mcnamara@intel.com>;
>Ju-Hyoung Lee <juhlee@microsoft.com>; Kevin Traynor
><ktraynor@redhat.com>; Luca Boccassi <bluca@debian.org>; Pei Zhang
><pezhang@redhat.com>; Yu, PingX <pingx.yu@intel.com>; Xu, Qian Q
><qian.q.xu@intel.com>; Raslan Darawsheh <rasland@nvidia.com>; NBU-
>Contact-Thomas Monjalon <thomas@monjalon.net>; Peng, Yuan
><yuan.peng@intel.com>; Chen, Zhaoyan <zhaoyan.chen@intel.com>;
>Xueming(Steven) Li <xuemingl@nvidia.com>
>Subject: [dpdk-dev] 20.11.2 patches review and test
>
>Hi all,
>
>Here is a list of patches targeted for stable release 20.11.2.
>
>The planned date for the final release is 15th June.
>
>Please help with testing and validation of your use cases and report any
>issues/results with reply-all to this mail. For the final release the fixes and
>reported validations will be added to the release notes.
>
>A release candidate tarball can be found at:
>
>    https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1
>
>These patches are located at branch 20.11 of dpdk-stable repo:
>    https://dpdk.org/browse/dpdk-stable/
>
>
>Thanks.
>
>Xueming Li <xuemingl@nvidia.com>
>
>---
>Ajit Khaparde (3):
>      net/bnxt: fix RSS context cleanup
>      net/bnxt: check kvargs parsing
>      net/bnxt: fix resource cleanup
>
>Alvin Zhang (7):
>      net/ice: fix VLAN filter with PF
>      net/i40e: fix input set field mask
>      net/igc: fix Rx RSS hash offload capability
>      net/igc: fix Rx error counter for bad length
>      net/e1000: fix Rx error counter for bad length
>      net/e1000: fix max Rx packet size
>      net/igc: fix Rx packet size
>
>Anatoly Burakov (2):
>      fbarray: fix log message on truncation error
>      power: do not skip saving original P-state governor
>
>Andrew Boyer (1):
>      net/ionic: fix completion type in lif init
>
>Andrew Rybchenko (3):
>      net/failsafe: fix RSS hash offload reporting
>      net/failsafe: report minimum and maximum MTU
>      common/sfc_efx: remove GENEVE from supported tunnels
>
>Ankur Dwivedi (1):
>      crypto/octeontx: fix session-less mode
>
>Apeksha Gupta (1):
>      examples/l2fwd-crypto: skip masked devices
>
>Arek Kusztal (1):
>      crypto/qat: fix offset for out-of-place scatter-gather
>
>Beilei Xing (1):
>      net/i40evf: fix packet loss for X722
>
>Bruce Richardson (1):
>      build: exclude meson files from examples installation
>
>Chenbo Xia (1):
>      examples/vhost: check memory table query
>
>Chengchang Tang (15):
>      net/hns3: fix HW buffer size on MTU update
>      net/hns3: fix processing Tx offload flags
>      net/hns3: fix Tx checksum for UDP packets with special port
>      net/hns3: fix long task queue pairs reset time
>      ethdev: validate input in module EEPROM dump
>      ethdev: validate input in register info
>      ethdev: validate input in EEPROM info
>      net/hns3: fix rollback after setting PVID failure
>      net/hns3: fix timing in resetting queues
>      net/hns3: fix queue state when concurrent with reset
>      net/hns3: fix configure FEC when concurrent with reset
>      net/hns3: fix use of command status enumeration
>      examples: add eal cleanup to examples
>      net/bonding: fix adding itself as its slave
>      net/hns3: fix timing in mailbox
>
>Chengwen Feng (15):
>      net/hns3: fix flow counter value
>      net/hns3: fix VF mailbox head field
>      net/hns3: support get device version when dump register
>      net/hns3: fix some packet types
>      net/hns3: fix missing outer L4 UDP flag for VXLAN
>      net/hns3: remove VLAN/QinQ ptypes from support list
>      test: check thread creation
>      common/dpaax: fix possible null pointer access
>      examples/ethtool: remove unused parsing
>      net/hns3: fix flow director lock
>      net/e1000/base: fix timeout for shadow RAM write
>      net/hns3: fix setting default MAC address in bonding of VF
>      net/hns3: fix possible mismatched response of mailbox
>      net/hns3: fix VF handling LSC event in secondary process
>      net/hns3: fix verification of NEON support
>
>Ciara Loftus (1):
>      net/af_xdp: fix error handling during Rx queue setup
>
>Conor Walsh (1):
>      examples/l3fwd: fix LPM IPv6 subnets
>
>Cristian Dumitrescu (3):
>      table: fix actions with different data size
>      pipeline: fix instruction translation
>      pipeline: fix endianness conversions
>
>Dapeng Yu (3):
>      net/igc: remove MTU setting limitation
>      net/e1000: remove MTU setting limitation
>      examples/packet_ordering: fix port configuration
>
>David Harton (1):
>      net/ena: fix releasing Tx ring mbufs
>
>David Marchand (8):
>      doc: fix sphinx rtd theme import in GHA
>      service: clean references to removed symbol
>      eal: fix evaluation of log level option
>      ci: hook to GitHub Actions
>      ci: enable v21 ABI checks
>      ci: fix package installation in GitHub Actions
>      ci: ignore APT update failure in GitHub Actions
>      ci: catch coredumps
>
>Dekel Peled (1):
>      common/mlx5: fix DevX read output buffer size
>
>Dmitry Kozlyuk (3):
>      net/pcap: fix format string
>      eal/windows: add missing SPDX license tag
>      buildtools: fix all drivers disabled on Windows
>
>Ed Czeck (2):
>      net/ark: update packet director initial state
>      net/ark: refactor Rx buffer recovery
>
>Elad Nachman (2):
>      kni: support async user request
>      kni: fix kernel deadlock with bifurcated device
>
>Feifei Wang (2):
>      net/i40e: fix parsing packet type for NEON
>      test/trace: fix race on collected perf data
>
>Ferruh Yigit (3):
>      power: remove duplicated symbols from map file
>      log/linux: make default output stderr
>      license: fix typos
>
>Guoyang Zhou (1):
>     net/hinic: fix crash in secondary process
>
>Haiyue Wang (1):
>      net/ixgbe: fix Rx errors statistics for UDP checksum
>
>Harman Kalra (1):
>      event/octeontx2: fix device reconfigure for single slot
>
>Hongbo Zheng (3):
>      app/testpmd: fix Tx/Rx descriptor query error log
>      net/hns3: fix FLR miss detection
>      net/hns3: delete redundant blank line
>
>Huisong Li (11):
>      net/hns3: fix device capabilities for copper media type
>      net/hns3: remove unused parameter markers
>      net/hns3: fix reporting undefined speed
>      net/hns3: fix link update when failed to get link info
>      net/hns3: fix flow control exception
>      app/testpmd: fix bitmap of link speeds when force speed
>      net/hns3: fix flow control mode
>      net/hns3: remove redundant mailbox response
>      net/hns3: fix DCB mode check
>      net/hns3: fix VMDq mode check
>      net/hns3: fix mbuf leakage
>
>Ibtisam Tariq (1):
>      examples/vhost_crypto: remove unused short option
>
>Igor Russkikh (2):
>      net/qede: reduce log verbosity
>      net/qede: accept bigger RSS table
>
>Ilya Maximets (1):
>      net/virtio: fix interrupt unregistering for listening socket
>
>Ivan Malov (5):
>      net/sfc: fix buffer size for flow parse
>      net: fix comment in IPv6 header
>      net/sfc: fix error path inconsistency
>      common/sfc_efx/base: fix indication of MAE encap support
>      net/sfc: fix outer rule rollback on error
>
>Jiawei Wang (2):
>      app/testpmd: fix NVGRE encap configuration
>      net/mlx5: fix resource release for mirror flow
>
>Jiawei Zhu (1):
>      net/mlx5: fix Rx segmented packets on mbuf starvation
>
>Jiawen Wu (3):
>      net/txgbe: remove unused functions
>      net/txgbe: fix Rx missed packet counter
>      net/txgbe: update packet type
>
>John Daley (1):
>      net/enic: fix flow initialization error handling
>
>Kalesh AP (18):
>      net/bnxt: remove unused macro
>      net/bnxt: fix VNIC configuration
>      net/bnxt: fix firmware fatal error handling
>      net/bnxt: fix FW readiness check during recovery
>      net/bnxt: fix device readiness check
>      net/bnxt: fix VF info allocation
>      net/bnxt: fix HWRM and FW incompatibility handling
>      net/bnxt: mute some failure logs
>      app/testpmd: check MAC address query
>      net/bnxt: fix PCI write check
>      net/bnxt: fix link state operations
>      net/bnxt: fix timesync when PTP is not supported
>      net/bnxt: fix memory allocation for command response
>      net/bnxt: fix double free in port start failure
>      net/bnxt: fix configuring LRO
>      net/bnxt: fix health check alarm cancellation
>      net/bnxt: fix PTP support for Thor
>      net/bnxt: fix ring count calculation for Thor
>
>Kevin Traynor (1):
>      test/cmdline: fix inputs array
>
>Lance Richardson (6):
>      net/bnxt: fix Rx buffer posting
>      net/bnxt: fix Tx length hint threshold
>      net/bnxt: fix handling of null flow mask
>      test: fix TCP header initialization
>      net/bnxt: fix Rx descriptor status
>      net/bnxt: fix Rx queue count
>
>Leyi Rong (1):
>      net/iavf: fix packet length parsing in AVX512
>
>Li Zhang (1):
>      net/mlx5: fix flow actions index in cache
>
>Luc Pelletier (2):
>      eal: fix race in control thread creation
>      eal: fix hang in control thread creation
>
>Marvin Liu (5):
>      vhost: fix split ring potential buffer overflow
>      vhost: fix packed ring potential buffer overflow
>      vhost: fix batch dequeue potential buffer overflow
>      vhost: fix initialization of temporary header
>      vhost: fix initialization of async temporary header
>
>Matan Azrad (4):
>      common/mlx5/linux: add glue function to query WQ
>      common/mlx5: add DevX command to query WQ
>      common/mlx5: add DevX commands for queue counters
>      vdpa/mlx5: fix virtq cleaning
>
>Min Hu (Connor) (8):
>      net/hns3: fix MTU config complexity
>      net/hns3: update HiSilicon copyright syntax
>      net/hns3: fix copyright date
>      examples/ptpclient: remove wrong comment
>      test/bpf: fix error message
>      doc: fix HiSilicon copyright syntax
>      net/hns3: remove unused macros
>      net/hns3: remove unused macro
>
>Murphy Yang (3):
>      net/ixgbe: fix RSS RETA being reset after port start
>      net/i40e: fix flow director config after flow validate
>      net/i40e: fix flow director for common pctypes
>
>Natanael Copa (5):
>      common/dpaax/caamflib: fix build with musl
>      bus/dpaa: fix 64-bit arch detection
>      bus/dpaa: fix build with musl
>      net/cxgbe: remove use of uint type
>      app/testpmd: fix build with musl
>
>Nipun Gupta (1):
>      bus/dpaa: fix statistics reading
>
>Nithin Dabilpuram (3):
>      vfio: do not merge contiguous areas
>      vfio: fix DMA mapping granularity for IOVA as VA
>      test/mem: fix page size for external memory
>
>Pallavi Kadam (1):
>      bus/pci: skip probing some Windows NDIS devices
>
>Pavan Nikhilesh (2):
>      test/event: fix timeout accuracy
>      app/eventdev: fix timeout accuracy
>
>Pu Xu (1):
>      ip_frag: fix fragmenting IPv4 packet with header option
>
>Qi Zhang (7):
>      net/ice/base: fix payload indicator on ptype
>      net/ice/base: fix uninitialized struct
>      net/ice/base: cleanup filter list on error
>      net/ice/base: fix memory allocation for MAC addresses
>      net/iavf: fix TSO max segment size
>      doc: fix matching versions in ice guide
>      net/iavf: fix wrong Tx context descriptor
>
>Radha Mohan Chintakuntla (1):
>      raw/octeontx2_dma: assign PCI device in DPI VF
>
>Raslan Darawsheh (1):
>      ethdev: update flow item GTP QFI definition
>
>Richael Zhuang (2):
>      test/power: add delay before checking CPU frequency
>      test/power: round CPU frequency to check
>
>Robin Zhang (4):
>      net/i40e: announce request queue capability in PF
>      doc: update recommended versions for i40e
>      net/i40e: fix lack of MAC type when set MAC address
>      net/iavf: fix lack of MAC type when set MAC address
>
>Rohit Raj (3):
>      net/dpaa2: fix getting link status
>      net/dpaa: fix getting link status
>      examples/l2fwd-crypto: fix packet length while decryption
>
>Roy Shterman (1):
>      mem: fix freeing segments in --huge-unlink mode
>
>Satheesh Paul (1):
>      net/octeontx2: fix VLAN filter
>
>Savinay Dharmappa (1):
>      sched: fix traffic class oversubscription parameter
>
>Shijith Thotton (1):
>      eventdev: fix case to initiate crypto adapter service
>
>Siwar Zitouni (1):
>      net/ice: fix disabling promiscuous mode
>Somnath Kotur (3):
>      net/bnxt: fix xstats get
>      net/bnxt: fix Rx and Tx timestamps
>      net/bnxt: fix Tx timestamp init
>
>Stanislaw Kardach (1):
>      test: proceed if timer subsystem already initialized
>
>Stephen Hemminger (1):
>      kni: refactor user request processing
>
>Tal Shnaiderman (2):
>      eal/windows: fix default thread priority
>      eal/windows: fix return codes of pthread shim layer
>
>Tengfei Zhang (1):
>      net/pcap: fix file descriptor leak on close
>
>Thinh Tran (1):
>      test: fix autotest handling of skipped tests
>
>Thomas Monjalon (16):
>      bus/pci: fix Windows kernel driver categories
>      eal: fix comment of OS-specific header files
>      buildtools: fix build with busybox
>      build: detect execinfo library on Linux
>      build: remove redundant _GNU_SOURCE definitions
>      eal: fix build with musl
>      net/igc: remove use of uint type
>      event/dlb: fix header includes for musl
>      examples/bbdev: fix header include for musl
>      drivers: fix log level after loading
>      app/regex: fix usage text
>      app/testpmd: fix usage text
>      doc: fix names of UIO drivers
>      doc: fix build with Sphinx 4
>      bus/pci: support I/O port operations with musl
>      app: fix exit messages
>
>Tyler Retzlaff (1):
>      eal: add C++ include guard for reciprocal header
>
>Vadim Podovinnikov (1):
>      net/bonding: fix LACP system address check
>
>Venkat Duvvuru (1):
>      net/bnxt: fix queues per VNIC
>
>Viacheslav Ovsiienko (11):
>      net/mlx5: fix external buffer pool registration for Rx queue
>      net/mlx5: fix metadata item validation for ingress flows
>      net/mlx5: fix hashed list size for tunnel flow groups
>      net/mlx5: fix UAR allocation diagnostics messages
>      common/mlx5: add timestamp format support to DevX
>      vdpa/mlx5: support timestamp format
>      net/mlx5: fix Rx metadata leftovers
>      net/mlx5: fix drop action for Direct Rules/Verbs
>      net/mlx4: fix RSS action with null hash key
>      net/mlx5: support timestamp format
>      regex/mlx5: support timestamp format
>
>Wenjun Wu (2):
>      net/ice: check some functions return
>      net/ice: fix RSS hash update
>
>Wenwu Ma (1):
>      net/ice: fix illegal access when removing MAC filter
>
>Wenzhuo Lu (2):
>      net/iavf: fix crash in AVX512
>      net/ice: fix crash in AVX512
>
>Wisam Jaddo (1):
>      app/flow-perf: fix encap/decap actions
>
>Xiao Wang (1):
>      vdpa/ifc: check PCI config read
>
>Xiaoyu Min (4):
>      net/mlx5: support RSS expansion for IPv6 GRE
>      net/mlx5: fix shared inner RSS
>      net/mlx5: fix missing shared RSS hash types
>      net/mlx5: fix redundant flow after RSS expansion
>
>Xiaoyun Li (2):
>      app/testpmd: remove unnecessary UDP tunnel check
>      net/i40e: fix IPv4 fragment offload
>
>Youri Querry (1):
>      bus/fslmc: fix random portal hangs with qbman 5.0
>
>Yunjian Wang (3):
>      vfio: fix API description
>      net/mlx5: fix using flow tunnel before null check
>      vfio: fix duplicated user mem map

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH v9 10/10] Enable the new EAL thread API
  2021-06-08  5:50  5%         ` Narcisa Ana Maria Vasile
@ 2021-06-08  7:45  5%           ` David Marchand
  2021-06-18 21:53  0%             ` Narcisa Ana Maria Vasile
  0 siblings, 1 reply; 200+ results
From: David Marchand @ 2021-06-08  7:45 UTC (permalink / raw)
  To: Narcisa Ana Maria Vasile
  Cc: dev, Thomas Monjalon, Dmitry Kozlyuk, Khoa To, navasile,
	Dmitry Malloy (MESHCHANINOV),
	roretzla, Tal Shnaiderman, Omar Cardona, Bruce Richardson,
	Pallavi Kadam

On Tue, Jun 8, 2021 at 7:50 AM Narcisa Ana Maria Vasile
<navasile@linux.microsoft.com> wrote:
>
> On Fri, Jun 04, 2021 at 04:44:34PM -0700, Narcisa Ana Maria Vasile wrote:
> > From: Narcisa Vasile <navasile@microsoft.com>
> >
> > Rename pthread_* occurrences with the new rte_thread_* API.
> > Enable the new API in the build system.
> >
> > Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
> > ---
>
> I'll send v10.
> Can someone please help with an example on how to check for ABI breaks? Thank you!
>
> I've run:
> DPDK_ABI_REF_VERSION=v21.05 DPDK_ABI_REF_DIR=~/ref ./devtools/test-meson-builds.sh
> which doesn't give any warnings about the ABI break.

This should work the way you tried if you have working toolchains and
libabigail installed.
Something is off in your env.

Side note: ovsrobot is out those days (we have some trouble in one of
RH labs and it happens ovsrobot is hosted there), but you could try
with a github repo of yours + GHA, and the ABI failure should be
caught too.


I just tried on my rhel7 (gcc 4.8.5 + libabigail 1.8.2) with your
series applied.
$ DPDK_ABI_REF_VERSION=v21.05
DPDK_ABI_REF_DIR=~/git/pub/dpdk.org/reference
./devtools/test-meson-builds.sh
...
Error: ABI issue reported for 'abidiff --suppr
/home/dmarchan/git/pub/dpdk.org/devtools/../devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/usr/local/include
--headers-dir2 /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/usr/local/include
/home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/dump/librte_eal.dump
/home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/dump/librte_eal.dump'
ABIDIFF_ABI_CHANGE, this change requires a review (abidiff flagged
this as a potential issue).


$ abidiff --suppr
/home/dmarchan/git/pub/dpdk.org/devtools/../devtools/libabigail.abignore
--no-added-syms --headers-dir1
/home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/usr/local/include
--headers-dir2 /home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/usr/local/include
/home/dmarchan/git/pub/dpdk.org/reference/v21.05/build-gcc-shared/dump/librte_eal.dump
/home/dmarchan/git/pub/dpdk.org/build-gcc-shared/install/dump/librte_eal.dump
Functions changes summary: 0 Removed, 2 Changed (1 filtered out), 0
Added (20 filtered out) functions
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable

2 functions with some indirect sub-type change:

  [C] 'function int rte_ctrl_thread_create(pthread_t*, const char*,
const pthread_attr_t*, void* (void*)*, void*)' at rte_lcore.h:443:1
has some indirect sub-type changes:
    parameter 1 of type 'pthread_t*' changed:
      in pointed to type 'typedef pthread_t' at rte_thread.h:42:1:
        typedef name changed from pthread_t to rte_thread_t at rte_thread.h:42:1
        underlying type 'unsigned long int' changed:
          entity changed from 'unsigned long int' to 'struct
rte_thread_tag' at rte_thread.h:40:1
          type size hasn't changed
    parameter 3 of type 'const pthread_attr_t*' changed:
      in pointed to type 'const pthread_attr_t':
        'const pthread_attr_t' changed to 'const rte_thread_attr_t'

  [C] 'function int rte_thread_setname(pthread_t, const char*)' at
rte_lcore.h:377:1 has some indirect sub-type changes:
    parameter 1 of type 'typedef pthread_t' changed:
      typedef name changed from pthread_t to rte_thread_t at rte_thread.h:42:1
      underlying type 'unsigned long int' changed:
        entity changed from 'unsigned long int' to 'struct
rte_thread_tag' at rte_thread.h:40:1
        type size hasn't changed



Can you check that in your env build-gcc-shared/ and the build
directory for references are configured with debug symbols?
You should see:
$ meson configure build-gcc-shared | awk '$1=="buildtype" {print $2}'
debugoptimized
$ meson configure reference/v21.05/build | awk '$1=="buildtype" {print $2}'
debugoptimized




>
> I've cloned the dpdk repo in "~/ref" and checkout v21.05 tag.
> "~/dpdk" is on a local branch that contains my changes:
>
> "./devtools/check-abi.sh ~/ref ~/dpdk" - didn't work.
>
> I've then used gen-abi.sh (with a small change to skip the *.symbols,
> since abidw can't handle them) to generate the *.dump files. Reruning check-abi.sh

gen-abi.sh is an internal script that works for an installed dpdk, not
a build directory.
$ ./devtools/gen-abi.sh
Usage: ./devtools/gen-abi.sh installdir


> worked this time, but didn't show the ABI break. This is the entire output:
>
> ------
> WARNING: could not identify an include directory for /home/administrator/ref, expect false positives...
> WARNING: could not identify an include directory for /home/administrator/dpdk, expect false positives...

check-abi.sh tries to find an include/ directory to filter changes on
private structures.
But there are multiple include/ dirs in a build directory, so the
script gives up on trying to filter and logs a warning.
This is not clearly written but, like gen-abi.sh, check-abi.sh works
on an installed directory.


> Functions changes summary: 0 Removed, 0 Changed, 0 Added function
> Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
> Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info
> ------
>
> I've also tried to compare each file:
> abidiff --suppr ./devtools/libabigail.abignore --no-added-syms  ~/ref/dump/librte_eal.dump ~/dpdk/dump/librte_eal.dump
>

Without debug info, libabigail won't catch/report much but symbol
removals, or basic changes in function signatures.


-- 
David Marchand


^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [PATCH v9 10/10] Enable the new EAL thread API
  @ 2021-06-08  5:50  5%         ` Narcisa Ana Maria Vasile
  2021-06-08  7:45  5%           ` David Marchand
  0 siblings, 1 reply; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-08  5:50 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

On Fri, Jun 04, 2021 at 04:44:34PM -0700, Narcisa Ana Maria Vasile wrote:
> From: Narcisa Vasile <navasile@microsoft.com>
> 
> Rename pthread_* occurrences with the new rte_thread_* API.
> Enable the new API in the build system.
> 
> Signed-off-by: Narcisa Vasile <navasile@microsoft.com>
> ---

I'll send v10. 
Can someone please help with an example on how to check for ABI breaks? Thank you!

I've run:
DPDK_ABI_REF_VERSION=v21.05 DPDK_ABI_REF_DIR=~/ref ./devtools/test-meson-builds.sh
which doesn't give any warnings about the ABI break.

I've cloned the dpdk repo in "~/ref" and checkout v21.05 tag.
"~/dpdk" is on a local branch that contains my changes:

"./devtools/check-abi.sh ~/ref ~/dpdk" - didn't work.

I've then used gen-abi.sh (with a small change to skip the *.symbols,
since abidw can't handle them) to generate the *.dump files. Reruning check-abi.sh
worked this time, but didn't show the ABI break. This is the entire output:

------
WARNING: could not identify an include directory for /home/administrator/ref, expect false positives...
WARNING: could not identify an include directory for /home/administrator/dpdk, expect false positives...
Functions changes summary: 0 Removed, 0 Changed, 0 Added function
Variables changes summary: 0 Removed, 0 Changed, 0 Added variable
Variable symbols changes summary: 0 Removed, 0 Added variable symbol not referenced by debug info
------

I've also tried to compare each file:
abidiff --suppr ./devtools/libabigail.abignore --no-added-syms  ~/ref/dump/librte_eal.dump ~/dpdk/dump/librte_eal.dump


^ permalink raw reply	[relevance 5%]

* Re: [dpdk-dev] [RFC PATCH 0/3] Add PIE support for HQoS library
  @ 2021-06-07 13:01  0%   ` Liguzinski, WojciechX
  0 siblings, 0 replies; 200+ results
From: Liguzinski, WojciechX @ 2021-06-07 13:01 UTC (permalink / raw)
  To: Morten Brørup, Singh, Jasvinder, Dumitrescu, Cristian
  Cc: Dharmappa, Savinay, dev


> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com> 
> Sent: Tuesday, May 25, 2021 10:57 AM
> To: Liguzinski, WojciechX <wojciechx.liguzinski@intel.com>; dev@dpdk.org; Singh, Jasvinder <jasvinder.singh@intel.com>; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: Dharmappa, Savinay <savinay.dharmappa@intel.com>
> Subject: RE: [dpdk-dev] [RFC PATCH 0/3] Add PIE support for HQoS library
>
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Liguzinski, 
> > WojciechX
> > Sent: Monday, 24 May 2021 12.58
> > 
> > DPDK sched library is equipped with mechanism that secures it from the 
> > bufferbloat problem which is a situation when excess buffers in the 
> > network cause high latency and latency variation. Currently, it 
> > supports RED for queue congestion control
>
> The correct term is "active queue management", not "queue congestion control".

Good point. I will correct the naming.

>
> > (which is designed
> > to control the queue length but it does not control latency directly 
> > and is now being obsoleted ).
>
> Some might prefer other algorithms, such as PIE, CoDel, CAKE, etc., but RED is not obsolete!

I didn't write that it is obsolete, I just shortened what was written in the RFC (8033) on page 4:
"(...) AQM schemes, such as Random Early Detection
(RED) [RED] as suggested in [RFC2309] (which is now obsoleted by
[RFC7567]), have been around for well over a decade. RED is
implemented in a wide variety of network devices, both in hardware
and software. Unfortunately, due to the fact that RED needs careful
tuning of its parameters for various network conditions, most network
operators don't turn RED on. (...)"

Apologies if I weren't precise when thinking about such a summary. :-)

>
> > However, more advanced queue management is required to address this 
> > problem and provide desirable quality of service to users.
> > 
> > This solution (RFC) proposes usage of new algorithm called "PIE"
> > (Proportional Integral
> > controller Enhanced) that can effectively and directly control queuing 
> > latency to address the bufferbloat problem.
> > 
> > The implementation of mentioned functionality includes modification of 
> > existing and adding a new set of data structures to the library, 
> > adding PIE related APIs.
> > This affects structures in public API/ABI. That is why deprecation 
> > notice is going to be prepared and sent.
> > 
> > 
> > Liguzinski, WojciechX (3):
> >   sched: add pie based congestion management
> >   example/qos_sched: add pie support
> >   example/ip_pipeline: add pie support
>
> It's "PIE", not "pie". :-)

Sure, I will make a proper naming corrections ;-)

>
> Nonetheless, the RFC looks good!
>
> -Morten

Thanks,
Wojciech

^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v9 00/10] eal: Add EAL API for threading
  2021-06-04 23:38  2%   ` [dpdk-dev] [PATCH v8 " Narcisa Ana Maria Vasile
@ 2021-06-04 23:44  2%     ` Narcisa Ana Maria Vasile
                           ` (2 more replies)
  0 siblings, 3 replies; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-04 23:44 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

EAL thread API

**Problem Statement**
DPDK currently uses the pthread interface to create and manage threads.
Windows does not support the POSIX thread programming model, so it currently
relies on a header file that hides the Windows calls under
pthread matched interfaces. Given that EAL should isolate the environment
specifics from the applications and libraries and mediate
all the communication with the operating systems, a new EAL interface
is needed for thread management.

**Goals**
* Introduce a generic EAL API for threading support that will remove
  the current Windows pthread.h shim.
* Replace references to pthread_* across the DPDK codebase with the new
  RTE_THREAD_* API.
* Allow users to choose between using the RTE_THREAD_* API or a
  3rd party thread library through a configuration option.

**Design plan**
New API main files:
* rte_thread.h (librte_eal/include)
* rte_thread_types.h (librte_eal/include)
* rte_thread_windows_types.h (librte_eal/windows/include)
* rte_thread.c (librte_eal/windows)
* rte_thread.c (librte_eal/common)

For flexibility, the user is offered the option of either using the RTE_THREAD_* API or
a 3rd party thread library, through a meson flag “use_external_thread_lib”.
By default, this flag is set to FALSE, which means Windows libraries and applications
will use the RTE_THREAD_* API for managing threads.

If compiling on Windows and the “use_external_thread_lib” is *not* set,
the following files will be parsed: 
* include/rte_thread.h
* windows/include/rte_thread_windows_types.h
* windows/rte_thread.c
In all other cases, the compilation/parsing includes the following files:
* include/rte_thread.h 
* include/rte_thread_types.h
* common/rte_thread.c

**A schematic example of the design**
--------------------------------------------------
lib/librte_eal/include/rte_thread.h
int rte_thread_create();

lib/librte_eal/common/rte_thread.c
int rte_thread_create() 
{
	return pthread_create();
}

lib/librte_eal/windows/rte_thread.c
int rte_thread_create() 
{
	return CreateThread();
}

lib/librte_eal/windows/meson.build
if get_option('use_external_thread_lib')
	sources += 'librte_eal/common/rte_thread.c'
else
	sources += 'librte_eal/windows/rte_thread.c'
endif
-----------------------------------------------------

**Thread attributes**

When or after a thread is created, specific characteristics of the thread
can be adjusted. Given that the thread characteristics that are of interest
for DPDK applications are affinity and priority, the following structure
that represents thread attributes has been defined:

typedef struct
{
	enum rte_thread_priority priority;
	rte_cpuset_t cpuset;
} rte_thread_attr_t;

The *rte_thread_create()* function can optionally receive an rte_thread_attr_t
object that will cause the thread to be created with the affinity and priority
described by the attributes object. If no rte_thread_attr_t is passed
(parameter is NULL), the default affinity and priority are used.
An rte_thread_attr_t object can also be set to the default values
by calling *rte_thread_attr_init()*.

*Priority* is represented through an enum that currently advertises
two values for priority:
	- RTE_THREAD_PRIORITY_NORMAL
	- RTE_THREAD_PRIORITY_REALTIME_CRITICAL
The enum can be extended to allow for multiple priority levels.
rte_thread_set_priority      - sets the priority of a thread
rte_thread_attr_set_priority - updates an rte_thread_attr_t object
                               with a new value for priority

The user can choose thread priority through an EAL parameter,
when starting an application.  If EAL parameter is not used,
the per-platform default value for thread priority is used.
Otherwise administrator has an option to set one of available options:
 --thread-prio normal
 --thread-prio realtime

Example:
./dpdk-l2fwd -l 0-3 -n 4 –thread-prio normal -- -q 8 -p ffff

*Affinity* is described by the already known “rte_cpuset_t” type.
rte_thread_attr_set/get_affinity - sets/gets the affinity field in a
                                   rte_thread_attr_t object
rte_thread_set/get_affinity      – sets/gets the affinity of a thread

**Errors**
A translation function that maps Windows error codes to errno-style
error codes is provided. 

**Future work**
Note that this patchset was focused on introducing new API that will
remove the Windows pthread.h shim. In DPDK, there are still a few references
to pthread_* that were not implemented in the shim.
The long term plan is for EAL to provide full threading support:
* Adding support for conditional variables
* Additional functionality offered by pthread_* (such as pthread_setname_np, etc.)
* Static mutex initializers are not used on Windows. If we must continue
  using them, they need to be platform dependent and an implementation will
  need to be provided for Windows.

v9:
- Sign patches

v8:
- Rebase
- Add rte_thread_detach() API
- Set default priority, when user did not specify a value

v7:
Based on DmitryK's review:
- Change thread id representation
- Change mutex id representation
- Implement static mutex inititalizer for Windows
- Change barrier identifier representation
- Improve commit messages
- Add missing doxygen comments
- Split error translation function
- Improve name for affinity function
- Remove cpuset_size parameter
- Fix eal_create_cpu_map function
- Map EAL priority values to OS specific values
- Add thread wrapper for start routine
- Do not export rte_thread_cancel() on Windows
- Cleanup, fix comments, fix typos.

v6:
- improve error-translation function
- call the error translation function in rte_thread_value_get()

v5:
- update cover letter with more details on the priority argument

v4:
- fix function description
- rebase

v3:
- rebase

v2:
- revert changes that break ABI 
- break up changes into smaller patches
- fix coding style issues
- fix issues with errors
- fix parameter type in examples/kni.c

Narcisa Vasile (10):
  eal: add thread id and simple thread functions
  eal: add thread attributes
  eal/windows: translate Windows errors to errno-style errors
  eal: implement functions for thread affinity management
  eal: implement thread priority management functions
  eal: add thread lifetime management
  eal: implement functions for mutex management
  eal: implement functions for thread barrier management
  eal: add EAL argument for setting thread priority
  Enable the new EAL thread API

 app/test/process.h                            |   8 +-
 app/test/test_lcores.c                        |  16 +-
 app/test/test_link_bonding.c                  |  10 +-
 app/test/test_lpm_perf.c                      |  12 +-
 config/meson.build                            |   4 +
 drivers/bus/dpaa/base/qbman/bman_driver.c     |   5 +-
 drivers/bus/dpaa/base/qbman/dpaa_sys.c        |  14 +-
 drivers/bus/dpaa/base/qbman/process.c         |   6 +-
 drivers/bus/dpaa/dpaa_bus.c                   |  14 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  19 +-
 drivers/compress/mlx5/mlx5_compress.c         |  10 +-
 drivers/event/dlb2/pf/base/dlb2_osdep.h       |   4 +-
 drivers/net/af_xdp/rte_eth_af_xdp.c           |  18 +-
 drivers/net/ark/ark_ethdev.c                  |   2 +-
 drivers/net/ark/ark_pktgen.c                  |   4 +-
 drivers/net/atlantic/atl_ethdev.c             |   4 +-
 drivers/net/atlantic/atl_types.h              |   5 +-
 .../net/atlantic/hw_atl/hw_atl_utils_fw2x.c   |  26 +-
 drivers/net/axgbe/axgbe_common.h              |   2 +-
 drivers/net/axgbe/axgbe_dev.c                 |   8 +-
 drivers/net/axgbe/axgbe_ethdev.c              |   8 +-
 drivers/net/axgbe/axgbe_ethdev.h              |   8 +-
 drivers/net/axgbe/axgbe_i2c.c                 |   4 +-
 drivers/net/axgbe/axgbe_mdio.c                |   8 +-
 drivers/net/axgbe/axgbe_phy_impl.c            |   6 +-
 drivers/net/bnxt/bnxt.h                       |  16 +-
 drivers/net/bnxt/bnxt_cpr.c                   |   4 +-
 drivers/net/bnxt/bnxt_ethdev.c                |  52 +-
 drivers/net/bnxt/bnxt_irq.c                   |   8 +-
 drivers/net/bnxt/bnxt_reps.c                  |  10 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c            |  34 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h            |   4 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c          |  24 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h          |   2 +-
 drivers/net/ena/base/ena_plat_dpdk.h          |  12 +-
 drivers/net/enic/enic.h                       |   2 +-
 drivers/net/ice/ice_dcf_parent.c              |   6 +-
 drivers/net/ipn3ke/ipn3ke_representor.c       |   6 +-
 drivers/net/ixgbe/ixgbe_ethdev.c              |   2 +-
 drivers/net/ixgbe/ixgbe_ethdev.h              |   2 +-
 drivers/net/kni/rte_eth_kni.c                 |   8 +-
 drivers/net/mlx5/linux/mlx5_os.c              |   2 +-
 drivers/net/mlx5/mlx5.c                       |  20 +-
 drivers/net/mlx5/mlx5.h                       |   2 +-
 drivers/net/mlx5/mlx5_txpp.c                  |   8 +-
 drivers/net/mlx5/windows/mlx5_flow_os.c       |  10 +-
 drivers/net/mlx5/windows/mlx5_os.c            |   2 +-
 drivers/net/qede/base/bcm_osal.h              |   8 +-
 drivers/net/vhost/rte_eth_vhost.c             |  24 +-
 .../net/virtio/virtio_user/virtio_user_dev.c  |  30 +-
 .../net/virtio/virtio_user/virtio_user_dev.h  |   2 +-
 drivers/raw/ifpga/ifpga_rawdev.c              |   6 +-
 drivers/vdpa/ifc/ifcvf_vdpa.c                 |  46 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c                 |  24 +-
 drivers/vdpa/mlx5/mlx5_vdpa.h                 |   4 +-
 drivers/vdpa/mlx5/mlx5_vdpa_event.c           |  46 +-
 examples/kni/main.c                           |   6 +-
 .../performance-thread/pthread_shim/main.c    |   2 +-
 examples/vhost/main.c                         |   2 +-
 examples/vhost_blk/vhost_blk.c                |  12 +-
 lib/eal/common/eal_common_options.c           |  34 +-
 lib/eal/common/eal_common_proc.c              |  48 +-
 lib/eal/common/eal_common_thread.c            |  42 +-
 lib/eal/common/eal_common_trace.c             |   2 +-
 lib/eal/common/eal_internal_cfg.h             |   2 +
 lib/eal/common/eal_options.h                  |   2 +
 lib/eal/common/eal_private.h                  |   2 +-
 lib/eal/common/eal_thread.h                   |   6 +
 lib/eal/common/malloc_mp.c                    |  32 +-
 lib/eal/common/meson.build                    |   1 +
 lib/eal/common/rte_thread.c                   | 422 +++++++++++
 lib/eal/freebsd/eal.c                         |  42 +-
 lib/eal/freebsd/eal_alarm.c                   |  12 +-
 lib/eal/freebsd/eal_interrupts.c              |   4 +-
 lib/eal/freebsd/eal_thread.c                  |  14 +-
 lib/eal/include/meson.build                   |   1 +
 lib/eal/include/rte_lcore.h                   |   8 +-
 lib/eal/include/rte_per_lcore.h               |   2 -
 lib/eal/include/rte_thread.h                  | 378 +++++++++-
 lib/eal/include/rte_thread_types.h            |  14 +
 lib/eal/linux/eal.c                           |  46 +-
 lib/eal/linux/eal_alarm.c                     |  10 +-
 lib/eal/linux/eal_interrupts.c                |   4 +-
 lib/eal/linux/eal_thread.c                    |  18 +-
 lib/eal/linux/eal_timer.c                     |   2 +-
 lib/eal/unix/meson.build                      |   1 -
 lib/eal/unix/rte_thread.c                     |  92 ---
 lib/eal/version.map                           |  22 +
 lib/eal/windows/eal.c                         |  43 +-
 lib/eal/windows/eal_interrupts.c              |  10 +-
 lib/eal/windows/eal_lcore.c                   | 169 +++--
 lib/eal/windows/eal_thread.c                  |  28 +-
 lib/eal/windows/eal_windows.h                 |  20 +-
 lib/eal/windows/include/meson.build           |   1 +
 lib/eal/windows/include/pthread.h             | 192 -----
 .../include/rte_windows_thread_types.h        |  19 +
 lib/eal/windows/include/sched.h               |   2 +-
 lib/eal/windows/meson.build                   |   7 +-
 lib/eal/windows/rte_thread.c                  | 678 +++++++++++++++++-
 lib/ethdev/rte_ethdev.c                       |   4 +-
 lib/ethdev/rte_ethdev_core.h                  |   5 +-
 lib/ethdev/rte_flow.c                         |   4 +-
 lib/eventdev/rte_event_eth_rx_adapter.c       |   6 +-
 lib/vhost/fd_man.c                            |  40 +-
 lib/vhost/fd_man.h                            |   6 +-
 lib/vhost/socket.c                            | 130 ++--
 lib/vhost/vhost.c                             |  10 +-
 meson_options.txt                             |   2 +
 108 files changed, 2345 insertions(+), 967 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c
 create mode 100644 lib/eal/include/rte_thread_types.h
 delete mode 100644 lib/eal/unix/rte_thread.c
 delete mode 100644 lib/eal/windows/include/pthread.h
 create mode 100644 lib/eal/windows/include/rte_windows_thread_types.h

-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v8 00/10] eal: Add EAL API for threading
  2021-06-01 20:55  2% ` [dpdk-dev] [PATCH v7 00/10] eal: Add EAL " Narcisa Ana Maria Vasile
@ 2021-06-04 23:38  2%   ` Narcisa Ana Maria Vasile
  2021-06-04 23:44  2%     ` [dpdk-dev] [PATCH v9 " Narcisa Ana Maria Vasile
  0 siblings, 1 reply; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-04 23:38 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

EAL thread API

**Problem Statement**
DPDK currently uses the pthread interface to create and manage threads.
Windows does not support the POSIX thread programming model, so it currently
relies on a header file that hides the Windows calls under
pthread matched interfaces. Given that EAL should isolate the environment
specifics from the applications and libraries and mediate
all the communication with the operating systems, a new EAL interface
is needed for thread management.

**Goals**
* Introduce a generic EAL API for threading support that will remove
  the current Windows pthread.h shim.
* Replace references to pthread_* across the DPDK codebase with the new
  RTE_THREAD_* API.
* Allow users to choose between using the RTE_THREAD_* API or a
  3rd party thread library through a configuration option.

**Design plan**
New API main files:
* rte_thread.h (librte_eal/include)
* rte_thread_types.h (librte_eal/include)
* rte_thread_windows_types.h (librte_eal/windows/include)
* rte_thread.c (librte_eal/windows)
* rte_thread.c (librte_eal/common)

For flexibility, the user is offered the option of either using the RTE_THREAD_* API or
a 3rd party thread library, through a meson flag “use_external_thread_lib”.
By default, this flag is set to FALSE, which means Windows libraries and applications
will use the RTE_THREAD_* API for managing threads.

If compiling on Windows and the “use_external_thread_lib” is *not* set,
the following files will be parsed: 
* include/rte_thread.h
* windows/include/rte_thread_windows_types.h
* windows/rte_thread.c
In all other cases, the compilation/parsing includes the following files:
* include/rte_thread.h 
* include/rte_thread_types.h
* common/rte_thread.c

**A schematic example of the design**
--------------------------------------------------
lib/librte_eal/include/rte_thread.h
int rte_thread_create();

lib/librte_eal/common/rte_thread.c
int rte_thread_create() 
{
	return pthread_create();
}

lib/librte_eal/windows/rte_thread.c
int rte_thread_create() 
{
	return CreateThread();
}

lib/librte_eal/windows/meson.build
if get_option('use_external_thread_lib')
	sources += 'librte_eal/common/rte_thread.c'
else
	sources += 'librte_eal/windows/rte_thread.c'
endif
-----------------------------------------------------

**Thread attributes**

When or after a thread is created, specific characteristics of the thread
can be adjusted. Given that the thread characteristics that are of interest
for DPDK applications are affinity and priority, the following structure
that represents thread attributes has been defined:

typedef struct
{
	enum rte_thread_priority priority;
	rte_cpuset_t cpuset;
} rte_thread_attr_t;

The *rte_thread_create()* function can optionally receive an rte_thread_attr_t
object that will cause the thread to be created with the affinity and priority
described by the attributes object. If no rte_thread_attr_t is passed
(parameter is NULL), the default affinity and priority are used.
An rte_thread_attr_t object can also be set to the default values
by calling *rte_thread_attr_init()*.

*Priority* is represented through an enum that currently advertises
two values for priority:
	- RTE_THREAD_PRIORITY_NORMAL
	- RTE_THREAD_PRIORITY_REALTIME_CRITICAL
The enum can be extended to allow for multiple priority levels.
rte_thread_set_priority      - sets the priority of a thread
rte_thread_attr_set_priority - updates an rte_thread_attr_t object
                               with a new value for priority

The user can choose thread priority through an EAL parameter,
when starting an application.  If EAL parameter is not used,
the per-platform default value for thread priority is used.
Otherwise administrator has an option to set one of available options:
 --thread-prio normal
 --thread-prio realtime

Example:
./dpdk-l2fwd -l 0-3 -n 4 –thread-prio normal -- -q 8 -p ffff

*Affinity* is described by the already known “rte_cpuset_t” type.
rte_thread_attr_set/get_affinity - sets/gets the affinity field in a
                                   rte_thread_attr_t object
rte_thread_set/get_affinity      – sets/gets the affinity of a thread

**Errors**
A translation function that maps Windows error codes to errno-style
error codes is provided. 

**Future work**
Note that this patchset was focused on introducing new API that will
remove the Windows pthread.h shim. In DPDK, there are still a few references
to pthread_* that were not implemented in the shim.
The long term plan is for EAL to provide full threading support:
* Adding support for conditional variables
* Additional functionality offered by pthread_* (such as pthread_setname_np, etc.)
* Static mutex initializers are not used on Windows. If we must continue
  using them, they need to be platform dependent and an implementation will
  need to be provided for Windows.

v8:
- Rebase
- Add rte_thread_detach() API
- Set default priority, when user did not specify a value

v7:
Based on DmitryK's review:
- Change thread id representation
- Change mutex id representation
- Implement static mutex inititalizer for Windows
- Change barrier identifier representation
- Improve commit messages
- Add missing doxygen comments
- Split error translation function
- Improve name for affinity function
- Remove cpuset_size parameter
- Fix eal_create_cpu_map function
- Map EAL priority values to OS specific values
- Add thread wrapper for start routine
- Do not export rte_thread_cancel() on Windows
- Cleanup, fix comments, fix typos.

v6:
- improve error-translation function
- call the error translation function in rte_thread_value_get()

v5:
- update cover letter with more details on the priority argument

v4:
- fix function description
- rebase

v3:
- rebase

v2:
- revert changes that break ABI 
- break up changes into smaller patches
- fix coding style issues
- fix issues with errors
- fix parameter type in examples/kni.c

Narcisa Vasile (10):
  eal: add thread id and simple thread functions
  eal: add thread attributes
  eal/windows: translate Windows errors to errno-style errors
  eal: implement functions for thread affinity management
  eal: implement thread priority management functions
  eal: add thread lifetime management
  eal: implement functions for mutex management
  eal: implement functions for thread barrier management
  eal: add EAL argument for setting thread priority
  Enable the new EAL thread API

 app/test/process.h                            |   8 +-
 app/test/test_lcores.c                        |  16 +-
 app/test/test_link_bonding.c                  |  10 +-
 app/test/test_lpm_perf.c                      |  12 +-
 config/meson.build                            |   4 +
 drivers/bus/dpaa/base/qbman/bman_driver.c     |   5 +-
 drivers/bus/dpaa/base/qbman/dpaa_sys.c        |  14 +-
 drivers/bus/dpaa/base/qbman/process.c         |   6 +-
 drivers/bus/dpaa/dpaa_bus.c                   |  14 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  19 +-
 drivers/compress/mlx5/mlx5_compress.c         |  10 +-
 drivers/event/dlb2/pf/base/dlb2_osdep.h       |   4 +-
 drivers/net/af_xdp/rte_eth_af_xdp.c           |  18 +-
 drivers/net/ark/ark_ethdev.c                  |   2 +-
 drivers/net/ark/ark_pktgen.c                  |   4 +-
 drivers/net/atlantic/atl_ethdev.c             |   4 +-
 drivers/net/atlantic/atl_types.h              |   5 +-
 .../net/atlantic/hw_atl/hw_atl_utils_fw2x.c   |  26 +-
 drivers/net/axgbe/axgbe_common.h              |   2 +-
 drivers/net/axgbe/axgbe_dev.c                 |   8 +-
 drivers/net/axgbe/axgbe_ethdev.c              |   8 +-
 drivers/net/axgbe/axgbe_ethdev.h              |   8 +-
 drivers/net/axgbe/axgbe_i2c.c                 |   4 +-
 drivers/net/axgbe/axgbe_mdio.c                |   8 +-
 drivers/net/axgbe/axgbe_phy_impl.c            |   6 +-
 drivers/net/bnxt/bnxt.h                       |  16 +-
 drivers/net/bnxt/bnxt_cpr.c                   |   4 +-
 drivers/net/bnxt/bnxt_ethdev.c                |  52 +-
 drivers/net/bnxt/bnxt_irq.c                   |   8 +-
 drivers/net/bnxt/bnxt_reps.c                  |  10 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c            |  34 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h            |   4 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c          |  24 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h          |   2 +-
 drivers/net/ena/base/ena_plat_dpdk.h          |  12 +-
 drivers/net/enic/enic.h                       |   2 +-
 drivers/net/ice/ice_dcf_parent.c              |   6 +-
 drivers/net/ipn3ke/ipn3ke_representor.c       |   6 +-
 drivers/net/ixgbe/ixgbe_ethdev.c              |   2 +-
 drivers/net/ixgbe/ixgbe_ethdev.h              |   2 +-
 drivers/net/kni/rte_eth_kni.c                 |   8 +-
 drivers/net/mlx5/linux/mlx5_os.c              |   2 +-
 drivers/net/mlx5/mlx5.c                       |  20 +-
 drivers/net/mlx5/mlx5.h                       |   2 +-
 drivers/net/mlx5/mlx5_txpp.c                  |   8 +-
 drivers/net/mlx5/windows/mlx5_flow_os.c       |  10 +-
 drivers/net/mlx5/windows/mlx5_os.c            |   2 +-
 drivers/net/qede/base/bcm_osal.h              |   8 +-
 drivers/net/vhost/rte_eth_vhost.c             |  24 +-
 .../net/virtio/virtio_user/virtio_user_dev.c  |  30 +-
 .../net/virtio/virtio_user/virtio_user_dev.h  |   2 +-
 drivers/raw/ifpga/ifpga_rawdev.c              |   6 +-
 drivers/vdpa/ifc/ifcvf_vdpa.c                 |  46 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c                 |  24 +-
 drivers/vdpa/mlx5/mlx5_vdpa.h                 |   4 +-
 drivers/vdpa/mlx5/mlx5_vdpa_event.c           |  46 +-
 examples/kni/main.c                           |   6 +-
 .../performance-thread/pthread_shim/main.c    |   2 +-
 examples/vhost/main.c                         |   2 +-
 examples/vhost_blk/vhost_blk.c                |  12 +-
 lib/eal/common/eal_common_options.c           |  34 +-
 lib/eal/common/eal_common_proc.c              |  48 +-
 lib/eal/common/eal_common_thread.c            |  42 +-
 lib/eal/common/eal_common_trace.c             |   2 +-
 lib/eal/common/eal_internal_cfg.h             |   2 +
 lib/eal/common/eal_options.h                  |   2 +
 lib/eal/common/eal_private.h                  |   2 +-
 lib/eal/common/eal_thread.h                   |   6 +
 lib/eal/common/malloc_mp.c                    |  32 +-
 lib/eal/common/meson.build                    |   1 +
 lib/eal/common/rte_thread.c                   | 422 +++++++++++
 lib/eal/freebsd/eal.c                         |  42 +-
 lib/eal/freebsd/eal_alarm.c                   |  12 +-
 lib/eal/freebsd/eal_interrupts.c              |   4 +-
 lib/eal/freebsd/eal_thread.c                  |  14 +-
 lib/eal/include/meson.build                   |   1 +
 lib/eal/include/rte_lcore.h                   |   8 +-
 lib/eal/include/rte_per_lcore.h               |   2 -
 lib/eal/include/rte_thread.h                  | 378 +++++++++-
 lib/eal/include/rte_thread_types.h            |  14 +
 lib/eal/linux/eal.c                           |  46 +-
 lib/eal/linux/eal_alarm.c                     |  10 +-
 lib/eal/linux/eal_interrupts.c                |   4 +-
 lib/eal/linux/eal_thread.c                    |  18 +-
 lib/eal/linux/eal_timer.c                     |   2 +-
 lib/eal/unix/meson.build                      |   1 -
 lib/eal/unix/rte_thread.c                     |  92 ---
 lib/eal/version.map                           |  22 +
 lib/eal/windows/eal.c                         |  43 +-
 lib/eal/windows/eal_interrupts.c              |  10 +-
 lib/eal/windows/eal_lcore.c                   | 169 +++--
 lib/eal/windows/eal_thread.c                  |  28 +-
 lib/eal/windows/eal_windows.h                 |  20 +-
 lib/eal/windows/include/meson.build           |   1 +
 lib/eal/windows/include/pthread.h             | 192 -----
 .../include/rte_windows_thread_types.h        |  19 +
 lib/eal/windows/include/sched.h               |   2 +-
 lib/eal/windows/meson.build                   |   7 +-
 lib/eal/windows/rte_thread.c                  | 678 +++++++++++++++++-
 lib/ethdev/rte_ethdev.c                       |   4 +-
 lib/ethdev/rte_ethdev_core.h                  |   5 +-
 lib/ethdev/rte_flow.c                         |   4 +-
 lib/eventdev/rte_event_eth_rx_adapter.c       |   6 +-
 lib/vhost/fd_man.c                            |  40 +-
 lib/vhost/fd_man.h                            |   6 +-
 lib/vhost/socket.c                            | 130 ++--
 lib/vhost/vhost.c                             |  10 +-
 meson_options.txt                             |   2 +
 108 files changed, 2345 insertions(+), 967 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c
 create mode 100644 lib/eal/include/rte_thread_types.h
 delete mode 100644 lib/eal/unix/rte_thread.c
 delete mode 100644 lib/eal/windows/include/pthread.h
 create mode 100644 lib/eal/windows/include/rte_windows_thread_types.h

-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 2%]

* Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics
  2021-06-03 10:33  3%               ` Andrew Rybchenko
@ 2021-06-03 11:05  0%                 ` Ilya Maximets
  0 siblings, 0 replies; 200+ results
From: Ilya Maximets @ 2021-06-03 11:05 UTC (permalink / raw)
  To: Andrew Rybchenko, Ilya Maximets, Ivan Malov, dev
  Cc: Eli Britstein, Smadar Fuks, Hyong Youb Kim, Ori Kam, Jerin Jacob,
	John Daley, Thomas Monjalon, Ferruh Yigit

On 6/3/21 12:33 PM, Andrew Rybchenko wrote:
> On 6/3/21 12:29 PM, Ilya Maximets wrote:
>> On 6/2/21 9:35 PM, Ivan Malov wrote:
>>> On 02/06/2021 20:35, Ilya Maximets wrote:
>>>> (Dropped Broadcom folks from CC.  Mail server refuses to accept their
>>>> emails for some reason: "Recipient address rejected: Domain not found."
>>>> Please, try to ad them back on reply.)
>>>>
>>>> On 6/2/21 6:26 PM, Andrew Rybchenko wrote:
>>>>> On 6/2/21 3:46 PM, Ilya Maximets wrote:
>>>>>> On 6/1/21 4:28 PM, Ivan Malov wrote:
>>>>>>> Hi Ilya,
>>>>>>>
>>>>>>> Thank you for reviewing the proposal at such short notice. I'm afraid that prior discussions overlook the simple fact that the whole problem is not limited to just VF representors. Action PORT_ID is also used with respect to the admin PF's ethdev, which "represents itself" (and by no means it represents the underlying physical/network port). In this case, one cannot state that the application treats it as a physical port, just like one states that the application perceives representors as VFs themselves.
>>>>>>
>>>>>>
>>>>>> I don't think that it was overlooked.  If device is in a switchdev mode than
>>>>>> there is a PF representor and VF representors.  Application typically works
>>>>>> only with representors in this case is it doesn't make much sense to have
>>>>>> representor and the upstream port attached to the same application at the
>>>>>> same time.  Configuration that is applied by application to the representor
>>>>>> (PF or VF, it doesn't matter) applies to the corresponding upstream port
>>>>>> (actual PF or VF) by default.
>>>>>
>>>>> PF is not necessarily associated with a network port. It
>>>>> could  be many PFs and just one network port on NIC.
>>>>> Extra PFs are like VFs in this case. These PFs may be
>>>>> passed to a VM in a similar way. So, we can have PF
>>>>> representors similar to VF representors. I.e. it is
>>>>> incorrect to say that PF in the case of switchdev is
>>>>> a representor of a network port.
>>>>>
>>>>> If we prefer to talk in representors terminology, we
>>>>> need 4 types of prepresentors:
>>>>>   - PF representor for PCIe physical function
>>>>>   - VF representor for PCIe virtual function
>>>>>   - SF representor for PCIe sub-function (PASID)
>>>>>   - network port representor
>>>>> In fact above is PCIe oriented, but there are
>>>>> other buses and ways to deliver traffic to applications.
>>>>> Basically representor for any virtual port in virtual
>>>>> switch which DPDK app can control using transfer rules.
>>>>>
>>>>>> Exactly same thing here with PORT_ID action.  You have a packet and action
>>>>>> to send it to the port, but it's not specified if HW needs to send it to
>>>>>> the representor or the upstream port (again, VF or PF, it doesn't matter).
>>>>>> Since there is no extra information, HW should send it to the upstream
>>>>>> port by default.  The same as configuration applies by default to the
>>>>>> upstream port.
>>>>>>
>>>>>> Let's look at some workflow examples:
>>>>>>
>>>>>>        DPDK Application
>>>>>>          |         |
>>>>>>          |         |
>>>>>>     +--PF-rep------VF-rep---+
>>>>>>     |                       |
>>>>>>     |    NIC (switchdev)    |
>>>>>>     |                       |
>>>>>>     +---PF---------VF-------+
>>>>>>         |          |
>>>>>>         |          |
>>>>>>     External       VM or whatever
>>>>>>     Network
>>>>>
>>>>> See above. PF <-> External Network is incorrect above
>>>>> since it not always the case. It should be
>>>>> "NP <-> External network" and "NP-rep" above (NP -
>>>>> network port). Sometimes PF is an NP-rep, but sometimes
>>>>> it is not. It is just a question of default rules in
>>>>> switchdev on what to do with traffic incoming from
>>>>> network port.
>>>>>
>>>>> A bit more complicated picture is:
>>>>>
>>>>>      +----------------------------------------+
>>>>>      |            DPDK Application            |
>>>>>      +----+---------+---------+---------+-----+
>>>>>           |PF0      |PF1      |         |
>>>>>           |         |         |         |
>>>>>      +--NP1-rep---NP2-rep---PF2-rep---VF-rep--+
>>>>>      |                                        |
>>>>>      |             NIC (switchdev)            |
>>>>>      |                                        |
>>>>>      +---NP1-------NP2-------PF2--------VF----+
>>>>>           |         |         |         |
>>>>>           |         |         |         |
>>>>>       External   External    VM or     VM or
>>>>>      Network 1  Network 2  whatever   whatever
>>>>>
>>>>> So, sometimes PF plays network port representor role (PF0,
>>>>> PF1), sometimes it requires representor itself (PF2).
>>>>> What to do if PF2 itself is attached to application?
>>>>> Can we route traffic to it using PORT_ID action?
>>>>> It has DPDK ethdev port. It is one of arguments why
>>>>> plain PORT_ID should route DPDK application.
>>>>
>>>> OK.  This is not very different from my understanding.  The key
>>>> is that there is a pair of interfaces, one is more visible than
>>>> the other one.
>>>>
>>>>>
>>>>> Of course, some applications would like to see it as
>>>>> (simpler is better):
>>>>>
>>>>>      +----------------------------------------+
>>>>>      |            DPDK Application            |
>>>>>      |                                        |
>>>>>      +---PF0-------PF1------PF2-rep---VF-rep--+
>>>>>           |         |         |         |
>>>>>           |         |         |         |
>>>>>       External   External    VM or     VM or
>>>>>      Network 1  Network 2  whatever   whatever
>>>>>
>>>>> but some, I believe, require full picture. For examples,
>>>>> I'd really like to know how much traffic goes via all 8
>>>>> switchdev ports and running rte_eth_stats_get(0, ...)
>>>>> (i.e. DPDK port 0 attached to PF0) I'd like to get
>>>>> NP1-rep stats (not NP1 stats). It will match exactly
>>>>> what I see in DPDK application. It is an argument why
>>>>> plain PORT_ID should be treated as a DPDK ethdev port,
>>>>> not a represented (upstream) entity.
>>>>
>>>> The point is that if application doesn't require full picture,
>>>> it should not care.  If application requires the full picture,
>>>> it could take extra steps by setting extra bits.  I don't
>>>> understand why we need to force all applications to care about
>>>> the full picture if we can avoid that?
>>>>
>>>>>
>>>>>> a. Workflow for "DPDK Application" to set MAC to VF:
>>>>>>
>>>>>> 1. "DPDK Application" calls rte_set_etheraddr("VF-rep", new_mac);
>>>>>> 2.  DPDK sets MAC for "VF".
>>>>>>
>>>>>> b. Workflow for "DPDK Application" to set MAC to PF:
>>>>>>
>>>>>> 1. "DPDK Application" calls rte_set_etheraddr("PF-rep", new_mac);
>>>>>> 2.  DPDK sets MAC for "PF".
>>>>>>
>>>>>> c. Workflow for "DPDK Application" to send packet to the external network:
>>>>>>
>>>>>> 1. "DPDK Application" calls rte_eth_tx_burst("PF-rep", packet);
>>>>>> 2. NIC receives the packet from "PF-rep" and sends it to "PF".
>>>>>> 3. packet egresses to the external network from "PF".
>>>>>>
>>>>>> d. Workflow for "DPDK Application" to send packet to the "VM or whatever":
>>>>>>
>>>>>> 1. "DPDK Application" calls rte_eth_tx_burst("VF-rep", packet);
>>>>>> 2. NIC receives the packet from "VF-rep" and sends it to "VF".
>>>>>> 3. "VM or whatever" receives the packet from "VF".
>>>>>>
>>>>>> In two workflows above there is no rte_flow processing on step 2, i.e.,
>>>>>> NIC does not perform any lookups/matches/actions, because it's not possible
>>>>>> to configure actions for packets received from "PF-rep" or
>>>>>> "VF-rep" as these ports doesn't own a port id and all the configuration
>>>>>> and rte_flow actions translated and applied for the devices that these
>>>>>> ports represents ("PF" and "VF") and not representors themselves ("PF-rep"
>>>>>> or "VF-rep").
>>>>>>
>>>>>> e. Workflow for the packet received on PF and PORT_ID action:
>>>>>>
>>>>>> 1. "DPDK Application" configures rte_flow for all packets from "PF-rep"
>>>>>>     to execute PORT_ID "VF-rep".
>>>>>> 2. NIC receives packet on "PF".
>>>>>> 3. NIC executes 'PORT_ID "VF-rep"' action by sending packet to "VF".
>>>>>> 4. "VM or whatever" receives the packet from "VF".
>>>>>>
>>>>>> f. Workflow for the packet received on VF and PORT_ID action:
>>>>>>
>>>>>> 1. "DPDK Application" configures rte_flow for all packets from "VF-rep"
>>>>>>     to execute 'PORT_ID "PF-rep"'.
>>>>>> 2. NIC receives packet on "VF".
>>>>>> 3. NIC executes 'PORT_ID "PF-rep"' action by sending packet to "PF".
>>>>>> 4. Packet egresses from the "PF" to the external network.
>>>>>>
>>>>>> Above is what, IMHO, the logic should look like and this matches with
>>>>>> the overall switchdev design in kernel.
>>>>>>
>>>>>> I understand that this logic could seem flipped-over from the HW point
>>>>>> of view, but it's perfectly logical from the user's perspective, because
>>>>>> user should not care if the application works with representors or
>>>>>> some real devices.  If application configures that all packets from port
>>>>>> A should be sent to port B, user will expect that these packets will
>>>>>> egress from port B once received from port A.  That will be highly
>>>>>> inconvenient if the packet will ingress from port B back to the
>>>>>> application instead.
>>>>>>
>>>>>>        DPDK Application
>>>>>>          |          |
>>>>>>          |          |
>>>>>>       port A     port B
>>>>>>          |          |
>>>>>>        *****MAGIC*****
>>>>>>          |          |
>>>>>>     External      Another Network
>>>>>>     Network       or VM or whatever
>>>>>>
>>>>>> It should not matter if there is an extra layer between ports A and B
>>>>>> and the external network and VM.  Everything should work in exactly the
>>>>>> same way, transparently for the application.
>>>>>>
>>>>>> The point of hardware offloading, and therefore rte_flow API, is to take
>>>>>> what user does in software and make this "magically" work in hardware in
>>>>>> the exactly same way.  And this will be broken if user will have to
>>>>>> use different logic based on the mode the hardware works in, i.e. based on
>>>>>> the fact if the application works with ports or their representors.
>>>>>>
>>>>>> If some specific use case requires application to know if it's an
>>>>>> upstream port or the representor and demystify the internals of the switchdev
>>>>>> NIC, there should be a different port id for the representor itself that
>>>>>> could be used in all DPDK APIs including rte_flow API or a special bit for
>>>>>> that matter.  IIRC, there was an idea to add a bit directly to the port_id
>>>>>> for that purpose that will flip over behavior in all the workflow scenarios
>>>>>> that I described above.
>>>>>
>>>>> As I understand we're basically on the same page, but just
>>>>> fighting for defaults in DPDK.
>>>>
>>>> Yep.
>>>>
>>>>>
>>>>>>>
>>>>>>> Given these facts, it would not be quite right to just align the documentation with the de-facto action meaning assumed by OvS.
>>>>>>
>>>>>> It's not a "meaning assumed by OvS", it's the original design and the
>>>>>> main idea of a switchdev based on a common sense.
>>>>>
>>>>> If so, common sense is not that common :)
>>>>> My "common sense" says me that PORT_ID action
>>>>> should route traffic to DPDK ethdev port to be
>>>>> received by the DPDK application.
>>>>
>>>> By this logic rte_eth_tx_burst("VF-rep", packet) should send a packet
>>>> to "VF-rep", i.e. this packet will be received back by the application
>>>> on this same interface.  But that is counter-intuitive and this is not
>>>> how it works in linux kernel if you're opening socket and sending a
>>>> packet to the "VF-rep" network interface.
>>>>
>>>> And if rte_eth_tx_burst("VF-rep", packet) sends packet to "VF" and not
>>>> to "VF-rep", than I don't understand why PORT_ID action should work in
>>>> the opposite way.
>>>
>>> There's no contradiction here.
>>>
>>> In rte_eth_tx_burst(X, packet) example, "X" is the port which the application sits on and from where it sends the packet. In other words, it's the point where the packet originates from, and not where it goes to.
>>>
>>> At the same time, flow *action* PORT_ID (ID = "X") is clearly the opposite: it specifies where the packet will go. Port ID is the characteristic of a DPDK ethdev. So the packet goes *to* an ethdev with the given ID ("X").
>>>
>>> Perhaps consider action PHY_PORT: the index is the characteristic of the network port. The packet goes *to* network through this NP. And not the opposite way. Hopefully, nobody is going to claim that action PHY_PORT should mean re-injecting the packet back to the HW flow engine "as if it just came from the network port". Then why does one try to skew the PORT_ID meaning this way? PORT_ID points to an ethdev - the packet goes *to* the ethdev. Isn't that simple?
>>
>> It's not simple.  And PHY_PORT action would be hard to use from the
>> application that doesn't really need to know how underlying hardware
>> structured.
> 
> Yes, I agree. Basically above paragraph just try to highlight
> existing consistent semantics in various actions which set
> traffic direction and highlight inconsistency if we interpret
> PORT_ID default as egress in accordance with terminology
> suggested below. PORT_ID is a DPDK port and default direction
> should be to DPDK port. I'll continue on the topic below.
> 
>>>
>>>>
>>>> Application receives a packet from port A and puts it to the port B.
>>>> TC rule to forward packets from port A to port B will provide same result.
>>>> So, why the similar rte_flow should do the opposite and send the packet
>>>> back to the application?
>>>
>>> Please see above. Action VF sends the packet *to* VF and *not* to the upstream entity which this VF is connected to. Action PHY_PORT sends the packet *to* network and does *not* make it appear as if it entered the NIC from the network side. Action QUEUE sends the packet *to* the Rx queue and does *not* make it appear as if it just egressed from the Tx queue with the same index. Action PORT_ID sends the packet *to* an ethdev with the given ID and *not* to the upstream entity which this ethdev is connected to. It's just that transparent. It's just "do what the name suggests".
>>>
>>> Yes, an application (say, OvS) might have a high level design which perceives the "high-level" ports plugged to it as a "patch-panel" of sorts. Yes, when a high-level mechanism/logic of such application invokes a *datapath-unaware* wrapper to offload a rule and request that the packet be delivered to the given "high-level" port, it therefore requests that the packet be delivered to the opposite end of the wire. But then the lower-level datapath-specific (DPDK) handler kicks in. Since it's DPDK-specific, it knows *everything* about the underlying flow library it works with. In particular it knows that action PORT_ID delivers the packet to an *ethdev*, at the same time, it knows that the upper caller (high-level logic) for sure wants the opposite, so it (the lower-level DPDK component) sets the "upstream" bit when translating the higher-level port action to an RTE action "PORT_ID".
>>
>> I don't understand that.  DPDK user is the application and DPDK
>> doesn't translate anything, application creates PORT_ID action
>> directly and passes it to DPDK.  So, you're forcing the *end user*
>> (a.k.a. application) to know *everything* about the hardware the
>> application runs on.  Of course, it gets this information about
>> the hardware from the DPDK library (otherwise this would be
>> completely ridiculous), but this doesn't change the fact that it's
>> the application that needs to think about the structure of the
>> underlying hardware while it's absolutely not necessary in vast
>> majority of cases.
> 
> Yes, that's all true, but I think that specification of the
> direction is *not* diving to deep in hardware details.
> 
> For DPDK I think it is important to have consistent semantics
> and interpretation of input parameters. That will make the
> library easier to use and make it less error-prone.
> 
>>> Then the resulting action is correct, and the packet indeed doesn't end up in the ethdev but goes
>>> to the opposite end of the wire. That's it.
>>>
>>> I have an impression that for some reason people are tempted to ignore the two nominal "layers" in such applications (generic, or high-level one and DPDK-specific one) thus trying to align DPDK logic with high-level logic of the applications. That's simply not right. What I'm trying to point out is that it *is* the true job of DPDK-specific data path handler in such application - to properly translate generic flow actions to DPDK-specific ones. It's the duty of DPDK component in such applications to be aware of the genuine meaning of action PORT_ID.
>>
>> The reason is very simple: if application don't need to know the
>> full picture (how the hardware structured inside) it shouldn't
>> care and it's a duty of DPDK to abstract the hardware and provide
>> programming interfaces that could be easily used by application
>> developers who are not experts in the architecture of a hardware
>> that they want to use (basically, application developer should not
>> care at all in most cases on which hardware application will work).
>> It's basically in almost every single DPDK API, EAL means environment
>> *abstraction* layer, not an environment *proxy/passthrough* layer.
>> We can't assume that DPDK-specific layers in applications are always
>> written by hardware experts and, IMHO, DPDK should not force users
>> to learn underlying structures of switchdev devices.  They might not
>> even have such devices for testing, so the application that works
>> on simple NICs should be able to run correctly on switchdev-capable
>> NICs too.
>>
>> I think that "***MAGIC***" abstraction (see one of my previous ascii
>> graphics) is very important here.
> 
> I've answered it above. Specification of the direction is *not*
> diving to deep in HW details.

Yes, I agree that specification of direction doesn't require any
knowledge of any HW details and this option is perfectly fine for
me as described in 'ingress/egress' suggestion below.  My argument
is about 'upstream' flag specifically.  Wording is important, because
I think that 'upstream' implies some knowledge that there are two
different ports.

> 
>>>
>>> This way, mixing up the two meanings is ruled out.
>>
>> Looking closer to how tc flower rules configured I noticed that
>> 'mirred' action requires user to specify the direction in which
>> the packet will appear on the destination port.  And I suppose
>> this will solve your issues with PORT_ID action without exposing
>> the "full picture" of the architecture of an underlying hardware.
>>
>> It looks something like this:
>>
>>   tc filter add dev A ... action mirred egress redirect dev B
>>                                         ^^^^^^
>>
>> Direction could be 'ingress' or 'egress', so the packet will
>> ingress from the port B back to application/kernel or it will
>> egress from this port to the external network.  Same thing
>> could be implemented in rte_flow like this:
>>
>>   flow create A ingress transfer pattern eth / end
>>       action port_id id B egress / end
>>
>> So, application that needs to receive the packet from the port B
>> will specify 'ingress', others that just want to send packet from
>> the port B will specify 'egress'.  Will that work for you?
>>
>> (BTW, 'ingress' seems to be not implemented in TC and that kind
>> of suggests that it's not very useful at least for kernel use cases)
>>
>> One might say that it's actually the same what is proposed in
>> this RFC, but I will argue that 'ingress/egress' schema doesn't
>> break the "***MAGIC***" abstraction because user is not obligated
>> to know the structure of the underlying hardware, while 'upstream'
>> flag is something very unclear from that perspective and makes
>> no sense for plane ports (non-representors).
> 
> I think it is really an excellent idea and suggested
> terminology looks very good to me. However, we should
> agree on technical details on API level (not testpmd
> commands). I think we have 4 options:
> 
> A. Add "ingress" bit with "egress" as unset meaning.
>    Yes, that's what is current behaviour assumed and
>    used by OvS and implemented in some PMDs.
>    My problem with it that it is, IMHO, inconsistent
>    default value (as explained above).
> 
> B. Add "egress" bit with "ingress" as unset meaning.
>    Basically it is what is suggested in the RFC, but
>    the problem of the suggestion is the silent breakage
>    of existing users (let's put it a side if it is
>    correct usage or misuse). It is still the fact.
> 
> C. Encode above in ethdev port ID MSB.
>    The problem of the solution is that encoding
>    makes sense for representors, but the problem
>    exists for non-representor ports as well.
>    I have no good ideas on terminology in the case
>    if we try to solve it for non-representors.
> 
> D. Break API and ABI and add enum with unset(default)/
>    ingress/egress members to enforce application to
>    specify direction.
> 
> It is unclear what we'll do in the case of A, B and D
> if we encode representor in port ID MSB in any case.

My opinion:

 - Option D is the best choice for rte_flow.  No defaults, users forced
   to explicitly choose the direction in HW-independent way.

 - I agree that option C somewhat conflicts with the 'ingress/egress'
   flag idea and it is more hardware-specific.  Therefore if option C
   is going to be implemented it should be implemented in concept of
   option A, i.e. 'egress' is default option if port ID MSB is not set.

> 
>>>>>>> On 01/06/2021 15:10, Ilya Maximets wrote:
>>>>>>>> On 6/1/21 1:14 PM, Ivan Malov wrote:
>>>>>>>>> By its very name, action PORT_ID means that packets hit an ethdev with the
>>>>>>>>> given DPDK port ID. At least the current comments don't state the opposite.
>>>>>>>>> That said, since port representors had been adopted, applications like OvS
>>>>>>>>> have been misusing the action. They misread its purpose as sending packets
>>>>>>>>> to the opposite end of the "wire" plugged to the given ethdev, for example,
>>>>>>>>> redirecting packets to the VF itself rather than to its representor ethdev.
>>>>>>>>> Another example: OvS relies on this action with the admin PF's ethdev port
>>>>>>>>> ID specified in it in order to send offloaded packets to the physical port.
>>>>>>>>>
>>>>>>>>> Since there might be applications which use this action in its valid sense,
>>>>>>>>> one can't just change the documentation to greenlight the opposite meaning.
>>>>>>>>> This patch adds an explicit bit to the action configuration which will let
>>>>>>>>> applications, depending on their needs, leverage the two meanings properly.
>>>>>>>>> Applications like OvS, as well as PMDs, will have to be corrected when the
>>>>>>>>> patch has been applied. But the improved clarity of the action is worth it.
>>>>>>>>>
>>>>>>>>> The proposed change is not the only option. One could avoid changes in OvS
>>>>>>>>> and PMDs if the new configuration field had the opposite meaning, with the
>>>>>>>>> action itself meaning delivery to the represented port and not to DPDK one.
>>>>>>>>> Alternatively, one could define a brand new action with the said behaviour.
>>>>>>>>
>>>>>>>> We had already very similar discussions regarding the understanding of what
>>>>>>>> the representor really is from the DPDK API's point of view, and the last
>>>>>>>> time, IIUC, it was concluded by a tech. board that representor should be
>>>>>>>> a "ghost of a VF", i.e. DPDK APIs should apply configuration by default to
>>>>>>>> VF and not to the representor device:
>>>>>>>>     https://patches.dpdk.org/project/dpdk/cover/20191029185051.32203-1-thomas@monjalon.net/#104376
>>>>>>>> This wasn't enforced though, IIUC, for existing code and semantics is still mixed.
>>>>>>>>
>>>>>>>> I still think that configuration should be applied to VF, and the same applies
>>>>>>>> to rte_flow API.  IMHO, average application should not care if device is
>>>>>>>> a VF itself or its representor.  Everything should work exactly the same.
>>>>>>>> I think this matches with the original idea/design of the switchdev functionality
>>>>>>>> in the linux kernel and also matches with how the average user thinks about
>>>>>>>> representor devices.
>>>>>>>>
>>>>>>>> If some specific use-case requires to distinguish VF from the representor,
>>>>>>>> there should probably be a separate special API/flag for that.
>>>>>>>>
>>>>>>>> Best regards, Ilya Maximets.
>>>>>>>>
>>>>>>>
>>>>>
>>>
> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics
  @ 2021-06-03 10:33  3%               ` Andrew Rybchenko
  2021-06-03 11:05  0%                 ` Ilya Maximets
  0 siblings, 1 reply; 200+ results
From: Andrew Rybchenko @ 2021-06-03 10:33 UTC (permalink / raw)
  To: Ilya Maximets, Ivan Malov, dev
  Cc: Eli Britstein, Smadar Fuks, Hyong Youb Kim, Ori Kam, Jerin Jacob,
	John Daley, Thomas Monjalon, Ferruh Yigit

On 6/3/21 12:29 PM, Ilya Maximets wrote:
> On 6/2/21 9:35 PM, Ivan Malov wrote:
>> On 02/06/2021 20:35, Ilya Maximets wrote:
>>> (Dropped Broadcom folks from CC.  Mail server refuses to accept their
>>> emails for some reason: "Recipient address rejected: Domain not found."
>>> Please, try to ad them back on reply.)
>>>
>>> On 6/2/21 6:26 PM, Andrew Rybchenko wrote:
>>>> On 6/2/21 3:46 PM, Ilya Maximets wrote:
>>>>> On 6/1/21 4:28 PM, Ivan Malov wrote:
>>>>>> Hi Ilya,
>>>>>>
>>>>>> Thank you for reviewing the proposal at such short notice. I'm afraid that prior discussions overlook the simple fact that the whole problem is not limited to just VF representors. Action PORT_ID is also used with respect to the admin PF's ethdev, which "represents itself" (and by no means it represents the underlying physical/network port). In this case, one cannot state that the application treats it as a physical port, just like one states that the application perceives representors as VFs themselves.
>>>>>
>>>>>
>>>>> I don't think that it was overlooked.  If device is in a switchdev mode than
>>>>> there is a PF representor and VF representors.  Application typically works
>>>>> only with representors in this case is it doesn't make much sense to have
>>>>> representor and the upstream port attached to the same application at the
>>>>> same time.  Configuration that is applied by application to the representor
>>>>> (PF or VF, it doesn't matter) applies to the corresponding upstream port
>>>>> (actual PF or VF) by default.
>>>>
>>>> PF is not necessarily associated with a network port. It
>>>> could  be many PFs and just one network port on NIC.
>>>> Extra PFs are like VFs in this case. These PFs may be
>>>> passed to a VM in a similar way. So, we can have PF
>>>> representors similar to VF representors. I.e. it is
>>>> incorrect to say that PF in the case of switchdev is
>>>> a representor of a network port.
>>>>
>>>> If we prefer to talk in representors terminology, we
>>>> need 4 types of prepresentors:
>>>>   - PF representor for PCIe physical function
>>>>   - VF representor for PCIe virtual function
>>>>   - SF representor for PCIe sub-function (PASID)
>>>>   - network port representor
>>>> In fact above is PCIe oriented, but there are
>>>> other buses and ways to deliver traffic to applications.
>>>> Basically representor for any virtual port in virtual
>>>> switch which DPDK app can control using transfer rules.
>>>>
>>>>> Exactly same thing here with PORT_ID action.  You have a packet and action
>>>>> to send it to the port, but it's not specified if HW needs to send it to
>>>>> the representor or the upstream port (again, VF or PF, it doesn't matter).
>>>>> Since there is no extra information, HW should send it to the upstream
>>>>> port by default.  The same as configuration applies by default to the
>>>>> upstream port.
>>>>>
>>>>> Let's look at some workflow examples:
>>>>>
>>>>>        DPDK Application
>>>>>          |         |
>>>>>          |         |
>>>>>     +--PF-rep------VF-rep---+
>>>>>     |                       |
>>>>>     |    NIC (switchdev)    |
>>>>>     |                       |
>>>>>     +---PF---------VF-------+
>>>>>         |          |
>>>>>         |          |
>>>>>     External       VM or whatever
>>>>>     Network
>>>>
>>>> See above. PF <-> External Network is incorrect above
>>>> since it not always the case. It should be
>>>> "NP <-> External network" and "NP-rep" above (NP -
>>>> network port). Sometimes PF is an NP-rep, but sometimes
>>>> it is not. It is just a question of default rules in
>>>> switchdev on what to do with traffic incoming from
>>>> network port.
>>>>
>>>> A bit more complicated picture is:
>>>>
>>>>      +----------------------------------------+
>>>>      |            DPDK Application            |
>>>>      +----+---------+---------+---------+-----+
>>>>           |PF0      |PF1      |         |
>>>>           |         |         |         |
>>>>      +--NP1-rep---NP2-rep---PF2-rep---VF-rep--+
>>>>      |                                        |
>>>>      |             NIC (switchdev)            |
>>>>      |                                        |
>>>>      +---NP1-------NP2-------PF2--------VF----+
>>>>           |         |         |         |
>>>>           |         |         |         |
>>>>       External   External    VM or     VM or
>>>>      Network 1  Network 2  whatever   whatever
>>>>
>>>> So, sometimes PF plays network port representor role (PF0,
>>>> PF1), sometimes it requires representor itself (PF2).
>>>> What to do if PF2 itself is attached to application?
>>>> Can we route traffic to it using PORT_ID action?
>>>> It has DPDK ethdev port. It is one of arguments why
>>>> plain PORT_ID should route DPDK application.
>>>
>>> OK.  This is not very different from my understanding.  The key
>>> is that there is a pair of interfaces, one is more visible than
>>> the other one.
>>>
>>>>
>>>> Of course, some applications would like to see it as
>>>> (simpler is better):
>>>>
>>>>      +----------------------------------------+
>>>>      |            DPDK Application            |
>>>>      |                                        |
>>>>      +---PF0-------PF1------PF2-rep---VF-rep--+
>>>>           |         |         |         |
>>>>           |         |         |         |
>>>>       External   External    VM or     VM or
>>>>      Network 1  Network 2  whatever   whatever
>>>>
>>>> but some, I believe, require full picture. For examples,
>>>> I'd really like to know how much traffic goes via all 8
>>>> switchdev ports and running rte_eth_stats_get(0, ...)
>>>> (i.e. DPDK port 0 attached to PF0) I'd like to get
>>>> NP1-rep stats (not NP1 stats). It will match exactly
>>>> what I see in DPDK application. It is an argument why
>>>> plain PORT_ID should be treated as a DPDK ethdev port,
>>>> not a represented (upstream) entity.
>>>
>>> The point is that if application doesn't require full picture,
>>> it should not care.  If application requires the full picture,
>>> it could take extra steps by setting extra bits.  I don't
>>> understand why we need to force all applications to care about
>>> the full picture if we can avoid that?
>>>
>>>>
>>>>> a. Workflow for "DPDK Application" to set MAC to VF:
>>>>>
>>>>> 1. "DPDK Application" calls rte_set_etheraddr("VF-rep", new_mac);
>>>>> 2.  DPDK sets MAC for "VF".
>>>>>
>>>>> b. Workflow for "DPDK Application" to set MAC to PF:
>>>>>
>>>>> 1. "DPDK Application" calls rte_set_etheraddr("PF-rep", new_mac);
>>>>> 2.  DPDK sets MAC for "PF".
>>>>>
>>>>> c. Workflow for "DPDK Application" to send packet to the external network:
>>>>>
>>>>> 1. "DPDK Application" calls rte_eth_tx_burst("PF-rep", packet);
>>>>> 2. NIC receives the packet from "PF-rep" and sends it to "PF".
>>>>> 3. packet egresses to the external network from "PF".
>>>>>
>>>>> d. Workflow for "DPDK Application" to send packet to the "VM or whatever":
>>>>>
>>>>> 1. "DPDK Application" calls rte_eth_tx_burst("VF-rep", packet);
>>>>> 2. NIC receives the packet from "VF-rep" and sends it to "VF".
>>>>> 3. "VM or whatever" receives the packet from "VF".
>>>>>
>>>>> In two workflows above there is no rte_flow processing on step 2, i.e.,
>>>>> NIC does not perform any lookups/matches/actions, because it's not possible
>>>>> to configure actions for packets received from "PF-rep" or
>>>>> "VF-rep" as these ports doesn't own a port id and all the configuration
>>>>> and rte_flow actions translated and applied for the devices that these
>>>>> ports represents ("PF" and "VF") and not representors themselves ("PF-rep"
>>>>> or "VF-rep").
>>>>>
>>>>> e. Workflow for the packet received on PF and PORT_ID action:
>>>>>
>>>>> 1. "DPDK Application" configures rte_flow for all packets from "PF-rep"
>>>>>     to execute PORT_ID "VF-rep".
>>>>> 2. NIC receives packet on "PF".
>>>>> 3. NIC executes 'PORT_ID "VF-rep"' action by sending packet to "VF".
>>>>> 4. "VM or whatever" receives the packet from "VF".
>>>>>
>>>>> f. Workflow for the packet received on VF and PORT_ID action:
>>>>>
>>>>> 1. "DPDK Application" configures rte_flow for all packets from "VF-rep"
>>>>>     to execute 'PORT_ID "PF-rep"'.
>>>>> 2. NIC receives packet on "VF".
>>>>> 3. NIC executes 'PORT_ID "PF-rep"' action by sending packet to "PF".
>>>>> 4. Packet egresses from the "PF" to the external network.
>>>>>
>>>>> Above is what, IMHO, the logic should look like and this matches with
>>>>> the overall switchdev design in kernel.
>>>>>
>>>>> I understand that this logic could seem flipped-over from the HW point
>>>>> of view, but it's perfectly logical from the user's perspective, because
>>>>> user should not care if the application works with representors or
>>>>> some real devices.  If application configures that all packets from port
>>>>> A should be sent to port B, user will expect that these packets will
>>>>> egress from port B once received from port A.  That will be highly
>>>>> inconvenient if the packet will ingress from port B back to the
>>>>> application instead.
>>>>>
>>>>>        DPDK Application
>>>>>          |          |
>>>>>          |          |
>>>>>       port A     port B
>>>>>          |          |
>>>>>        *****MAGIC*****
>>>>>          |          |
>>>>>     External      Another Network
>>>>>     Network       or VM or whatever
>>>>>
>>>>> It should not matter if there is an extra layer between ports A and B
>>>>> and the external network and VM.  Everything should work in exactly the
>>>>> same way, transparently for the application.
>>>>>
>>>>> The point of hardware offloading, and therefore rte_flow API, is to take
>>>>> what user does in software and make this "magically" work in hardware in
>>>>> the exactly same way.  And this will be broken if user will have to
>>>>> use different logic based on the mode the hardware works in, i.e. based on
>>>>> the fact if the application works with ports or their representors.
>>>>>
>>>>> If some specific use case requires application to know if it's an
>>>>> upstream port or the representor and demystify the internals of the switchdev
>>>>> NIC, there should be a different port id for the representor itself that
>>>>> could be used in all DPDK APIs including rte_flow API or a special bit for
>>>>> that matter.  IIRC, there was an idea to add a bit directly to the port_id
>>>>> for that purpose that will flip over behavior in all the workflow scenarios
>>>>> that I described above.
>>>>
>>>> As I understand we're basically on the same page, but just
>>>> fighting for defaults in DPDK.
>>>
>>> Yep.
>>>
>>>>
>>>>>>
>>>>>> Given these facts, it would not be quite right to just align the documentation with the de-facto action meaning assumed by OvS.
>>>>>
>>>>> It's not a "meaning assumed by OvS", it's the original design and the
>>>>> main idea of a switchdev based on a common sense.
>>>>
>>>> If so, common sense is not that common :)
>>>> My "common sense" says me that PORT_ID action
>>>> should route traffic to DPDK ethdev port to be
>>>> received by the DPDK application.
>>>
>>> By this logic rte_eth_tx_burst("VF-rep", packet) should send a packet
>>> to "VF-rep", i.e. this packet will be received back by the application
>>> on this same interface.  But that is counter-intuitive and this is not
>>> how it works in linux kernel if you're opening socket and sending a
>>> packet to the "VF-rep" network interface.
>>>
>>> And if rte_eth_tx_burst("VF-rep", packet) sends packet to "VF" and not
>>> to "VF-rep", than I don't understand why PORT_ID action should work in
>>> the opposite way.
>>
>> There's no contradiction here.
>>
>> In rte_eth_tx_burst(X, packet) example, "X" is the port which the application sits on and from where it sends the packet. In other words, it's the point where the packet originates from, and not where it goes to.
>>
>> At the same time, flow *action* PORT_ID (ID = "X") is clearly the opposite: it specifies where the packet will go. Port ID is the characteristic of a DPDK ethdev. So the packet goes *to* an ethdev with the given ID ("X").
>>
>> Perhaps consider action PHY_PORT: the index is the characteristic of the network port. The packet goes *to* network through this NP. And not the opposite way. Hopefully, nobody is going to claim that action PHY_PORT should mean re-injecting the packet back to the HW flow engine "as if it just came from the network port". Then why does one try to skew the PORT_ID meaning this way? PORT_ID points to an ethdev - the packet goes *to* the ethdev. Isn't that simple?
> 
> It's not simple.  And PHY_PORT action would be hard to use from the
> application that doesn't really need to know how underlying hardware
> structured.

Yes, I agree. Basically above paragraph just try to highlight
existing consistent semantics in various actions which set
traffic direction and highlight inconsistency if we interpret
PORT_ID default as egress in accordance with terminology
suggested below. PORT_ID is a DPDK port and default direction
should be to DPDK port. I'll continue on the topic below.

>>
>>>
>>> Application receives a packet from port A and puts it to the port B.
>>> TC rule to forward packets from port A to port B will provide same result.
>>> So, why the similar rte_flow should do the opposite and send the packet
>>> back to the application?
>>
>> Please see above. Action VF sends the packet *to* VF and *not* to the upstream entity which this VF is connected to. Action PHY_PORT sends the packet *to* network and does *not* make it appear as if it entered the NIC from the network side. Action QUEUE sends the packet *to* the Rx queue and does *not* make it appear as if it just egressed from the Tx queue with the same index. Action PORT_ID sends the packet *to* an ethdev with the given ID and *not* to the upstream entity which this ethdev is connected to. It's just that transparent. It's just "do what the name suggests".
>>
>> Yes, an application (say, OvS) might have a high level design which perceives the "high-level" ports plugged to it as a "patch-panel" of sorts. Yes, when a high-level mechanism/logic of such application invokes a *datapath-unaware* wrapper to offload a rule and request that the packet be delivered to the given "high-level" port, it therefore requests that the packet be delivered to the opposite end of the wire. But then the lower-level datapath-specific (DPDK) handler kicks in. Since it's DPDK-specific, it knows *everything* about the underlying flow library it works with. In particular it knows that action PORT_ID delivers the packet to an *ethdev*, at the same time, it knows that the upper caller (high-level logic) for sure wants the opposite, so it (the lower-level DPDK component) sets the "upstream" bit when translating the higher-level port action to an RTE action "PORT_ID".
> 
> I don't understand that.  DPDK user is the application and DPDK
> doesn't translate anything, application creates PORT_ID action
> directly and passes it to DPDK.  So, you're forcing the *end user*
> (a.k.a. application) to know *everything* about the hardware the
> application runs on.  Of course, it gets this information about
> the hardware from the DPDK library (otherwise this would be
> completely ridiculous), but this doesn't change the fact that it's
> the application that needs to think about the structure of the
> underlying hardware while it's absolutely not necessary in vast
> majority of cases.

Yes, that's all true, but I think that specification of the
direction is *not* diving to deep in hardware details.

For DPDK I think it is important to have consistent semantics
and interpretation of input parameters. That will make the
library easier to use and make it less error-prone.

>> Then the resulting action is correct, and the packet indeed doesn't end up in the ethdev but goes
>> to the opposite end of the wire. That's it.
>>
>> I have an impression that for some reason people are tempted to ignore the two nominal "layers" in such applications (generic, or high-level one and DPDK-specific one) thus trying to align DPDK logic with high-level logic of the applications. That's simply not right. What I'm trying to point out is that it *is* the true job of DPDK-specific data path handler in such application - to properly translate generic flow actions to DPDK-specific ones. It's the duty of DPDK component in such applications to be aware of the genuine meaning of action PORT_ID.
> 
> The reason is very simple: if application don't need to know the
> full picture (how the hardware structured inside) it shouldn't
> care and it's a duty of DPDK to abstract the hardware and provide
> programming interfaces that could be easily used by application
> developers who are not experts in the architecture of a hardware
> that they want to use (basically, application developer should not
> care at all in most cases on which hardware application will work).
> It's basically in almost every single DPDK API, EAL means environment
> *abstraction* layer, not an environment *proxy/passthrough* layer.
> We can't assume that DPDK-specific layers in applications are always
> written by hardware experts and, IMHO, DPDK should not force users
> to learn underlying structures of switchdev devices.  They might not
> even have such devices for testing, so the application that works
> on simple NICs should be able to run correctly on switchdev-capable
> NICs too.
> 
> I think that "***MAGIC***" abstraction (see one of my previous ascii
> graphics) is very important here.

I've answered it above. Specification of the direction is *not*
diving to deep in HW details.

>>
>> This way, mixing up the two meanings is ruled out.
> 
> Looking closer to how tc flower rules configured I noticed that
> 'mirred' action requires user to specify the direction in which
> the packet will appear on the destination port.  And I suppose
> this will solve your issues with PORT_ID action without exposing
> the "full picture" of the architecture of an underlying hardware.
> 
> It looks something like this:
> 
>   tc filter add dev A ... action mirred egress redirect dev B
>                                         ^^^^^^
> 
> Direction could be 'ingress' or 'egress', so the packet will
> ingress from the port B back to application/kernel or it will
> egress from this port to the external network.  Same thing
> could be implemented in rte_flow like this:
> 
>   flow create A ingress transfer pattern eth / end
>       action port_id id B egress / end
> 
> So, application that needs to receive the packet from the port B
> will specify 'ingress', others that just want to send packet from
> the port B will specify 'egress'.  Will that work for you?
> 
> (BTW, 'ingress' seems to be not implemented in TC and that kind
> of suggests that it's not very useful at least for kernel use cases)
> 
> One might say that it's actually the same what is proposed in
> this RFC, but I will argue that 'ingress/egress' schema doesn't
> break the "***MAGIC***" abstraction because user is not obligated
> to know the structure of the underlying hardware, while 'upstream'
> flag is something very unclear from that perspective and makes
> no sense for plane ports (non-representors).

I think it is really an excellent idea and suggested
terminology looks very good to me. However, we should
agree on technical details on API level (not testpmd
commands). I think we have 4 options:

A. Add "ingress" bit with "egress" as unset meaning.
   Yes, that's what is current behaviour assumed and
   used by OvS and implemented in some PMDs.
   My problem with it that it is, IMHO, inconsistent
   default value (as explained above).

B. Add "egress" bit with "ingress" as unset meaning.
   Basically it is what is suggested in the RFC, but
   the problem of the suggestion is the silent breakage
   of existing users (let's put it a side if it is
   correct usage or misuse). It is still the fact.

C. Encode above in ethdev port ID MSB.
   The problem of the solution is that encoding
   makes sense for representors, but the problem
   exists for non-representor ports as well.
   I have no good ideas on terminology in the case
   if we try to solve it for non-representors.

D. Break API and ABI and add enum with unset(default)/
   ingress/egress members to enforce application to
   specify direction.

It is unclear what we'll do in the case of A, B and D
if we encode representor in port ID MSB in any case.

>>>>>> On 01/06/2021 15:10, Ilya Maximets wrote:
>>>>>>> On 6/1/21 1:14 PM, Ivan Malov wrote:
>>>>>>>> By its very name, action PORT_ID means that packets hit an ethdev with the
>>>>>>>> given DPDK port ID. At least the current comments don't state the opposite.
>>>>>>>> That said, since port representors had been adopted, applications like OvS
>>>>>>>> have been misusing the action. They misread its purpose as sending packets
>>>>>>>> to the opposite end of the "wire" plugged to the given ethdev, for example,
>>>>>>>> redirecting packets to the VF itself rather than to its representor ethdev.
>>>>>>>> Another example: OvS relies on this action with the admin PF's ethdev port
>>>>>>>> ID specified in it in order to send offloaded packets to the physical port.
>>>>>>>>
>>>>>>>> Since there might be applications which use this action in its valid sense,
>>>>>>>> one can't just change the documentation to greenlight the opposite meaning.
>>>>>>>> This patch adds an explicit bit to the action configuration which will let
>>>>>>>> applications, depending on their needs, leverage the two meanings properly.
>>>>>>>> Applications like OvS, as well as PMDs, will have to be corrected when the
>>>>>>>> patch has been applied. But the improved clarity of the action is worth it.
>>>>>>>>
>>>>>>>> The proposed change is not the only option. One could avoid changes in OvS
>>>>>>>> and PMDs if the new configuration field had the opposite meaning, with the
>>>>>>>> action itself meaning delivery to the represented port and not to DPDK one.
>>>>>>>> Alternatively, one could define a brand new action with the said behaviour.
>>>>>>>
>>>>>>> We had already very similar discussions regarding the understanding of what
>>>>>>> the representor really is from the DPDK API's point of view, and the last
>>>>>>> time, IIUC, it was concluded by a tech. board that representor should be
>>>>>>> a "ghost of a VF", i.e. DPDK APIs should apply configuration by default to
>>>>>>> VF and not to the representor device:
>>>>>>>     https://patches.dpdk.org/project/dpdk/cover/20191029185051.32203-1-thomas@monjalon.net/#104376
>>>>>>> This wasn't enforced though, IIUC, for existing code and semantics is still mixed.
>>>>>>>
>>>>>>> I still think that configuration should be applied to VF, and the same applies
>>>>>>> to rte_flow API.  IMHO, average application should not care if device is
>>>>>>> a VF itself or its representor.  Everything should work exactly the same.
>>>>>>> I think this matches with the original idea/design of the switchdev functionality
>>>>>>> in the linux kernel and also matches with how the average user thinks about
>>>>>>> representor devices.
>>>>>>>
>>>>>>> If some specific use-case requires to distinguish VF from the representor,
>>>>>>> there should probably be a separate special API/flag for that.
>>>>>>>
>>>>>>> Best regards, Ilya Maximets.
>>>>>>>
>>>>>>
>>>>
>>


^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-03  2:03  3%     ` Stephen Hemminger
@ 2021-06-03  4:59  0%       ` Gregory Etelson
  0 siblings, 0 replies; 200+ results
From: Gregory Etelson @ 2021-06-03  4:59 UTC (permalink / raw)
  To: Stephen Hemminger, Min Hu (Connor)
  Cc: Morten Brørup, dev, Matan Azrad, Ori Kam, Raslan Darawsheh,
	Bernard Iremonger, Olivier Matz

> On Thu, 3 Jun 2021 08:58:42 +0800
> "Min Hu (Connor)" <humin29@huawei.com> wrote:
> 
> > Hi, Morten and all,
> >       I have a questions which has bothering me for a long time.
> >       What's the difference between API and ABI?
> >       Why does this patch does not breake ABI, but break API(maybe)?
> >
> >       Hope for your reply, thanks.
> 
> The API being fixed, that a user can in confidence recompile their source
> code and it will compile without any new errors.
> 
> The ABI guarantee means, that an application dynamically linked to DPDK
> shared libraries will work without problem if the DPDK libraries are
> updated.

Hello Stephen,

Thank you for the clarification.

According to the above statements, the patch introduces alternative
access method to IPv4 version & ihl fields without breaking existing API.

Regards,
Gregory

^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-06-03  0:58  4%   ` Min Hu (Connor)
@ 2021-06-03  2:03  3%     ` Stephen Hemminger
  2021-06-03  4:59  0%       ` Gregory Etelson
  0 siblings, 1 reply; 200+ results
From: Stephen Hemminger @ 2021-06-03  2:03 UTC (permalink / raw)
  To: Min Hu (Connor)
  Cc: Morten Brørup, Gregory Etelson, dev, matan, orika, rasland,
	Bernard Iremonger, Olivier Matz

On Thu, 3 Jun 2021 08:58:42 +0800
"Min Hu (Connor)" <humin29@huawei.com> wrote:

> Hi, Morten and all,
> 	I have a questions which has bothering me for a long time.
> 	What's the difference between API and ABI?
> 	Why does this patch does not breake ABI, but break API(maybe)?
> 	
> 	Hope for your reply, thanks.

The API being fixed, that a user can in confidence recompile their source
code and it will compile without any new errors.

The ABI guarantee means, that an application dynamically linked to DPDK
shared libraries will work without problem if the DPDK libraries are updated.

^ permalink raw reply	[relevance 3%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-27 15:56  3% ` Morten Brørup
  2021-05-28 10:20  0%   ` Ananyev, Konstantin
@ 2021-06-03  0:58  4%   ` Min Hu (Connor)
  2021-06-03  2:03  3%     ` Stephen Hemminger
  1 sibling, 1 reply; 200+ results
From: Min Hu (Connor) @ 2021-06-03  0:58 UTC (permalink / raw)
  To: Morten Brørup, Gregory Etelson, dev
  Cc: matan, orika, rasland, Bernard Iremonger, Olivier Matz

Hi, Morten and all,
	I have a questions which has bothering me for a long time.
	What's the difference between API and ABI?
	Why does this patch does not breake ABI, but break API(maybe)?
	
	Hope for your reply, thanks.

在 2021/5/27 23:56, Morten Brørup 写道:
>> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gregory Etelson
>> Sent: Thursday, 27 May 2021 17.29
> and version fields
>>
>> RTE IPv4 header definition combines the `version' and `ihl'  fields
>> into a single structure member.
>> This patch introduces dedicated structure members for both `version'
>> and `ihl' IPv4 fields. Separated header fields definitions allow to
>> create simplified code to match on the IHL value in a flow rule.
>> The original `version_ihl' structure member is kept for backward
>> compatibility.
>>
>> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
>> ---
>>   app/test/test_flow_classify.c |  8 ++++----
>>   lib/net/rte_ip.h              | 16 +++++++++++++++-
>>   2 files changed, 19 insertions(+), 5 deletions(-)
>>
>> diff --git a/app/test/test_flow_classify.c
>> b/app/test/test_flow_classify.c
>> index 951606f248..4f64be5357 100644
>> --- a/app/test/test_flow_classify.c
>> +++ b/app/test/test_flow_classify.c
>> @@ -95,7 +95,7 @@ static struct rte_acl_field_def
>> ipv4_defs[NUM_FIELDS_IPV4] = {
>>    *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
>>    */
>>   static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
>> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
>> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
>>   	  RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}
>>   };
>>   static const struct rte_flow_item_ipv4 ipv4_mask_24 = {
>> @@ -131,7 +131,7 @@ static struct rte_flow_item  end_item = {
>> RTE_FLOW_ITEM_TYPE_END,
>>    *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
>>    */
>>   static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
>> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
>> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
>>   	  RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}
>>   };
>>
>> @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
>> RTE_FLOW_ITEM_TYPE_TCP,
>>    *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
>>    */
>>   static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
>> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
>> -	RTE_IPV4(15, 16, 17, 18)}
>> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
>> +	RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
>>   };
>>
>>   static struct rte_flow_item_sctp sctp_spec_1 = {
>> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
>> index 4b728969c1..684bb028b2 100644
>> --- a/lib/net/rte_ip.h
>> +++ b/lib/net/rte_ip.h
>> @@ -38,7 +38,21 @@ extern "C" {
>>    * IPv4 Header
>>    */
>>   struct rte_ipv4_hdr {
>> -	uint8_t  version_ihl;		/**< version and header length */
>> +	__extension__
>> +	union {
>> +		uint8_t version_ihl;    /**< version and header length */
>> +		struct {
>> +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
>> +			uint8_t ihl:4;
>> +			uint8_t version:4;
>> +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
>> +			uint8_t version:4;
>> +			uint8_t ihl:4;
>> +#else
>> +#error "setup endian definition"
>> +#endif
>> +		};
>> +	};
>>   	uint8_t  type_of_service;	/**< type of service */
>>   	rte_be16_t total_length;	/**< length of packet */
>>   	rte_be16_t packet_id;		/**< packet ID */
>> --
>> 2.31.1
>>
> 
> This does not break the ABI, but it could be discussed if it breaks the API due to the required structure initialization changes shown in test_flow_classify.c. I think this patch is an improvement, and that such structure modifications should be generally accepted, so:
> 
> Acked-by: Morten Brørup <mb@smartsharesystems.com>
> 
> .
> 

^ permalink raw reply	[relevance 4%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-31 11:10  0%           ` Gregory Etelson
@ 2021-06-02  9:51  0%             ` Gregory Etelson
  2021-06-10  4:10  0%               ` Gregory Etelson
  0 siblings, 1 reply; 200+ results
From: Gregory Etelson @ 2021-06-02  9:51 UTC (permalink / raw)
  To: Morten Brørup, Iremonger, Bernard, dev
  Cc: Matan Azrad, Ori Kam, Raslan Darawsheh, Olivier Matz, Thomas Monjalon

Hello,

Is there another concern about that patch ?
Please comment.

Regards,
Gregory

> -----Original Message-----
> From: Gregory Etelson
> Sent: Monday, May 31, 2021 14:10
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Morten Brørup
> <mb@smartsharesystems.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@nvidia.com>; Ori Kam <orika@nvidia.com>;
> Raslan Darawsheh <rasland@nvidia.com>; Iremonger, Bernard
> <bernard.iremonger@intel.com>; Olivier Matz <olivier.matz@6wind.com>
> Subject: RE: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
> 
> > > > > > > RTE IPv4 header definition combines the `version' and `ihl'
> > > > > > > fields into a single structure member.
> > > > > > > This patch introduces dedicated structure members for both
> > > > > `version'
> > > > > > > and `ihl' IPv4 fields. Separated header fields definitions
> > > > > > > allow to create simplified code to match on the IHL value in
> > > > > > > a flow
> > rule.
> > > > > > > The original `version_ihl' structure member is kept for
> > > > > > > backward compatibility.
> > > > > > >
> > > > > > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > > > > > ---
> > > > > > >  app/test/test_flow_classify.c |  8 ++++----
> > > > > > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > > > > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > > > > >
> > > > > > > diff --git a/app/test/test_flow_classify.c
> > > > > > > b/app/test/test_flow_classify.c index 951606f248..4f64be5357
> > > > > > > 100644
> > > > > > > --- a/app/test/test_flow_classify.c
> > > > > > > +++ b/app/test/test_flow_classify.c
> > > > > > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > > > > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > > > > > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > > > > > >   */
> > > > > > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > > >     RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}  };  static
> > > > > > > const struct rte_flow_item_ipv4 ipv4_mask_24 = { @@ -131,7
> > > > > > > +131,7 @@ static struct rte_flow_item  end_item = {
> > > > RTE_FLOW_ITEM_TYPE_END,
> > > > > > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > > > > > >   */
> > > > > > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > > >     RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}  };
> > > > > > >
> > > > > > > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1
> > > > > > > = { RTE_FLOW_ITEM_TYPE_TCP,
> > > > > > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > > > > > >   */
> > > > > > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13,
> > > > > > > 14),
> > > > > > > - RTE_IPV4(15, 16, 17, 18)}
> > > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > > > > > + RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > > > > > >  };
> > > > > > >
> > > > > > >  static struct rte_flow_item_sctp sctp_spec_1 = { diff --git
> > > > > > > a/lib/net/rte_ip.h b/lib/net/rte_ip.h index
> > > > > > > 4b728969c1..684bb028b2
> > > > > > > 100644
> > > > > > > --- a/lib/net/rte_ip.h
> > > > > > > +++ b/lib/net/rte_ip.h
> > > > > > > @@ -38,7 +38,21 @@ extern "C" {
> > > > > > >   * IPv4 Header
> > > > > > >   */
> > > > > > >  struct rte_ipv4_hdr {
> > > > > > > - uint8_t  version_ihl;           /**< version and header length */
> > > > > > > + __extension__
> > > > > > > + union {
> > > > > > > +         uint8_t version_ihl;    /**< version and header length */
> > > > > > > +         struct {
> > > > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > > > +                 uint8_t ihl:4;
> > > > > > > +                 uint8_t version:4; #elif RTE_BYTE_ORDER ==
> > > > > > > +RTE_BIG_ENDIAN
> > > > > > > +                 uint8_t version:4;
> > > > > > > +                 uint8_t ihl:4; #else #error "setup endian
> > > > > > > +definition"
> > > > > > > +#endif
> > > > > > > +         };
> > > > > > > + };
> > > > > > >   uint8_t  type_of_service;       /**< type of service */
> > > > > > >   rte_be16_t total_length;        /**< length of packet */
> > > > > > >   rte_be16_t packet_id;           /**< packet ID */
> > > > > > > --
> > > > > > > 2.31.1
> > > > > > >
> > > > > >
> > > > > > This does not break the ABI, but it could be discussed if it
> > > > > > breaks
> > > > > the API due to the required structure initialization changes
> > > > > shown in
> > > > > > test_flow_classify.c.
> > > > >
> > > > > Yep, I guess it might be classified as API change.
> > > > > Another thing that concerns me - it is not the only place in
> > > > > IPv4 header when we unite multiple bit-fields into one field:
> > > > > type_of_service, fragment_offset.
> > > > > If we start splitting ipv4 fields into actual bitfields, I
> > > > > suppose we'll end-up splitting these ones too.
> > > > > But I am not sure it will pay off - as compiler not always
> > > > > generates optimal code for reading/updating bitfields.
> > > > > Did you consider just adding extra macros to simplify access to
> > > > > these fields (like RTE_IPV4_HDR_(GET_SET)_*), instead?
> > > > >
> > > >
> > > > Let's please not introduce accessor macros for bitfields. If we
> > > > don't introduce bitfields like these, I would rather stick with
> > > > the current _MASK, _SHIFT and _FLAG defines.
> > > >
> > > > Yes, this change will lead to the introduction of more bitfields,
> > > > both here and in other places. We already accepted it in the eCPRI
> > > > structure (/lib/net/rte_ecpri.h), so why not just generally accept it.
> > > >
> > > > Are modern compilers really worse at handling a bitfield defined
> > > > like this, compared to handling a single uint8_t with hand coding?
> > > > I consider your concern very important, so I'm only asking if it
> > > > is still relevant, to avoid making decisions based on past
> > > > experience that might be outdated. (I admit to falling into that
> > > > trap myself, once in a while.)
> > > >
> > >
> > > I compared x86 code generated with gcc-9, gcc-10 and clang-10 for
> > > these
> > 2 functions:
> > > void test_ipv4_hdr_byte(struct rte_ipv4_hdr *h, uint8_t version,
> > > uint8_t ihl) {
> > >       h->version_ihl = ((version & 0x0f) << 4) | (ihl & 0x0f); }
> > > void test_ipv4_hdr_bits(struct rte_ipv4_hdr *h, uint8_t version,
> > > uint8_t
> > > ihl) {
> > >       h->version = version & 0x0f;
> > >       h->ihl = ihl & 0x0f;
> > > }
> > > meson configuration flags: --default-library=static
> > > --buildtype=release Each compiler produced identical code for both
> > functions.
> >
> > For that particular case (2 bit-fields packed tightly into one byte)
> > compilers usually perform quite well. At least I never saw issues for such
> case.
> > Bit-fields that do cross byte boundaries - that might be a trouble.
> >
> 
> Can we keep both implementations, the combined byte and the bit-field,
> grouped into a union ? In that case application or PMD can select access
> method that fits.
> 
> > >
> > >
> > > > > > I think this patch is an improvement, and that such structure
> > > > > modifications should be generally accepted, so:
> > > > > >
> > > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > > >


^ permalink raw reply	[relevance 0%]

* [dpdk-dev] [PATCH v7 00/10] eal: Add EAL API for threading
  @ 2021-06-01 20:55  2% ` Narcisa Ana Maria Vasile
  2021-06-04 23:38  2%   ` [dpdk-dev] [PATCH v8 " Narcisa Ana Maria Vasile
  0 siblings, 1 reply; 200+ results
From: Narcisa Ana Maria Vasile @ 2021-06-01 20:55 UTC (permalink / raw)
  To: dev, thomas, dmitry.kozliuk, khot, navasile, dmitrym, roretzla,
	talshn, ocardona
  Cc: bruce.richardson, david.marchand, pallavi.kadam

From: Narcisa Vasile <navasile@microsoft.com>

EAL thread API

**Problem Statement**
DPDK currently uses the pthread interface to create and manage threads.
Windows does not support the POSIX thread programming model, so it currently
relies on a header file that hides the Windows calls under
pthread matched interfaces. Given that EAL should isolate the environment
specifics from the applications and libraries and mediate
all the communication with the operating systems, a new EAL interface
is needed for thread management.

**Goals**
* Introduce a generic EAL API for threading support that will remove
  the current Windows pthread.h shim.
* Replace references to pthread_* across the DPDK codebase with the new
  RTE_THREAD_* API.
* Allow users to choose between using the RTE_THREAD_* API or a
  3rd party thread library through a configuration option.

**Design plan**
New API main files:
* rte_thread.h (librte_eal/include)
* rte_thread_types.h (librte_eal/include)
* rte_thread_windows_types.h (librte_eal/windows/include)
* rte_thread.c (librte_eal/windows)
* rte_thread.c (librte_eal/common)

For flexibility, the user is offered the option of either using the RTE_THREAD_* API or
a 3rd party thread library, through a meson flag “use_external_thread_lib”.
By default, this flag is set to FALSE, which means Windows libraries and applications
will use the RTE_THREAD_* API for managing threads.

If compiling on Windows and the “use_external_thread_lib” is *not* set,
the following files will be parsed: 
* include/rte_thread.h
* windows/include/rte_thread_windows_types.h
* windows/rte_thread.c
In all other cases, the compilation/parsing includes the following files:
* include/rte_thread.h 
* include/rte_thread_types.h
* common/rte_thread.c

**A schematic example of the design**
--------------------------------------------------
lib/librte_eal/include/rte_thread.h
int rte_thread_create();

lib/librte_eal/common/rte_thread.c
int rte_thread_create() 
{
	return pthread_create();
}

lib/librte_eal/windows/rte_thread.c
int rte_thread_create() 
{
	return CreateThread();
}

lib/librte_eal/windows/meson.build
if get_option('use_external_thread_lib')
	sources += 'librte_eal/common/rte_thread.c'
else
	sources += 'librte_eal/windows/rte_thread.c'
endif
-----------------------------------------------------

**Thread attributes**

When or after a thread is created, specific characteristics of the thread
can be adjusted. Given that the thread characteristics that are of interest
for DPDK applications are affinity and priority, the following structure
that represents thread attributes has been defined:

typedef struct
{
	enum rte_thread_priority priority;
	rte_cpuset_t cpuset;
} rte_thread_attr_t;

The *rte_thread_create()* function can optionally receive an rte_thread_attr_t
object that will cause the thread to be created with the affinity and priority
described by the attributes object. If no rte_thread_attr_t is passed
(parameter is NULL), the default affinity and priority are used.
An rte_thread_attr_t object can also be set to the default values
by calling *rte_thread_attr_init()*.

*Priority* is represented through an enum that currently advertises
two values for priority:
	- RTE_THREAD_PRIORITY_NORMAL
	- RTE_THREAD_PRIORITY_REALTIME_CRITICAL
The enum can be extended to allow for multiple priority levels.
rte_thread_set_priority      - sets the priority of a thread
rte_thread_attr_set_priority - updates an rte_thread_attr_t object
                               with a new value for priority

The user can choose thread priority through an EAL parameter,
when starting an application.  If EAL parameter is not used,
the per-platform default value for thread priority is used.
Otherwise administrator has an option to set one of available options:
 --thread-prio normal
 --thread-prio realtime

Example:
./dpdk-l2fwd -l 0-3 -n 4 –thread-prio normal -- -q 8 -p ffff

*Affinity* is described by the already known “rte_cpuset_t” type.
rte_thread_attr_set/get_affinity - sets/gets the affinity field in a
                                   rte_thread_attr_t object
rte_thread_set/get_affinity      – sets/gets the affinity of a thread

**Errors**
A translation function that maps Windows error codes to errno-style
error codes is provided. 

**Future work**
Note that this patchset was focused on introducing new API that will
remove the Windows pthread.h shim. In DPDK, there are still a few references
to pthread_* that were not implemented in the shim.
The long term plan is for EAL to provide full threading support:
* Adding support for conditional variables
* Additional functionality offered by pthread_* (such as pthread_setname_np, etc.)
* Static mutex initializers are not used on Windows. If we must continue
  using them, they need to be platform dependent and an implementation will
  need to be provided for Windows.

v7:
Based on DmitryK's review:
- Change thread id representation
- Change mutex id representation
- Implement static mutex inititalizer for Windows
- Change barrier identifier representation
- Improve commit messages
- Add missing doxygen comments
- Split error translation function
- Improve name for affinity function
- Remove cpuset_size parameter
- Fix eal_create_cpu_map function
- Map EAL priority values to OS specific values
- Add thread wrapper for start routine
- Do not export rte_thread_cancel() on Windows
- Cleanup, fix comments, fix typos.

v6:
- improve error-translation function
- call the error translation function in rte_thread_value_get()

v5:
- update cover letter with more details on the priority argument

v4:
- fix function description
- rebase

v3:
- rebase

v2:
- revert changes that break ABI 
- break up changes into smaller patches
- fix coding style issues
- fix issues with errors
- fix parameter type in examples/kni.c

Narcisa Vasile (10):
  eal: add thread id and simple thread functions
  eal: add thread attributes
  eal/windows: translate Windows errors to errno-style errors
  eal: implement functions for thread affinity management
  eal: implement thread priority management functions
  eal: add thread lifetime management
  eal: implement functions for mutex management
  eal: implement functions for thread barrier management
  eal: add EAL argument for setting thread priority
  Enable the new EAL thread API

 app/test/process.h                            |   8 +-
 app/test/test_lcores.c                        |  16 +-
 app/test/test_link_bonding.c                  |  10 +-
 app/test/test_lpm_perf.c                      |  12 +-
 config/meson.build                            |   4 +
 drivers/bus/dpaa/base/qbman/bman_driver.c     |   5 +-
 drivers/bus/dpaa/base/qbman/dpaa_sys.c        |  14 +-
 drivers/bus/dpaa/base/qbman/process.c         |   6 +-
 drivers/bus/dpaa/dpaa_bus.c                   |  14 +-
 drivers/bus/fslmc/portal/dpaa2_hw_dpio.c      |  19 +-
 drivers/compress/mlx5/mlx5_compress.c         |  10 +-
 drivers/event/dlb2/pf/base/dlb2_osdep.h       |   4 +-
 drivers/net/af_xdp/rte_eth_af_xdp.c           |  18 +-
 drivers/net/ark/ark_ethdev.c                  |   2 +-
 drivers/net/atlantic/atl_ethdev.c             |   4 +-
 drivers/net/atlantic/atl_types.h              |   5 +-
 .../net/atlantic/hw_atl/hw_atl_utils_fw2x.c   |  26 +-
 drivers/net/axgbe/axgbe_common.h              |   2 +-
 drivers/net/axgbe/axgbe_dev.c                 |   8 +-
 drivers/net/axgbe/axgbe_ethdev.c              |   8 +-
 drivers/net/axgbe/axgbe_ethdev.h              |   8 +-
 drivers/net/axgbe/axgbe_i2c.c                 |   4 +-
 drivers/net/axgbe/axgbe_mdio.c                |   8 +-
 drivers/net/axgbe/axgbe_phy_impl.c            |   6 +-
 drivers/net/bnxt/bnxt.h                       |  16 +-
 drivers/net/bnxt/bnxt_cpr.c                   |   4 +-
 drivers/net/bnxt/bnxt_ethdev.c                |  52 +-
 drivers/net/bnxt/bnxt_irq.c                   |   8 +-
 drivers/net/bnxt/bnxt_reps.c                  |  10 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.c            |  34 +-
 drivers/net/bnxt/tf_ulp/bnxt_ulp.h            |   4 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.c          |  24 +-
 drivers/net/bnxt/tf_ulp/ulp_fc_mgr.h          |   2 +-
 drivers/net/ena/base/ena_plat_dpdk.h          |   8 +-
 drivers/net/enic/enic.h                       |   2 +-
 drivers/net/ice/ice_dcf_parent.c              |   4 +-
 drivers/net/ipn3ke/ipn3ke_representor.c       |   6 +-
 drivers/net/ixgbe/ixgbe_ethdev.h              |   2 +-
 drivers/net/kni/rte_eth_kni.c                 |   8 +-
 drivers/net/mlx5/linux/mlx5_os.c              |   2 +-
 drivers/net/mlx5/mlx5.c                       |  20 +-
 drivers/net/mlx5/mlx5.h                       |   2 +-
 drivers/net/mlx5/mlx5_txpp.c                  |   8 +-
 drivers/net/mlx5/windows/mlx5_flow_os.c       |  10 +-
 drivers/net/mlx5/windows/mlx5_os.c            |   2 +-
 drivers/net/qede/base/bcm_osal.h              |   8 +-
 drivers/net/vhost/rte_eth_vhost.c             |  24 +-
 .../net/virtio/virtio_user/virtio_user_dev.c  |  30 +-
 .../net/virtio/virtio_user/virtio_user_dev.h  |   2 +-
 drivers/raw/ifpga/ifpga_rawdev.c              |   6 +-
 drivers/vdpa/ifc/ifcvf_vdpa.c                 |  46 +-
 drivers/vdpa/mlx5/mlx5_vdpa.c                 |  24 +-
 drivers/vdpa/mlx5/mlx5_vdpa.h                 |   6 +-
 drivers/vdpa/mlx5/mlx5_vdpa_event.c           |  73 +-
 examples/kni/main.c                           |   6 +-
 .../performance-thread/pthread_shim/main.c    |   2 +-
 examples/vhost/main.c                         |   2 +-
 examples/vhost_blk/vhost_blk.c                |  12 +-
 lib/eal/common/eal_common_options.c           |  34 +-
 lib/eal/common/eal_common_proc.c              |  48 +-
 lib/eal/common/eal_common_thread.c            |  31 +-
 lib/eal/common/eal_common_trace.c             |   2 +-
 lib/eal/common/eal_internal_cfg.h             |   2 +
 lib/eal/common/eal_options.h                  |   2 +
 lib/eal/common/eal_private.h                  |   2 +-
 lib/eal/common/malloc_mp.c                    |  32 +-
 lib/eal/common/meson.build                    |   1 +
 lib/eal/common/rte_thread.c                   | 416 +++++++++++
 lib/eal/freebsd/eal.c                         |  40 +-
 lib/eal/freebsd/eal_alarm.c                   |  12 +-
 lib/eal/freebsd/eal_interrupts.c              |   4 +-
 lib/eal/freebsd/eal_thread.c                  |  14 +-
 lib/eal/include/meson.build                   |   1 +
 lib/eal/include/rte_lcore.h                   |   8 +-
 lib/eal/include/rte_per_lcore.h               |   2 -
 lib/eal/include/rte_thread.h                  | 364 +++++++++-
 lib/eal/include/rte_thread_types.h            |  14 +
 lib/eal/linux/eal.c                           |  43 +-
 lib/eal/linux/eal_alarm.c                     |  10 +-
 lib/eal/linux/eal_interrupts.c                |   4 +-
 lib/eal/linux/eal_thread.c                    |  18 +-
 lib/eal/linux/eal_timer.c                     |   2 +-
 lib/eal/unix/meson.build                      |   1 -
 lib/eal/unix/rte_thread.c                     |  92 ---
 lib/eal/version.map                           |  21 +
 lib/eal/windows/eal.c                         |  40 +-
 lib/eal/windows/eal_interrupts.c              |  10 +-
 lib/eal/windows/eal_lcore.c                   | 169 +++--
 lib/eal/windows/eal_thread.c                  |  28 +-
 lib/eal/windows/eal_windows.h                 |  20 +-
 lib/eal/windows/include/meson.build           |   1 +
 lib/eal/windows/include/pthread.h             | 186 -----
 .../include/rte_windows_thread_types.h        |  19 +
 lib/eal/windows/include/sched.h               |   2 +-
 lib/eal/windows/meson.build                   |   7 +-
 lib/eal/windows/rte_thread.c                  | 671 +++++++++++++++++-
 lib/ethdev/rte_ethdev.c                       |   4 +-
 lib/ethdev/rte_ethdev_core.h                  |   5 +-
 lib/ethdev/rte_flow.c                         |   4 +-
 lib/eventdev/rte_event_eth_rx_adapter.c       |   6 +-
 lib/vhost/fd_man.c                            |  40 +-
 lib/vhost/fd_man.h                            |   6 +-
 lib/vhost/socket.c                            | 130 ++--
 lib/vhost/vhost.c                             |  10 +-
 meson_options.txt                             |   2 +
 105 files changed, 2298 insertions(+), 972 deletions(-)
 create mode 100644 lib/eal/common/rte_thread.c
 create mode 100644 lib/eal/include/rte_thread_types.h
 delete mode 100644 lib/eal/unix/rte_thread.c
 delete mode 100644 lib/eal/windows/include/pthread.h
 create mode 100644 lib/eal/windows/include/rte_windows_thread_types.h

-- 
2.31.0.vfs.0.1


^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH] doc: announce removal of ABIs in PCI bus driver
@ 2021-06-01  8:41  5% Chenbo Xia
  0 siblings, 0 replies; 200+ results
From: Chenbo Xia @ 2021-06-01  8:41 UTC (permalink / raw)
  To: dev, thomas; +Cc: mdr, nhorman

All ABIs in PCI bus driver, which are defined in rte_buc_pci.h,
will be removed and the header will be made internal.

Signed-off-by: Chenbo Xia <chenbo.xia@intel.com>
---
 doc/guides/rel_notes/deprecation.rst | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst
index 9584d6bfd7..b01f46c62e 100644
--- a/doc/guides/rel_notes/deprecation.rst
+++ b/doc/guides/rel_notes/deprecation.rst
@@ -147,3 +147,8 @@ Deprecation Notices
 * cmdline: ``cmdline`` structure will be made opaque to hide platform-specific
   content. On Linux and FreeBSD, supported prior to DPDK 20.11,
   original structure will be kept until DPDK 21.11.
+
+* pci: To reduce unnecessary ABIs exposed by DPDK bus driver, "rte_bus_pci.h"
+  will be made internal in 21.11 and macros/data structures/functions defined
+  in the header will not be considered as ABI anymore. This change is inspired
+  by the RFC https://patchwork.dpdk.org/project/dpdk/list/?series=17176.
-- 
2.17.1


^ permalink raw reply	[relevance 5%]

* [dpdk-dev] 20.11.2 patches review and test
@ 2021-06-01  7:54  1% Xueming(Steven) Li
  2021-06-08  8:52  0% ` Jiang, YuX
                   ` (3 more replies)
  0 siblings, 4 replies; 200+ results
From: Xueming(Steven) Li @ 2021-06-01  7:54 UTC (permalink / raw)
  Cc: dev, Abhishek Marathe, Akhil Goyal, Ali Alnubani,
	benjamin.walker, David Christensen, hariprasad.govindharajan,
	Hemant Agrawal, Ian Stokes, Jerin Jacob, John McNamara,
	Ju-Hyoung Lee, Kevin Traynor, Luca Boccassi, Pei Zhang, pingx.yu,
	qian.q.xu, Raslan Darawsheh, NBU-Contact-Thomas Monjalon,
	yuan.peng, zhaoyan.chen, Xueming(Steven) Li

Hi all,

Here is a list of patches targeted for stable release 20.11.2.

The planned date for the final release is 15th June.

Please help with testing and validation of your use cases and report
any issues/results with reply-all to this mail. For the final release
the fixes and reported validations will be added to the release notes.

A release candidate tarball can be found at:

    https://dpdk.org/browse/dpdk-stable/tag/?id=v20.11.2-rc1

These patches are located at branch 20.11 of dpdk-stable repo:
    https://dpdk.org/browse/dpdk-stable/


Thanks.

Xueming Li <xuemingl@nvidia.com>

---
Ajit Khaparde (3):
      net/bnxt: fix RSS context cleanup
      net/bnxt: check kvargs parsing
      net/bnxt: fix resource cleanup

Alvin Zhang (7):
      net/ice: fix VLAN filter with PF
      net/i40e: fix input set field mask
      net/igc: fix Rx RSS hash offload capability
      net/igc: fix Rx error counter for bad length
      net/e1000: fix Rx error counter for bad length
      net/e1000: fix max Rx packet size
      net/igc: fix Rx packet size

Anatoly Burakov (2):
      fbarray: fix log message on truncation error
      power: do not skip saving original P-state governor

Andrew Boyer (1):
      net/ionic: fix completion type in lif init

Andrew Rybchenko (3):
      net/failsafe: fix RSS hash offload reporting
      net/failsafe: report minimum and maximum MTU
      common/sfc_efx: remove GENEVE from supported tunnels

Ankur Dwivedi (1):
      crypto/octeontx: fix session-less mode

Apeksha Gupta (1):
      examples/l2fwd-crypto: skip masked devices

Arek Kusztal (1):
      crypto/qat: fix offset for out-of-place scatter-gather

Beilei Xing (1):
      net/i40evf: fix packet loss for X722

Bruce Richardson (1):
      build: exclude meson files from examples installation

Chenbo Xia (1):
      examples/vhost: check memory table query

Chengchang Tang (15):
      net/hns3: fix HW buffer size on MTU update
      net/hns3: fix processing Tx offload flags
      net/hns3: fix Tx checksum for UDP packets with special port
      net/hns3: fix long task queue pairs reset time
      ethdev: validate input in module EEPROM dump
      ethdev: validate input in register info
      ethdev: validate input in EEPROM info
      net/hns3: fix rollback after setting PVID failure
      net/hns3: fix timing in resetting queues
      net/hns3: fix queue state when concurrent with reset
      net/hns3: fix configure FEC when concurrent with reset
      net/hns3: fix use of command status enumeration
      examples: add eal cleanup to examples
      net/bonding: fix adding itself as its slave
      net/hns3: fix timing in mailbox

Chengwen Feng (15):
      net/hns3: fix flow counter value
      net/hns3: fix VF mailbox head field
      net/hns3: support get device version when dump register
      net/hns3: fix some packet types
      net/hns3: fix missing outer L4 UDP flag for VXLAN
      net/hns3: remove VLAN/QinQ ptypes from support list
      test: check thread creation
      common/dpaax: fix possible null pointer access
      examples/ethtool: remove unused parsing
      net/hns3: fix flow director lock
      net/e1000/base: fix timeout for shadow RAM write
      net/hns3: fix setting default MAC address in bonding of VF
      net/hns3: fix possible mismatched response of mailbox
      net/hns3: fix VF handling LSC event in secondary process
      net/hns3: fix verification of NEON support

Ciara Loftus (1):
      net/af_xdp: fix error handling during Rx queue setup

Conor Walsh (1):
      examples/l3fwd: fix LPM IPv6 subnets

Cristian Dumitrescu (3):
      table: fix actions with different data size
      pipeline: fix instruction translation
      pipeline: fix endianness conversions

Dapeng Yu (3):
      net/igc: remove MTU setting limitation
      net/e1000: remove MTU setting limitation
      examples/packet_ordering: fix port configuration

David Harton (1):
      net/ena: fix releasing Tx ring mbufs

David Marchand (8):
      doc: fix sphinx rtd theme import in GHA
      service: clean references to removed symbol
      eal: fix evaluation of log level option
      ci: hook to GitHub Actions
      ci: enable v21 ABI checks
      ci: fix package installation in GitHub Actions
      ci: ignore APT update failure in GitHub Actions
      ci: catch coredumps

Dekel Peled (1):
      common/mlx5: fix DevX read output buffer size

Dmitry Kozlyuk (3):
      net/pcap: fix format string
      eal/windows: add missing SPDX license tag
      buildtools: fix all drivers disabled on Windows

Ed Czeck (2):
      net/ark: update packet director initial state
      net/ark: refactor Rx buffer recovery

Elad Nachman (2):
      kni: support async user request
      kni: fix kernel deadlock with bifurcated device

Feifei Wang (2):
      net/i40e: fix parsing packet type for NEON
      test/trace: fix race on collected perf data

Ferruh Yigit (3):
      power: remove duplicated symbols from map file
      log/linux: make default output stderr
      license: fix typos

Guoyang Zhou (1):
     net/hinic: fix crash in secondary process

Haiyue Wang (1):
      net/ixgbe: fix Rx errors statistics for UDP checksum

Harman Kalra (1):
      event/octeontx2: fix device reconfigure for single slot

Hongbo Zheng (3):
      app/testpmd: fix Tx/Rx descriptor query error log
      net/hns3: fix FLR miss detection
      net/hns3: delete redundant blank line

Huisong Li (11):
      net/hns3: fix device capabilities for copper media type
      net/hns3: remove unused parameter markers
      net/hns3: fix reporting undefined speed
      net/hns3: fix link update when failed to get link info
      net/hns3: fix flow control exception
      app/testpmd: fix bitmap of link speeds when force speed
      net/hns3: fix flow control mode
      net/hns3: remove redundant mailbox response
      net/hns3: fix DCB mode check
      net/hns3: fix VMDq mode check
      net/hns3: fix mbuf leakage

Ibtisam Tariq (1):
      examples/vhost_crypto: remove unused short option

Igor Russkikh (2):
      net/qede: reduce log verbosity
      net/qede: accept bigger RSS table

Ilya Maximets (1):
      net/virtio: fix interrupt unregistering for listening socket

Ivan Malov (5):
      net/sfc: fix buffer size for flow parse
      net: fix comment in IPv6 header
      net/sfc: fix error path inconsistency
      common/sfc_efx/base: fix indication of MAE encap support
      net/sfc: fix outer rule rollback on error

Jiawei Wang (2):
      app/testpmd: fix NVGRE encap configuration
      net/mlx5: fix resource release for mirror flow

Jiawei Zhu (1):
      net/mlx5: fix Rx segmented packets on mbuf starvation

Jiawen Wu (3):
      net/txgbe: remove unused functions
      net/txgbe: fix Rx missed packet counter
      net/txgbe: update packet type

John Daley (1):
      net/enic: fix flow initialization error handling

Kalesh AP (18):
      net/bnxt: remove unused macro
      net/bnxt: fix VNIC configuration
      net/bnxt: fix firmware fatal error handling
      net/bnxt: fix FW readiness check during recovery
      net/bnxt: fix device readiness check
      net/bnxt: fix VF info allocation
      net/bnxt: fix HWRM and FW incompatibility handling
      net/bnxt: mute some failure logs
      app/testpmd: check MAC address query
      net/bnxt: fix PCI write check
      net/bnxt: fix link state operations
      net/bnxt: fix timesync when PTP is not supported
      net/bnxt: fix memory allocation for command response
      net/bnxt: fix double free in port start failure
      net/bnxt: fix configuring LRO
      net/bnxt: fix health check alarm cancellation
      net/bnxt: fix PTP support for Thor
      net/bnxt: fix ring count calculation for Thor

Kevin Traynor (1):
      test/cmdline: fix inputs array

Lance Richardson (6):
      net/bnxt: fix Rx buffer posting
      net/bnxt: fix Tx length hint threshold
      net/bnxt: fix handling of null flow mask
      test: fix TCP header initialization
      net/bnxt: fix Rx descriptor status
      net/bnxt: fix Rx queue count

Leyi Rong (1):
      net/iavf: fix packet length parsing in AVX512

Li Zhang (1):
      net/mlx5: fix flow actions index in cache

Luc Pelletier (2):
      eal: fix race in control thread creation
      eal: fix hang in control thread creation

Marvin Liu (5):
      vhost: fix split ring potential buffer overflow
      vhost: fix packed ring potential buffer overflow
      vhost: fix batch dequeue potential buffer overflow
      vhost: fix initialization of temporary header
      vhost: fix initialization of async temporary header

Matan Azrad (4):
      common/mlx5/linux: add glue function to query WQ
      common/mlx5: add DevX command to query WQ
      common/mlx5: add DevX commands for queue counters
      vdpa/mlx5: fix virtq cleaning

Min Hu (Connor) (8):
      net/hns3: fix MTU config complexity
      net/hns3: update HiSilicon copyright syntax
      net/hns3: fix copyright date
      examples/ptpclient: remove wrong comment
      test/bpf: fix error message
      doc: fix HiSilicon copyright syntax
      net/hns3: remove unused macros
      net/hns3: remove unused macro

Murphy Yang (3):
      net/ixgbe: fix RSS RETA being reset after port start
      net/i40e: fix flow director config after flow validate
      net/i40e: fix flow director for common pctypes

Natanael Copa (5):
      common/dpaax/caamflib: fix build with musl
      bus/dpaa: fix 64-bit arch detection
      bus/dpaa: fix build with musl
      net/cxgbe: remove use of uint type
      app/testpmd: fix build with musl

Nipun Gupta (1):
      bus/dpaa: fix statistics reading

Nithin Dabilpuram (3):
      vfio: do not merge contiguous areas
      vfio: fix DMA mapping granularity for IOVA as VA
      test/mem: fix page size for external memory

Pallavi Kadam (1):
      bus/pci: skip probing some Windows NDIS devices

Pavan Nikhilesh (2):
      test/event: fix timeout accuracy
      app/eventdev: fix timeout accuracy

Pu Xu (1):
      ip_frag: fix fragmenting IPv4 packet with header option

Qi Zhang (7):
      net/ice/base: fix payload indicator on ptype
      net/ice/base: fix uninitialized struct
      net/ice/base: cleanup filter list on error
      net/ice/base: fix memory allocation for MAC addresses
      net/iavf: fix TSO max segment size
      doc: fix matching versions in ice guide
      net/iavf: fix wrong Tx context descriptor

Radha Mohan Chintakuntla (1):
      raw/octeontx2_dma: assign PCI device in DPI VF

Raslan Darawsheh (1):
      ethdev: update flow item GTP QFI definition

Richael Zhuang (2):
      test/power: add delay before checking CPU frequency
      test/power: round CPU frequency to check

Robin Zhang (4):
      net/i40e: announce request queue capability in PF
      doc: update recommended versions for i40e
      net/i40e: fix lack of MAC type when set MAC address
      net/iavf: fix lack of MAC type when set MAC address

Rohit Raj (3):
      net/dpaa2: fix getting link status
      net/dpaa: fix getting link status
      examples/l2fwd-crypto: fix packet length while decryption

Roy Shterman (1):
      mem: fix freeing segments in --huge-unlink mode

Satheesh Paul (1):
      net/octeontx2: fix VLAN filter

Savinay Dharmappa (1):
      sched: fix traffic class oversubscription parameter

Shijith Thotton (1):
      eventdev: fix case to initiate crypto adapter service

Siwar Zitouni (1):
      net/ice: fix disabling promiscuous mode                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         Somnath Kotur (3):
      net/bnxt: fix xstats get
      net/bnxt: fix Rx and Tx timestamps
      net/bnxt: fix Tx timestamp init

Stanislaw Kardach (1):
      test: proceed if timer subsystem already initialized

Stephen Hemminger (1):
      kni: refactor user request processing

Tal Shnaiderman (2):
      eal/windows: fix default thread priority
      eal/windows: fix return codes of pthread shim layer

Tengfei Zhang (1):
      net/pcap: fix file descriptor leak on close

Thinh Tran (1):
      test: fix autotest handling of skipped tests

Thomas Monjalon (16):
      bus/pci: fix Windows kernel driver categories
      eal: fix comment of OS-specific header files
      buildtools: fix build with busybox
      build: detect execinfo library on Linux
      build: remove redundant _GNU_SOURCE definitions
      eal: fix build with musl
      net/igc: remove use of uint type
      event/dlb: fix header includes for musl
      examples/bbdev: fix header include for musl
      drivers: fix log level after loading
      app/regex: fix usage text
      app/testpmd: fix usage text
      doc: fix names of UIO drivers
      doc: fix build with Sphinx 4
      bus/pci: support I/O port operations with musl
      app: fix exit messages

Tyler Retzlaff (1):
      eal: add C++ include guard for reciprocal header

Vadim Podovinnikov (1):
      net/bonding: fix LACP system address check

Venkat Duvvuru (1):
      net/bnxt: fix queues per VNIC

Viacheslav Ovsiienko (11):
      net/mlx5: fix external buffer pool registration for Rx queue
      net/mlx5: fix metadata item validation for ingress flows
      net/mlx5: fix hashed list size for tunnel flow groups
      net/mlx5: fix UAR allocation diagnostics messages
      common/mlx5: add timestamp format support to DevX
      vdpa/mlx5: support timestamp format
      net/mlx5: fix Rx metadata leftovers
      net/mlx5: fix drop action for Direct Rules/Verbs
      net/mlx4: fix RSS action with null hash key
      net/mlx5: support timestamp format
      regex/mlx5: support timestamp format

Wenjun Wu (2):
      net/ice: check some functions return
      net/ice: fix RSS hash update

Wenwu Ma (1):
      net/ice: fix illegal access when removing MAC filter

Wenzhuo Lu (2):
      net/iavf: fix crash in AVX512
      net/ice: fix crash in AVX512

Wisam Jaddo (1):
      app/flow-perf: fix encap/decap actions

Xiao Wang (1):
      vdpa/ifc: check PCI config read

Xiaoyu Min (4):
      net/mlx5: support RSS expansion for IPv6 GRE
      net/mlx5: fix shared inner RSS
      net/mlx5: fix missing shared RSS hash types
      net/mlx5: fix redundant flow after RSS expansion

Xiaoyun Li (2):
      app/testpmd: remove unnecessary UDP tunnel check
      net/i40e: fix IPv4 fragment offload

Youri Querry (1):
      bus/fslmc: fix random portal hangs with qbman 5.0

Yunjian Wang (3):
      vfio: fix API description
      net/mlx5: fix using flow tunnel before null check
      vfio: fix duplicated user mem map

^ permalink raw reply	[relevance 1%]

* [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK
  @ 2021-06-01  3:06  2% ` Chenbo Xia
  2021-06-11  7:15  0%   ` Thomas Monjalon
  0 siblings, 1 reply; 200+ results
From: Chenbo Xia @ 2021-06-01  3:06 UTC (permalink / raw)
  To: dev, thomas, cunming.liang, jingjing.wu
  Cc: anatoly.burakov, ferruh.yigit, mdr, nhorman, bruce.richardson,
	david.marchand, stephen, konstantin.ananyev

Hi everyone,

This is a draft implementation of the mdev (Mediated device [1])
support in DPDK PCI bus driver. Mdev is a way to virtualize devices
in Linux kernel. Based on the device-api (mdev_type/device_api),
there could be different types of mdev devices (e.g. vfio-pci).
In this patchset, the PCI bus driver is extended to support scanning
and probing the mdev devices whose device-api is "vfio-pci".

                     +---------+
                     | PCI bus |
                     +----+----+
                          |
         +--------+-------+-------+--------+
         |        |               |        |
  Physical PCI devices ...   Mediated PCI devices ...

The first four patches in this patchset are mainly preparation of mdev
bus support. The left two patches are the key implementation of mdev bus.

The implementation of mdev bus in DPDK has several options:

1: Embed mdev bus in current pci bus

   This patchset takes this option for an example. Mdev has several
   device types: pci/platform/amba/ccw/ap. DPDK currently only cares
   pci devices in all mdev device types so we could embed the mdev bus
   into current pci bus. Then pci bus with mdev support will scan/plug/
   unplug/.. not only normal pci devices but also mediated pci devices.

2: A new mdev bus that scans mediated pci devices and probes mdev driver to
   plug-in pci devices to pci bus

   If we took this option, a new mdev bus will be implemented to scan
   mediated pci devices and a new mdev driver for pci devices will be
   implemented in pci bus to plug-in mediated pci devices to pci bus.

   Our RFC v1 takes this option:
   http://patchwork.dpdk.org/project/dpdk/cover/20190403071844.21126-1-tiwei.bie@intel.com/

   Note that: for either option 1 or 2, device drivers do not know the
   implementation difference but only use structs/functions exposed by
   pci bus. Mediated pci devices are different from normal pci devices
   on: 1. Mediated pci devices use UUID as address but normal ones use BDF.
   2. Mediated pci devices may have some capabilities that normal pci
   devices do not have. For example, mediated pci devices could have
   regions that have sparse mmap capability, which allows a region to have
   multiple mmap areas. Another example is mediated pci devices may have
   regions/part of regions not mmaped but need to access them. Above
   difference will change the current ABI (i.e., struct rte_pci_device).
   Please check 5th and 6th patch for details.

3. A brand new mdev bus that does everything

   This option will implement a new and standalone mdev bus. This option
   does not need any changes in current pci bus but only needs some shared
   code (linux vfio part) in pci bus. Drivers of devices that support mdev
   will register itself as a mdev driver and do not rely on pci bus anymore.
   This option, IMHO, will make the code clean. The only potential problem
   may be code duplication, which could be solved by making code of linux
   vfio part of pci bus common and shared.

Your comments on above three options are welcomed and appreciated!

Thanks!
Chenbo

----------------------------------------------------------------------------
RFC v3:
- Add sparse mmap support
- Minor fixes and improvements

RFC v2:
- Let PCI bus scan mediated PCI devices directly
- Address Keith's comments
- Merge below patch into this series (David)
   http://patches.dpdk.org/patch/55927/
- Add internal representation of PCI device (David)
- Minor fixes and improvements

[1] https://github.com/torvalds/linux/blob/master/Documentation/driver-api/vfio-mediated-device.rst

Chenbo Xia (1):
  bus/pci: add sparse mmap support for mediated PCI devices

Tiwei Bie (5):
  bus/pci: introduce an internal representation of PCI device
  bus/pci: avoid depending on private value in kernel source
  bus/pci: introduce helper for MMIO read and write
  eal: add a helper for reading string from sysfs
  bus/pci: add mdev support

 drivers/bus/pci/bsd/pci.c             |  36 +-
 drivers/bus/pci/linux/pci.c           | 107 ++++-
 drivers/bus/pci/linux/pci_init.h      |  29 +-
 drivers/bus/pci/linux/pci_uio.c       |  22 +
 drivers/bus/pci/linux/pci_vfio.c      | 586 ++++++++++++++++++++++----
 drivers/bus/pci/linux/pci_vfio_mdev.c | 277 ++++++++++++
 drivers/bus/pci/meson.build           |   1 +
 drivers/bus/pci/pci_common.c          |  86 ++--
 drivers/bus/pci/pci_params.c          |  36 +-
 drivers/bus/pci/private.h             |  40 ++
 drivers/bus/pci/rte_bus_pci.h         |  83 +++-
 drivers/bus/pci/version.map           |   4 +
 lib/eal/common/eal_filesystem.h       |  10 +
 lib/eal/freebsd/eal.c                 |  22 +
 lib/eal/linux/eal.c                   |  39 +-
 lib/eal/version.map                   |   3 +
 16 files changed, 1224 insertions(+), 157 deletions(-)
 create mode 100644 drivers/bus/pci/linux/pci_vfio_mdev.c

-- 
2.17.1


^ permalink raw reply	[relevance 2%]

* [dpdk-dev] [PATCH v1 2/2] devtools: use absolute path for the build directory
  2021-06-01  1:56  8% [dpdk-dev] [PATCH v1 0/2] relative path support for ABI compatibility check Feifei Wang
  2021-06-01  1:56 17% ` [dpdk-dev] [PATCH v1 1/2] devtools: add " Feifei Wang
@ 2021-06-01  1:56 12% ` Feifei Wang
  1 sibling, 0 replies; 200+ results
From: Feifei Wang @ 2021-06-01  1:56 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, nd, Phil Yang, Juraj Linkeš, Feifei Wang, Ruifeng Wang

From: Phil Yang <phil.yang@arm.com>

To make the code easier to maintain, use the absolute path for the
default build_dir to avoid repeatedly calling of readlink.

Suggested-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 devtools/test-meson-builds.sh | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
index 43b906598d..d6b0e7e059 100755
--- a/devtools/test-meson-builds.sh
+++ b/devtools/test-meson-builds.sh
@@ -16,7 +16,7 @@ srcdir=$(dirname $(readlink -f $0))/..
 
 MESON=${MESON:-meson}
 use_shared="--default-library=shared"
-builds_dir=${DPDK_BUILD_TEST_DIR:-.}
+builds_dir=$(readlink -f ${DPDK_BUILD_TEST_DIR:-.})
 
 if command -v gmake >/dev/null 2>&1 ; then
 	MAKE=gmake
@@ -193,16 +193,16 @@ build () # <directory> <target cc | cross file> <ABI check> [meson options]
 		fi
 
 		install_target $builds_dir/$targetdir \
-			$(readlink -f $builds_dir/$targetdir/install)
+			$builds_dir/$targetdir/install
 		echo "Checking ABI compatibility of $targetdir" >&$verbose
 		echo $srcdir/devtools/gen-abi.sh \
-			$(readlink -f $builds_dir/$targetdir/install) >&$veryverbose
+			$builds_dir/$targetdir/install >&$veryverbose
 		$srcdir/devtools/gen-abi.sh \
-			$(readlink -f $builds_dir/$targetdir/install) >&$veryverbose
+			$builds_dir/$targetdir/install >&$veryverbose
 		echo $srcdir/devtools/check-abi.sh $abirefdir/$targetdir \
-			$(readlink -f $builds_dir/$targetdir/install) >&$veryverbose
+			$builds_dir/$targetdir/install >&$veryverbose
 		$srcdir/devtools/check-abi.sh $abirefdir/$targetdir \
-			$(readlink -f $builds_dir/$targetdir/install) >&$verbose
+			$builds_dir/$targetdir/install >&$verbose
 	fi
 }
 
@@ -275,7 +275,7 @@ done
 # Test installation of the x86-generic target, to be used for checking
 # the sample apps build using the pkg-config file for cflags and libs
 load_env cc
-build_path=$(readlink -f $builds_dir/build-x86-generic)
+build_path=$builds_dir/build-x86-generic
 export DESTDIR=$build_path/install
 install_target $build_path $DESTDIR
 pc_file=$(find $DESTDIR -name libdpdk.pc)
-- 
2.25.1


^ permalink raw reply	[relevance 12%]

* [dpdk-dev] [PATCH v1 1/2] devtools: add relative path support for ABI compatibility check
  2021-06-01  1:56  8% [dpdk-dev] [PATCH v1 0/2] relative path support for ABI compatibility check Feifei Wang
@ 2021-06-01  1:56 17% ` Feifei Wang
  2021-06-22  2:08  4%   ` [dpdk-dev] 回复: " Feifei Wang
  2021-06-22  9:19  4%   ` [dpdk-dev] " Bruce Richardson
  2021-06-01  1:56 12% ` [dpdk-dev] [PATCH v1 2/2] devtools: use absolute path for the build directory Feifei Wang
  1 sibling, 2 replies; 200+ results
From: Feifei Wang @ 2021-06-01  1:56 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: dev, nd, Phil Yang, Feifei Wang, Juraj Linkeš, Ruifeng Wang

From: Phil Yang <phil.yang@arm.com>

Because dpdk guide does not limit the relative path for ABI
compatibility check, users maybe set 'DPDK_ABI_REF_DIR' as a relative
path:

~/dpdk/devtools$ DPDK_ABI_REF_VERSION=v19.11 DPDK_ABI_REF_DIR=build-gcc-shared
./test-meson-builds.sh

And if the DESTDIR is not an absolute path, ninja complains:
+ install_target build-gcc-shared/v19.11/build build-gcc-shared/v19.11/build-gcc-shared
+ rm -rf build-gcc-shared/v19.11/build-gcc-shared
+ echo 'DESTDIR=build-gcc-shared/v19.11/build-gcc-shared ninja -C build-gcc-shared/v19.11/build install'
+ DESTDIR=build-gcc-shared/v19.11/build-gcc-shared
+ ninja -C build-gcc-shared/v19.11/build install
...
ValueError: dst_dir must be absolute, got build-gcc-shared/v19.11/build-gcc-shared/usr/local/share/dpdk/
examples/bbdev_app
...
Error: install directory 'build-gcc-shared/v19.11/build-gcc-shared' does not exist.

To fix this, add relative path support using 'readlink -f'.

Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Feifei Wang <feifei.wang2@arm.com>
Reviewed-by: Juraj Linkeš <juraj.linkes@pantheon.tech>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
---
 devtools/test-meson-builds.sh | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/devtools/test-meson-builds.sh b/devtools/test-meson-builds.sh
index daf817ac3e..43b906598d 100755
--- a/devtools/test-meson-builds.sh
+++ b/devtools/test-meson-builds.sh
@@ -168,7 +168,8 @@ build () # <directory> <target cc | cross file> <ABI check> [meson options]
 	config $srcdir $builds_dir/$targetdir $cross --werror $*
 	compile $builds_dir/$targetdir
 	if [ -n "$DPDK_ABI_REF_VERSION" -a "$abicheck" = ABI ] ; then
-		abirefdir=${DPDK_ABI_REF_DIR:-reference}/$DPDK_ABI_REF_VERSION
+		abirefdir=$(readlink -f \
+			${DPDK_ABI_REF_DIR:-reference}/$DPDK_ABI_REF_VERSION)
 		if [ ! -d $abirefdir/$targetdir ]; then
 			# clone current sources
 			if [ ! -d $abirefdir/src ]; then
-- 
2.25.1


^ permalink raw reply	[relevance 17%]

* [dpdk-dev] [PATCH v1 0/2] relative path support for ABI compatibility check
@ 2021-06-01  1:56  8% Feifei Wang
  2021-06-01  1:56 17% ` [dpdk-dev] [PATCH v1 1/2] devtools: add " Feifei Wang
  2021-06-01  1:56 12% ` [dpdk-dev] [PATCH v1 2/2] devtools: use absolute path for the build directory Feifei Wang
  0 siblings, 2 replies; 200+ results
From: Feifei Wang @ 2021-06-01  1:56 UTC (permalink / raw)
  Cc: dev, nd, Feifei Wang

Add relative path support for ABI compatibility check and do some code
simplification work.

Phil Yang (2):
  devtools: add relative path support for ABI compatibility check
  devtools: use absolute path for the build directory

 devtools/test-meson-builds.sh | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

-- 
2.25.1


^ permalink raw reply	[relevance 8%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-31  9:58  0%         ` Ananyev, Konstantin
@ 2021-05-31 11:10  0%           ` Gregory Etelson
  2021-06-02  9:51  0%             ` Gregory Etelson
  0 siblings, 1 reply; 200+ results
From: Gregory Etelson @ 2021-05-31 11:10 UTC (permalink / raw)
  To: Ananyev, Konstantin, Morten Brørup, dev
  Cc: Matan Azrad, Ori Kam, Raslan Darawsheh, Iremonger, Bernard, Olivier Matz

> > > > > > RTE IPv4 header definition combines the `version' and `ihl'
> > > > > > fields into a single structure member.
> > > > > > This patch introduces dedicated structure members for both
> > > > `version'
> > > > > > and `ihl' IPv4 fields. Separated header fields definitions
> > > > > > allow to create simplified code to match on the IHL value in a flow
> rule.
> > > > > > The original `version_ihl' structure member is kept for
> > > > > > backward compatibility.
> > > > > >
> > > > > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > > > > ---
> > > > > >  app/test/test_flow_classify.c |  8 ++++----
> > > > > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > > > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > > > >
> > > > > > diff --git a/app/test/test_flow_classify.c
> > > > > > b/app/test/test_flow_classify.c index 951606f248..4f64be5357
> > > > > > 100644
> > > > > > --- a/app/test/test_flow_classify.c
> > > > > > +++ b/app/test/test_flow_classify.c
> > > > > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > > > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > > > > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > > > > >   */
> > > > > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > >     RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}  };  static
> > > > > > const struct rte_flow_item_ipv4 ipv4_mask_24 = { @@ -131,7
> > > > > > +131,7 @@ static struct rte_flow_item  end_item = {
> > > RTE_FLOW_ITEM_TYPE_END,
> > > > > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > > > > >   */
> > > > > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > >     RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}  };
> > > > > >
> > > > > > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 =
> > > > > > { RTE_FLOW_ITEM_TYPE_TCP,
> > > > > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > > > > >   */
> > > > > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13,
> > > > > > 14),
> > > > > > - RTE_IPV4(15, 16, 17, 18)}
> > > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > > > > + RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > > > > >  };
> > > > > >
> > > > > >  static struct rte_flow_item_sctp sctp_spec_1 = { diff --git
> > > > > > a/lib/net/rte_ip.h b/lib/net/rte_ip.h index
> > > > > > 4b728969c1..684bb028b2
> > > > > > 100644
> > > > > > --- a/lib/net/rte_ip.h
> > > > > > +++ b/lib/net/rte_ip.h
> > > > > > @@ -38,7 +38,21 @@ extern "C" {
> > > > > >   * IPv4 Header
> > > > > >   */
> > > > > >  struct rte_ipv4_hdr {
> > > > > > - uint8_t  version_ihl;           /**< version and header length */
> > > > > > + __extension__
> > > > > > + union {
> > > > > > +         uint8_t version_ihl;    /**< version and header length */
> > > > > > +         struct {
> > > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > > +                 uint8_t ihl:4;
> > > > > > +                 uint8_t version:4; #elif RTE_BYTE_ORDER ==
> > > > > > +RTE_BIG_ENDIAN
> > > > > > +                 uint8_t version:4;
> > > > > > +                 uint8_t ihl:4; #else #error "setup endian
> > > > > > +definition"
> > > > > > +#endif
> > > > > > +         };
> > > > > > + };
> > > > > >   uint8_t  type_of_service;       /**< type of service */
> > > > > >   rte_be16_t total_length;        /**< length of packet */
> > > > > >   rte_be16_t packet_id;           /**< packet ID */
> > > > > > --
> > > > > > 2.31.1
> > > > > >
> > > > >
> > > > > This does not break the ABI, but it could be discussed if it
> > > > > breaks
> > > > the API due to the required structure initialization changes shown
> > > > in
> > > > > test_flow_classify.c.
> > > >
> > > > Yep, I guess it might be classified as API change.
> > > > Another thing that concerns me - it is not the only place in IPv4
> > > > header when we unite multiple bit-fields into one field:
> > > > type_of_service, fragment_offset.
> > > > If we start splitting ipv4 fields into actual bitfields, I suppose
> > > > we'll end-up splitting these ones too.
> > > > But I am not sure it will pay off - as compiler not always
> > > > generates optimal code for reading/updating bitfields.
> > > > Did you consider just adding extra macros to simplify access to
> > > > these fields (like RTE_IPV4_HDR_(GET_SET)_*), instead?
> > > >
> > >
> > > Let's please not introduce accessor macros for bitfields. If we
> > > don't introduce bitfields like these, I would rather stick with the
> > > current _MASK, _SHIFT and _FLAG defines.
> > >
> > > Yes, this change will lead to the introduction of more bitfields,
> > > both here and in other places. We already accepted it in the eCPRI
> > > structure (/lib/net/rte_ecpri.h), so why not just generally accept it.
> > >
> > > Are modern compilers really worse at handling a bitfield defined
> > > like this, compared to handling a single uint8_t with hand coding? I
> > > consider your concern very important, so I'm only asking if it is
> > > still relevant, to avoid making decisions based on past experience
> > > that might be outdated. (I admit to falling into that trap myself,
> > > once in a while.)
> > >
> >
> > I compared x86 code generated with gcc-9, gcc-10 and clang-10 for these
> 2 functions:
> > void test_ipv4_hdr_byte(struct rte_ipv4_hdr *h, uint8_t version,
> > uint8_t ihl) {
> >       h->version_ihl = ((version & 0x0f) << 4) | (ihl & 0x0f); } void
> > test_ipv4_hdr_bits(struct rte_ipv4_hdr *h, uint8_t version, uint8_t
> > ihl) {
> >       h->version = version & 0x0f;
> >       h->ihl = ihl & 0x0f;
> > }
> > meson configuration flags: --default-library=static
> > --buildtype=release Each compiler produced identical code for both
> functions.
> 
> For that particular case (2 bit-fields packed tightly into one byte) compilers
> usually perform quite well. At least I never saw issues for such case.
> Bit-fields that do cross byte boundaries - that might be a trouble.
> 

Can we keep both implementations, the combined byte and the bit-field, 
grouped into a union ? In that case application or PMD can select access
method that fits.
 
> >
> >
> > > > > I think this patch is an improvement, and that such structure
> > > > modifications should be generally accepted, so:
> > > > >
> > > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > > >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-28 14:18  0%       ` Gregory Etelson
@ 2021-05-31  9:58  0%         ` Ananyev, Konstantin
  2021-05-31 11:10  0%           ` Gregory Etelson
  0 siblings, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-05-31  9:58 UTC (permalink / raw)
  To: Gregory Etelson, Morten Brørup, dev
  Cc: Matan Azrad, Ori Kam, Raslan Darawsheh, Iremonger, Bernard, Olivier Matz



> > > > > RTE IPv4 header definition combines the `version' and `ihl'
> > > > > fields into a single structure member.
> > > > > This patch introduces dedicated structure members for both
> > > `version'
> > > > > and `ihl' IPv4 fields. Separated header fields definitions allow
> > > > > to create simplified code to match on the IHL value in a flow rule.
> > > > > The original `version_ihl' structure member is kept for backward
> > > > > compatibility.
> > > > >
> > > > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > > > ---
> > > > >  app/test/test_flow_classify.c |  8 ++++----
> > > > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > > >
> > > > > diff --git a/app/test/test_flow_classify.c
> > > > > b/app/test/test_flow_classify.c index 951606f248..4f64be5357
> > > > > 100644
> > > > > --- a/app/test/test_flow_classify.c
> > > > > +++ b/app/test/test_flow_classify.c
> > > > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > > > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > > > >   */
> > > > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > >     RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}  };  static const
> > > > > struct rte_flow_item_ipv4 ipv4_mask_24 = { @@ -131,7 +131,7 @@
> > > > > static struct rte_flow_item  end_item = {
> > RTE_FLOW_ITEM_TYPE_END,
> > > > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > > > >   */
> > > > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > >     RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}  };
> > > > >
> > > > > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
> > > > > RTE_FLOW_ITEM_TYPE_TCP,
> > > > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > > > >   */
> > > > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
> > > > > - RTE_IPV4(15, 16, 17, 18)}
> > > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > > > + RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > > > >  };
> > > > >
> > > > >  static struct rte_flow_item_sctp sctp_spec_1 = { diff --git
> > > > > a/lib/net/rte_ip.h b/lib/net/rte_ip.h index 4b728969c1..684bb028b2
> > > > > 100644
> > > > > --- a/lib/net/rte_ip.h
> > > > > +++ b/lib/net/rte_ip.h
> > > > > @@ -38,7 +38,21 @@ extern "C" {
> > > > >   * IPv4 Header
> > > > >   */
> > > > >  struct rte_ipv4_hdr {
> > > > > - uint8_t  version_ihl;           /**< version and header length */
> > > > > + __extension__
> > > > > + union {
> > > > > +         uint8_t version_ihl;    /**< version and header length */
> > > > > +         struct {
> > > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > > +                 uint8_t ihl:4;
> > > > > +                 uint8_t version:4; #elif RTE_BYTE_ORDER ==
> > > > > +RTE_BIG_ENDIAN
> > > > > +                 uint8_t version:4;
> > > > > +                 uint8_t ihl:4;
> > > > > +#else
> > > > > +#error "setup endian definition"
> > > > > +#endif
> > > > > +         };
> > > > > + };
> > > > >   uint8_t  type_of_service;       /**< type of service */
> > > > >   rte_be16_t total_length;        /**< length of packet */
> > > > >   rte_be16_t packet_id;           /**< packet ID */
> > > > > --
> > > > > 2.31.1
> > > > >
> > > >
> > > > This does not break the ABI, but it could be discussed if it breaks
> > > the API due to the required structure initialization changes shown in
> > > > test_flow_classify.c.
> > >
> > > Yep, I guess it might be classified as API change.
> > > Another thing that concerns me - it is not the only place in IPv4
> > > header when we unite multiple bit-fields into one field:
> > > type_of_service, fragment_offset.
> > > If we start splitting ipv4 fields into actual bitfields, I suppose
> > > we'll end-up splitting these ones too.
> > > But I am not sure it will pay off - as compiler not always generates
> > > optimal code for reading/updating bitfields.
> > > Did you consider just adding extra macros to simplify access to these
> > > fields (like RTE_IPV4_HDR_(GET_SET)_*), instead?
> > >
> >
> > Let's please not introduce accessor macros for bitfields. If we don't
> > introduce bitfields like these, I would rather stick with the current _MASK,
> > _SHIFT and _FLAG defines.
> >
> > Yes, this change will lead to the introduction of more bitfields, both here
> > and in other places. We already accepted it in the eCPRI structure
> > (/lib/net/rte_ecpri.h), so why not just generally accept it.
> >
> > Are modern compilers really worse at handling a bitfield defined like this,
> > compared to handling a single uint8_t with hand coding? I consider your
> > concern very important, so I'm only asking if it is still relevant, to avoid
> > making decisions based on past experience that might be outdated. (I admit
> > to falling into that trap myself, once in a while.)
> >
> 
> I compared x86 code generated with gcc-9, gcc-10 and clang-10 for these 2 functions:
> void test_ipv4_hdr_byte(struct rte_ipv4_hdr *h, uint8_t version, uint8_t ihl)
> {
> 	h->version_ihl = ((version & 0x0f) << 4) | (ihl & 0x0f);
> }
> void test_ipv4_hdr_bits(struct rte_ipv4_hdr *h, uint8_t version, uint8_t ihl)
> {
> 	h->version = version & 0x0f;
> 	h->ihl = ihl & 0x0f;
> }
> meson configuration flags: --default-library=static --buildtype=release
> Each compiler produced identical code for both functions.

For that particular case (2 bit-fields packed tightly into one byte)
compilers usually perform quite well. At least I never saw issues for such case.
Bit-fields that do cross byte boundaries - that might be a trouble.  

> 
> 
> > > > I think this patch is an improvement, and that such structure
> > > modifications should be generally accepted, so:
> > > >
> > > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> > >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-28 10:52  0%     ` Morten Brørup
@ 2021-05-28 14:18  0%       ` Gregory Etelson
  2021-05-31  9:58  0%         ` Ananyev, Konstantin
  0 siblings, 1 reply; 200+ results
From: Gregory Etelson @ 2021-05-28 14:18 UTC (permalink / raw)
  To: Morten Brørup, Ananyev, Konstantin, dev
  Cc: Matan Azrad, Ori Kam, Raslan Darawsheh, Iremonger, Bernard, Olivier Matz

> > > > RTE IPv4 header definition combines the `version' and `ihl'
> > > > fields into a single structure member.
> > > > This patch introduces dedicated structure members for both
> > `version'
> > > > and `ihl' IPv4 fields. Separated header fields definitions allow
> > > > to create simplified code to match on the IHL value in a flow rule.
> > > > The original `version_ihl' structure member is kept for backward
> > > > compatibility.
> > > >
> > > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > > ---
> > > >  app/test/test_flow_classify.c |  8 ++++----
> > > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > > >
> > > > diff --git a/app/test/test_flow_classify.c
> > > > b/app/test/test_flow_classify.c index 951606f248..4f64be5357
> > > > 100644
> > > > --- a/app/test/test_flow_classify.c
> > > > +++ b/app/test/test_flow_classify.c
> > > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > > >   */
> > > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > >     RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}  };  static const
> > > > struct rte_flow_item_ipv4 ipv4_mask_24 = { @@ -131,7 +131,7 @@
> > > > static struct rte_flow_item  end_item = {
> RTE_FLOW_ITEM_TYPE_END,
> > > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > > >   */
> > > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > >     RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}  };
> > > >
> > > > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
> > > > RTE_FLOW_ITEM_TYPE_TCP,
> > > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > > >   */
> > > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > > - { 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
> > > > - RTE_IPV4(15, 16, 17, 18)}
> > > > + { { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > > + RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > > >  };
> > > >
> > > >  static struct rte_flow_item_sctp sctp_spec_1 = { diff --git
> > > > a/lib/net/rte_ip.h b/lib/net/rte_ip.h index 4b728969c1..684bb028b2
> > > > 100644
> > > > --- a/lib/net/rte_ip.h
> > > > +++ b/lib/net/rte_ip.h
> > > > @@ -38,7 +38,21 @@ extern "C" {
> > > >   * IPv4 Header
> > > >   */
> > > >  struct rte_ipv4_hdr {
> > > > - uint8_t  version_ihl;           /**< version and header length */
> > > > + __extension__
> > > > + union {
> > > > +         uint8_t version_ihl;    /**< version and header length */
> > > > +         struct {
> > > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > > +                 uint8_t ihl:4;
> > > > +                 uint8_t version:4; #elif RTE_BYTE_ORDER ==
> > > > +RTE_BIG_ENDIAN
> > > > +                 uint8_t version:4;
> > > > +                 uint8_t ihl:4;
> > > > +#else
> > > > +#error "setup endian definition"
> > > > +#endif
> > > > +         };
> > > > + };
> > > >   uint8_t  type_of_service;       /**< type of service */
> > > >   rte_be16_t total_length;        /**< length of packet */
> > > >   rte_be16_t packet_id;           /**< packet ID */
> > > > --
> > > > 2.31.1
> > > >
> > >
> > > This does not break the ABI, but it could be discussed if it breaks
> > the API due to the required structure initialization changes shown in
> > > test_flow_classify.c.
> >
> > Yep, I guess it might be classified as API change.
> > Another thing that concerns me - it is not the only place in IPv4
> > header when we unite multiple bit-fields into one field:
> > type_of_service, fragment_offset.
> > If we start splitting ipv4 fields into actual bitfields, I suppose
> > we'll end-up splitting these ones too.
> > But I am not sure it will pay off - as compiler not always generates
> > optimal code for reading/updating bitfields.
> > Did you consider just adding extra macros to simplify access to these
> > fields (like RTE_IPV4_HDR_(GET_SET)_*), instead?
> >
> 
> Let's please not introduce accessor macros for bitfields. If we don't
> introduce bitfields like these, I would rather stick with the current _MASK,
> _SHIFT and _FLAG defines.
> 
> Yes, this change will lead to the introduction of more bitfields, both here
> and in other places. We already accepted it in the eCPRI structure
> (/lib/net/rte_ecpri.h), so why not just generally accept it.
> 
> Are modern compilers really worse at handling a bitfield defined like this,
> compared to handling a single uint8_t with hand coding? I consider your
> concern very important, so I'm only asking if it is still relevant, to avoid
> making decisions based on past experience that might be outdated. (I admit
> to falling into that trap myself, once in a while.)
> 

I compared x86 code generated with gcc-9, gcc-10 and clang-10 for these 2 functions:
void test_ipv4_hdr_byte(struct rte_ipv4_hdr *h, uint8_t version, uint8_t ihl)
{
	h->version_ihl = ((version & 0x0f) << 4) | (ihl & 0x0f);
}
void test_ipv4_hdr_bits(struct rte_ipv4_hdr *h, uint8_t version, uint8_t ihl)
{
	h->version = version & 0x0f;
	h->ihl = ihl & 0x0f;
}
meson configuration flags: --default-library=static --buildtype=release
Each compiler produced identical code for both functions. 
 

> > > I think this patch is an improvement, and that such structure
> > modifications should be generally accepted, so:
> > >
> > > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> >


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-28 10:20  0%   ` Ananyev, Konstantin
@ 2021-05-28 10:52  0%     ` Morten Brørup
  2021-05-28 14:18  0%       ` Gregory Etelson
  0 siblings, 1 reply; 200+ results
From: Morten Brørup @ 2021-05-28 10:52 UTC (permalink / raw)
  To: Ananyev, Konstantin, Gregory Etelson, dev
  Cc: matan, orika, rasland, Iremonger, Bernard, Olivier Matz

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev,
> Konstantin
> Sent: Friday, 28 May 2021 12.21
> 
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gregory
> Etelson
> > > Sent: Thursday, 27 May 2021 17.29
> > and version fields
> > >
> > > RTE IPv4 header definition combines the `version' and `ihl'  fields
> > > into a single structure member.
> > > This patch introduces dedicated structure members for both
> `version'
> > > and `ihl' IPv4 fields. Separated header fields definitions allow to
> > > create simplified code to match on the IHL value in a flow rule.
> > > The original `version_ihl' structure member is kept for backward
> > > compatibility.
> > >
> > > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > > ---
> > >  app/test/test_flow_classify.c |  8 ++++----
> > >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> > >  2 files changed, 19 insertions(+), 5 deletions(-)
> > >
> > > diff --git a/app/test/test_flow_classify.c
> > > b/app/test/test_flow_classify.c
> > > index 951606f248..4f64be5357 100644
> > > --- a/app/test/test_flow_classify.c
> > > +++ b/app/test/test_flow_classify.c
> > > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > > ipv4_defs[NUM_FIELDS_IPV4] = {
> > >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> > >   */
> > >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > >  	  RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}
> > >  };
> > >  static const struct rte_flow_item_ipv4 ipv4_mask_24 = {
> > > @@ -131,7 +131,7 @@ static struct rte_flow_item  end_item = {
> > > RTE_FLOW_ITEM_TYPE_END,
> > >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> > >   */
> > >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > >  	  RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}
> > >  };
> > >
> > > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
> > > RTE_FLOW_ITEM_TYPE_TCP,
> > >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> > >   */
> > >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
> > > -	RTE_IPV4(15, 16, 17, 18)}
> > > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > > +	RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> > >  };
> > >
> > >  static struct rte_flow_item_sctp sctp_spec_1 = {
> > > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > > index 4b728969c1..684bb028b2 100644
> > > --- a/lib/net/rte_ip.h
> > > +++ b/lib/net/rte_ip.h
> > > @@ -38,7 +38,21 @@ extern "C" {
> > >   * IPv4 Header
> > >   */
> > >  struct rte_ipv4_hdr {
> > > -	uint8_t  version_ihl;		/**< version and header length */
> > > +	__extension__
> > > +	union {
> > > +		uint8_t version_ihl;    /**< version and header length */
> > > +		struct {
> > > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > > +			uint8_t ihl:4;
> > > +			uint8_t version:4;
> > > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > > +			uint8_t version:4;
> > > +			uint8_t ihl:4;
> > > +#else
> > > +#error "setup endian definition"
> > > +#endif
> > > +		};
> > > +	};
> > >  	uint8_t  type_of_service;	/**< type of service */
> > >  	rte_be16_t total_length;	/**< length of packet */
> > >  	rte_be16_t packet_id;		/**< packet ID */
> > > --
> > > 2.31.1
> > >
> >
> > This does not break the ABI, but it could be discussed if it breaks
> the API due to the required structure initialization changes shown in
> > test_flow_classify.c.
> 
> Yep, I guess it might be classified as API change.
> Another thing that concerns me - it is not the only place in IPv4
> header when we unite multiple bit-fields into one field:
> type_of_service, fragment_offset.
> If we start splitting ipv4 fields into actual bitfields, I suppose
> we'll end-up splitting these ones too.
> But I am not sure it will pay off - as compiler not always generates
> optimal code for reading/updating bitfields.
> Did you consider just adding extra macros to simplify access to these
> fields (like RTE_IPV4_HDR_(GET_SET)_*),
> instead?
> 

Let's please not introduce accessor macros for bitfields. If we don't introduce bitfields like these, I would rather stick with the current _MASK, _SHIFT and _FLAG defines.

Yes, this change will lead to the introduction of more bitfields, both here and in other places. We already accepted it in the eCPRI structure (/lib/net/rte_ecpri.h), so why not just generally accept it.

Are modern compilers really worse at handling a bitfield defined like this, compared to handling a single uint8_t with hand coding? I consider your concern very important, so I'm only asking if it is still relevant, to avoid making decisions based on past experience that might be outdated. (I admit to falling into that trap myself, once in a while.)


> > I think this patch is an improvement, and that such structure
> modifications should be generally accepted, so:
> >
> > Acked-by: Morten Brørup <mb@smartsharesystems.com>
> 


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  2021-05-27 15:56  3% ` Morten Brørup
@ 2021-05-28 10:20  0%   ` Ananyev, Konstantin
  2021-05-28 10:52  0%     ` Morten Brørup
  2021-06-03  0:58  4%   ` Min Hu (Connor)
  1 sibling, 1 reply; 200+ results
From: Ananyev, Konstantin @ 2021-05-28 10:20 UTC (permalink / raw)
  To: Morten Brørup, Gregory Etelson, dev
  Cc: matan, orika, rasland, Iremonger, Bernard, Olivier Matz


 
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gregory Etelson
> > Sent: Thursday, 27 May 2021 17.29
> and version fields
> >
> > RTE IPv4 header definition combines the `version' and `ihl'  fields
> > into a single structure member.
> > This patch introduces dedicated structure members for both `version'
> > and `ihl' IPv4 fields. Separated header fields definitions allow to
> > create simplified code to match on the IHL value in a flow rule.
> > The original `version_ihl' structure member is kept for backward
> > compatibility.
> >
> > Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> > ---
> >  app/test/test_flow_classify.c |  8 ++++----
> >  lib/net/rte_ip.h              | 16 +++++++++++++++-
> >  2 files changed, 19 insertions(+), 5 deletions(-)
> >
> > diff --git a/app/test/test_flow_classify.c
> > b/app/test/test_flow_classify.c
> > index 951606f248..4f64be5357 100644
> > --- a/app/test/test_flow_classify.c
> > +++ b/app/test/test_flow_classify.c
> > @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> > ipv4_defs[NUM_FIELDS_IPV4] = {
> >   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
> >   */
> >  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> >  	  RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}
> >  };
> >  static const struct rte_flow_item_ipv4 ipv4_mask_24 = {
> > @@ -131,7 +131,7 @@ static struct rte_flow_item  end_item = {
> > RTE_FLOW_ITEM_TYPE_END,
> >   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
> >   */
> >  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> >  	  RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}
> >  };
> >
> > @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
> > RTE_FLOW_ITEM_TYPE_TCP,
> >   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
> >   */
> >  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> > -	{ 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
> > -	RTE_IPV4(15, 16, 17, 18)}
> > +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> > +	RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
> >  };
> >
> >  static struct rte_flow_item_sctp sctp_spec_1 = {
> > diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> > index 4b728969c1..684bb028b2 100644
> > --- a/lib/net/rte_ip.h
> > +++ b/lib/net/rte_ip.h
> > @@ -38,7 +38,21 @@ extern "C" {
> >   * IPv4 Header
> >   */
> >  struct rte_ipv4_hdr {
> > -	uint8_t  version_ihl;		/**< version and header length */
> > +	__extension__
> > +	union {
> > +		uint8_t version_ihl;    /**< version and header length */
> > +		struct {
> > +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> > +			uint8_t ihl:4;
> > +			uint8_t version:4;
> > +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> > +			uint8_t version:4;
> > +			uint8_t ihl:4;
> > +#else
> > +#error "setup endian definition"
> > +#endif
> > +		};
> > +	};
> >  	uint8_t  type_of_service;	/**< type of service */
> >  	rte_be16_t total_length;	/**< length of packet */
> >  	rte_be16_t packet_id;		/**< packet ID */
> > --
> > 2.31.1
> >
> 
> This does not break the ABI, but it could be discussed if it breaks the API due to the required structure initialization changes shown in
> test_flow_classify.c.

Yep, I guess it might be classified as API change.
Another thing that concerns me - it is not the only place in IPv4 header when we unite multiple bit-fields into one field:
type_of_service, fragment_offset.
If we start splitting ipv4 fields into actual bitfields, I suppose we'll end-up splitting these ones too.
But I am not sure it will pay off - as compiler not always generates optimal code for reading/updating bitfields.
Did you consider just adding extra macros to simplify access to these fields (like RTE_IPV4_HDR_(GET_SET)_*),
instead?  

> I think this patch is an improvement, and that such structure modifications should be generally accepted, so:
> 
> Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[relevance 0%]

* Re: [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields
  @ 2021-05-27 15:56  3% ` Morten Brørup
  2021-05-28 10:20  0%   ` Ananyev, Konstantin
  2021-06-03  0:58  4%   ` Min Hu (Connor)
  2021-06-17 15:02  3% ` Tyler Retzlaff
  1 sibling, 2 replies; 200+ results
From: Morten Brørup @ 2021-05-27 15:56 UTC (permalink / raw)
  To: Gregory Etelson, dev
  Cc: matan, orika, rasland, Bernard Iremonger, Olivier Matz

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Gregory Etelson
> Sent: Thursday, 27 May 2021 17.29
and version fields
> 
> RTE IPv4 header definition combines the `version' and `ihl'  fields
> into a single structure member.
> This patch introduces dedicated structure members for both `version'
> and `ihl' IPv4 fields. Separated header fields definitions allow to
> create simplified code to match on the IHL value in a flow rule.
> The original `version_ihl' structure member is kept for backward
> compatibility.
> 
> Signed-off-by: Gregory Etelson <getelson@nvidia.com>
> ---
>  app/test/test_flow_classify.c |  8 ++++----
>  lib/net/rte_ip.h              | 16 +++++++++++++++-
>  2 files changed, 19 insertions(+), 5 deletions(-)
> 
> diff --git a/app/test/test_flow_classify.c
> b/app/test/test_flow_classify.c
> index 951606f248..4f64be5357 100644
> --- a/app/test/test_flow_classify.c
> +++ b/app/test/test_flow_classify.c
> @@ -95,7 +95,7 @@ static struct rte_acl_field_def
> ipv4_defs[NUM_FIELDS_IPV4] = {
>   *  dst mask 255.255.255.00 / udp src is 32 dst is 33 / end"
>   */
>  static struct rte_flow_item_ipv4 ipv4_udp_spec_1 = {
> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_UDP, 0,
>  	  RTE_IPV4(2, 2, 2, 3), RTE_IPV4(2, 2, 2, 7)}
>  };
>  static const struct rte_flow_item_ipv4 ipv4_mask_24 = {
> @@ -131,7 +131,7 @@ static struct rte_flow_item  end_item = {
> RTE_FLOW_ITEM_TYPE_END,
>   *  dst mask 255.255.255.00 / tcp src is 16 dst is 17 / end"
>   */
>  static struct rte_flow_item_ipv4 ipv4_tcp_spec_1 = {
> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_TCP, 0,
>  	  RTE_IPV4(1, 2, 3, 4), RTE_IPV4(5, 6, 7, 8)}
>  };
> 
> @@ -150,8 +150,8 @@ static struct rte_flow_item  tcp_item_1 = {
> RTE_FLOW_ITEM_TYPE_TCP,
>   *  dst mask 255.255.255.00 / sctp src is 16 dst is 17/ end"
>   */
>  static struct rte_flow_item_ipv4 ipv4_sctp_spec_1 = {
> -	{ 0, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0, RTE_IPV4(11, 12, 13, 14),
> -	RTE_IPV4(15, 16, 17, 18)}
> +	{ { .version_ihl = 0}, 0, 0, 0, 0, 0, IPPROTO_SCTP, 0,
> +	RTE_IPV4(11, 12, 13, 14), RTE_IPV4(15, 16, 17, 18)}
>  };
> 
>  static struct rte_flow_item_sctp sctp_spec_1 = {
> diff --git a/lib/net/rte_ip.h b/lib/net/rte_ip.h
> index 4b728969c1..684bb028b2 100644
> --- a/lib/net/rte_ip.h
> +++ b/lib/net/rte_ip.h
> @@ -38,7 +38,21 @@ extern "C" {
>   * IPv4 Header
>   */
>  struct rte_ipv4_hdr {
> -	uint8_t  version_ihl;		/**< version and header length */
> +	__extension__
> +	union {
> +		uint8_t version_ihl;    /**< version and header length */
> +		struct {
> +#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
> +			uint8_t ihl:4;
> +			uint8_t version:4;
> +#elif RTE_BYTE_ORDER == RTE_BIG_ENDIAN
> +			uint8_t version:4;
> +			uint8_t ihl:4;
> +#else
> +#error "setup endian definition"
> +#endif
> +		};
> +	};
>  	uint8_t  type_of_service;	/**< type of service */
>  	rte_be16_t total_length;	/**< length of packet */
>  	rte_be16_t packet_id;		/**< packet ID */
> --
> 2.31.1
> 

This does not break the ABI, but it could be discussed if it breaks the API due to the required structure initialization changes shown in test_flow_classify.c. I think this patch is an improvement, and that such structure modifications should be generally accepted, so:

Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[relevance 3%]

Results 3601-3800 of ~18000   |  | reverse | sort options + mbox downloads above
-- links below jump to the message on this page --
2019-07-15  7:52     [dpdk-dev] [RFC v2 5/5] bus/pci: add mdev support Tiwei Bie
2021-06-01  3:06  2% ` [dpdk-dev] [RFC v3 0/6] Add mdev (Mediated device) support in DPDK Chenbo Xia
2021-06-11  7:15  0%   ` Thomas Monjalon
2021-06-15  2:49  0%     ` Xia, Chenbo
2021-06-15  7:48  0%       ` Thomas Monjalon
2021-06-15 10:44  0%         ` Xia, Chenbo
2020-04-24  7:07     [dpdk-dev] [PATCH v1 0/2] Use WFE for spinlock and ring Gavin Hu
2021-04-25  5:56     ` [dpdk-dev] " Ruifeng Wang
2021-07-07 14:47  0%   ` Stephen Hemminger
2021-07-08  9:41  0%     ` Ruifeng Wang
2021-07-07  5:43  3% ` [dpdk-dev] [PATCH v4 0/3] " Ruifeng Wang
2021-07-07  5:48  3% ` Ruifeng Wang
2021-04-03  1:38     [dpdk-dev] [PATCH v6 00/10] eal: Add new API for threading Narcisa Ana Maria Vasile
2021-06-01 20:55  2% ` [dpdk-dev] [PATCH v7 00/10] eal: Add EAL " Narcisa Ana Maria Vasile
2021-06-04 23:38  2%   ` [dpdk-dev] [PATCH v8 " Narcisa Ana Maria Vasile
2021-06-04 23:44  2%     ` [dpdk-dev] [PATCH v9 " Narcisa Ana Maria Vasile
2021-06-04 23:44           ` [dpdk-dev] [PATCH v9 07/10] eal: implement functions for mutex management Narcisa Ana Maria Vasile
2021-06-08 23:04  3%         ` Dmitry Kozlyuk
2021-06-09 22:37  0%           ` Dmitry Kozlyuk
2021-06-12  2:39  0%             ` Narcisa Ana Maria Vasile
2021-06-04 23:44           ` [dpdk-dev] [PATCH v9 10/10] Enable the new EAL thread API Narcisa Ana Maria Vasile
2021-06-08  5:50  5%         ` Narcisa Ana Maria Vasile
2021-06-08  7:45  5%           ` David Marchand
2021-06-18 21:53  0%             ` Narcisa Ana Maria Vasile
2021-06-18 21:26  3%       ` [dpdk-dev] [PATCH v10 0/9] eal: Add EAL API for threading Narcisa Ana Maria Vasile
2021-05-08  8:00     [dpdk-dev] [RFC] lib/ethdev: add dev configured flag Huisong Li
2021-07-06  4:10     ` [dpdk-dev] [PATCH V2] ethdev: " Huisong Li
2021-07-06  8:36  4%   ` Andrew Rybchenko
2021-07-07  2:55  0%     ` Huisong Li
2021-07-07  8:25  3%       ` Andrew Rybchenko
2021-07-07  9:26  0%         ` Huisong Li
2021-07-07  7:39  3%     ` David Marchand
2021-07-07  8:23  0%       ` Andrew Rybchenko
2021-07-07  9:36  0%         ` David Marchand
2021-07-07  9:59  0%           ` Thomas Monjalon
2021-05-10 13:47     [dpdk-dev] [RFC v2] bus/auxiliary: introduce auxiliary bus Xueming Li
2021-06-21 16:11     ` [dpdk-dev] [PATCH v4 2/2] " Thomas Monjalon
2021-06-22 23:50       ` Xueming(Steven) Li
2021-06-23  8:15  4%     ` Thomas Monjalon
2021-06-23 14:52  3%       ` Xueming(Steven) Li
2021-06-24  6:37  3%         ` Thomas Monjalon
2021-06-24  8:42  3%           ` Xueming(Steven) Li
2021-05-24 10:58     [dpdk-dev] [RFC PATCH 0/3] Add PIE support for HQoS library Liguzinski, WojciechX
2021-05-25  8:56     ` Morten Brørup
2021-06-07 13:01  0%   ` Liguzinski, WojciechX
2021-06-09 10:53  3% ` [dpdk-dev] [RFC PATCH v1 " Liguzinski, WojciechX
2021-06-15  9:01  3%   ` [dpdk-dev] [RFC PATCH v2 " Liguzinski, WojciechX
2021-06-21  7:35  3%     ` [dpdk-dev] [RFC PATCH v3 " Liguzinski, WojciechX
2021-07-05  8:04  3%       ` [dpdk-dev] [RFC PATCH v4 " Liguzinski, WojciechX
2021-05-27 15:24     [dpdk-dev] [PATCH 00/20] net/sfc: support flow API COUNT action Andrew Rybchenko
2021-06-18 13:40     ` [dpdk-dev] [PATCH v3 19/20] net/sfc: support flow action COUNT in transfer rules Andrew Rybchenko
2021-06-21  8:28       ` David Marchand
2021-06-21  9:30         ` Thomas Monjalon
2021-07-01  9:22           ` Andrew Rybchenko
2021-07-01 12:34             ` David Marchand
2021-07-01 13:05               ` Andrew Rybchenko
2021-07-02  8:43                 ` Andrew Rybchenko
2021-07-02 13:37  3%               ` David Marchand
2021-07-02 13:39  0%                 ` Andrew Rybchenko
2021-07-02 12:30     ` Thomas Monjalon
2021-07-02 12:53       ` Andrew Rybchenko
2021-07-04 19:45  3%     ` Thomas Monjalon
2021-07-05  8:41  0%       ` Andrew Rybchenko
2021-05-27 15:28     [dpdk-dev] [PATCH] net: introduce IPv4 ihl and version fields Gregory Etelson
2021-05-27 15:56  3% ` Morten Brørup
2021-05-28 10:20  0%   ` Ananyev, Konstantin
2021-05-28 10:52  0%     ` Morten Brørup
2021-05-28 14:18  0%       ` Gregory Etelson
2021-05-31  9:58  0%         ` Ananyev, Konstantin
2021-05-31 11:10  0%           ` Gregory Etelson
2021-06-02  9:51  0%             ` Gregory Etelson
2021-06-10  4:10  0%               ` Gregory Etelson
2021-06-10  9:22  4%                 ` Olivier Matz
2021-06-14 16:36  4%                   ` Andrew Rybchenko
2021-06-17 16:29  0%                     ` Ferruh Yigit
2021-06-03  0:58  4%   ` Min Hu (Connor)
2021-06-03  2:03  3%     ` Stephen Hemminger
2021-06-03  4:59  0%       ` Gregory Etelson
2021-06-17 15:02  3% ` Tyler Retzlaff
2021-06-01  1:56  8% [dpdk-dev] [PATCH v1 0/2] relative path support for ABI compatibility check Feifei Wang
2021-06-01  1:56 17% ` [dpdk-dev] [PATCH v1 1/2] devtools: add " Feifei Wang
2021-06-22  2:08  4%   ` [dpdk-dev] 回复: " Feifei Wang
2021-06-22  9:19  4%   ` [dpdk-dev] " Bruce Richardson
2021-06-01  1:56 12% ` [dpdk-dev] [PATCH v1 2/2] devtools: use absolute path for the build directory Feifei Wang
2021-06-01  7:54  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
2021-06-08  8:52  0% ` Jiang, YuX
2021-06-08 10:28  0% ` Pei Zhang
2021-06-08 11:31  0% ` Kevin Traynor
2021-06-08 13:10  0%   ` Xueming(Steven) Li
2021-06-14 12:39  0%     ` Xueming(Steven) Li
2021-06-09 11:56  0% ` Xueming(Steven) Li
2021-06-10  8:53  0%   ` Christian Ehrhardt
2021-06-14 12:35  0%     ` Xueming(Steven) Li
2021-06-01  8:41  5% [dpdk-dev] [PATCH] doc: announce removal of ABIs in PCI bus driver Chenbo Xia
2021-06-01 11:14     [dpdk-dev] [RFC PATCH] ethdev: clarify flow action PORT ID semantics Ivan Malov
2021-06-01 12:10     ` Ilya Maximets
2021-06-01 14:28       ` Ivan Malov
2021-06-02 12:46         ` Ilya Maximets
2021-06-02 16:26           ` Andrew Rybchenko
2021-06-02 17:35             ` Ilya Maximets
2021-06-02 19:35               ` Ivan Malov
2021-06-03  9:29                 ` Ilya Maximets
2021-06-03 10:33  3%               ` Andrew Rybchenko
2021-06-03 11:05  0%                 ` Ilya Maximets
2021-06-01 12:00     [dpdk-dev] [PATCH v1 0/7] Enhancements for PMD power management Anatoly Burakov
2021-06-25 14:00     ` [dpdk-dev] [PATCH v2 " Anatoly Burakov
2021-06-25 14:00  3%   ` [dpdk-dev] [PATCH v2 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-06-25 14:00  3%   ` [dpdk-dev] [PATCH v2 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-06-28 12:41       ` [dpdk-dev] [PATCH v3 0/7] Enhancements for PMD power management Anatoly Burakov
2021-06-28 12:41  3%     ` [dpdk-dev] [PATCH v3 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-06-28 12:41  3%     ` [dpdk-dev] [PATCH v3 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-06-28 15:54         ` [dpdk-dev] [PATCH v4 0/7] Enhancements for PMD power management Anatoly Burakov
2021-06-28 15:54  3%       ` [dpdk-dev] [PATCH v4 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-06-28 15:54  3%       ` [dpdk-dev] [PATCH v4 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-06-29 15:48           ` [dpdk-dev] [PATCH v5 0/7] Enhancements for PMD power management Anatoly Burakov
2021-06-29 15:48  3%         ` [dpdk-dev] [PATCH v5 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-06-29 15:48  3%         ` [dpdk-dev] [PATCH v5 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-07-05 15:21             ` [dpdk-dev] [PATCH v6 0/7] Enhancements for PMD power management Anatoly Burakov
2021-07-05 15:21  3%           ` [dpdk-dev] [PATCH v6 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-07-05 15:21  3%           ` [dpdk-dev] [PATCH v6 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-07-07 10:48               ` [dpdk-dev] [PATCH v7 0/7] Enhancements for PMD power management Anatoly Burakov
2021-07-07 10:48  3%             ` [dpdk-dev] [PATCH v7 1/7] power_intrinsics: use callbacks for comparison Anatoly Burakov
2021-07-07 10:48  3%             ` [dpdk-dev] [PATCH v7 4/7] power: remove thread safety from PMD power API's Anatoly Burakov
2021-06-14 10:58     [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays Thomas Monjalon
2021-06-14 12:22     ` Morten Brørup
2021-06-14 13:15       ` Bruce Richardson
2021-06-14 13:32         ` Thomas Monjalon
2021-06-14 14:59           ` Ananyev, Konstantin
2021-06-14 15:54  3%         ` Ananyev, Konstantin
2021-06-17 13:08  3%           ` Ferruh Yigit
2021-06-17 14:58  0%             ` Ananyev, Konstantin
2021-06-17 15:17  0%               ` Morten Brørup
2021-06-17 16:12  0%                 ` Ferruh Yigit
2021-06-17 16:55  0%                   ` Morten Brørup
2021-06-18 10:21  0%                     ` Ferruh Yigit
2021-06-17 17:05  0%                   ` Ananyev, Konstantin
2021-06-18 10:28  0%                     ` Ferruh Yigit
2021-06-17 15:44  3%               ` Ferruh Yigit
2021-06-18 10:41  0%                 ` Ananyev, Konstantin
2021-06-18 10:49  0%                   ` Ferruh Yigit
2021-06-21 11:06  0%                   ` Ananyev, Konstantin
2021-06-21 14:05  0%                     ` Ferruh Yigit
2021-06-21 14:42  0%                       ` Ananyev, Konstantin
2021-06-21 15:32  0%                         ` Ferruh Yigit
2021-06-21 15:37  0%                           ` Ananyev, Konstantin
2021-06-14 15:48           ` Morten Brørup
2021-06-15  6:48             ` Thomas Monjalon
2021-06-16  9:42               ` Jerin Jacob
2021-06-16 11:27  3%             ` Morten Brørup
2021-06-16 12:00  0%               ` Jerin Jacob
2021-06-16 13:02  0%               ` Bruce Richardson
2021-06-16 15:01  0%                 ` Morten Brørup
2021-06-17  9:17     [dpdk-dev] [PATCH v6] ethdev: add new ext hdr for gtp psc Raslan Darawsheh
2021-06-22  7:27     ` Singh, Aman Deep
2021-07-01 14:06       ` Andrew Rybchenko
2021-07-06 14:24         ` Raslan Darawsheh
2021-07-08  9:23           ` Andrew Rybchenko
2021-07-08  9:27  4%         ` Raslan Darawsheh
2021-07-08  9:39  0%           ` Andrew Rybchenko
2021-06-18 16:36  5% [dpdk-dev] [PATCH] devtools: script to track map symbols Ray Kinsella
2021-06-21 15:25  6% ` [dpdk-dev] [PATCH v3] " Ray Kinsella
2021-06-21 15:35  6% ` [dpdk-dev] [PATCH v4] " Ray Kinsella
2021-06-22 10:19  6% ` [dpdk-dev] [PATCH v5] " Ray Kinsella
2021-06-18 21:54     [dpdk-dev] [PATCH 0/6] Enable the internal EAL thread API Narcisa Ana Maria Vasile
2021-06-18 21:54  4% ` [dpdk-dev] [PATCH 2/6] eal: add function for control thread creation Narcisa Ana Maria Vasile
2021-06-19  1:57  4% ` [dpdk-dev] [PATCH v2 0/6] Enable the internal EAL thread API Narcisa Ana Maria Vasile
2021-06-19  1:57  4%   ` [dpdk-dev] [PATCH v2 2/6] eal: add function for control thread creation Narcisa Ana Maria Vasile
2021-06-21  9:18     [dpdk-dev] [PATCH] devtools: script to track map symbols Kinsella, Ray
2021-06-21 15:11  5% ` Ray Kinsella
2021-06-22 15:50 12% [dpdk-dev] [PATCH v1] doc: update ABI in MAINTAINERS file Ray Kinsella
2021-06-25  8:08  7% ` Ferruh Yigit
2021-06-22 16:48     [dpdk-dev] [PATCH 0/2] OCTEONTX crypto adapter support Shijith Thotton
2021-06-23 20:53     ` [dpdk-dev] [PATCH v2 " Shijith Thotton
2021-06-23 20:53       ` [dpdk-dev] [PATCH v2 2/2] drivers: add octeontx crypto adapter data path Shijith Thotton
2021-06-30  8:54         ` Akhil Goyal
2021-06-30 16:23  4%       ` [dpdk-dev] [dpdk-ci] " Brandon Lo
2021-06-23  0:03     [dpdk-dev] [PATCH v5 2/2] bus/auxiliary: introduce auxiliary bus Xueming Li
2021-06-25 11:47     ` [dpdk-dev] [PATCH v6 " Xueming Li
2021-07-04 16:13  3%   ` Andrew Rybchenko
2021-07-05  5:47  0%     ` Xueming(Steven) Li
2021-06-24 10:28  3% [dpdk-dev] Experimental symbols in security lib Kinsella, Ray
2021-06-24 10:49  0% ` Kinsella, Ray
2021-06-24 12:22  0%   ` [dpdk-dev] [EXT] " Akhil Goyal
2021-06-24 10:29  3% [dpdk-dev] Experimental symbols in net lib Kinsella, Ray
2021-06-24 10:29  3% [dpdk-dev] Experimental symbols in mbuf lib Kinsella, Ray
2021-06-24 10:30  3% [dpdk-dev] Experimental symbols in vhost lib Kinsella, Ray
2021-06-24 11:04  0% ` Xia, Chenbo
2021-06-24 10:30  3% [dpdk-dev] Experimental symbols in flow_classify lib Kinsella, Ray
2021-06-24 10:31  3% [dpdk-dev] Experimental symbols in eal lib Kinsella, Ray
2021-06-24 12:14  0% ` David Marchand
2021-06-24 12:15  0%   ` Kinsella, Ray
2021-06-29 16:50  0%   ` Tyler Retzlaff
2021-06-24 10:31  3% [dpdk-dev] Experimental symbols in port lib Kinsella, Ray
2021-06-24 10:32  3% [dpdk-dev] Experimental symbols in compressdev lib Kinsella, Ray
2021-06-24 10:55  0% ` Trahe, Fiona
2021-06-25  7:49  0% ` David Marchand
2021-06-25  9:14  0%   ` Kinsella, Ray
2021-06-24 10:33  3% [dpdk-dev] Experimental symbols in sched lib Kinsella, Ray
2021-06-24 19:21  0% ` Singh, Jasvinder
2021-06-24 10:33  3% [dpdk-dev] Experimental symbols in cryptodev lib Kinsella, Ray
2021-06-24 10:34  3% [dpdk-dev] Experimental symbols in rib lib Kinsella, Ray
2021-06-24 10:34  3% [dpdk-dev] Experimental symbols in pipeline lib Kinsella, Ray
2021-06-24 10:34  3% [dpdk-dev] Experimental symbols in ip_frag Kinsella, Ray
2021-06-24 10:35  3% [dpdk-dev] Experimental symbols in bbdev lib Kinsella, Ray
2021-06-24 15:42  3% ` Chautru, Nicolas
2021-06-24 19:27  3%   ` Kinsella, Ray
2021-06-25  7:48  0% ` David Marchand
2021-06-24 10:36  3% [dpdk-dev] Experimental Symbols in ethdev lib Kinsella, Ray
2021-06-24 10:36  3% [dpdk-dev] Experimental Symbols in kvargs Kinsella, Ray
2021-06-24 10:39  3% [dpdk-dev] Experimental symbols in power lib Kinsella, Ray
2021-06-24 10:42  3% [dpdk-dev] Experimental symbols in kni lib Kinsella, Ray
2021-06-24 13:24  0% ` Ferruh Yigit
2021-06-24 13:54  0%   ` Kinsella, Ray
2021-06-25 13:26  0%     ` Igor Ryzhov
2021-06-28 12:23  0%       ` Ferruh Yigit
2021-06-24 10:44  3% [dpdk-dev] Experimental symbols in metrics lib Kinsella, Ray
2021-06-24 10:46  3% [dpdk-dev] Experimental symbols in fib lib Kinsella, Ray
     [not found]     <c6c3ce36-9585-6fcb-8899-719d6b8a368b@ashroe.eu>
2021-06-24 10:47  0% ` [dpdk-dev] Experimental symbols in hash lib Kinsella, Ray
2021-06-25 11:47     [dpdk-dev] [PATCH v6 1/2] devargs: add common key definition Xueming Li
2021-07-05  6:45     ` [dpdk-dev] [PATCH v8 2/2] bus/auxiliary: introduce auxiliary bus Xueming Li
2021-07-05  9:19  3%   ` Andrew Rybchenko
2021-07-05  9:30  0%     ` Xueming(Steven) Li
2021-07-05  9:35  0%       ` Andrew Rybchenko
2021-07-05 14:57  0%         ` Thomas Monjalon
2021-07-05 15:06  0%           ` Andrew Rybchenko
2021-06-26 15:41  1% [dpdk-dev] 20.11.2 patches review and test Xueming(Steven) Li
2021-06-26 23:08  1% Xueming Li
2021-06-26 23:28  1% Xueming Li
2021-06-30 10:33  0% ` Jiang, YuX
2021-07-06  2:37  0%   ` Xueming(Steven) Li
2021-07-06  3:26  0% ` [dpdk-dev] [dpdk-stable] " Kalesh Anakkur Purayil
2021-07-06  6:47  0%   ` Xueming(Steven) Li
2021-06-29 16:00 21% [dpdk-dev] [PATCH v1] doc: policy on promotion of experimental APIs Ray Kinsella
2021-06-29 16:28  3% ` Tyler Retzlaff
2021-06-29 18:38  0%   ` Kinsella, Ray
2021-06-30 19:56  4%     ` Tyler Retzlaff
2021-07-01  7:56  0%       ` Ferruh Yigit
2021-07-01 14:45  4%         ` Tyler Retzlaff
2021-07-01 10:19  4%       ` Kinsella, Ray
2021-07-01 15:09  4%         ` Tyler Retzlaff
2021-07-02  6:30  4%           ` Kinsella, Ray
2021-07-01 10:31 23% ` [dpdk-dev] [PATCH v2] " Ray Kinsella
2021-07-01 10:38 23% ` [dpdk-dev] [PATCH v3] doc: policy on the " Ray Kinsella
2021-07-07 18:32  0%   ` Tyler Retzlaff
2021-06-30 12:46     [dpdk-dev] [PATCH] test: fix crypto_op length for sessionless case Abhinandan Gujjar
2021-07-02 17:08     ` Gujjar, Abhinandan S
2021-07-02 23:26       ` Ferruh Yigit
2021-07-05  6:30         ` Gujjar, Abhinandan S
2021-07-06 16:09  3%       ` Brandon Lo
2021-07-01 16:30  4% [dpdk-dev] DPDK Release Status Meeting 01/07/2021 Mcnamara, John
2021-07-02  8:00  8% [dpdk-dev] ABI/API stability towards drivers Morten Brørup
2021-07-02  9:45  7% ` [dpdk-dev] [dpdk-techboard] " Ferruh Yigit
2021-07-02 12:26  4% ` Thomas Monjalon
2021-07-07 18:46  8% ` [dpdk-dev] " Tyler Retzlaff
2021-07-02 13:18     [dpdk-dev] [PATCH] dmadev: introduce DMA device library Chengwen Feng
2021-07-04  9:30  3% ` Jerin Jacob
2021-07-05 10:52  0%   ` Bruce Richardson
2021-07-05 15:55  0%     ` Jerin Jacob
2021-07-05 17:16  0%       ` Bruce Richardson
2021-07-07  8:08  0%         ` Jerin Jacob
2021-07-02 15:23     [dpdk-dev] [PATCH 21.11] telemetry: remove experimental tags from APIs Bruce Richardson
2021-07-05 10:09     ` Power, Ciara
2021-07-05 10:58  3%   ` Bruce Richardson
2021-07-07 12:37  1% [dpdk-dev] [dpdk-announce] DPDK 20.11.2 released Xueming(Steven) Li
2021-07-07 19:30     [dpdk-dev] [pull-request] next-crypto 21.08 rc1 Akhil Goyal
2021-07-07 21:57  5% ` Thomas Monjalon
2021-07-08  7:39  0%   ` [dpdk-dev] [EXT] " Akhil Goyal
2021-07-08  7:41  0%   ` [dpdk-dev] " Thomas Monjalon
2021-07-08  7:47  3%     ` David Marchand
2021-07-08  7:48  0%     ` [dpdk-dev] [EXT] " Akhil Goyal

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).