DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization
@ 2020-06-11  3:32 Joyce Kong
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
                   ` (3 more replies)
  0 siblings, 4 replies; 40+ messages in thread
From: Joyce Kong @ 2020-06-11  3:32 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev

This series is to optimize the virtio performance by using restrict
pointer aliasing, which allows the compiler to vectorize loops more
aggressively.

The patches were benchmarked by running PVP case on ThunderX2 platform
and showed positive performance results.

Joyce Kong (2):
  net/virtio: restrict pointer aliasing for NEON vpmd
  lib/vhost: restrict pointer aliasing for packed path

 drivers/net/virtio/virtio_rxtx_simple_neon.c |  4 ++--
 lib/librte_vhost/virtio_net.c                | 14 +++++++-------
 2 files changed, 9 insertions(+), 9 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-06-11  3:32 [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization Joyce Kong
@ 2020-06-11  3:32 ` Joyce Kong
  2020-06-23  8:47   ` Maxime Coquelin
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path Joyce Kong
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-06-11  3:32 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev

Restrict pointer aliasing to allow the compiler to vectorize loops
more aggressively.

With this patch, a 9.6% improvement is observed in throughput for
the virtio-net PVP case, and a 2.4% perf improvement in throughput
for the virtio-user PVP case. All performance data are measured
under the 0.001% acceptable packet loss with 2 cores on the vhost
side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
index 363e2b330..c08dd51fb 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
@@ -36,8 +36,8 @@
  * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
  */
 uint16_t
-virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
-	uint16_t nb_pkts)
+virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
+		**__restrict rx_pkts, uint16_t nb_pkts)
 {
 	struct virtnet_rx *rxvq = rx_queue;
 	struct virtqueue *vq = rxvq->vq;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path
  2020-06-11  3:32 [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization Joyce Kong
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
@ 2020-06-11  3:32 ` Joyce Kong
  2020-07-07 16:25   ` Adrian Moreno
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
  3 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-06-11  3:32 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev

Restrict pointer aliasing to allow the compiler to vectorize
loops more aggressively.

With this patch, a 9.6% improvement is observed in throughput for the
packed virtio-net PVP case, and a 2.8% improvement in throughput for
the packed virtio-user PVP case. All performance data are measured
under 0.001% acceptable packet loss with 1 core on both vhost and
virtio side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 lib/librte_vhost/virtio_net.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 751c1f373..39c92e7e1 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1133,8 +1133,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
 
 static __rte_noinline uint32_t
 virtio_dev_rx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
-		     struct rte_mbuf **pkts,
+		     struct vhost_virtqueue *__restrict vq,
+		     struct rte_mbuf **__restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -1219,7 +1219,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 
 uint16_t
 rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count)
+	struct rte_mbuf **__restrict pkts, uint16_t count)
 {
 	struct virtio_net *dev = get_device(vid);
 
@@ -2124,9 +2124,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
-			   struct vhost_virtqueue *vq,
+			   struct vhost_virtqueue *__restrict vq,
 			   struct rte_mempool *mbuf_pool,
-			   struct rte_mbuf **pkts,
+			   struct rte_mbuf **__restrict pkts,
 			   uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -2160,9 +2160,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
+		     struct vhost_virtqueue *__restrict vq,
 		     struct rte_mempool *mbuf_pool,
-		     struct rte_mbuf **pkts,
+		     struct rte_mbuf **__restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
@ 2020-06-23  8:47   ` Maxime Coquelin
  2020-06-23  9:05     ` Phil Yang
  0 siblings, 1 reply; 40+ messages in thread
From: Maxime Coquelin @ 2020-06-23  8:47 UTC (permalink / raw)
  To: Joyce Kong, jerinj, zhihong.wang, xiaolong.ye,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev



On 6/11/20 5:32 AM, Joyce Kong wrote:
> Restrict pointer aliasing to allow the compiler to vectorize loops
> more aggressively.
> 
> With this patch, a 9.6% improvement is observed in throughput for
> the virtio-net PVP case, and a 2.4% perf improvement in throughput
> for the virtio-user PVP case. All performance data are measured
> under the 0.001% acceptable packet loss with 2 cores on the vhost
> side.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> ---
>  drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)

Very nice, we should consider doing the same on other platforms.

> diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> index 363e2b330..c08dd51fb 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> @@ -36,8 +36,8 @@
>   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
>   */
>  uint16_t
> -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> -	uint16_t nb_pkts)
> +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> +		**__restrict rx_pkts, uint16_t nb_pkts)

Is __restrict supported by all the compilers?
Wouldn't it be better to introduce a wrapper?

>  {
>  	struct virtnet_rx *rxvq = rx_queue;
>  	struct virtqueue *vq = rxvq->vq;
> 


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-06-23  8:47   ` Maxime Coquelin
@ 2020-06-23  9:05     ` Phil Yang
  2020-06-24  2:58       ` Joyce Kong
  0 siblings, 1 reply; 40+ messages in thread
From: Phil Yang @ 2020-06-23  9:05 UTC (permalink / raw)
  To: Maxime Coquelin, Joyce Kong, jerinj, zhihong.wang, xiaolong.ye,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev

> -----Original Message-----
> From: Maxime Coquelin <maxime.coquelin@redhat.com>
> Sent: Tuesday, June 23, 2020 4:48 PM
> To: Joyce Kong <Joyce.Kong@arm.com>; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org
> Subject: Re: [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON
> vpmd
>
>
>
> On 6/11/20 5:32 AM, Joyce Kong wrote:
> > Restrict pointer aliasing to allow the compiler to vectorize loops
> > more aggressively.
> >
> > With this patch, a 9.6% improvement is observed in throughput for
> > the virtio-net PVP case, and a 2.4% perf improvement in throughput
> > for the virtio-user PVP case. All performance data are measured
> > under the 0.001% acceptable packet loss with 2 cores on the vhost
> > side.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > ---
> >  drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
>
> Very nice, we should consider doing the same on other platforms.
>
> > diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > index 363e2b330..c08dd51fb 100644
> > --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > @@ -36,8 +36,8 @@
> >   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
> >   */
> >  uint16_t
> > -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> > -uint16_t nb_pkts)
> > +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> > +**__restrict rx_pkts, uint16_t nb_pkts)
>
> Is __restrict supported by all the compilers?
> Wouldn't it be better to introduce a wrapper?

+1 for this.
In my understanding, the __restrict keyword is recognized in C at all language levels.
However, the restrict keyword is recognized in C under compilation with c99.
DPDK uses the restrict qualifier a lot, which might have some issues with some old compilers.
So the wrapper will be useful.

Thanks,
Phil

>
> >  {
> >  struct virtnet_rx *rxvq = rx_queue;
> >  struct virtqueue *vq = rxvq->vq;
> >

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-06-23  9:05     ` Phil Yang
@ 2020-06-24  2:58       ` Joyce Kong
  2020-06-24  4:16         ` Stephen Hemminger
  0 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-06-24  2:58 UTC (permalink / raw)
  To: Phil Yang, Maxime Coquelin, jerinj, zhihong.wang, xiaolong.ye,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev

> -----Original Message-----
> From: Phil Yang <Phil.Yang@arm.com>
> Sent: Tuesday, June 23, 2020 5:06 PM
> To: Maxime Coquelin <maxime.coquelin@redhat.com>; Joyce Kong
> <Joyce.Kong@arm.com>; jerinj@marvell.com; zhihong.wang@intel.com;
> xiaolong.ye@intel.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Ruifeng Wang
> <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org
> Subject: RE: [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON
> vpmd
>
> > -----Original Message-----
> > From: Maxime Coquelin <maxime.coquelin@redhat.com>
> > Sent: Tuesday, June 23, 2020 4:48 PM
> > To: Joyce Kong <Joyce.Kong@arm.com>; jerinj@marvell.com;
> > zhihong.wang@intel.com; xiaolong.ye@intel.com; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng
> > Wang <Ruifeng.Wang@arm.com>
> > Cc: dev@dpdk.org
> > Subject: Re: [PATCH v1 1/2] net/virtio: restrict pointer aliasing for
> > NEON vpmd
> >
> >
> >
> > On 6/11/20 5:32 AM, Joyce Kong wrote:
> > > Restrict pointer aliasing to allow the compiler to vectorize loops
> > > more aggressively.
> > >
> > > With this patch, a 9.6% improvement is observed in throughput for
> > > the virtio-net PVP case, and a 2.4% perf improvement in throughput
> > > for the virtio-user PVP case. All performance data are measured
> > > under the 0.001% acceptable packet loss with 2 cores on the vhost
> > > side.
> > >
> > > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > > ---
> > >  drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > Very nice, we should consider doing the same on other platforms.
> >
> > > diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > > index 363e2b330..c08dd51fb 100644
> > > --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > > +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> > > @@ -36,8 +36,8 @@
> > >   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
> > >   */
> > >  uint16_t
> > > -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
> > > -uint16_t nb_pkts)
> > > +virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **__restrict
> > > +rx_pkts, uint16_t nb_pkts)
> >
> > Is __restrict supported by all the compilers?
> > Wouldn't it be better to introduce a wrapper?
>
> +1 for this.
> In my understanding, the __restrict keyword is recognized in C at all language
> levels.
> However, the restrict keyword is recognized in C under compilation with c99.
> DPDK uses the restrict qualifier a lot, which might have some issues with
> some old compilers.
> So the wrapper will be useful.
>
> Thanks,
> Phil
>

Thanks for the suggestion, I shall introduce a wrapper to support all the compilers
in next version.

> >
> > >  {
> > >  struct virtnet_rx *rxvq = rx_queue;  struct virtqueue *vq =
> > > rxvq->vq;
> > >
>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-06-24  2:58       ` Joyce Kong
@ 2020-06-24  4:16         ` Stephen Hemminger
  0 siblings, 0 replies; 40+ messages in thread
From: Stephen Hemminger @ 2020-06-24  4:16 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Phil Yang, Maxime Coquelin, jerinj, zhihong.wang, xiaolong.ye,
	Honnappa Nagarahalli, Ruifeng Wang, dev

On Wed, 24 Jun 2020 02:58:28 +0000
Joyce Kong <Joyce.Kong@arm.com> wrote:

> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

Please fix your email system.
Technically, it is not legal to post to a public mailing list with
such a boilerplate footer.

^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper
  2020-06-11  3:32 [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization Joyce Kong
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path Joyce Kong
@ 2020-07-06  7:49 ` Joyce Kong
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
                     ` (6 more replies)
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
  3 siblings, 7 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

As the 'restrict' keyword is recognized in C99, this patchset is to
add a wrapper defining '__rte_restrict' which can be supported by
all compilers. Then replace the existing 'restrict' and '__restrict'
in different vpmds, and optimize vhost/virtio with restricted pointer
aliasing for more aggressive loops vectorization.

The vhost/virtio optimization patches were benchmarked by running PVP
case on ThunderX2 platform and showed positive performance results.

Joyce Kong (6):
  lib/eal: add a wrapper to define restricted pointers
  net/virtio: restrict pointer aliasing for NEON vpmd
  lib/vhost: restrict pointer aliasing for packed vpmd
  net/i40e: replace restrict with rte restrict
  examples/performance-thread: replace restrict with wrapper
  net/mlx5: replace restrict keyword with rte restrict

 drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
 drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
 drivers/net/virtio/virtio_rxtx_simple_neon.c  |   5 +-
 .../pthread_shim/pthread_shim.c               |  12 +-
 lib/librte_eal/include/rte_common.h           |  10 +
 lib/librte_vhost/virtio_net.c                 |  14 +-
 6 files changed, 139 insertions(+), 127 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-07  2:15     ` Jerin Jacob
                       ` (3 more replies)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 2/6] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
                     ` (5 subsequent siblings)
  6 siblings, 4 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

The 'restrict' keyword is recognized in C99, while type qulifier
'__restrict' compiles ok in C with all language levels. This patch
is to add a wrapper defining '__rte_restrict' with 'restrict' and
'__restrict' to be supported by all compilers.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
---
 lib/librte_eal/include/rte_common.h | 10 ++++++++++
 1 file changed, 10 insertions(+)

diff --git a/lib/librte_eal/include/rte_common.h b/lib/librte_eal/include/rte_common.h
index 0843ce69e..cda32c056 100644
--- a/lib/librte_eal/include/rte_common.h
+++ b/lib/librte_eal/include/rte_common.h
@@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
  */
 #define __rte_unused __attribute__((__unused__))
 
+/**
+ * Define a wrapper for restricted pointers which can be supported
+ * by all compilers.
+ */
+#if __STDC_VERSION__ >= 199901
+#define __rte_restrict restrict
+#else
+#define __rte_restrict __restrict
+#endif
+
 /**
  * definition to mark a variable or function parameter as used so
  * as to avoid a compiler warning
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 2/6] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

Restrict pointer aliasing to allow the compiler to vectorize loops
more aggressively.

With this patch, a 9.6% improvement is observed in throughput for
the virtio-net PVP case, and a 2.4% perf improvement in throughput
for the virtio-user PVP case. All performance data are measured
under the 0.001% acceptable packet loss with 2 cores on the vhost
side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
index 363e2b330..5febfb0f5 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
@@ -36,8 +36,8 @@
  * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
  */
 uint16_t
-virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
-	uint16_t nb_pkts)
+virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
+		**__rte_restrict rx_pkts, uint16_t nb_pkts)
 {
 	struct virtnet_rx *rxvq = rx_queue;
 	struct virtqueue *vq = rxvq->vq;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 2/6] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-07 13:58     ` David Marchand
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

Restrict pointer aliasing to allow the compiler to vectorize loop
more aggressively.

With this patch, a 9.6% improvement is observed in throughput for
the packed virtio-net PVP case, and a 2.8% improvement in throughput
for the packed virtio-user PVP case. All performance data are measured
under 0.001% acceptable packet loss with 1 core on both vhost and
virtio side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/virtio/virtio_rxtx_simple_neon.c |  5 +++--
 lib/librte_vhost/virtio_net.c                | 14 +++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
index 5febfb0f5..31824a931 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
@@ -36,8 +36,9 @@
  * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
  */
 uint16_t
-virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
-		**__rte_restrict rx_pkts, uint16_t nb_pkts)
+virtio_recv_pkts_vec(void *rx_queue,
+		struct rte_mbuf **__rte_restrict rx_pkts,
+		uint16_t nb_pkts)
 {
 	struct virtnet_rx *rxvq = rx_queue;
 	struct virtqueue *vq = rxvq->vq;
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 751c1f373..e60358251 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1133,8 +1133,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
 
 static __rte_noinline uint32_t
 virtio_dev_rx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
-		     struct rte_mbuf **pkts,
+		     struct vhost_virtqueue *__rte_restrict vq,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -1219,7 +1219,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 
 uint16_t
 rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count)
+	struct rte_mbuf **__rte_restrict pkts, uint16_t count)
 {
 	struct virtio_net *dev = get_device(vid);
 
@@ -2124,9 +2124,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
-			   struct vhost_virtqueue *vq,
+			   struct vhost_virtqueue *__rte_restrict vq,
 			   struct rte_mempool *mbuf_pool,
-			   struct rte_mbuf **pkts,
+			   struct rte_mbuf **__rte_restrict pkts,
 			   uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -2160,9 +2160,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
+		     struct vhost_virtqueue *__rte_restrict vq,
 		     struct rte_mempool *mbuf_pool,
-		     struct rte_mbuf **pkts,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
                     ` (2 preceding siblings ...)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-07  2:25     ` Phil Yang
                       ` (2 more replies)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper Joyce Kong
                     ` (2 subsequent siblings)
  6 siblings, 3 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

'__rte_restrict' is a common wrapper for restricted pointers which
can be supported by all compilers. Use '__rte_restrict' instead of
'__restrict' for code consistency.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c | 17 +++++++++--------
 1 file changed, 9 insertions(+), 8 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 1dfd0478b..4574139d5 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -172,8 +172,8 @@ desc_to_olflags_v(struct i40e_rx_queue *rxq, uint64x2_t descs[4],
 #define I40E_UINT16_BIT (CHAR_BIT * sizeof(uint16_t))
 
 static inline void
-desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
-		uint32_t *__restrict ptype_tbl)
+desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__rte_restrict rx_pkts,
+		uint32_t *__rte_restrict ptype_tbl)
 {
 	int i;
 	uint8_t ptype;
@@ -194,8 +194,9 @@ desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
  *   numbers of DD bits
  */
 static inline uint16_t
-_recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
-	**__restrict rx_pkts, uint16_t nb_pkts, uint8_t *split_packet)
+_recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
+		   struct rte_mbuf **__rte_restrict rx_pkts,
+		   uint16_t nb_pkts, uint8_t *split_packet)
 {
 	volatile union i40e_rx_desc *rxdp;
 	struct i40e_rx_entry *sw_ring;
@@ -432,8 +433,8 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
  *   numbers of DD bits
  */
 uint16_t
-i40e_recv_pkts_vec(void *__restrict rx_queue,
-		struct rte_mbuf **__restrict rx_pkts, uint16_t nb_pkts)
+i40e_recv_pkts_vec(void *__rte_restrict rx_queue,
+		struct rte_mbuf **__rte_restrict rx_pkts, uint16_t nb_pkts)
 {
 	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
 }
@@ -504,8 +505,8 @@ vtx(volatile struct i40e_tx_desc *txdp, struct rte_mbuf **pkt,
 }
 
 uint16_t
-i40e_xmit_fixed_burst_vec(void *__restrict tx_queue,
-	struct rte_mbuf **__restrict tx_pkts, uint16_t nb_pkts)
+i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue,
+	struct rte_mbuf **__rte_restrict tx_pkts, uint16_t nb_pkts)
 {
 	struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue;
 	volatile struct i40e_tx_desc *txdp;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
                     ` (3 preceding siblings ...)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-07  2:27     ` Phil Yang
  2020-07-07  2:45     ` Ruifeng Wang
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict Joyce Kong
  2020-07-09 13:52   ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper Morten Brørup
  6 siblings, 2 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

'__rte_restrict' is a common wrapper for restricted pointers which
can be supported by all compilers. Use '__rte_restrict' instead of
'__restrict' for code consistency.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
---
 .../performance-thread/pthread_shim/pthread_shim.c   | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/examples/performance-thread/pthread_shim/pthread_shim.c b/examples/performance-thread/pthread_shim/pthread_shim.c
index 93e8dca3f..bbc076584 100644
--- a/examples/performance-thread/pthread_shim/pthread_shim.c
+++ b/examples/performance-thread/pthread_shim/pthread_shim.c
@@ -341,9 +341,9 @@ int pthread_cond_signal(pthread_cond_t *cond)
 }
 
 int
-pthread_cond_timedwait(pthread_cond_t *__restrict cond,
-		       pthread_mutex_t *__restrict mutex,
-		       const struct timespec *__restrict time)
+pthread_cond_timedwait(pthread_cond_t *__rte_restrict cond,
+		       pthread_mutex_t *__rte_restrict mutex,
+		       const struct timespec *__rte_restrict time)
 {
 	NOT_IMPLEMENTED;
 	return _sys_pthread_funcs.f_pthread_cond_timedwait(cond, mutex, time);
@@ -362,10 +362,10 @@ int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
 }
 
 int
-pthread_create(pthread_t *__restrict tid,
-		const pthread_attr_t *__restrict attr,
+pthread_create(pthread_t *__rte_restrict tid,
+		const pthread_attr_t *__rte_restrict attr,
 		lthread_func_t func,
-	       void *__restrict arg)
+	       void *__rte_restrict arg)
 {
 	if (override) {
 		int lcore = -1;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
                     ` (4 preceding siblings ...)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper Joyce Kong
@ 2020-07-06  7:49   ` Joyce Kong
  2020-07-07  2:28     ` Phil Yang
  2020-07-07  2:47     ` Ruifeng Wang
  2020-07-09 13:52   ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper Morten Brørup
  6 siblings, 2 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-06  7:49 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye, beilei.xing,
	jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

The 'restrict' keyword is recognized in C99, which might have
some issues with old compilers. It is better to use the wrapper
'__rte_restrict' which can be supported by all compilers for
restricted pointers.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
---
 drivers/net/mlx5/mlx5_rxtx.c | 208 +++++++++++++++++------------------
 1 file changed, 104 insertions(+), 104 deletions(-)

diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e4106bf0a..894f441f3 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -113,13 +113,13 @@ mlx5_queue_state_modify(struct rte_eth_dev *dev,
 			struct mlx5_mp_arg_queue_state_modify *sm);
 
 static inline void
-mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
-			volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
+			volatile struct mlx5_cqe *__rte_restrict cqe,
 			uint32_t phcsum);
 
 static inline void
-mlx5_lro_update_hdr(uint8_t *restrict padd,
-		    volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
+		    volatile struct mlx5_cqe *__rte_restrict cqe,
 		    uint32_t len);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
@@ -374,7 +374,7 @@ mlx5_set_swp_types_table(void)
  *   Software Parser flags are set by pointer.
  */
 static __rte_always_inline uint32_t
-txq_mbuf_to_swp(struct mlx5_txq_local *restrict loc,
+txq_mbuf_to_swp(struct mlx5_txq_local *__rte_restrict loc,
 		uint8_t *swp_flags,
 		unsigned int olx)
 {
@@ -747,7 +747,7 @@ check_err_cqe_seen(volatile struct mlx5_err_cqe *err_cqe)
  *   the error completion entry is handled successfully.
  */
 static int
-mlx5_tx_error_cqe_handle(struct mlx5_txq_data *restrict txq,
+mlx5_tx_error_cqe_handle(struct mlx5_txq_data *__rte_restrict txq,
 			 volatile struct mlx5_err_cqe *err_cqe)
 {
 	if (err_cqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR) {
@@ -1508,8 +1508,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   The L3 pseudo-header checksum.
  */
 static inline void
-mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
-			volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
+			volatile struct mlx5_cqe *__rte_restrict cqe,
 			uint32_t phcsum)
 {
 	uint8_t l4_type = (rte_be_to_cpu_16(cqe->hdr_type_etc) &
@@ -1550,8 +1550,8 @@ mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
  *   The packet length.
  */
 static inline void
-mlx5_lro_update_hdr(uint8_t *restrict padd,
-		    volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
+		    volatile struct mlx5_cqe *__rte_restrict cqe,
 		    uint32_t len)
 {
 	union {
@@ -1965,7 +1965,7 @@ mlx5_check_vec_rx_support(struct rte_eth_dev *dev __rte_unused)
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_free_mbuf(struct rte_mbuf **restrict pkts,
+mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
 		  unsigned int olx __rte_unused)
 {
@@ -2070,7 +2070,7 @@ mlx5_tx_free_mbuf(struct rte_mbuf **restrict pkts,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_free_elts(struct mlx5_txq_data *restrict txq,
+mlx5_tx_free_elts(struct mlx5_txq_data *__rte_restrict txq,
 		  uint16_t tail,
 		  unsigned int olx __rte_unused)
 {
@@ -2111,8 +2111,8 @@ mlx5_tx_free_elts(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_copy_elts(struct mlx5_txq_data *restrict txq,
-		  struct rte_mbuf **restrict pkts,
+mlx5_tx_copy_elts(struct mlx5_txq_data *__rte_restrict txq,
+		  struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
 		  unsigned int olx __rte_unused)
 {
@@ -2148,7 +2148,7 @@ mlx5_tx_copy_elts(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_comp_flush(struct mlx5_txq_data *restrict txq,
+mlx5_tx_comp_flush(struct mlx5_txq_data *__rte_restrict txq,
 		   volatile struct mlx5_cqe *last_cqe,
 		   unsigned int olx __rte_unused)
 {
@@ -2179,7 +2179,7 @@ mlx5_tx_comp_flush(struct mlx5_txq_data *restrict txq,
  * routine smaller, simple and faster - from experiments.
  */
 static void
-mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq,
+mlx5_tx_handle_completion(struct mlx5_txq_data *__rte_restrict txq,
 			  unsigned int olx __rte_unused)
 {
 	unsigned int count = MLX5_TX_COMP_MAX_CQE;
@@ -2268,8 +2268,8 @@ mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_request_completion(struct mlx5_txq_data *restrict txq,
-			   struct mlx5_txq_local *restrict loc,
+mlx5_tx_request_completion(struct mlx5_txq_data *__rte_restrict txq,
+			   struct mlx5_txq_local *__rte_restrict loc,
 			   unsigned int olx)
 {
 	uint16_t head = txq->elts_head;
@@ -2316,7 +2316,7 @@ mlx5_tx_request_completion(struct mlx5_txq_data *restrict txq,
 int
 mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
 {
-	struct mlx5_txq_data *restrict txq = tx_queue;
+	struct mlx5_txq_data *__rte_restrict txq = tx_queue;
 	uint16_t used;
 
 	mlx5_tx_handle_completion(txq, 0);
@@ -2347,14 +2347,14 @@ mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_cseg_init(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int ds,
 		  unsigned int opcode,
 		  unsigned int olx __rte_unused)
 {
-	struct mlx5_wqe_cseg *restrict cs = &wqe->cseg;
+	struct mlx5_wqe_cseg *__rte_restrict cs = &wqe->cseg;
 
 	/* For legacy MPW replace the EMPW by TSO with modifier. */
 	if (MLX5_TXOFF_CONFIG(MPW) && opcode == MLX5_OPCODE_ENHANCED_MPSW)
@@ -2382,12 +2382,12 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_eseg_none(struct mlx5_txq_data *restrict txq __rte_unused,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_none(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 
 	/*
@@ -2440,13 +2440,13 @@ mlx5_tx_eseg_none(struct mlx5_txq_data *restrict txq __rte_unused,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_eseg_dmin(struct mlx5_txq_data *restrict txq __rte_unused,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_dmin(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *psrc, *pdst;
 
@@ -2524,15 +2524,15 @@ mlx5_tx_eseg_dmin(struct mlx5_txq_data *restrict txq __rte_unused,
  *   Pointer to the next Data Segment (aligned and wrapped around).
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_eseg_data(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_data(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int inlen,
 		  unsigned int tso,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *psrc, *pdst;
 	unsigned int part;
@@ -2650,7 +2650,7 @@ mlx5_tx_eseg_data(struct mlx5_txq_data *restrict txq,
  */
 static __rte_always_inline unsigned int
 mlx5_tx_mseg_memcpy(uint8_t *pdst,
-		    struct mlx5_txq_local *restrict loc,
+		    struct mlx5_txq_local *__rte_restrict loc,
 		    unsigned int len,
 		    unsigned int must,
 		    unsigned int olx __rte_unused)
@@ -2747,15 +2747,15 @@ mlx5_tx_mseg_memcpy(uint8_t *pdst,
  *   wrapping check on its own).
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_mdat(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int inlen,
 		  unsigned int tso,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *pdst;
 	unsigned int part, tlen = 0;
@@ -2851,9 +2851,9 @@ mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict txq,
-		 struct mlx5_txq_local *restrict loc,
-		 struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_ptr(struct mlx5_txq_data *__rte_restrict txq,
+		 struct mlx5_txq_local *__rte_restrict loc,
+		 struct mlx5_wqe_dseg *__rte_restrict dseg,
 		 uint8_t *buf,
 		 unsigned int len,
 		 unsigned int olx __rte_unused)
@@ -2885,9 +2885,9 @@ mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_iptr(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -2961,9 +2961,9 @@ mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict txq,
  *   last packet in the eMPW session.
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_dseg_empw(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_empw(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -3024,9 +3024,9 @@ mlx5_tx_dseg_empw(struct mlx5_txq_data *restrict txq,
  *   Ring buffer wraparound check is needed.
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_dseg_vlan(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_vlan(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -3112,15 +3112,15 @@ mlx5_tx_dseg_vlan(struct mlx5_txq_data *restrict txq,
  *   Actual size of built WQE in segments.
  */
 static __rte_always_inline unsigned int
-mlx5_tx_mseg_build(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
-		   struct mlx5_wqe *restrict wqe,
+mlx5_tx_mseg_build(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
+		   struct mlx5_wqe *__rte_restrict wqe,
 		   unsigned int vlan,
 		   unsigned int inlen,
 		   unsigned int tso,
 		   unsigned int olx __rte_unused)
 {
-	struct mlx5_wqe_dseg *restrict dseg;
+	struct mlx5_wqe_dseg *__rte_restrict dseg;
 	unsigned int ds;
 
 	MLX5_ASSERT((rte_pktmbuf_pkt_len(loc->mbuf) + vlan) >= inlen);
@@ -3225,11 +3225,11 @@ mlx5_tx_mseg_build(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_tso(struct mlx5_txq_data *restrict txq,
-			struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_tso(struct mlx5_txq_data *__rte_restrict txq,
+			struct mlx5_txq_local *__rte_restrict loc,
 			unsigned int olx)
 {
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, dlen, inlen, ntcp, vlan = 0;
 
 	/*
@@ -3314,12 +3314,12 @@ mlx5_tx_packet_multi_tso(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_send(struct mlx5_txq_data *restrict txq,
-			  struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_send(struct mlx5_txq_data *__rte_restrict txq,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
-	struct mlx5_wqe_dseg *restrict dseg;
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe_dseg *__rte_restrict dseg;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, nseg;
 
 	MLX5_ASSERT(NB_SEGS(loc->mbuf) > 1);
@@ -3422,11 +3422,11 @@ mlx5_tx_packet_multi_send(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_inline(struct mlx5_txq_data *restrict txq,
-			    struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_inline(struct mlx5_txq_data *__rte_restrict txq,
+			    struct mlx5_txq_local *__rte_restrict loc,
 			    unsigned int olx)
 {
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, inlen, dlen, vlan = 0;
 
 	MLX5_ASSERT(MLX5_TXOFF_CONFIG(INLINE));
@@ -3587,10 +3587,10 @@ mlx5_tx_packet_multi_inline(struct mlx5_txq_data *restrict txq,
  * Local context variables updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_mseg(struct mlx5_txq_data *restrict txq,
-		   struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_mseg(struct mlx5_txq_data *__rte_restrict txq,
+		   struct rte_mbuf **__rte_restrict pkts,
 		   unsigned int pkts_n,
-		   struct mlx5_txq_local *restrict loc,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int olx)
 {
 	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
@@ -3676,10 +3676,10 @@ mlx5_tx_burst_mseg(struct mlx5_txq_data *restrict txq,
  * Local context variables updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
-		  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_tso(struct mlx5_txq_data *__rte_restrict txq,
+		  struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
-		  struct mlx5_txq_local *restrict loc,
+		  struct mlx5_txq_local *__rte_restrict loc,
 		  unsigned int olx)
 {
 	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
@@ -3687,8 +3687,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe *restrict wqe;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe *__rte_restrict wqe;
 		unsigned int ds, dlen, hlen, ntcp, vlan = 0;
 		uint8_t *dptr;
 
@@ -3800,8 +3800,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
  *  MLX5_TXCMP_CODE_EMPW - single-segment packet, use MPW.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_able_to_empw(struct mlx5_txq_data *restrict txq,
-		     struct mlx5_txq_local *restrict loc,
+mlx5_tx_able_to_empw(struct mlx5_txq_data *__rte_restrict txq,
+		     struct mlx5_txq_local *__rte_restrict loc,
 		     unsigned int olx,
 		     bool newp)
 {
@@ -3855,9 +3855,9 @@ mlx5_tx_able_to_empw(struct mlx5_txq_data *restrict txq,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline bool
-mlx5_tx_match_empw(struct mlx5_txq_data *restrict txq __rte_unused,
-		   struct mlx5_wqe_eseg *restrict es,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_match_empw(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		   struct mlx5_wqe_eseg *__rte_restrict es,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   uint32_t dlen,
 		   unsigned int olx)
 {
@@ -3909,8 +3909,8 @@ mlx5_tx_match_empw(struct mlx5_txq_data *restrict txq __rte_unused,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline void
-mlx5_tx_sdone_empw(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_sdone_empw(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int ds,
 		   unsigned int slen,
 		   unsigned int olx __rte_unused)
@@ -3954,11 +3954,11 @@ mlx5_tx_sdone_empw(struct mlx5_txq_data *restrict txq,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline void
-mlx5_tx_idone_empw(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_idone_empw(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int len,
 		   unsigned int slen,
-		   struct mlx5_wqe *restrict wqem,
+		   struct mlx5_wqe *__rte_restrict wqem,
 		   unsigned int olx __rte_unused)
 {
 	struct mlx5_wqe_dseg *dseg = &wqem->dseg[0];
@@ -4042,10 +4042,10 @@ mlx5_tx_idone_empw(struct mlx5_txq_data *restrict txq,
  * No VLAN insertion is supported.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_empw_simple(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4061,8 +4061,8 @@ mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe_eseg *restrict eseg;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe_eseg *__rte_restrict eseg;
 		enum mlx5_txcmp_code ret;
 		unsigned int part, loop;
 		unsigned int slen = 0;
@@ -4208,10 +4208,10 @@ mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
  * with inlining, optionally supports VLAN insertion.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_empw_inline(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4227,8 +4227,8 @@ mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe *restrict wqem;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe *__rte_restrict wqem;
 		enum mlx5_txcmp_code ret;
 		unsigned int room, part, nlim;
 		unsigned int slen = 0;
@@ -4489,10 +4489,10 @@ mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
  * Data inlining and VLAN insertion are supported.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_single_send(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4504,7 +4504,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe *restrict wqe;
+		struct mlx5_wqe *__rte_restrict wqe;
 		enum mlx5_txcmp_code ret;
 
 		MLX5_ASSERT(NB_SEGS(loc->mbuf) == 1);
@@ -4602,7 +4602,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 				 * not contain inlined data for eMPW due to
 				 * segment shared for all packets.
 				 */
-				struct mlx5_wqe_dseg *restrict dseg;
+				struct mlx5_wqe_dseg *__rte_restrict dseg;
 				unsigned int ds;
 				uint8_t *dptr;
 
@@ -4765,10 +4765,10 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 }
 
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_single(struct mlx5_txq_data *restrict txq,
-		     struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_single(struct mlx5_txq_data *__rte_restrict txq,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     unsigned int pkts_n,
-		     struct mlx5_txq_local *restrict loc,
+		     struct mlx5_txq_local *__rte_restrict loc,
 		     unsigned int olx)
 {
 	enum mlx5_txcmp_code ret;
@@ -4819,8 +4819,8 @@ mlx5_tx_burst_single(struct mlx5_txq_data *restrict txq,
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static __rte_always_inline uint16_t
-mlx5_tx_burst_tmpl(struct mlx5_txq_data *restrict txq,
-		   struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_tmpl(struct mlx5_txq_data *__rte_restrict txq,
+		   struct rte_mbuf **__rte_restrict pkts,
 		   uint16_t pkts_n,
 		   unsigned int olx)
 {
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
@ 2020-07-07  2:15     ` Jerin Jacob
  2020-07-07  2:24     ` Phil Yang
                       ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Jerin Jacob @ 2020-07-07  2:15 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Maxime Coquelin, Jerin Jacob, Zhihong Wang, Ye, Xiaolong,
	Beilei Xing, jia.guo, John McNamara, Matan Azrad, Shahaf Shuler,
	Slava Ovsiienko, Honnappa Nagarahalli, Phil Yang,
	Ruifeng Wang (Arm Technology China),
	dpdk-dev, nd

On Mon, Jul 6, 2020 at 1:19 PM Joyce Kong <joyce.kong@arm.com> wrote:
>
> The 'restrict' keyword is recognized in C99, while type qulifier
> '__restrict' compiles ok in C with all language levels. This patch
> is to add a wrapper defining '__rte_restrict' with 'restrict' and
> '__restrict' to be supported by all compilers.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>

Acked-by: Jerin Jacob <jerinj@marvell.com>


> ---
>  lib/librte_eal/include/rte_common.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/lib/librte_eal/include/rte_common.h b/lib/librte_eal/include/rte_common.h
> index 0843ce69e..cda32c056 100644
> --- a/lib/librte_eal/include/rte_common.h
> +++ b/lib/librte_eal/include/rte_common.h
> @@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
>   */
>  #define __rte_unused __attribute__((__unused__))
>
> +/**
> + * Define a wrapper for restricted pointers which can be supported
> + * by all compilers.
> + */
> +#if __STDC_VERSION__ >= 199901
> +#define __rte_restrict restrict
> +#else
> +#define __rte_restrict __restrict
> +#endif
> +
>  /**
>   * definition to mark a variable or function parameter as used so
>   * as to avoid a compiler warning
> --
> 2.27.0
>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
  2020-07-07  2:15     ` Jerin Jacob
@ 2020-07-07  2:24     ` Phil Yang
  2020-07-07  2:40     ` Ruifeng Wang
  2020-07-07 13:57     ` David Marchand
  3 siblings, 0 replies; 40+ messages in thread
From: Phil Yang @ 2020-07-07  2:24 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev, nd

> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 1/6] lib/eal: add a common wrapper for restricted
> pointers
> 
> The 'restrict' keyword is recognized in C99, while type qulifier
> '__restrict' compiles ok in C with all language levels. This patch
> is to add a wrapper defining '__rte_restrict' with 'restrict' and
> '__restrict' to be supported by all compilers.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---

Reviewed-by: Phil Yang <phil.yang@arm.com>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
@ 2020-07-07  2:25     ` Phil Yang
  2020-07-07  2:43     ` Ruifeng Wang
  2020-07-07 14:00     ` David Marchand
  2 siblings, 0 replies; 40+ messages in thread
From: Phil Yang @ 2020-07-07  2:25 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev, nd

> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
> 
> '__rte_restrict' is a common wrapper for restricted pointers which
> can be supported by all compilers. Use '__rte_restrict' instead of
> '__restrict' for code consistency.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
Reviewed-by: Phil Yang <phil.yang@arm.com> 


>  drivers/net/i40e/i40e_rxtx_vec_neon.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> index 1dfd0478b..4574139d5 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> @@ -172,8 +172,8 @@ desc_to_olflags_v(struct i40e_rx_queue *rxq,
> uint64x2_t descs[4],
>  #define I40E_UINT16_BIT (CHAR_BIT * sizeof(uint16_t))
> 
>  static inline void
> -desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
> -		uint32_t *__restrict ptype_tbl)
> +desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__rte_restrict
> rx_pkts,
> +		uint32_t *__rte_restrict ptype_tbl)
>  {
>  	int i;
>  	uint8_t ptype;
> @@ -194,8 +194,9 @@ desc_to_ptype_v(uint64x2_t descs[4], struct
> rte_mbuf **__restrict rx_pkts,
>   *   numbers of DD bits
>   */
>  static inline uint16_t
> -_recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
> -	**__restrict rx_pkts, uint16_t nb_pkts, uint8_t *split_packet)
> +_recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
> +		   struct rte_mbuf **__rte_restrict rx_pkts,
> +		   uint16_t nb_pkts, uint8_t *split_packet)
>  {
>  	volatile union i40e_rx_desc *rxdp;
>  	struct i40e_rx_entry *sw_ring;
> @@ -432,8 +433,8 @@ _recv_raw_pkts_vec(struct i40e_rx_queue
> *__restrict rxq, struct rte_mbuf
>   *   numbers of DD bits
>   */
>  uint16_t
> -i40e_recv_pkts_vec(void *__restrict rx_queue,
> -		struct rte_mbuf **__restrict rx_pkts, uint16_t nb_pkts)
> +i40e_recv_pkts_vec(void *__rte_restrict rx_queue,
> +		struct rte_mbuf **__rte_restrict rx_pkts, uint16_t nb_pkts)
>  {
>  	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
>  }
> @@ -504,8 +505,8 @@ vtx(volatile struct i40e_tx_desc *txdp, struct
> rte_mbuf **pkt,
>  }
> 
>  uint16_t
> -i40e_xmit_fixed_burst_vec(void *__restrict tx_queue,
> -	struct rte_mbuf **__restrict tx_pkts, uint16_t nb_pkts)
> +i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue,
> +	struct rte_mbuf **__rte_restrict tx_pkts, uint16_t nb_pkts)
>  {
>  	struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue;
>  	volatile struct i40e_tx_desc *txdp;
> --
> 2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper Joyce Kong
@ 2020-07-07  2:27     ` Phil Yang
  2020-07-07  2:45     ` Ruifeng Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Phil Yang @ 2020-07-07  2:27 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev, nd

> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 5/6] examples/performance-thread: replace restrict with
> wrapper
> 
> '__rte_restrict' is a common wrapper for restricted pointers which
> can be supported by all compilers. Use '__rte_restrict' instead of
> '__restrict' for code consistency.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>


Reviewed-by: Phil Yang <phil.yang@arm.com>
 
> ---
>  .../performance-thread/pthread_shim/pthread_shim.c   | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/examples/performance-thread/pthread_shim/pthread_shim.c
> b/examples/performance-thread/pthread_shim/pthread_shim.c
> index 93e8dca3f..bbc076584 100644
> --- a/examples/performance-thread/pthread_shim/pthread_shim.c
> +++ b/examples/performance-thread/pthread_shim/pthread_shim.c
> @@ -341,9 +341,9 @@ int pthread_cond_signal(pthread_cond_t *cond)
>  }
> 
>  int
> -pthread_cond_timedwait(pthread_cond_t *__restrict cond,
> -		       pthread_mutex_t *__restrict mutex,
> -		       const struct timespec *__restrict time)
> +pthread_cond_timedwait(pthread_cond_t *__rte_restrict cond,
> +		       pthread_mutex_t *__rte_restrict mutex,
> +		       const struct timespec *__rte_restrict time)
>  {
>  	NOT_IMPLEMENTED;
>  	return _sys_pthread_funcs.f_pthread_cond_timedwait(cond,
> mutex, time);
> @@ -362,10 +362,10 @@ int pthread_cond_wait(pthread_cond_t *cond,
> pthread_mutex_t *mutex)
>  }
> 
>  int
> -pthread_create(pthread_t *__restrict tid,
> -		const pthread_attr_t *__restrict attr,
> +pthread_create(pthread_t *__rte_restrict tid,
> +		const pthread_attr_t *__rte_restrict attr,
>  		lthread_func_t func,
> -	       void *__restrict arg)
> +	       void *__rte_restrict arg)
>  {
>  	if (override) {
>  		int lcore = -1;
> --
> 2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict Joyce Kong
@ 2020-07-07  2:28     ` Phil Yang
  2020-07-07  2:47     ` Ruifeng Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Phil Yang @ 2020-07-07  2:28 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Ruifeng Wang
  Cc: dev, nd

> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:50 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict
> 
> The 'restrict' keyword is recognized in C99, which might have
> some issues with old compilers. It is better to use the wrapper
> '__rte_restrict' which can be supported by all compilers for
> restricted pointers.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>


Reviewed-by: Phil Yang <phil.yang@arm.com>

> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 208 +++++++++++++++++------------------
>  1 file changed, 104 insertions(+), 104 deletions(-)
> 
> diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
> index e4106bf0a..894f441f3 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -113,13 +113,13 @@ mlx5_queue_state_modify(struct rte_eth_dev
> *dev,
>  			struct mlx5_mp_arg_queue_state_modify *sm);
> 
>  static inline void
> -mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
> -			volatile struct mlx5_cqe *restrict cqe,
> +mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
> +			volatile struct mlx5_cqe *__rte_restrict cqe,
>  			uint32_t phcsum);
> 
>  static inline void
> -mlx5_lro_update_hdr(uint8_t *restrict padd,
> -		    volatile struct mlx5_cqe *restrict cqe,
> +mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
> +		    volatile struct mlx5_cqe *__rte_restrict cqe,
>  		    uint32_t len);
> 
>  uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
> @@ -374,7 +374,7 @@ mlx5_set_swp_types_table(void)
>   *   Software Parser flags are set by pointer.
>   */
>  static __rte_always_inline uint32_t
> -txq_mbuf_to_swp(struct mlx5_txq_local *restrict loc,
> +txq_mbuf_to_swp(struct mlx5_txq_local *__rte_restrict loc,
>  		uint8_t *swp_flags,
>  		unsigned int olx)
>  {
> @@ -747,7 +747,7 @@ check_err_cqe_seen(volatile struct mlx5_err_cqe
> *err_cqe)
>   *   the error completion entry is handled successfully.
>   */
>  static int
> -mlx5_tx_error_cqe_handle(struct mlx5_txq_data *restrict txq,
> +mlx5_tx_error_cqe_handle(struct mlx5_txq_data *__rte_restrict txq,
>  			 volatile struct mlx5_err_cqe *err_cqe)
>  {
>  	if (err_cqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR)
> {
> @@ -1508,8 +1508,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf
> **pkts, uint16_t pkts_n)
>   *   The L3 pseudo-header checksum.
>   */
>  static inline void
> -mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
> -			volatile struct mlx5_cqe *restrict cqe,
> +mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
> +			volatile struct mlx5_cqe *__rte_restrict cqe,
>  			uint32_t phcsum)
>  {
>  	uint8_t l4_type = (rte_be_to_cpu_16(cqe->hdr_type_etc) &
> @@ -1550,8 +1550,8 @@ mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr
> *restrict tcp,
>   *   The packet length.
>   */
>  static inline void
> -mlx5_lro_update_hdr(uint8_t *restrict padd,
> -		    volatile struct mlx5_cqe *restrict cqe,
> +mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
> +		    volatile struct mlx5_cqe *__rte_restrict cqe,
>  		    uint32_t len)
>  {
>  	union {
> @@ -1965,7 +1965,7 @@ mlx5_check_vec_rx_support(struct rte_eth_dev
> *dev __rte_unused)
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_free_mbuf(struct rte_mbuf **restrict pkts,
> +mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts,
>  		  unsigned int pkts_n,
>  		  unsigned int olx __rte_unused)
>  {
> @@ -2070,7 +2070,7 @@ mlx5_tx_free_mbuf(struct rte_mbuf **restrict
> pkts,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_free_elts(struct mlx5_txq_data *restrict txq,
> +mlx5_tx_free_elts(struct mlx5_txq_data *__rte_restrict txq,
>  		  uint16_t tail,
>  		  unsigned int olx __rte_unused)
>  {
> @@ -2111,8 +2111,8 @@ mlx5_tx_free_elts(struct mlx5_txq_data *restrict
> txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_copy_elts(struct mlx5_txq_data *restrict txq,
> -		  struct rte_mbuf **restrict pkts,
> +mlx5_tx_copy_elts(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct rte_mbuf **__rte_restrict pkts,
>  		  unsigned int pkts_n,
>  		  unsigned int olx __rte_unused)
>  {
> @@ -2148,7 +2148,7 @@ mlx5_tx_copy_elts(struct mlx5_txq_data *restrict
> txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_comp_flush(struct mlx5_txq_data *restrict txq,
> +mlx5_tx_comp_flush(struct mlx5_txq_data *__rte_restrict txq,
>  		   volatile struct mlx5_cqe *last_cqe,
>  		   unsigned int olx __rte_unused)
>  {
> @@ -2179,7 +2179,7 @@ mlx5_tx_comp_flush(struct mlx5_txq_data
> *restrict txq,
>   * routine smaller, simple and faster - from experiments.
>   */
>  static void
> -mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq,
> +mlx5_tx_handle_completion(struct mlx5_txq_data *__rte_restrict txq,
>  			  unsigned int olx __rte_unused)
>  {
>  	unsigned int count = MLX5_TX_COMP_MAX_CQE;
> @@ -2268,8 +2268,8 @@ mlx5_tx_handle_completion(struct mlx5_txq_data
> *restrict txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_request_completion(struct mlx5_txq_data *restrict txq,
> -			   struct mlx5_txq_local *restrict loc,
> +mlx5_tx_request_completion(struct mlx5_txq_data *__rte_restrict txq,
> +			   struct mlx5_txq_local *__rte_restrict loc,
>  			   unsigned int olx)
>  {
>  	uint16_t head = txq->elts_head;
> @@ -2316,7 +2316,7 @@ mlx5_tx_request_completion(struct
> mlx5_txq_data *restrict txq,
>  int
>  mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
>  {
> -	struct mlx5_txq_data *restrict txq = tx_queue;
> +	struct mlx5_txq_data *__rte_restrict txq = tx_queue;
>  	uint16_t used;
> 
>  	mlx5_tx_handle_completion(txq, 0);
> @@ -2347,14 +2347,14 @@ mlx5_tx_descriptor_status(void *tx_queue,
> uint16_t offset)
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_cseg_init(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc __rte_unused,
> -		  struct mlx5_wqe *restrict wqe,
> +mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
> +		  struct mlx5_wqe *__rte_restrict wqe,
>  		  unsigned int ds,
>  		  unsigned int opcode,
>  		  unsigned int olx __rte_unused)
>  {
> -	struct mlx5_wqe_cseg *restrict cs = &wqe->cseg;
> +	struct mlx5_wqe_cseg *__rte_restrict cs = &wqe->cseg;
> 
>  	/* For legacy MPW replace the EMPW by TSO with modifier. */
>  	if (MLX5_TXOFF_CONFIG(MPW) && opcode ==
> MLX5_OPCODE_ENHANCED_MPSW)
> @@ -2382,12 +2382,12 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *restrict
> txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_eseg_none(struct mlx5_txq_data *restrict txq __rte_unused,
> -		  struct mlx5_txq_local *restrict loc,
> -		  struct mlx5_wqe *restrict wqe,
> +mlx5_tx_eseg_none(struct mlx5_txq_data *__rte_restrict txq
> __rte_unused,
> +		  struct mlx5_txq_local *__rte_restrict loc,
> +		  struct mlx5_wqe *__rte_restrict wqe,
>  		  unsigned int olx)
>  {
> -	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
> +	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
>  	uint32_t csum;
> 
>  	/*
> @@ -2440,13 +2440,13 @@ mlx5_tx_eseg_none(struct mlx5_txq_data
> *restrict txq __rte_unused,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_eseg_dmin(struct mlx5_txq_data *restrict txq __rte_unused,
> -		  struct mlx5_txq_local *restrict loc,
> -		  struct mlx5_wqe *restrict wqe,
> +mlx5_tx_eseg_dmin(struct mlx5_txq_data *__rte_restrict txq
> __rte_unused,
> +		  struct mlx5_txq_local *__rte_restrict loc,
> +		  struct mlx5_wqe *__rte_restrict wqe,
>  		  unsigned int vlan,
>  		  unsigned int olx)
>  {
> -	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
> +	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
>  	uint32_t csum;
>  	uint8_t *psrc, *pdst;
> 
> @@ -2524,15 +2524,15 @@ mlx5_tx_eseg_dmin(struct mlx5_txq_data
> *restrict txq __rte_unused,
>   *   Pointer to the next Data Segment (aligned and wrapped around).
>   */
>  static __rte_always_inline struct mlx5_wqe_dseg *
> -mlx5_tx_eseg_data(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc,
> -		  struct mlx5_wqe *restrict wqe,
> +mlx5_tx_eseg_data(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc,
> +		  struct mlx5_wqe *__rte_restrict wqe,
>  		  unsigned int vlan,
>  		  unsigned int inlen,
>  		  unsigned int tso,
>  		  unsigned int olx)
>  {
> -	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
> +	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
>  	uint32_t csum;
>  	uint8_t *psrc, *pdst;
>  	unsigned int part;
> @@ -2650,7 +2650,7 @@ mlx5_tx_eseg_data(struct mlx5_txq_data *restrict
> txq,
>   */
>  static __rte_always_inline unsigned int
>  mlx5_tx_mseg_memcpy(uint8_t *pdst,
> -		    struct mlx5_txq_local *restrict loc,
> +		    struct mlx5_txq_local *__rte_restrict loc,
>  		    unsigned int len,
>  		    unsigned int must,
>  		    unsigned int olx __rte_unused)
> @@ -2747,15 +2747,15 @@ mlx5_tx_mseg_memcpy(uint8_t *pdst,
>   *   wrapping check on its own).
>   */
>  static __rte_always_inline struct mlx5_wqe_dseg *
> -mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc,
> -		  struct mlx5_wqe *restrict wqe,
> +mlx5_tx_eseg_mdat(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc,
> +		  struct mlx5_wqe *__rte_restrict wqe,
>  		  unsigned int vlan,
>  		  unsigned int inlen,
>  		  unsigned int tso,
>  		  unsigned int olx)
>  {
> -	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
> +	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
>  	uint32_t csum;
>  	uint8_t *pdst;
>  	unsigned int part, tlen = 0;
> @@ -2851,9 +2851,9 @@ mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict
> txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict txq,
> -		 struct mlx5_txq_local *restrict loc,
> -		 struct mlx5_wqe_dseg *restrict dseg,
> +mlx5_tx_dseg_ptr(struct mlx5_txq_data *__rte_restrict txq,
> +		 struct mlx5_txq_local *__rte_restrict loc,
> +		 struct mlx5_wqe_dseg *__rte_restrict dseg,
>  		 uint8_t *buf,
>  		 unsigned int len,
>  		 unsigned int olx __rte_unused)
> @@ -2885,9 +2885,9 @@ mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict
> txq,
>   *   compile time and may be used for optimization.
>   */
>  static __rte_always_inline void
> -mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc,
> -		  struct mlx5_wqe_dseg *restrict dseg,
> +mlx5_tx_dseg_iptr(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc,
> +		  struct mlx5_wqe_dseg *__rte_restrict dseg,
>  		  uint8_t *buf,
>  		  unsigned int len,
>  		  unsigned int olx __rte_unused)
> @@ -2961,9 +2961,9 @@ mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict
> txq,
>   *   last packet in the eMPW session.
>   */
>  static __rte_always_inline struct mlx5_wqe_dseg *
> -mlx5_tx_dseg_empw(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc __rte_unused,
> -		  struct mlx5_wqe_dseg *restrict dseg,
> +mlx5_tx_dseg_empw(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
> +		  struct mlx5_wqe_dseg *__rte_restrict dseg,
>  		  uint8_t *buf,
>  		  unsigned int len,
>  		  unsigned int olx __rte_unused)
> @@ -3024,9 +3024,9 @@ mlx5_tx_dseg_empw(struct mlx5_txq_data
> *restrict txq,
>   *   Ring buffer wraparound check is needed.
>   */
>  static __rte_always_inline struct mlx5_wqe_dseg *
> -mlx5_tx_dseg_vlan(struct mlx5_txq_data *restrict txq,
> -		  struct mlx5_txq_local *restrict loc __rte_unused,
> -		  struct mlx5_wqe_dseg *restrict dseg,
> +mlx5_tx_dseg_vlan(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
> +		  struct mlx5_wqe_dseg *__rte_restrict dseg,
>  		  uint8_t *buf,
>  		  unsigned int len,
>  		  unsigned int olx __rte_unused)
> @@ -3112,15 +3112,15 @@ mlx5_tx_dseg_vlan(struct mlx5_txq_data
> *restrict txq,
>   *   Actual size of built WQE in segments.
>   */
>  static __rte_always_inline unsigned int
> -mlx5_tx_mseg_build(struct mlx5_txq_data *restrict txq,
> -		   struct mlx5_txq_local *restrict loc,
> -		   struct mlx5_wqe *restrict wqe,
> +mlx5_tx_mseg_build(struct mlx5_txq_data *__rte_restrict txq,
> +		   struct mlx5_txq_local *__rte_restrict loc,
> +		   struct mlx5_wqe *__rte_restrict wqe,
>  		   unsigned int vlan,
>  		   unsigned int inlen,
>  		   unsigned int tso,
>  		   unsigned int olx __rte_unused)
>  {
> -	struct mlx5_wqe_dseg *restrict dseg;
> +	struct mlx5_wqe_dseg *__rte_restrict dseg;
>  	unsigned int ds;
> 
>  	MLX5_ASSERT((rte_pktmbuf_pkt_len(loc->mbuf) + vlan) >= inlen);
> @@ -3225,11 +3225,11 @@ mlx5_tx_mseg_build(struct mlx5_txq_data
> *restrict txq,
>   * Local context variables partially updated.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_packet_multi_tso(struct mlx5_txq_data *restrict txq,
> -			struct mlx5_txq_local *restrict loc,
> +mlx5_tx_packet_multi_tso(struct mlx5_txq_data *__rte_restrict txq,
> +			struct mlx5_txq_local *__rte_restrict loc,
>  			unsigned int olx)
>  {
> -	struct mlx5_wqe *restrict wqe;
> +	struct mlx5_wqe *__rte_restrict wqe;
>  	unsigned int ds, dlen, inlen, ntcp, vlan = 0;
> 
>  	/*
> @@ -3314,12 +3314,12 @@ mlx5_tx_packet_multi_tso(struct mlx5_txq_data
> *restrict txq,
>   * Local context variables partially updated.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_packet_multi_send(struct mlx5_txq_data *restrict txq,
> -			  struct mlx5_txq_local *restrict loc,
> +mlx5_tx_packet_multi_send(struct mlx5_txq_data *__rte_restrict txq,
> +			  struct mlx5_txq_local *__rte_restrict loc,
>  			  unsigned int olx)
>  {
> -	struct mlx5_wqe_dseg *restrict dseg;
> -	struct mlx5_wqe *restrict wqe;
> +	struct mlx5_wqe_dseg *__rte_restrict dseg;
> +	struct mlx5_wqe *__rte_restrict wqe;
>  	unsigned int ds, nseg;
> 
>  	MLX5_ASSERT(NB_SEGS(loc->mbuf) > 1);
> @@ -3422,11 +3422,11 @@ mlx5_tx_packet_multi_send(struct
> mlx5_txq_data *restrict txq,
>   * Local context variables partially updated.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_packet_multi_inline(struct mlx5_txq_data *restrict txq,
> -			    struct mlx5_txq_local *restrict loc,
> +mlx5_tx_packet_multi_inline(struct mlx5_txq_data *__rte_restrict txq,
> +			    struct mlx5_txq_local *__rte_restrict loc,
>  			    unsigned int olx)
>  {
> -	struct mlx5_wqe *restrict wqe;
> +	struct mlx5_wqe *__rte_restrict wqe;
>  	unsigned int ds, inlen, dlen, vlan = 0;
> 
>  	MLX5_ASSERT(MLX5_TXOFF_CONFIG(INLINE));
> @@ -3587,10 +3587,10 @@ mlx5_tx_packet_multi_inline(struct
> mlx5_txq_data *restrict txq,
>   * Local context variables updated.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_mseg(struct mlx5_txq_data *restrict txq,
> -		   struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_mseg(struct mlx5_txq_data *__rte_restrict txq,
> +		   struct rte_mbuf **__rte_restrict pkts,
>  		   unsigned int pkts_n,
> -		   struct mlx5_txq_local *restrict loc,
> +		   struct mlx5_txq_local *__rte_restrict loc,
>  		   unsigned int olx)
>  {
>  	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
> @@ -3676,10 +3676,10 @@ mlx5_tx_burst_mseg(struct mlx5_txq_data
> *restrict txq,
>   * Local context variables updated.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
> -		  struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_tso(struct mlx5_txq_data *__rte_restrict txq,
> +		  struct rte_mbuf **__rte_restrict pkts,
>  		  unsigned int pkts_n,
> -		  struct mlx5_txq_local *restrict loc,
> +		  struct mlx5_txq_local *__rte_restrict loc,
>  		  unsigned int olx)
>  {
>  	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
> @@ -3687,8 +3687,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict
> txq,
>  	pkts += loc->pkts_sent + 1;
>  	pkts_n -= loc->pkts_sent;
>  	for (;;) {
> -		struct mlx5_wqe_dseg *restrict dseg;
> -		struct mlx5_wqe *restrict wqe;
> +		struct mlx5_wqe_dseg *__rte_restrict dseg;
> +		struct mlx5_wqe *__rte_restrict wqe;
>  		unsigned int ds, dlen, hlen, ntcp, vlan = 0;
>  		uint8_t *dptr;
> 
> @@ -3800,8 +3800,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict
> txq,
>   *  MLX5_TXCMP_CODE_EMPW - single-segment packet, use MPW.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_able_to_empw(struct mlx5_txq_data *restrict txq,
> -		     struct mlx5_txq_local *restrict loc,
> +mlx5_tx_able_to_empw(struct mlx5_txq_data *__rte_restrict txq,
> +		     struct mlx5_txq_local *__rte_restrict loc,
>  		     unsigned int olx,
>  		     bool newp)
>  {
> @@ -3855,9 +3855,9 @@ mlx5_tx_able_to_empw(struct mlx5_txq_data
> *restrict txq,
>   *  false - no match, eMPW should be restarted.
>   */
>  static __rte_always_inline bool
> -mlx5_tx_match_empw(struct mlx5_txq_data *restrict txq __rte_unused,
> -		   struct mlx5_wqe_eseg *restrict es,
> -		   struct mlx5_txq_local *restrict loc,
> +mlx5_tx_match_empw(struct mlx5_txq_data *__rte_restrict txq
> __rte_unused,
> +		   struct mlx5_wqe_eseg *__rte_restrict es,
> +		   struct mlx5_txq_local *__rte_restrict loc,
>  		   uint32_t dlen,
>  		   unsigned int olx)
>  {
> @@ -3909,8 +3909,8 @@ mlx5_tx_match_empw(struct mlx5_txq_data
> *restrict txq __rte_unused,
>   *  false - no match, eMPW should be restarted.
>   */
>  static __rte_always_inline void
> -mlx5_tx_sdone_empw(struct mlx5_txq_data *restrict txq,
> -		   struct mlx5_txq_local *restrict loc,
> +mlx5_tx_sdone_empw(struct mlx5_txq_data *__rte_restrict txq,
> +		   struct mlx5_txq_local *__rte_restrict loc,
>  		   unsigned int ds,
>  		   unsigned int slen,
>  		   unsigned int olx __rte_unused)
> @@ -3954,11 +3954,11 @@ mlx5_tx_sdone_empw(struct mlx5_txq_data
> *restrict txq,
>   *  false - no match, eMPW should be restarted.
>   */
>  static __rte_always_inline void
> -mlx5_tx_idone_empw(struct mlx5_txq_data *restrict txq,
> -		   struct mlx5_txq_local *restrict loc,
> +mlx5_tx_idone_empw(struct mlx5_txq_data *__rte_restrict txq,
> +		   struct mlx5_txq_local *__rte_restrict loc,
>  		   unsigned int len,
>  		   unsigned int slen,
> -		   struct mlx5_wqe *restrict wqem,
> +		   struct mlx5_wqe *__rte_restrict wqem,
>  		   unsigned int olx __rte_unused)
>  {
>  	struct mlx5_wqe_dseg *dseg = &wqem->dseg[0];
> @@ -4042,10 +4042,10 @@ mlx5_tx_idone_empw(struct mlx5_txq_data
> *restrict txq,
>   * No VLAN insertion is supported.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
> -			  struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_empw_simple(struct mlx5_txq_data *__rte_restrict txq,
> +			  struct rte_mbuf **__rte_restrict pkts,
>  			  unsigned int pkts_n,
> -			  struct mlx5_txq_local *restrict loc,
> +			  struct mlx5_txq_local *__rte_restrict loc,
>  			  unsigned int olx)
>  {
>  	/*
> @@ -4061,8 +4061,8 @@ mlx5_tx_burst_empw_simple(struct
> mlx5_txq_data *restrict txq,
>  	pkts += loc->pkts_sent + 1;
>  	pkts_n -= loc->pkts_sent;
>  	for (;;) {
> -		struct mlx5_wqe_dseg *restrict dseg;
> -		struct mlx5_wqe_eseg *restrict eseg;
> +		struct mlx5_wqe_dseg *__rte_restrict dseg;
> +		struct mlx5_wqe_eseg *__rte_restrict eseg;
>  		enum mlx5_txcmp_code ret;
>  		unsigned int part, loop;
>  		unsigned int slen = 0;
> @@ -4208,10 +4208,10 @@ mlx5_tx_burst_empw_simple(struct
> mlx5_txq_data *restrict txq,
>   * with inlining, optionally supports VLAN insertion.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
> -			  struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_empw_inline(struct mlx5_txq_data *__rte_restrict txq,
> +			  struct rte_mbuf **__rte_restrict pkts,
>  			  unsigned int pkts_n,
> -			  struct mlx5_txq_local *restrict loc,
> +			  struct mlx5_txq_local *__rte_restrict loc,
>  			  unsigned int olx)
>  {
>  	/*
> @@ -4227,8 +4227,8 @@ mlx5_tx_burst_empw_inline(struct mlx5_txq_data
> *restrict txq,
>  	pkts += loc->pkts_sent + 1;
>  	pkts_n -= loc->pkts_sent;
>  	for (;;) {
> -		struct mlx5_wqe_dseg *restrict dseg;
> -		struct mlx5_wqe *restrict wqem;
> +		struct mlx5_wqe_dseg *__rte_restrict dseg;
> +		struct mlx5_wqe *__rte_restrict wqem;
>  		enum mlx5_txcmp_code ret;
>  		unsigned int room, part, nlim;
>  		unsigned int slen = 0;
> @@ -4489,10 +4489,10 @@ mlx5_tx_burst_empw_inline(struct
> mlx5_txq_data *restrict txq,
>   * Data inlining and VLAN insertion are supported.
>   */
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
> -			  struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_single_send(struct mlx5_txq_data *__rte_restrict txq,
> +			  struct rte_mbuf **__rte_restrict pkts,
>  			  unsigned int pkts_n,
> -			  struct mlx5_txq_local *restrict loc,
> +			  struct mlx5_txq_local *__rte_restrict loc,
>  			  unsigned int olx)
>  {
>  	/*
> @@ -4504,7 +4504,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data
> *restrict txq,
>  	pkts += loc->pkts_sent + 1;
>  	pkts_n -= loc->pkts_sent;
>  	for (;;) {
> -		struct mlx5_wqe *restrict wqe;
> +		struct mlx5_wqe *__rte_restrict wqe;
>  		enum mlx5_txcmp_code ret;
> 
>  		MLX5_ASSERT(NB_SEGS(loc->mbuf) == 1);
> @@ -4602,7 +4602,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data
> *restrict txq,
>  				 * not contain inlined data for eMPW due to
>  				 * segment shared for all packets.
>  				 */
> -				struct mlx5_wqe_dseg *restrict dseg;
> +				struct mlx5_wqe_dseg *__rte_restrict dseg;
>  				unsigned int ds;
>  				uint8_t *dptr;
> 
> @@ -4765,10 +4765,10 @@ mlx5_tx_burst_single_send(struct
> mlx5_txq_data *restrict txq,
>  }
> 
>  static __rte_always_inline enum mlx5_txcmp_code
> -mlx5_tx_burst_single(struct mlx5_txq_data *restrict txq,
> -		     struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_single(struct mlx5_txq_data *__rte_restrict txq,
> +		     struct rte_mbuf **__rte_restrict pkts,
>  		     unsigned int pkts_n,
> -		     struct mlx5_txq_local *restrict loc,
> +		     struct mlx5_txq_local *__rte_restrict loc,
>  		     unsigned int olx)
>  {
>  	enum mlx5_txcmp_code ret;
> @@ -4819,8 +4819,8 @@ mlx5_tx_burst_single(struct mlx5_txq_data
> *restrict txq,
>   *   Number of packets successfully transmitted (<= pkts_n).
>   */
>  static __rte_always_inline uint16_t
> -mlx5_tx_burst_tmpl(struct mlx5_txq_data *restrict txq,
> -		   struct rte_mbuf **restrict pkts,
> +mlx5_tx_burst_tmpl(struct mlx5_txq_data *__rte_restrict txq,
> +		   struct rte_mbuf **__rte_restrict pkts,
>  		   uint16_t pkts_n,
>  		   unsigned int olx)
>  {
> --
> 2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
  2020-07-07  2:15     ` Jerin Jacob
  2020-07-07  2:24     ` Phil Yang
@ 2020-07-07  2:40     ` Ruifeng Wang
  2020-07-07 13:57     ` David Marchand
  3 siblings, 0 replies; 40+ messages in thread
From: Ruifeng Wang @ 2020-07-07  2:40 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Phil Yang
  Cc: dev, nd, nd


> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 1/6] lib/eal: add a common wrapper for restricted
> pointers
> 
> The 'restrict' keyword is recognized in C99, while type qulifier '__restrict'
> compiles ok in C with all language levels. This patch is to add a wrapper
> defining '__rte_restrict' with 'restrict' and '__restrict' to be supported by all
> compilers.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  lib/librte_eal/include/rte_common.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
> 
> diff --git a/lib/librte_eal/include/rte_common.h
> b/lib/librte_eal/include/rte_common.h
> index 0843ce69e..cda32c056 100644
> --- a/lib/librte_eal/include/rte_common.h
> +++ b/lib/librte_eal/include/rte_common.h
> @@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
>   */
>  #define __rte_unused __attribute__((__unused__))
> 
> +/**
> + * Define a wrapper for restricted pointers which can be supported
> + * by all compilers.
> + */
> +#if __STDC_VERSION__ >= 199901
> +#define __rte_restrict restrict
> +#else
> +#define __rte_restrict __restrict
> +#endif
> +
>  /**
>   * definition to mark a variable or function parameter as used so
>   * as to avoid a compiler warning
> --
> 2.27.0
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
  2020-07-07  2:25     ` Phil Yang
@ 2020-07-07  2:43     ` Ruifeng Wang
  2020-07-07 14:00     ` David Marchand
  2 siblings, 0 replies; 40+ messages in thread
From: Ruifeng Wang @ 2020-07-07  2:43 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Phil Yang
  Cc: dev, nd, nd


> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
> 
> '__rte_restrict' is a common wrapper for restricted pointers which can be
> supported by all compilers. Use '__rte_restrict' instead of '__restrict' for
> code consistency.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  drivers/net/i40e/i40e_rxtx_vec_neon.c | 17 +++++++++--------
>  1 file changed, 9 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> index 1dfd0478b..4574139d5 100644
> --- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
> +++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
> @@ -172,8 +172,8 @@ desc_to_olflags_v(struct i40e_rx_queue *rxq,
> uint64x2_t descs[4],  #define I40E_UINT16_BIT (CHAR_BIT * sizeof(uint16_t))
> 
>  static inline void
> -desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
> -		uint32_t *__restrict ptype_tbl)
> +desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__rte_restrict
> rx_pkts,
> +		uint32_t *__rte_restrict ptype_tbl)
>  {
>  	int i;
>  	uint8_t ptype;
> @@ -194,8 +194,9 @@ desc_to_ptype_v(uint64x2_t descs[4], struct
> rte_mbuf **__restrict rx_pkts,
>   *   numbers of DD bits
>   */
>  static inline uint16_t
> -_recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
> -	**__restrict rx_pkts, uint16_t nb_pkts, uint8_t *split_packet)
> +_recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
> +		   struct rte_mbuf **__rte_restrict rx_pkts,
> +		   uint16_t nb_pkts, uint8_t *split_packet)
>  {
>  	volatile union i40e_rx_desc *rxdp;
>  	struct i40e_rx_entry *sw_ring;
> @@ -432,8 +433,8 @@ _recv_raw_pkts_vec(struct i40e_rx_queue
> *__restrict rxq, struct rte_mbuf
>   *   numbers of DD bits
>   */
>  uint16_t
> -i40e_recv_pkts_vec(void *__restrict rx_queue,
> -		struct rte_mbuf **__restrict rx_pkts, uint16_t nb_pkts)
> +i40e_recv_pkts_vec(void *__rte_restrict rx_queue,
> +		struct rte_mbuf **__rte_restrict rx_pkts, uint16_t nb_pkts)
>  {
>  	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);  }
> @@ -504,8 +505,8 @@ vtx(volatile struct i40e_tx_desc *txdp, struct
> rte_mbuf **pkt,  }
> 
>  uint16_t
> -i40e_xmit_fixed_burst_vec(void *__restrict tx_queue,
> -	struct rte_mbuf **__restrict tx_pkts, uint16_t nb_pkts)
> +i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue,
> +	struct rte_mbuf **__rte_restrict tx_pkts, uint16_t nb_pkts)
>  {
>  	struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue;
>  	volatile struct i40e_tx_desc *txdp;
> --
> 2.27.0
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper Joyce Kong
  2020-07-07  2:27     ` Phil Yang
@ 2020-07-07  2:45     ` Ruifeng Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Ruifeng Wang @ 2020-07-07  2:45 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Phil Yang
  Cc: dev, nd, nd


> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:49 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 5/6] examples/performance-thread: replace restrict with
> wrapper
> 
> '__rte_restrict' is a common wrapper for restricted pointers which can be
> supported by all compilers. Use '__rte_restrict' instead of '__restrict' for
> code consistency.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  .../performance-thread/pthread_shim/pthread_shim.c   | 12 ++++++------
>  1 file changed, 6 insertions(+), 6 deletions(-)
> 
> diff --git a/examples/performance-thread/pthread_shim/pthread_shim.c
> b/examples/performance-thread/pthread_shim/pthread_shim.c
> index 93e8dca3f..bbc076584 100644
> --- a/examples/performance-thread/pthread_shim/pthread_shim.c
> +++ b/examples/performance-thread/pthread_shim/pthread_shim.c
> @@ -341,9 +341,9 @@ int pthread_cond_signal(pthread_cond_t *cond)  }
> 
>  int
> -pthread_cond_timedwait(pthread_cond_t *__restrict cond,
> -		       pthread_mutex_t *__restrict mutex,
> -		       const struct timespec *__restrict time)
> +pthread_cond_timedwait(pthread_cond_t *__rte_restrict cond,
> +		       pthread_mutex_t *__rte_restrict mutex,
> +		       const struct timespec *__rte_restrict time)
>  {
>  	NOT_IMPLEMENTED;
>  	return _sys_pthread_funcs.f_pthread_cond_timedwait(cond,
> mutex, time); @@ -362,10 +362,10 @@ int
> pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)  }
> 
>  int
> -pthread_create(pthread_t *__restrict tid,
> -		const pthread_attr_t *__restrict attr,
> +pthread_create(pthread_t *__rte_restrict tid,
> +		const pthread_attr_t *__rte_restrict attr,
>  		lthread_func_t func,
> -	       void *__restrict arg)
> +	       void *__rte_restrict arg)
>  {
>  	if (override) {
>  		int lcore = -1;
> --
> 2.27.0
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict Joyce Kong
  2020-07-07  2:28     ` Phil Yang
@ 2020-07-07  2:47     ` Ruifeng Wang
  1 sibling, 0 replies; 40+ messages in thread
From: Ruifeng Wang @ 2020-07-07  2:47 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	Honnappa Nagarahalli, Phil Yang
  Cc: dev, nd, nd


> -----Original Message-----
> From: Joyce Kong <joyce.kong@arm.com>
> Sent: Monday, July 6, 2020 3:50 PM
> To: maxime.coquelin@redhat.com; jerinj@marvell.com;
> zhihong.wang@intel.com; xiaolong.ye@intel.com; beilei.xing@intel.com;
> jia.guo@intel.com; john.mcnamara@intel.com; matan@mellanox.com;
> shahafs@mellanox.com; viacheslavo@mellanox.com; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict
> 
> The 'restrict' keyword is recognized in C99, which might have some issues
> with old compilers. It is better to use the wrapper '__rte_restrict' which can
> be supported by all compilers for restricted pointers.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
> ---
>  drivers/net/mlx5/mlx5_rxtx.c | 208 +++++++++++++++++------------------
>  1 file changed, 104 insertions(+), 104 deletions(-)
> 
<snip>

^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
                       ` (2 preceding siblings ...)
  2020-07-07  2:40     ` Ruifeng Wang
@ 2020-07-07 13:57     ` David Marchand
  2020-07-08  2:46       ` Joyce Kong
  3 siblings, 1 reply; 40+ messages in thread
From: David Marchand @ 2020-07-07 13:57 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Maxime Coquelin, Jerin Jacob Kollanukkaran, Zhihong Wang,
	Xiaolong Ye, Beilei Xing, Jeff Guo, Mcnamara, John, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Honnappa Nagarahalli,
	Phil Yang, Ruifeng Wang (Arm Technology China),
	dev, nd

On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> The 'restrict' keyword is recognized in C99, while type qulifier
> '__restrict' compiles ok in C with all language levels. This patch
> is to add a wrapper defining '__rte_restrict' with 'restrict' and
> '__restrict' to be supported by all compilers.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> ---
>  lib/librte_eal/include/rte_common.h | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/lib/librte_eal/include/rte_common.h b/lib/librte_eal/include/rte_common.h
> index 0843ce69e..cda32c056 100644
> --- a/lib/librte_eal/include/rte_common.h
> +++ b/lib/librte_eal/include/rte_common.h
> @@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
>   */
>  #define __rte_unused __attribute__((__unused__))
>
> +/**
> + * Define a wrapper for restricted pointers which can be supported
> + * by all compilers.
> + */
> +#if __STDC_VERSION__ >= 199901
> +#define __rte_restrict restrict
> +#else
> +#define __rte_restrict __restrict
> +#endif
> +
>  /**
>   * definition to mark a variable or function parameter as used so
>   * as to avoid a compiler warning
> --
> 2.27.0
>

This triggers a build error on Centos 7 as reported by the CI.
I suppose the following would do the trick, though it is untested:

/**
 * Define a wrapper for restricted pointers which can be supported
 * by all compilers.
 */
#if !defined(__STDC_VERSION__) || __STDC_VERSION__ < 199901L
#define __rte_restrict __restrict
#else
#define __rte_restrict restrict
#endif


-- 
David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
@ 2020-07-07 13:58     ` David Marchand
  0 siblings, 0 replies; 40+ messages in thread
From: David Marchand @ 2020-07-07 13:58 UTC (permalink / raw)
  To: Joyce Kong, Zhihong Wang, Adrian Moreno Zapata
  Cc: Maxime Coquelin, Jerin Jacob Kollanukkaran, Xiaolong Ye,
	Beilei Xing, Jeff Guo, Mcnamara, John, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Honnappa Nagarahalli,
	Phil Yang, Ruifeng Wang (Arm Technology China),
	dev, nd

On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> Restrict pointer aliasing to allow the compiler to vectorize loop
> more aggressively.
>
> With this patch, a 9.6% improvement is observed in throughput for
> the packed virtio-net PVP case, and a 2.8% improvement in throughput
> for the packed virtio-user PVP case. All performance data are measured
> under 0.001% acceptable packet loss with 1 core on both vhost and
> virtio side.
>
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> ---
>  drivers/net/virtio/virtio_rxtx_simple_neon.c |  5 +++--
>  lib/librte_vhost/virtio_net.c                | 14 +++++++-------
>  2 files changed, 10 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> index 5febfb0f5..31824a931 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> @@ -36,8 +36,9 @@
>   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
>   */
>  uint16_t
> -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> -               **__rte_restrict rx_pkts, uint16_t nb_pkts)
> +virtio_recv_pkts_vec(void *rx_queue,
> +               struct rte_mbuf **__rte_restrict rx_pkts,
> +               uint16_t nb_pkts)
>  {
>         struct virtnet_rx *rxvq = rx_queue;
>         struct virtqueue *vq = rxvq->vq;

For the neon bits, I trust you.


> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 751c1f373..e60358251 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1133,8 +1133,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
>
>  static __rte_noinline uint32_t
>  virtio_dev_rx_packed(struct virtio_net *dev,
> -                    struct vhost_virtqueue *vq,
> -                    struct rte_mbuf **pkts,
> +                    struct vhost_virtqueue *__rte_restrict vq,
> +                    struct rte_mbuf **__rte_restrict pkts,
>                      uint32_t count)
>  {
>         uint32_t pkt_idx = 0;

But for the generic part, I'd like to get others opinion.
Added Zhihong and Adrian.


> @@ -1219,7 +1219,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>
>  uint16_t
>  rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
> -       struct rte_mbuf **pkts, uint16_t count)
> +       struct rte_mbuf **__rte_restrict pkts, uint16_t count)
>  {
>         struct virtio_net *dev = get_device(vid);
>
> @@ -2124,9 +2124,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
>
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> -                          struct vhost_virtqueue *vq,
> +                          struct vhost_virtqueue *__rte_restrict vq,
>                            struct rte_mempool *mbuf_pool,
> -                          struct rte_mbuf **pkts,
> +                          struct rte_mbuf **__rte_restrict pkts,
>                            uint32_t count)
>  {
>         uint32_t pkt_idx = 0;
> @@ -2160,9 +2160,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
> -                    struct vhost_virtqueue *vq,
> +                    struct vhost_virtqueue *__rte_restrict vq,
>                      struct rte_mempool *mbuf_pool,
> -                    struct rte_mbuf **pkts,
> +                    struct rte_mbuf **__rte_restrict pkts,
>                      uint32_t count)
>  {
>         uint32_t pkt_idx = 0;
> --
> 2.27.0
>


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
  2020-07-07  2:25     ` Phil Yang
  2020-07-07  2:43     ` Ruifeng Wang
@ 2020-07-07 14:00     ` David Marchand
  2020-07-08  3:21       ` Joyce Kong
  2 siblings, 1 reply; 40+ messages in thread
From: David Marchand @ 2020-07-07 14:00 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Maxime Coquelin, Jerin Jacob Kollanukkaran, Zhihong Wang,
	Xiaolong Ye, Beilei Xing, Jeff Guo, Mcnamara, John, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Honnappa Nagarahalli,
	Phil Yang, Ruifeng Wang (Arm Technology China),
	dev, nd

On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> '__rte_restrict' is a common wrapper for restricted pointers which
> can be supported by all compilers. Use '__rte_restrict' instead of
> '__restrict' for code consistency.

This patch 4, 5 and 6 are simple replacements and can be squashed into
the first patch.

Thanks.

-- 
David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path
  2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path Joyce Kong
@ 2020-07-07 16:25   ` Adrian Moreno
  2020-07-10  3:15     ` Joyce Kong
  0 siblings, 1 reply; 40+ messages in thread
From: Adrian Moreno @ 2020-07-07 16:25 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev


On 6/11/20 5:32 AM, Joyce Kong wrote:
> Restrict pointer aliasing to allow the compiler to vectorize
> loops more aggressively.
> 
> With this patch, a 9.6% improvement is observed in throughput for the
> packed virtio-net PVP case, and a 2.8% improvement in throughput for
> the packed virtio-user PVP case. All performance data are measured
> under 0.001% acceptable packet loss with 1 core on both vhost and
> virtio side.

Is the performance gain related solely to this patch or is it the result of the
combined series?

> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> ---
>  lib/librte_vhost/virtio_net.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
> 
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 751c1f373..39c92e7e1 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1133,8 +1133,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
>  
>  static __rte_noinline uint32_t
>  virtio_dev_rx_packed(struct virtio_net *dev,
> -		     struct vhost_virtqueue *vq,
> -		     struct rte_mbuf **pkts,
> +		     struct vhost_virtqueue *__restrict vq,
> +		     struct rte_mbuf **__restrict pkts,
>  		     uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;

I wonder if we're extracting the full potential of "restrict" considering that
the heavy lifting is done by the inner functions:
(virtio_dev_rx_batch_packed and virtio_dev_rx_single_packed)

> @@ -1219,7 +1219,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  uint16_t
>  rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
> -	struct rte_mbuf **pkts, uint16_t count)
> +	struct rte_mbuf **__restrict pkts, uint16_t count)
>  {
>  	struct virtio_net *dev = get_device(vid);
>  
Is this considered an api change?

> @@ -2124,9 +2124,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> -			   struct vhost_virtqueue *vq,
> +			   struct vhost_virtqueue *__restrict vq,
>  			   struct rte_mempool *mbuf_pool,
> -			   struct rte_mbuf **pkts,
> +			   struct rte_mbuf **__restrict pkts,
>  			   uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;
> @@ -2160,9 +2160,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
> -		     struct vhost_virtqueue *vq,
> +		     struct vhost_virtqueue *__restrict vq,
>  		     struct rte_mempool *mbuf_pool,
> -		     struct rte_mbuf **pkts,
> +		     struct rte_mbuf **__restrict pkts,
>  		     uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;
> 

-- 
Adrián Moreno


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers
  2020-07-07 13:57     ` David Marchand
@ 2020-07-08  2:46       ` Joyce Kong
  0 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-08  2:46 UTC (permalink / raw)
  To: David Marchand
  Cc: Maxime Coquelin, jerinj, Zhihong Wang, Xiaolong Ye, Beilei Xing,
	Jeff Guo, Mcnamara, John, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Honnappa Nagarahalli, Phil Yang,
	Ruifeng Wang, dev, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, July 7, 2020 9:57 PM
> To: Joyce Kong <Joyce.Kong@arm.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; jerinj@marvell.com;
> Zhihong Wang <zhihong.wang@intel.com>; Xiaolong Ye
> <xiaolong.ye@intel.com>; Beilei Xing <beilei.xing@intel.com>; Jeff Guo
> <jia.guo@intel.com>; Mcnamara, John <john.mcnamara@intel.com>; Matan
> Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Viacheslav Ovsiienko <viacheslavo@mellanox.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; dev <dev@dpdk.org>; nd
> <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for
> restricted pointers
> 
> On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
> >
> > The 'restrict' keyword is recognized in C99, while type qulifier
> > '__restrict' compiles ok in C with all language levels. This patch is
> > to add a wrapper defining '__rte_restrict' with 'restrict' and
> > '__restrict' to be supported by all compilers.
> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > ---
> >  lib/librte_eal/include/rte_common.h | 10 ++++++++++
> >  1 file changed, 10 insertions(+)
> >
> > diff --git a/lib/librte_eal/include/rte_common.h
> > b/lib/librte_eal/include/rte_common.h
> > index 0843ce69e..cda32c056 100644
> > --- a/lib/librte_eal/include/rte_common.h
> > +++ b/lib/librte_eal/include/rte_common.h
> > @@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
> >   */
> >  #define __rte_unused __attribute__((__unused__))
> >
> > +/**
> > + * Define a wrapper for restricted pointers which can be supported
> > + * by all compilers.
> > + */
> > +#if __STDC_VERSION__ >= 199901
> > +#define __rte_restrict restrict
> > +#else
> > +#define __rte_restrict __restrict
> > +#endif
> > +
> >  /**
> >   * definition to mark a variable or function parameter as used so
> >   * as to avoid a compiler warning
> > --
> > 2.27.0
> >
> 
> This triggers a build error on Centos 7 as reported by the CI.
> I suppose the following would do the trick, though it is untested:
> 
> /**
>  * Define a wrapper for restricted pointers which can be supported
>  * by all compilers.
>  */
> #if !defined(__STDC_VERSION__) || __STDC_VERSION__ < 199901L #define
> __rte_restrict __restrict #else #define __rte_restrict restrict #endif
> 

Will add this judgement in next version.

> 
> --
> David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-07 14:00     ` David Marchand
@ 2020-07-08  3:21       ` Joyce Kong
  2020-07-09  9:57         ` David Marchand
  0 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-07-08  3:21 UTC (permalink / raw)
  To: David Marchand
  Cc: Maxime Coquelin, jerinj, Zhihong Wang, Xiaolong Ye, Beilei Xing,
	Jeff Guo, Mcnamara, John, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Honnappa Nagarahalli, Phil Yang,
	Ruifeng Wang, dev, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Tuesday, July 7, 2020 10:00 PM
> To: Joyce Kong <Joyce.Kong@arm.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; jerinj@marvell.com;
> Zhihong Wang <zhihong.wang@intel.com>; Xiaolong Ye
> <xiaolong.ye@intel.com>; Beilei Xing <beilei.xing@intel.com>; Jeff Guo
> <jia.guo@intel.com>; Mcnamara, John <john.mcnamara@intel.com>; Matan
> Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Viacheslav Ovsiienko <viacheslavo@mellanox.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; dev <dev@dpdk.org>; nd
> <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte
> restrict
> 
> On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
> >
> > '__rte_restrict' is a common wrapper for restricted pointers which can
> > be supported by all compilers. Use '__rte_restrict' instead of
> > '__restrict' for code consistency.
> 
> This patch 4, 5 and 6 are simple replacements and can be squashed into the
> first patch.
> 
> Thanks.
> 
The first patch is to add a common definition for lib_eal, could we squash the 4,5 and 6 
into one replacement patch, while separate them from the first one?
This might be more convenient for review?

> --
> David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-08  3:21       ` Joyce Kong
@ 2020-07-09  9:57         ` David Marchand
  2020-07-10  2:45           ` Joyce Kong
  0 siblings, 1 reply; 40+ messages in thread
From: David Marchand @ 2020-07-09  9:57 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Maxime Coquelin, jerinj, Zhihong Wang, Xiaolong Ye, Beilei Xing,
	Jeff Guo, Mcnamara, John, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Honnappa Nagarahalli, Phil Yang,
	Ruifeng Wang, dev, nd

On Wed, Jul 8, 2020 at 5:21 AM Joyce Kong <Joyce.Kong@arm.com> wrote:
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Tuesday, July 7, 2020 10:00 PM
> > To: Joyce Kong <Joyce.Kong@arm.com>
> > Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; jerinj@marvell.com;
> > Zhihong Wang <zhihong.wang@intel.com>; Xiaolong Ye
> > <xiaolong.ye@intel.com>; Beilei Xing <beilei.xing@intel.com>; Jeff Guo
> > <jia.guo@intel.com>; Mcnamara, John <john.mcnamara@intel.com>; Matan
> > Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> > Viacheslav Ovsiienko <viacheslavo@mellanox.com>; Honnappa Nagarahalli
> > <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> > Ruifeng Wang <Ruifeng.Wang@arm.com>; dev <dev@dpdk.org>; nd
> > <nd@arm.com>
> > Subject: Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte
> > restrict
> >
> > On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com> wrote:
> > >
> > > '__rte_restrict' is a common wrapper for restricted pointers which can
> > > be supported by all compilers. Use '__rte_restrict' instead of
> > > '__restrict' for code consistency.
> >
> > This patch 4, 5 and 6 are simple replacements and can be squashed into the
> > first patch.
> >
> > Thanks.
> >
> The first patch is to add a common definition for lib_eal, could we squash the 4,5 and 6
> into one replacement patch, while separate them from the first one?
> This might be more convenient for review?

A copy/paste commitlog hints that it did not deserve separate patches.
About easing reviews, the changes are mechanical, there is no special case.
I am not convinced but, go as you like.

Can you provide a new revision wrt patch 1 problem?
Thanks.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
                     ` (5 preceding siblings ...)
  2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict Joyce Kong
@ 2020-07-09 13:52   ` Morten Brørup
  2020-07-10  3:17     ` Joyce Kong
  6 siblings, 1 reply; 40+ messages in thread
From: Morten Brørup @ 2020-07-09 13:52 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> Sent: Monday, July 6, 2020 9:49 AM
> 
> As the 'restrict' keyword is recognized in C99, this patchset is to
> add a wrapper defining '__rte_restrict' which can be supported by
> all compilers. Then replace the existing 'restrict' and '__restrict'
> in different vpmds, and optimize vhost/virtio with restricted pointer
> aliasing for more aggressive loops vectorization.
> 
> The vhost/virtio optimization patches were benchmarked by running PVP
> case on ThunderX2 platform and showed positive performance results.
> 
> Joyce Kong (6):
>   lib/eal: add a wrapper to define restricted pointers
>   net/virtio: restrict pointer aliasing for NEON vpmd
>   lib/vhost: restrict pointer aliasing for packed vpmd
>   net/i40e: replace restrict with rte restrict
>   examples/performance-thread: replace restrict with wrapper
>   net/mlx5: replace restrict keyword with rte restrict
> 
>  drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
>  drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
>  drivers/net/virtio/virtio_rxtx_simple_neon.c  |   5 +-
>  .../pthread_shim/pthread_shim.c               |  12 +-
>  lib/librte_eal/include/rte_common.h           |  10 +
>  lib/librte_vhost/virtio_net.c                 |  14 +-
>  6 files changed, 139 insertions(+), 127 deletions(-)
> 
> --
> 2.27.0
> 

If you are hunting for more places to add __rte_restrict, I will suggest rte_memcpy.h.



^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper
  2020-06-11  3:32 [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization Joyce Kong
                   ` (2 preceding siblings ...)
  2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
@ 2020-07-10  2:38 ` Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 1/3] lib/eal: add a common wrapper for restricted pointers Joyce Kong
                     ` (3 more replies)
  3 siblings, 4 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  2:38 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, amorenoz, mb, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

As the 'restrict' keyword is recognized in C99, this patchset is to
add a wrapper defining '__rte_restrict' which can be supported by
all compilers. Then replace the existing 'restrict' and '__restrict'
in different vpmds, and optimize vhost/virtio with restricted pointer
aliasing for more aggressive loops vectorization.

The vhost/virtio optimization patches were benchmarked by running PVP
case on ThunderX2 platform and showed positive performance results.

v3:
  1.Correct the compiling issue on GCC 4.8.5.
  2.Squash the replacement patches and wrapper definition into one
    patch.(suggested by David Marchand)

v2:
  Add a common wrapper for restricted pointer aliasing to be supported
  by all compilers.(suggested by Maxime Coquelin)

Joyce Kong (3):
  lib/eal: add a common wrapper for restricted pointers
  net/virtio: restrict pointer aliasing for NEON vpmd
  lib/vhost: restrict pointer aliasing for packed vpmd

 drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
 drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
 drivers/net/virtio/virtio_rxtx_simple_neon.c  |   5 +-
 .../pthread_shim/pthread_shim.c               |  12 +-
 lib/librte_eal/include/rte_common.h           |  10 +
 lib/librte_vhost/virtio_net.c                 |  14 +-
 6 files changed, 139 insertions(+), 127 deletions(-)

-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v3 1/3] lib/eal: add a common wrapper for restricted pointers
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
@ 2020-07-10  2:38   ` Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 2/3] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
                     ` (2 subsequent siblings)
  3 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  2:38 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, amorenoz, mb, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

The 'restrict' keyword is recognized in C99, while type qulifier
'__restrict' compiles ok in C with all language levels. This patch
is to replace the existing 'restrict' with '__rte_restrict' which
is a common wrapper supported by all compilers.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Ruifeng Wang <ruifeng.wang@arm.com>
Acked-by: Jerin Jacob <jerinj@marvell.com>
---
 drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
 drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
 .../pthread_shim/pthread_shim.c               |  12 +-
 lib/librte_eal/include/rte_common.h           |  10 +
 4 files changed, 129 insertions(+), 118 deletions(-)

diff --git a/drivers/net/i40e/i40e_rxtx_vec_neon.c b/drivers/net/i40e/i40e_rxtx_vec_neon.c
index 67158f108..6f874e45b 100644
--- a/drivers/net/i40e/i40e_rxtx_vec_neon.c
+++ b/drivers/net/i40e/i40e_rxtx_vec_neon.c
@@ -172,8 +172,8 @@ desc_to_olflags_v(struct i40e_rx_queue *rxq, uint64x2_t descs[4],
 #define I40E_UINT16_BIT (CHAR_BIT * sizeof(uint16_t))
 
 static inline void
-desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
-		uint32_t *__restrict ptype_tbl)
+desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__rte_restrict rx_pkts,
+		uint32_t *__rte_restrict ptype_tbl)
 {
 	int i;
 	uint8_t ptype;
@@ -194,8 +194,9 @@ desc_to_ptype_v(uint64x2_t descs[4], struct rte_mbuf **__restrict rx_pkts,
  *   numbers of DD bits
  */
 static inline uint16_t
-_recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
-	**__restrict rx_pkts, uint16_t nb_pkts, uint8_t *split_packet)
+_recv_raw_pkts_vec(struct i40e_rx_queue *__rte_restrict rxq,
+		   struct rte_mbuf **__rte_restrict rx_pkts,
+		   uint16_t nb_pkts, uint8_t *split_packet)
 {
 	volatile union i40e_rx_desc *rxdp;
 	struct i40e_rx_entry *sw_ring;
@@ -432,8 +433,8 @@ _recv_raw_pkts_vec(struct i40e_rx_queue *__restrict rxq, struct rte_mbuf
  *   numbers of DD bits
  */
 uint16_t
-i40e_recv_pkts_vec(void *__restrict rx_queue,
-		struct rte_mbuf **__restrict rx_pkts, uint16_t nb_pkts)
+i40e_recv_pkts_vec(void *__rte_restrict rx_queue,
+		struct rte_mbuf **__rte_restrict rx_pkts, uint16_t nb_pkts)
 {
 	return _recv_raw_pkts_vec(rx_queue, rx_pkts, nb_pkts, NULL);
 }
@@ -504,8 +505,8 @@ vtx(volatile struct i40e_tx_desc *txdp, struct rte_mbuf **pkt,
 }
 
 uint16_t
-i40e_xmit_fixed_burst_vec(void *__restrict tx_queue,
-	struct rte_mbuf **__restrict tx_pkts, uint16_t nb_pkts)
+i40e_xmit_fixed_burst_vec(void *__rte_restrict tx_queue,
+	struct rte_mbuf **__rte_restrict tx_pkts, uint16_t nb_pkts)
 {
 	struct i40e_tx_queue *txq = (struct i40e_tx_queue *)tx_queue;
 	volatile struct i40e_tx_desc *txdp;
diff --git a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c
index e4106bf0a..894f441f3 100644
--- a/drivers/net/mlx5/mlx5_rxtx.c
+++ b/drivers/net/mlx5/mlx5_rxtx.c
@@ -113,13 +113,13 @@ mlx5_queue_state_modify(struct rte_eth_dev *dev,
 			struct mlx5_mp_arg_queue_state_modify *sm);
 
 static inline void
-mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
-			volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
+			volatile struct mlx5_cqe *__rte_restrict cqe,
 			uint32_t phcsum);
 
 static inline void
-mlx5_lro_update_hdr(uint8_t *restrict padd,
-		    volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
+		    volatile struct mlx5_cqe *__rte_restrict cqe,
 		    uint32_t len);
 
 uint32_t mlx5_ptype_table[] __rte_cache_aligned = {
@@ -374,7 +374,7 @@ mlx5_set_swp_types_table(void)
  *   Software Parser flags are set by pointer.
  */
 static __rte_always_inline uint32_t
-txq_mbuf_to_swp(struct mlx5_txq_local *restrict loc,
+txq_mbuf_to_swp(struct mlx5_txq_local *__rte_restrict loc,
 		uint8_t *swp_flags,
 		unsigned int olx)
 {
@@ -747,7 +747,7 @@ check_err_cqe_seen(volatile struct mlx5_err_cqe *err_cqe)
  *   the error completion entry is handled successfully.
  */
 static int
-mlx5_tx_error_cqe_handle(struct mlx5_txq_data *restrict txq,
+mlx5_tx_error_cqe_handle(struct mlx5_txq_data *__rte_restrict txq,
 			 volatile struct mlx5_err_cqe *err_cqe)
 {
 	if (err_cqe->syndrome != MLX5_CQE_SYNDROME_WR_FLUSH_ERR) {
@@ -1508,8 +1508,8 @@ mlx5_rx_burst(void *dpdk_rxq, struct rte_mbuf **pkts, uint16_t pkts_n)
  *   The L3 pseudo-header checksum.
  */
 static inline void
-mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
-			volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *__rte_restrict tcp,
+			volatile struct mlx5_cqe *__rte_restrict cqe,
 			uint32_t phcsum)
 {
 	uint8_t l4_type = (rte_be_to_cpu_16(cqe->hdr_type_etc) &
@@ -1550,8 +1550,8 @@ mlx5_lro_update_tcp_hdr(struct rte_tcp_hdr *restrict tcp,
  *   The packet length.
  */
 static inline void
-mlx5_lro_update_hdr(uint8_t *restrict padd,
-		    volatile struct mlx5_cqe *restrict cqe,
+mlx5_lro_update_hdr(uint8_t *__rte_restrict padd,
+		    volatile struct mlx5_cqe *__rte_restrict cqe,
 		    uint32_t len)
 {
 	union {
@@ -1965,7 +1965,7 @@ mlx5_check_vec_rx_support(struct rte_eth_dev *dev __rte_unused)
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_free_mbuf(struct rte_mbuf **restrict pkts,
+mlx5_tx_free_mbuf(struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
 		  unsigned int olx __rte_unused)
 {
@@ -2070,7 +2070,7 @@ mlx5_tx_free_mbuf(struct rte_mbuf **restrict pkts,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_free_elts(struct mlx5_txq_data *restrict txq,
+mlx5_tx_free_elts(struct mlx5_txq_data *__rte_restrict txq,
 		  uint16_t tail,
 		  unsigned int olx __rte_unused)
 {
@@ -2111,8 +2111,8 @@ mlx5_tx_free_elts(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_copy_elts(struct mlx5_txq_data *restrict txq,
-		  struct rte_mbuf **restrict pkts,
+mlx5_tx_copy_elts(struct mlx5_txq_data *__rte_restrict txq,
+		  struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
 		  unsigned int olx __rte_unused)
 {
@@ -2148,7 +2148,7 @@ mlx5_tx_copy_elts(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_comp_flush(struct mlx5_txq_data *restrict txq,
+mlx5_tx_comp_flush(struct mlx5_txq_data *__rte_restrict txq,
 		   volatile struct mlx5_cqe *last_cqe,
 		   unsigned int olx __rte_unused)
 {
@@ -2179,7 +2179,7 @@ mlx5_tx_comp_flush(struct mlx5_txq_data *restrict txq,
  * routine smaller, simple and faster - from experiments.
  */
 static void
-mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq,
+mlx5_tx_handle_completion(struct mlx5_txq_data *__rte_restrict txq,
 			  unsigned int olx __rte_unused)
 {
 	unsigned int count = MLX5_TX_COMP_MAX_CQE;
@@ -2268,8 +2268,8 @@ mlx5_tx_handle_completion(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_request_completion(struct mlx5_txq_data *restrict txq,
-			   struct mlx5_txq_local *restrict loc,
+mlx5_tx_request_completion(struct mlx5_txq_data *__rte_restrict txq,
+			   struct mlx5_txq_local *__rte_restrict loc,
 			   unsigned int olx)
 {
 	uint16_t head = txq->elts_head;
@@ -2316,7 +2316,7 @@ mlx5_tx_request_completion(struct mlx5_txq_data *restrict txq,
 int
 mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
 {
-	struct mlx5_txq_data *restrict txq = tx_queue;
+	struct mlx5_txq_data *__rte_restrict txq = tx_queue;
 	uint16_t used;
 
 	mlx5_tx_handle_completion(txq, 0);
@@ -2347,14 +2347,14 @@ mlx5_tx_descriptor_status(void *tx_queue, uint16_t offset)
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_cseg_init(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_cseg_init(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int ds,
 		  unsigned int opcode,
 		  unsigned int olx __rte_unused)
 {
-	struct mlx5_wqe_cseg *restrict cs = &wqe->cseg;
+	struct mlx5_wqe_cseg *__rte_restrict cs = &wqe->cseg;
 
 	/* For legacy MPW replace the EMPW by TSO with modifier. */
 	if (MLX5_TXOFF_CONFIG(MPW) && opcode == MLX5_OPCODE_ENHANCED_MPSW)
@@ -2382,12 +2382,12 @@ mlx5_tx_cseg_init(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_eseg_none(struct mlx5_txq_data *restrict txq __rte_unused,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_none(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 
 	/*
@@ -2440,13 +2440,13 @@ mlx5_tx_eseg_none(struct mlx5_txq_data *restrict txq __rte_unused,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_eseg_dmin(struct mlx5_txq_data *restrict txq __rte_unused,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_dmin(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *psrc, *pdst;
 
@@ -2524,15 +2524,15 @@ mlx5_tx_eseg_dmin(struct mlx5_txq_data *restrict txq __rte_unused,
  *   Pointer to the next Data Segment (aligned and wrapped around).
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_eseg_data(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_data(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int inlen,
 		  unsigned int tso,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *psrc, *pdst;
 	unsigned int part;
@@ -2650,7 +2650,7 @@ mlx5_tx_eseg_data(struct mlx5_txq_data *restrict txq,
  */
 static __rte_always_inline unsigned int
 mlx5_tx_mseg_memcpy(uint8_t *pdst,
-		    struct mlx5_txq_local *restrict loc,
+		    struct mlx5_txq_local *__rte_restrict loc,
 		    unsigned int len,
 		    unsigned int must,
 		    unsigned int olx __rte_unused)
@@ -2747,15 +2747,15 @@ mlx5_tx_mseg_memcpy(uint8_t *pdst,
  *   wrapping check on its own).
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe *restrict wqe,
+mlx5_tx_eseg_mdat(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe *__rte_restrict wqe,
 		  unsigned int vlan,
 		  unsigned int inlen,
 		  unsigned int tso,
 		  unsigned int olx)
 {
-	struct mlx5_wqe_eseg *restrict es = &wqe->eseg;
+	struct mlx5_wqe_eseg *__rte_restrict es = &wqe->eseg;
 	uint32_t csum;
 	uint8_t *pdst;
 	unsigned int part, tlen = 0;
@@ -2851,9 +2851,9 @@ mlx5_tx_eseg_mdat(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict txq,
-		 struct mlx5_txq_local *restrict loc,
-		 struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_ptr(struct mlx5_txq_data *__rte_restrict txq,
+		 struct mlx5_txq_local *__rte_restrict loc,
+		 struct mlx5_wqe_dseg *__rte_restrict dseg,
 		 uint8_t *buf,
 		 unsigned int len,
 		 unsigned int olx __rte_unused)
@@ -2885,9 +2885,9 @@ mlx5_tx_dseg_ptr(struct mlx5_txq_data *restrict txq,
  *   compile time and may be used for optimization.
  */
 static __rte_always_inline void
-mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_iptr(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -2961,9 +2961,9 @@ mlx5_tx_dseg_iptr(struct mlx5_txq_data *restrict txq,
  *   last packet in the eMPW session.
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_dseg_empw(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_empw(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -3024,9 +3024,9 @@ mlx5_tx_dseg_empw(struct mlx5_txq_data *restrict txq,
  *   Ring buffer wraparound check is needed.
  */
 static __rte_always_inline struct mlx5_wqe_dseg *
-mlx5_tx_dseg_vlan(struct mlx5_txq_data *restrict txq,
-		  struct mlx5_txq_local *restrict loc __rte_unused,
-		  struct mlx5_wqe_dseg *restrict dseg,
+mlx5_tx_dseg_vlan(struct mlx5_txq_data *__rte_restrict txq,
+		  struct mlx5_txq_local *__rte_restrict loc __rte_unused,
+		  struct mlx5_wqe_dseg *__rte_restrict dseg,
 		  uint8_t *buf,
 		  unsigned int len,
 		  unsigned int olx __rte_unused)
@@ -3112,15 +3112,15 @@ mlx5_tx_dseg_vlan(struct mlx5_txq_data *restrict txq,
  *   Actual size of built WQE in segments.
  */
 static __rte_always_inline unsigned int
-mlx5_tx_mseg_build(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
-		   struct mlx5_wqe *restrict wqe,
+mlx5_tx_mseg_build(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
+		   struct mlx5_wqe *__rte_restrict wqe,
 		   unsigned int vlan,
 		   unsigned int inlen,
 		   unsigned int tso,
 		   unsigned int olx __rte_unused)
 {
-	struct mlx5_wqe_dseg *restrict dseg;
+	struct mlx5_wqe_dseg *__rte_restrict dseg;
 	unsigned int ds;
 
 	MLX5_ASSERT((rte_pktmbuf_pkt_len(loc->mbuf) + vlan) >= inlen);
@@ -3225,11 +3225,11 @@ mlx5_tx_mseg_build(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_tso(struct mlx5_txq_data *restrict txq,
-			struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_tso(struct mlx5_txq_data *__rte_restrict txq,
+			struct mlx5_txq_local *__rte_restrict loc,
 			unsigned int olx)
 {
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, dlen, inlen, ntcp, vlan = 0;
 
 	/*
@@ -3314,12 +3314,12 @@ mlx5_tx_packet_multi_tso(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_send(struct mlx5_txq_data *restrict txq,
-			  struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_send(struct mlx5_txq_data *__rte_restrict txq,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
-	struct mlx5_wqe_dseg *restrict dseg;
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe_dseg *__rte_restrict dseg;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, nseg;
 
 	MLX5_ASSERT(NB_SEGS(loc->mbuf) > 1);
@@ -3422,11 +3422,11 @@ mlx5_tx_packet_multi_send(struct mlx5_txq_data *restrict txq,
  * Local context variables partially updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_packet_multi_inline(struct mlx5_txq_data *restrict txq,
-			    struct mlx5_txq_local *restrict loc,
+mlx5_tx_packet_multi_inline(struct mlx5_txq_data *__rte_restrict txq,
+			    struct mlx5_txq_local *__rte_restrict loc,
 			    unsigned int olx)
 {
-	struct mlx5_wqe *restrict wqe;
+	struct mlx5_wqe *__rte_restrict wqe;
 	unsigned int ds, inlen, dlen, vlan = 0;
 
 	MLX5_ASSERT(MLX5_TXOFF_CONFIG(INLINE));
@@ -3587,10 +3587,10 @@ mlx5_tx_packet_multi_inline(struct mlx5_txq_data *restrict txq,
  * Local context variables updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_mseg(struct mlx5_txq_data *restrict txq,
-		   struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_mseg(struct mlx5_txq_data *__rte_restrict txq,
+		   struct rte_mbuf **__rte_restrict pkts,
 		   unsigned int pkts_n,
-		   struct mlx5_txq_local *restrict loc,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int olx)
 {
 	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
@@ -3676,10 +3676,10 @@ mlx5_tx_burst_mseg(struct mlx5_txq_data *restrict txq,
  * Local context variables updated.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
-		  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_tso(struct mlx5_txq_data *__rte_restrict txq,
+		  struct rte_mbuf **__rte_restrict pkts,
 		  unsigned int pkts_n,
-		  struct mlx5_txq_local *restrict loc,
+		  struct mlx5_txq_local *__rte_restrict loc,
 		  unsigned int olx)
 {
 	MLX5_ASSERT(loc->elts_free && loc->wqe_free);
@@ -3687,8 +3687,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe *restrict wqe;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe *__rte_restrict wqe;
 		unsigned int ds, dlen, hlen, ntcp, vlan = 0;
 		uint8_t *dptr;
 
@@ -3800,8 +3800,8 @@ mlx5_tx_burst_tso(struct mlx5_txq_data *restrict txq,
  *  MLX5_TXCMP_CODE_EMPW - single-segment packet, use MPW.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_able_to_empw(struct mlx5_txq_data *restrict txq,
-		     struct mlx5_txq_local *restrict loc,
+mlx5_tx_able_to_empw(struct mlx5_txq_data *__rte_restrict txq,
+		     struct mlx5_txq_local *__rte_restrict loc,
 		     unsigned int olx,
 		     bool newp)
 {
@@ -3855,9 +3855,9 @@ mlx5_tx_able_to_empw(struct mlx5_txq_data *restrict txq,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline bool
-mlx5_tx_match_empw(struct mlx5_txq_data *restrict txq __rte_unused,
-		   struct mlx5_wqe_eseg *restrict es,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_match_empw(struct mlx5_txq_data *__rte_restrict txq __rte_unused,
+		   struct mlx5_wqe_eseg *__rte_restrict es,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   uint32_t dlen,
 		   unsigned int olx)
 {
@@ -3909,8 +3909,8 @@ mlx5_tx_match_empw(struct mlx5_txq_data *restrict txq __rte_unused,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline void
-mlx5_tx_sdone_empw(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_sdone_empw(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int ds,
 		   unsigned int slen,
 		   unsigned int olx __rte_unused)
@@ -3954,11 +3954,11 @@ mlx5_tx_sdone_empw(struct mlx5_txq_data *restrict txq,
  *  false - no match, eMPW should be restarted.
  */
 static __rte_always_inline void
-mlx5_tx_idone_empw(struct mlx5_txq_data *restrict txq,
-		   struct mlx5_txq_local *restrict loc,
+mlx5_tx_idone_empw(struct mlx5_txq_data *__rte_restrict txq,
+		   struct mlx5_txq_local *__rte_restrict loc,
 		   unsigned int len,
 		   unsigned int slen,
-		   struct mlx5_wqe *restrict wqem,
+		   struct mlx5_wqe *__rte_restrict wqem,
 		   unsigned int olx __rte_unused)
 {
 	struct mlx5_wqe_dseg *dseg = &wqem->dseg[0];
@@ -4042,10 +4042,10 @@ mlx5_tx_idone_empw(struct mlx5_txq_data *restrict txq,
  * No VLAN insertion is supported.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_empw_simple(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4061,8 +4061,8 @@ mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe_eseg *restrict eseg;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe_eseg *__rte_restrict eseg;
 		enum mlx5_txcmp_code ret;
 		unsigned int part, loop;
 		unsigned int slen = 0;
@@ -4208,10 +4208,10 @@ mlx5_tx_burst_empw_simple(struct mlx5_txq_data *restrict txq,
  * with inlining, optionally supports VLAN insertion.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_empw_inline(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4227,8 +4227,8 @@ mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe_dseg *restrict dseg;
-		struct mlx5_wqe *restrict wqem;
+		struct mlx5_wqe_dseg *__rte_restrict dseg;
+		struct mlx5_wqe *__rte_restrict wqem;
 		enum mlx5_txcmp_code ret;
 		unsigned int room, part, nlim;
 		unsigned int slen = 0;
@@ -4489,10 +4489,10 @@ mlx5_tx_burst_empw_inline(struct mlx5_txq_data *restrict txq,
  * Data inlining and VLAN insertion are supported.
  */
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
-			  struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_single_send(struct mlx5_txq_data *__rte_restrict txq,
+			  struct rte_mbuf **__rte_restrict pkts,
 			  unsigned int pkts_n,
-			  struct mlx5_txq_local *restrict loc,
+			  struct mlx5_txq_local *__rte_restrict loc,
 			  unsigned int olx)
 {
 	/*
@@ -4504,7 +4504,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 	pkts += loc->pkts_sent + 1;
 	pkts_n -= loc->pkts_sent;
 	for (;;) {
-		struct mlx5_wqe *restrict wqe;
+		struct mlx5_wqe *__rte_restrict wqe;
 		enum mlx5_txcmp_code ret;
 
 		MLX5_ASSERT(NB_SEGS(loc->mbuf) == 1);
@@ -4602,7 +4602,7 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 				 * not contain inlined data for eMPW due to
 				 * segment shared for all packets.
 				 */
-				struct mlx5_wqe_dseg *restrict dseg;
+				struct mlx5_wqe_dseg *__rte_restrict dseg;
 				unsigned int ds;
 				uint8_t *dptr;
 
@@ -4765,10 +4765,10 @@ mlx5_tx_burst_single_send(struct mlx5_txq_data *restrict txq,
 }
 
 static __rte_always_inline enum mlx5_txcmp_code
-mlx5_tx_burst_single(struct mlx5_txq_data *restrict txq,
-		     struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_single(struct mlx5_txq_data *__rte_restrict txq,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     unsigned int pkts_n,
-		     struct mlx5_txq_local *restrict loc,
+		     struct mlx5_txq_local *__rte_restrict loc,
 		     unsigned int olx)
 {
 	enum mlx5_txcmp_code ret;
@@ -4819,8 +4819,8 @@ mlx5_tx_burst_single(struct mlx5_txq_data *restrict txq,
  *   Number of packets successfully transmitted (<= pkts_n).
  */
 static __rte_always_inline uint16_t
-mlx5_tx_burst_tmpl(struct mlx5_txq_data *restrict txq,
-		   struct rte_mbuf **restrict pkts,
+mlx5_tx_burst_tmpl(struct mlx5_txq_data *__rte_restrict txq,
+		   struct rte_mbuf **__rte_restrict pkts,
 		   uint16_t pkts_n,
 		   unsigned int olx)
 {
diff --git a/examples/performance-thread/pthread_shim/pthread_shim.c b/examples/performance-thread/pthread_shim/pthread_shim.c
index 93e8dca3f..bbc076584 100644
--- a/examples/performance-thread/pthread_shim/pthread_shim.c
+++ b/examples/performance-thread/pthread_shim/pthread_shim.c
@@ -341,9 +341,9 @@ int pthread_cond_signal(pthread_cond_t *cond)
 }
 
 int
-pthread_cond_timedwait(pthread_cond_t *__restrict cond,
-		       pthread_mutex_t *__restrict mutex,
-		       const struct timespec *__restrict time)
+pthread_cond_timedwait(pthread_cond_t *__rte_restrict cond,
+		       pthread_mutex_t *__rte_restrict mutex,
+		       const struct timespec *__rte_restrict time)
 {
 	NOT_IMPLEMENTED;
 	return _sys_pthread_funcs.f_pthread_cond_timedwait(cond, mutex, time);
@@ -362,10 +362,10 @@ int pthread_cond_wait(pthread_cond_t *cond, pthread_mutex_t *mutex)
 }
 
 int
-pthread_create(pthread_t *__restrict tid,
-		const pthread_attr_t *__restrict attr,
+pthread_create(pthread_t *__rte_restrict tid,
+		const pthread_attr_t *__rte_restrict attr,
 		lthread_func_t func,
-	       void *__restrict arg)
+	       void *__rte_restrict arg)
 {
 	if (override) {
 		int lcore = -1;
diff --git a/lib/librte_eal/include/rte_common.h b/lib/librte_eal/include/rte_common.h
index 0843ce69e..ce93e6596 100644
--- a/lib/librte_eal/include/rte_common.h
+++ b/lib/librte_eal/include/rte_common.h
@@ -103,6 +103,16 @@ typedef uint16_t unaligned_uint16_t;
  */
 #define __rte_unused __attribute__((__unused__))
 
+/**
+ * Define a wrapper for restricted pointers which can be supported
+ * by all compilers.
+ */
+#if !defined(__STDC_VERSION__) || __STDC_VERSION__ < 199901L
+#define __rte_restrict __restrict
+#else
+#define __rte_restrict restrict
+#endif
+
 /**
  * definition to mark a variable or function parameter as used so
  * as to avoid a compiler warning
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v3 2/3] net/virtio: restrict pointer aliasing for NEON vpmd
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 1/3] lib/eal: add a common wrapper for restricted pointers Joyce Kong
@ 2020-07-10  2:38   ` Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
  2020-07-10 14:05   ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper David Marchand
  3 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  2:38 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, amorenoz, mb, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

Restrict pointer aliasing to allow the compiler to vectorize loops
more aggressively.

With this patch, a 9.6% improvement is observed in throughput for
the virtio-net PVP case, and a 2.4% perf improvement in throughput
for the virtio-user PVP case. All performance data are measured
on ThunderX-2 platform under the 0.001% acceptable packet loss with
2 cores on the vhost side and 1 core on the virtio side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/virtio/virtio_rxtx_simple_neon.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
index 8e6fa1fd7..a9b649814 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
@@ -36,8 +36,8 @@
  * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
  */
 uint16_t
-virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf **rx_pkts,
-	uint16_t nb_pkts)
+virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
+		**__rte_restrict rx_pkts, uint16_t nb_pkts)
 {
 	struct virtnet_rx *rxvq = rx_queue;
 	struct virtqueue *vq = rxvq->vq;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 1/3] lib/eal: add a common wrapper for restricted pointers Joyce Kong
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 2/3] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
@ 2020-07-10  2:38   ` Joyce Kong
  2020-07-10 13:41     ` Adrian Moreno
  2020-07-10 14:05   ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper David Marchand
  3 siblings, 1 reply; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  2:38 UTC (permalink / raw)
  To: maxime.coquelin, jerinj, zhihong.wang, amorenoz, mb, xiaolong.ye,
	beilei.xing, jia.guo, john.mcnamara, matan, shahafs, viacheslavo,
	honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd

Restrict pointer aliasing to allow the compiler to vectorize loop
more aggressively.

With this patch, a 9.6% improvement is observed in throughput for
the packed virtio-net PVP case, and a 2.8% improvement in throughput
for the packed virtio-user PVP case. All performance data are measured
on ThunderX-2 platform under 0.001% acceptable packet loss with 1 core
on both vhost and virtio side.

Signed-off-by: Joyce Kong <joyce.kong@arm.com>
Reviewed-by: Phil Yang <phil.yang@arm.com>
---
 drivers/net/virtio/virtio_rxtx_simple_neon.c |  5 +++--
 lib/librte_vhost/virtio_net.c                | 14 +++++++-------
 2 files changed, 10 insertions(+), 9 deletions(-)

diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
index a9b649814..02520fda8 100644
--- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
+++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
@@ -36,8 +36,9 @@
  * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
  */
 uint16_t
-virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
-		**__rte_restrict rx_pkts, uint16_t nb_pkts)
+virtio_recv_pkts_vec(void *rx_queue,
+		struct rte_mbuf **__rte_restrict rx_pkts,
+		uint16_t nb_pkts)
 {
 	struct virtnet_rx *rxvq = rx_queue;
 	struct virtqueue *vq = rxvq->vq;
diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
index 236498f71..1d0be3dd4 100644
--- a/lib/librte_vhost/virtio_net.c
+++ b/lib/librte_vhost/virtio_net.c
@@ -1353,8 +1353,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
 
 static __rte_noinline uint32_t
 virtio_dev_rx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
-		     struct rte_mbuf **pkts,
+		     struct vhost_virtqueue *__rte_restrict vq,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -1439,7 +1439,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
 
 uint16_t
 rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
-	struct rte_mbuf **pkts, uint16_t count)
+	struct rte_mbuf **__rte_restrict pkts, uint16_t count)
 {
 	struct virtio_net *dev = get_device(vid);
 
@@ -2671,9 +2671,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
-			   struct vhost_virtqueue *vq,
+			   struct vhost_virtqueue *__rte_restrict vq,
 			   struct rte_mempool *mbuf_pool,
-			   struct rte_mbuf **pkts,
+			   struct rte_mbuf **__rte_restrict pkts,
 			   uint32_t count)
 {
 	uint32_t pkt_idx = 0;
@@ -2707,9 +2707,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
 
 static __rte_noinline uint16_t
 virtio_dev_tx_packed(struct virtio_net *dev,
-		     struct vhost_virtqueue *vq,
+		     struct vhost_virtqueue *__rte_restrict vq,
 		     struct rte_mempool *mbuf_pool,
-		     struct rte_mbuf **pkts,
+		     struct rte_mbuf **__rte_restrict pkts,
 		     uint32_t count)
 {
 	uint32_t pkt_idx = 0;
-- 
2.27.0


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict
  2020-07-09  9:57         ` David Marchand
@ 2020-07-10  2:45           ` Joyce Kong
  0 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  2:45 UTC (permalink / raw)
  To: David Marchand
  Cc: Maxime Coquelin, jerinj, Zhihong Wang, Xiaolong Ye, Beilei Xing,
	Jeff Guo, Mcnamara, John, Matan Azrad, Shahaf Shuler,
	Viacheslav Ovsiienko, Honnappa Nagarahalli, Phil Yang,
	Ruifeng Wang, dev, nd

> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Thursday, July 9, 2020 5:58 PM
> To: Joyce Kong <Joyce.Kong@arm.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; jerinj@marvell.com;
> Zhihong Wang <zhihong.wang@intel.com>; Xiaolong Ye
> <xiaolong.ye@intel.com>; Beilei Xing <beilei.xing@intel.com>; Jeff Guo
> <jia.guo@intel.com>; Mcnamara, John <john.mcnamara@intel.com>; Matan
> Azrad <matan@mellanox.com>; Shahaf Shuler <shahafs@mellanox.com>;
> Viacheslav Ovsiienko <viacheslavo@mellanox.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> Ruifeng Wang <Ruifeng.Wang@arm.com>; dev <dev@dpdk.org>; nd
> <nd@arm.com>
> Subject: Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte
> restrict
> 
> On Wed, Jul 8, 2020 at 5:21 AM Joyce Kong <Joyce.Kong@arm.com> wrote:
> >
> > > -----Original Message-----
> > > From: David Marchand <david.marchand@redhat.com>
> > > Sent: Tuesday, July 7, 2020 10:00 PM
> > > To: Joyce Kong <Joyce.Kong@arm.com>
> > > Cc: Maxime Coquelin <maxime.coquelin@redhat.com>;
> > > jerinj@marvell.com; Zhihong Wang <zhihong.wang@intel.com>; Xiaolong
> > > Ye <xiaolong.ye@intel.com>; Beilei Xing <beilei.xing@intel.com>;
> > > Jeff Guo <jia.guo@intel.com>; Mcnamara, John
> > > <john.mcnamara@intel.com>; Matan Azrad <matan@mellanox.com>;
> Shahaf
> > > Shuler <shahafs@mellanox.com>; Viacheslav Ovsiienko
> > > <viacheslavo@mellanox.com>; Honnappa Nagarahalli
> > > <Honnappa.Nagarahalli@arm.com>; Phil Yang <Phil.Yang@arm.com>;
> > > Ruifeng Wang <Ruifeng.Wang@arm.com>; dev <dev@dpdk.org>; nd
> > > <nd@arm.com>
> > > Subject: Re: [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict
> > > with rte restrict
> > >
> > > On Mon, Jul 6, 2020 at 9:50 AM Joyce Kong <joyce.kong@arm.com>
> wrote:
> > > >
> > > > '__rte_restrict' is a common wrapper for restricted pointers which
> > > > can be supported by all compilers. Use '__rte_restrict' instead of
> > > > '__restrict' for code consistency.
> > >
> > > This patch 4, 5 and 6 are simple replacements and can be squashed
> > > into the first patch.
> > >
> > > Thanks.
> > >
> > The first patch is to add a common definition for lib_eal, could we
> > squash the 4,5 and 6 into one replacement patch, while separate them
> from the first one?
> > This might be more convenient for review?
> 
> A copy/paste commitlog hints that it did not deserve separate patches.
> About easing reviews, the changes are mechanical, there is no special case.
> I am not convinced but, go as you like.
> 
> Can you provide a new revision wrt patch 1 problem?
> Thanks.
> 
> --
> David Marchand

Have just squashed the patch 4,5 and 6 into the first one,
and corrected the issue with a new version.
Thanks.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path
  2020-07-07 16:25   ` Adrian Moreno
@ 2020-07-10  3:15     ` Joyce Kong
  0 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  3:15 UTC (permalink / raw)
  To: Adrian Moreno
  Cc: maxime.coquelin, jerinj, David Marchand, zhihong.wang,
	xiaolong.ye, Honnappa Nagarahalli, Phil Yang, Ruifeng Wang, dev,
	nd

> -----Original Message-----
> From: Adrian Moreno <amorenoz@redhat.com>
> Sent: Wednesday, July 8, 2020 12:26 AM
> To: Joyce Kong <Joyce.Kong@arm.com>; maxime.coquelin@redhat.com;
> jerinj@marvell.com; zhihong.wang@intel.com; xiaolong.ye@intel.com;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Phil Yang
> <Phil.Yang@arm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for
> packed path
> 
> 
> On 6/11/20 5:32 AM, Joyce Kong wrote:
> > Restrict pointer aliasing to allow the compiler to vectorize loops
> > more aggressively.
> >
> > With this patch, a 9.6% improvement is observed in throughput for the
> > packed virtio-net PVP case, and a 2.8% improvement in throughput for
> > the packed virtio-user PVP case. All performance data are measured
> > under 0.001% acceptable packet loss with 1 core on both vhost and
> > virtio side.
> 
> Is the performance gain related solely to this patch or is it the result of the
> combined series?
> 

The performance gain is solely related to this patch on ThunderX-2 platform.
To test the perf for vhost patch, I use both 1 core on vhost and virtio side.
To test the perf for virtio patch, I use 2 cores on vhost side and 1 core on virtio side
to saturate the vCPU cycles, in this way the benefits of the patch can be manifested.

> >
> > Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > ---
> >  lib/librte_vhost/virtio_net.c | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_vhost/virtio_net.c
> > b/lib/librte_vhost/virtio_net.c index 751c1f373..39c92e7e1 100644
> > --- a/lib/librte_vhost/virtio_net.c
> > +++ b/lib/librte_vhost/virtio_net.c
> > @@ -1133,8 +1133,8 @@ virtio_dev_rx_single_packed(struct virtio_net
> > *dev,
> >
> >  static __rte_noinline uint32_t
> >  virtio_dev_rx_packed(struct virtio_net *dev,
> > -		     struct vhost_virtqueue *vq,
> > -		     struct rte_mbuf **pkts,
> > +		     struct vhost_virtqueue *__restrict vq,
> > +		     struct rte_mbuf **__restrict pkts,
> >  		     uint32_t count)
> >  {
> >  	uint32_t pkt_idx = 0;
> 
> I wonder if we're extracting the full potential of "restrict" considering that the
> heavy lifting is done by the inner functions:
> (virtio_dev_rx_batch_packed and virtio_dev_rx_single_packed)
>

Yes, for 'restrict' variables in noinline functions, they will still keep 'restrict' feature
when passed into inner functions.
 
> > @@ -1219,7 +1219,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t
> > queue_id,
> >
> >  uint16_t
> >  rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
> > -	struct rte_mbuf **pkts, uint16_t count)
> > +	struct rte_mbuf **__restrict pkts, uint16_t count)
> >  {
> >  	struct virtio_net *dev = get_device(vid);
> >
> Is this considered an api change?
> 
Yes.

> > @@ -2124,9 +2124,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
> >
> >  static __rte_noinline uint16_t
> >  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> > -			   struct vhost_virtqueue *vq,
> > +			   struct vhost_virtqueue *__restrict vq,
> >  			   struct rte_mempool *mbuf_pool,
> > -			   struct rte_mbuf **pkts,
> > +			   struct rte_mbuf **__restrict pkts,
> >  			   uint32_t count)
> >  {
> >  	uint32_t pkt_idx = 0;
> > @@ -2160,9 +2160,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net
> > *dev,
> >
> >  static __rte_noinline uint16_t
> >  virtio_dev_tx_packed(struct virtio_net *dev,
> > -		     struct vhost_virtqueue *vq,
> > +		     struct vhost_virtqueue *__restrict vq,
> >  		     struct rte_mempool *mbuf_pool,
> > -		     struct rte_mbuf **pkts,
> > +		     struct rte_mbuf **__restrict pkts,
> >  		     uint32_t count)
> >  {
> >  	uint32_t pkt_idx = 0;
> >
> 
> --
> Adrián Moreno


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper
  2020-07-09 13:52   ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper Morten Brørup
@ 2020-07-10  3:17     ` Joyce Kong
  0 siblings, 0 replies; 40+ messages in thread
From: Joyce Kong @ 2020-07-10  3:17 UTC (permalink / raw)
  To: Morten Brørup, maxime.coquelin, jerinj, zhihong.wang,
	xiaolong.ye, beilei.xing, jia.guo, john.mcnamara, matan, shahafs,
	viacheslavo, Honnappa Nagarahalli, Phil Yang, Ruifeng Wang
  Cc: dev, nd

> -----Original Message-----
> From: Morten Brørup <mb@smartsharesystems.com>
> Sent: Thursday, July 9, 2020 9:53 PM
> To: Joyce Kong <Joyce.Kong@arm.com>; maxime.coquelin@redhat.com;
> jerinj@marvell.com; zhihong.wang@intel.com; xiaolong.ye@intel.com;
> beilei.xing@intel.com; jia.guo@intel.com; john.mcnamara@intel.com;
> matan@mellanox.com; shahafs@mellanox.com; viacheslavo@mellanox.com;
> Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Phil Yang
> <Phil.Yang@arm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>
> Cc: dev@dpdk.org; nd <nd@arm.com>
> Subject: RE: [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a
> commonwrapper
> 
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Joyce Kong
> > Sent: Monday, July 6, 2020 9:49 AM
> >
> > As the 'restrict' keyword is recognized in C99, this patchset is to
> > add a wrapper defining '__rte_restrict' which can be supported by all
> > compilers. Then replace the existing 'restrict' and '__restrict'
> > in different vpmds, and optimize vhost/virtio with restricted pointer
> > aliasing for more aggressive loops vectorization.
> >
> > The vhost/virtio optimization patches were benchmarked by running PVP
> > case on ThunderX2 platform and showed positive performance results.
> >
> > Joyce Kong (6):
> >   lib/eal: add a wrapper to define restricted pointers
> >   net/virtio: restrict pointer aliasing for NEON vpmd
> >   lib/vhost: restrict pointer aliasing for packed vpmd
> >   net/i40e: replace restrict with rte restrict
> >   examples/performance-thread: replace restrict with wrapper
> >   net/mlx5: replace restrict keyword with rte restrict
> >
> >  drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
> >  drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
> >  drivers/net/virtio/virtio_rxtx_simple_neon.c  |   5 +-
> >  .../pthread_shim/pthread_shim.c               |  12 +-
> >  lib/librte_eal/include/rte_common.h           |  10 +
> >  lib/librte_vhost/virtio_net.c                 |  14 +-
> >  6 files changed, 139 insertions(+), 127 deletions(-)
> >
> > --
> > 2.27.0
> >
> 
> If you are hunting for more places to add __rte_restrict, I will suggest
> rte_memcpy.h.
> 
I will try to do this after the common patch is ok.
Thanks.


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
@ 2020-07-10 13:41     ` Adrian Moreno
  0 siblings, 0 replies; 40+ messages in thread
From: Adrian Moreno @ 2020-07-10 13:41 UTC (permalink / raw)
  To: Joyce Kong, maxime.coquelin, jerinj, zhihong.wang, mb,
	xiaolong.ye, beilei.xing, jia.guo, john.mcnamara, matan, shahafs,
	viacheslavo, honnappa.nagarahalli, phil.yang, ruifeng.wang
  Cc: dev, nd



On 7/10/20 4:38 AM, Joyce Kong wrote:
> Restrict pointer aliasing to allow the compiler to vectorize loop
> more aggressively.
> 
> With this patch, a 9.6% improvement is observed in throughput for
> the packed virtio-net PVP case, and a 2.8% improvement in throughput
> for the packed virtio-user PVP case. All performance data are measured
> on ThunderX-2 platform under 0.001% acceptable packet loss with 1 core
> on both vhost and virtio side.
> 
> Signed-off-by: Joyce Kong <joyce.kong@arm.com>
> Reviewed-by: Phil Yang <phil.yang@arm.com>
> ---
>  drivers/net/virtio/virtio_rxtx_simple_neon.c |  5 +++--
>  lib/librte_vhost/virtio_net.c                | 14 +++++++-------
>  2 files changed, 10 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/virtio/virtio_rxtx_simple_neon.c b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> index a9b649814..02520fda8 100644
> --- a/drivers/net/virtio/virtio_rxtx_simple_neon.c
> +++ b/drivers/net/virtio/virtio_rxtx_simple_neon.c
> @@ -36,8 +36,9 @@
>   * - nb_pkts < RTE_VIRTIO_DESC_PER_LOOP, just return no packet
>   */
>  uint16_t
> -virtio_recv_pkts_vec(void *rx_queue, struct rte_mbuf
> -		**__rte_restrict rx_pkts, uint16_t nb_pkts)
> +virtio_recv_pkts_vec(void *rx_queue,
> +		struct rte_mbuf **__rte_restrict rx_pkts,
> +		uint16_t nb_pkts)
>  {
>  	struct virtnet_rx *rxvq = rx_queue;
>  	struct virtqueue *vq = rxvq->vq;
> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
> index 236498f71..1d0be3dd4 100644
> --- a/lib/librte_vhost/virtio_net.c
> +++ b/lib/librte_vhost/virtio_net.c
> @@ -1353,8 +1353,8 @@ virtio_dev_rx_single_packed(struct virtio_net *dev,
>  
>  static __rte_noinline uint32_t
>  virtio_dev_rx_packed(struct virtio_net *dev,
> -		     struct vhost_virtqueue *vq,
> -		     struct rte_mbuf **pkts,
> +		     struct vhost_virtqueue *__rte_restrict vq,
> +		     struct rte_mbuf **__rte_restrict pkts,
>  		     uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;
> @@ -1439,7 +1439,7 @@ virtio_dev_rx(struct virtio_net *dev, uint16_t queue_id,
>  
>  uint16_t
>  rte_vhost_enqueue_burst(int vid, uint16_t queue_id,
> -	struct rte_mbuf **pkts, uint16_t count)
> +	struct rte_mbuf **__rte_restrict pkts, uint16_t count)
>  {
>  	struct virtio_net *dev = get_device(vid);
>  
> @@ -2671,9 +2671,9 @@ free_zmbuf(struct vhost_virtqueue *vq)
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
> -			   struct vhost_virtqueue *vq,
> +			   struct vhost_virtqueue *__rte_restrict vq,
>  			   struct rte_mempool *mbuf_pool,
> -			   struct rte_mbuf **pkts,
> +			   struct rte_mbuf **__rte_restrict pkts,
>  			   uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;
> @@ -2707,9 +2707,9 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev,
>  
>  static __rte_noinline uint16_t
>  virtio_dev_tx_packed(struct virtio_net *dev,
> -		     struct vhost_virtqueue *vq,
> +		     struct vhost_virtqueue *__rte_restrict vq,
>  		     struct rte_mempool *mbuf_pool,
> -		     struct rte_mbuf **pkts,
> +		     struct rte_mbuf **__rte_restrict pkts,
>  		     uint32_t count)
>  {
>  	uint32_t pkt_idx = 0;
> 

The vhost part looks good to me.

Acked-by: Adrián Moreno <amorenoz@redhat.com>

-- 
Adrián Moreno


^ permalink raw reply	[flat|nested] 40+ messages in thread

* Re: [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper
  2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
                     ` (2 preceding siblings ...)
  2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
@ 2020-07-10 14:05   ` David Marchand
  3 siblings, 0 replies; 40+ messages in thread
From: David Marchand @ 2020-07-10 14:05 UTC (permalink / raw)
  To: Joyce Kong
  Cc: Maxime Coquelin, Jerin Jacob Kollanukkaran, Zhihong Wang,
	Adrian Moreno Zapata, Morten Brørup, Xiaolong Ye,
	Beilei Xing, Jeff Guo, Mcnamara, John, Matan Azrad,
	Shahaf Shuler, Viacheslav Ovsiienko, Honnappa Nagarahalli,
	Phil Yang, Ruifeng Wang (Arm Technology China),
	dev, nd

On Fri, Jul 10, 2020 at 4:42 AM Joyce Kong <joyce.kong@arm.com> wrote:
>
> As the 'restrict' keyword is recognized in C99, this patchset is to
> add a wrapper defining '__rte_restrict' which can be supported by
> all compilers. Then replace the existing 'restrict' and '__restrict'
> in different vpmds, and optimize vhost/virtio with restricted pointer
> aliasing for more aggressive loops vectorization.
>
> The vhost/virtio optimization patches were benchmarked by running PVP
> case on ThunderX2 platform and showed positive performance results.
>
> v3:
>   1.Correct the compiling issue on GCC 4.8.5.
>   2.Squash the replacement patches and wrapper definition into one
>     patch.(suggested by David Marchand)
>
> v2:
>   Add a common wrapper for restricted pointer aliasing to be supported
>   by all compilers.(suggested by Maxime Coquelin)
>
> Joyce Kong (3):
>   lib/eal: add a common wrapper for restricted pointers
>   net/virtio: restrict pointer aliasing for NEON vpmd
>   lib/vhost: restrict pointer aliasing for packed vpmd
>
>  drivers/net/i40e/i40e_rxtx_vec_neon.c         |  17 +-
>  drivers/net/mlx5/mlx5_rxtx.c                  | 208 +++++++++---------
>  drivers/net/virtio/virtio_rxtx_simple_neon.c  |   5 +-
>  .../pthread_shim/pthread_shim.c               |  12 +-
>  lib/librte_eal/include/rte_common.h           |  10 +
>  lib/librte_vhost/virtio_net.c                 |  14 +-
>  6 files changed, 139 insertions(+), 127 deletions(-)
>

The changes are quite mechanical for the existing drivers.

On the vhost/virtio side, Maxime is off but Adrian had a look at the
generic bits.
The gains in vhost/virtio patches are interesting.

So I went and took those patches through the main branch.

Series applied, thanks Joyce.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 40+ messages in thread

end of thread, other threads:[~2020-07-10 14:05 UTC | newest]

Thread overview: 40+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-06-11  3:32 [dpdk-dev] [PATCH v1 0/2] virtio: restrict pointer aliasing for loops vectorization Joyce Kong
2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 1/2] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
2020-06-23  8:47   ` Maxime Coquelin
2020-06-23  9:05     ` Phil Yang
2020-06-24  2:58       ` Joyce Kong
2020-06-24  4:16         ` Stephen Hemminger
2020-06-11  3:32 ` [dpdk-dev] [PATCH v1 2/2] lib/vhost: restrict pointer aliasing for packed path Joyce Kong
2020-07-07 16:25   ` Adrian Moreno
2020-07-10  3:15     ` Joyce Kong
2020-07-06  7:49 ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a common wrapper Joyce Kong
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 1/6] lib/eal: add a common wrapper for restricted pointers Joyce Kong
2020-07-07  2:15     ` Jerin Jacob
2020-07-07  2:24     ` Phil Yang
2020-07-07  2:40     ` Ruifeng Wang
2020-07-07 13:57     ` David Marchand
2020-07-08  2:46       ` Joyce Kong
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 2/6] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 3/6] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
2020-07-07 13:58     ` David Marchand
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 4/6] net/i40e: replace restrict with rte restrict Joyce Kong
2020-07-07  2:25     ` Phil Yang
2020-07-07  2:43     ` Ruifeng Wang
2020-07-07 14:00     ` David Marchand
2020-07-08  3:21       ` Joyce Kong
2020-07-09  9:57         ` David Marchand
2020-07-10  2:45           ` Joyce Kong
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 5/6] examples/performance-thread: replace restrict with wrapper Joyce Kong
2020-07-07  2:27     ` Phil Yang
2020-07-07  2:45     ` Ruifeng Wang
2020-07-06  7:49   ` [dpdk-dev] [PATCH v2 6/6] net/mlx5: replace restrict keyword with rte restrict Joyce Kong
2020-07-07  2:28     ` Phil Yang
2020-07-07  2:47     ` Ruifeng Wang
2020-07-09 13:52   ` [dpdk-dev] [PATCH v2 0/6] Restrict pointer aliasing with a commonwrapper Morten Brørup
2020-07-10  3:17     ` Joyce Kong
2020-07-10  2:38 ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper Joyce Kong
2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 1/3] lib/eal: add a common wrapper for restricted pointers Joyce Kong
2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 2/3] net/virtio: restrict pointer aliasing for NEON vpmd Joyce Kong
2020-07-10  2:38   ` [dpdk-dev] [PATCH v3 3/3] lib/vhost: restrict pointer aliasing for packed vpmd Joyce Kong
2020-07-10 13:41     ` Adrian Moreno
2020-07-10 14:05   ` [dpdk-dev] [PATCH v3 0/3] restrict pointer aliasing with a common wrapper David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).