* [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-28 10:09 ` David Marchand
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM Jan Viktorin
` (17 subsequent siblings)
18 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off
v2 -> v3: XMM_SIZE is defined in rte_vect.h in a following patch
---
config/defconfig_arm-armv7-a-linuxapp-gcc | 75 +++++++++++++++++++++++++++++++
mk/arch/arm/rte.vars.mk | 39 ++++++++++++++++
mk/machine/armv7-a/rte.vars.mk | 67 +++++++++++++++++++++++++++
3 files changed, 181 insertions(+)
create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
create mode 100644 mk/arch/arm/rte.vars.mk
create mode 100644 mk/machine/armv7-a/rte.vars.mk
diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
new file mode 100644
index 0000000..5a778cf
--- /dev/null
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -0,0 +1,75 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All right reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# avoids using i686/x86_64 SIMD instructions, nothing for ARM
+CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH ?= arm
+CROSS ?=
+
+CPU_CFLAGS ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..48d3979
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,67 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+# - can define ARCH variable (overridden by cmdline value)
+# - can define CROSS variable (overridden by cmdline value)
+# - define MACHINE_CFLAGS variable (overridden by cmdline value)
+# - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+# - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+# - can define CPU_CFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS += -march=armv7-a
+
+ifdef CONFIG_RTE_ARCH_ARM_TUNE
+MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
+endif
+
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+MACHINE_CFLAGS += -mfpu=neon
+endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
@ 2015-10-28 10:09 ` David Marchand
2015-10-28 10:56 ` Jan Viktorin
0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 10:09 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev, Vlastimil Kosar
Hello Jan,
On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:
>
> diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> b/config/defconfig_arm-armv7-a-linuxapp-gcc
> new file mode 100644
> index 0000000..5a778cf
> --- /dev/null
> +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> +
> +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
>
(<unrelated>yet another build flag which has to disappear, and bitmap
header should be moved from librte_sched to eal with arch-specific
implementations when applicable</unrelated>)
Well, I am a bit confused by this comment.
For me, gcc provides ctzll builtins.
https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
And with your patchset applied, it builds fine with
RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
Is there a dependency on gcc version ?
+# PCI is usually not used on ARM
> +CONFIG_RTE_EAL_IGB_UIO=n
>
Not sure "usually not used" is a good reason to disable something.
Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
2015-10-28 10:09 ` David Marchand
@ 2015-10-28 10:56 ` Jan Viktorin
2015-10-28 13:40 ` David Marchand
2015-10-28 13:44 ` Hunt, David
0 siblings, 2 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 10:56 UTC (permalink / raw)
To: David Marchand; +Cc: dev, Vlastimil Kosar
On Wed, 28 Oct 2015 11:09:21 +0100
David Marchand <david.marchand@6wind.com> wrote:
> Hello Jan,
>
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
>
> >
> > diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc
> > b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > new file mode 100644
> > index 0000000..5a778cf
> > --- /dev/null
> > +++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
> > +
> > +# avoids using i686/x86_64 SIMD instructions, nothing for ARM
> > +CONFIG_RTE_BITMAP_OPTIMIZATIONS=0
> >
>
> (<unrelated>yet another build flag which has to disappear, and bitmap
> header should be moved from librte_sched to eal with arch-specific
> implementations when applicable</unrelated>)
>
> Well, I am a bit confused by this comment.
> For me, gcc provides ctzll builtins.
> https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html
>
> And with your patchset applied, it builds fine with
> RTE_BITMAP_OPTIMIZATIONS enabled using gcc 4.7.3 for arm on ubuntu 14.04.
> Is there a dependency on gcc version ?
It seems, there is no need for this. I will remove it. DPDK compiles
well.
>
>
> +# PCI is usually not used on ARM
> > +CONFIG_RTE_EAL_IGB_UIO=n
> >
>
> Not sure "usually not used" is a good reason to disable something.
> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
>
Well, it requires to set some options in Linux Kernel (at least PCI
support) which are usually disabled by the in-kernel *arm*_defconfigs.
Moreover, it seems I cannot enable it for some ARM architectures (I've
tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
with a PCI bus. I suppose that if somebody _really_ needs this, she would
enable it by hand.
At the moment, it breaks my common builds... The driver is mostly
useless on ARMv7 and just takes space in the filesystem.
>
> Thanks.
>
Regards
Jan
--
Jan Viktorin E-mail: Viktorin@RehiveTech.com
System Architect Web: www.RehiveTech.com
RehiveTech
Brno, Czech Republic
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
2015-10-28 10:56 ` Jan Viktorin
@ 2015-10-28 13:40 ` David Marchand
2015-10-28 13:44 ` Hunt, David
1 sibling, 0 replies; 72+ messages in thread
From: David Marchand @ 2015-10-28 13:40 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev, Vlastimil Kosar
On Wed, Oct 28, 2015 at 11:56 AM, Jan Viktorin <viktorin@rehivetech.com>
wrote:
> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand <david.marchand@6wind.com> wrote:
>
> > +# PCI is usually not used on ARM
> > > +CONFIG_RTE_EAL_IGB_UIO=n
> > >
> >
> > Not sure "usually not used" is a good reason to disable something.
> > Is there a real issue on arm with igb_uio code (compilation, pci
> accesses) ?
> >
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.
>
>
Ok, well, at the moment, you seem to be the only user :-)
Let's see what other people say.
--
David Marchand
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 01/17] mk: Introduce ARMv7 architecture
2015-10-28 10:56 ` Jan Viktorin
2015-10-28 13:40 ` David Marchand
@ 2015-10-28 13:44 ` Hunt, David
1 sibling, 0 replies; 72+ messages in thread
From: Hunt, David @ 2015-10-28 13:44 UTC (permalink / raw)
To: Jan Viktorin, David Marchand; +Cc: dev, Vlastimil Kosar
On 28/10/2015 10:56, Jan Viktorin wrote:
> On Wed, 28 Oct 2015 11:09:21 +0100
> David Marchand <david.marchand@6wind.com> wrote:
>
>> Hello Jan,
>>
>> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
>> wrote:
>>
>> +# PCI is usually not used on ARM
>>> +CONFIG_RTE_EAL_IGB_UIO=n
>>>
>>
>> Not sure "usually not used" is a good reason to disable something.
>> Is there a real issue on arm with igb_uio code (compilation, pci accesses) ?
>>
>
> Well, it requires to set some options in Linux Kernel (at least PCI
> support) which are usually disabled by the in-kernel *arm*_defconfigs.
> Moreover, it seems I cannot enable it for some ARM architectures (I've
> tried Altera SoC FPGA). That's because you hardly find an ARMv7 system
> with a PCI bus. I suppose that if somebody _really_ needs this, she would
> enable it by hand.
>
> At the moment, it breaks my common builds... The driver is mostly
> useless on ARMv7 and just takes space in the filesystem.
I have an ARMv8 board here that I've built a new kernel for the purposes
of an ARMv8 port, and it took quite a while to get the PCI
functionality all working, including implementing a fix to the kernel
PCI driver to expose the mmap resources in sysfs properly. But after
that, igb_uio compiles fine (on the ARMv8 patch) and works with a
Niantic to pass traffic between ports.
If the majority of ARMv7 boards don't have a PCI bus, then I'd suggest
leaving igb_uio disabled. Those few boards with PCI will most likely
have a correctly kernel (and source) ready to go, so enabling igb_uio
for them will be easy, but disabling seems a more sensible default for
the majority of ARMv7 users.
Rgds,
Dave.
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order " Jan Viktorin
` (16 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)
---
.../common/include/arch/arm/rte_atomic.h | 256 +++++++++++++++++++++
1 file changed, 256 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..1815766
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define rte_mb() __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while(0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+ return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+ __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+ __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+ return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+ __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+ __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+ tmp, 0);
+ }
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ /* replace the value by itself */
+ success = rte_atomic64_cmpset((volatile uint64_t *) &v->cnt,
+ tmp, tmp);
+ }
+ return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ success = rte_atomic64_cmpset((volatile uint64_t *)&v->cnt,
+ tmp, new_value);
+ }
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+ __atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+ __atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+ __atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+ __atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+ return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+ return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+ return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ * A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+ rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order operations for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 01/17] mk: Introduce " Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 02/17] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle " Jan Viktorin
` (15 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_byteorder.h | 148 +++++++++++++++++++++
1 file changed, 148 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..04e7b87
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,148 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+ register uint16_t x = _x;
+ asm volatile ("rev16 %[x1],%[x2]"
+ : [x1] "=r" (x)
+ : [x2] "r" (x)
+ );
+ return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+ register uint32_t x = _x;
+ asm volatile ("rev %[x1],%[x2]"
+ : [x1] "=r" (x)
+ : [x2] "r" (x)
+ );
+ return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+ return __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap16(x) : \
+ rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap32(x) : \
+ rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap64(x) : \
+ rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap16(x) : \
+ rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle operations for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (2 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 03/17] eal/arm: byte order " Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
` (14 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_cycles.h | 85 ++++++++++++++++++++++
1 file changed, 85 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+ available in the core is 32 bit wide. Therefore it is unsuitable as the
+ counter overlaps every few seconds and probably is not accessible by
+ userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+ simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ * The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+ struct timespec val;
+ uint64_t v;
+
+ while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+ /* no body */;
+
+ v = (uint64_t) val.tv_sec * 1000000000LL;
+ v += (uint64_t) val.tv_nsec;
+ return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+ rte_mb();
+ return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (3 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 04/17] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM Jan Viktorin
` (13 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
.../common/include/arch/arm/rte_cycles.h | 38 +++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
* @return
* The time base for this lcore.
*/
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
{
struct timespec val;
uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
v += (uint64_t) val.tv_nsec;
return v;
}
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ * asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ * asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ * asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+ unsigned tsc;
+ uint64_t final_tsc;
+
+ /* Read PMCCNTR */
+ asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+ /* 1 tick = 64 clocks */
+ final_tsc = ((uint64_t)tsc) << 6;
+
+ return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */
static inline uint64_t
rte_rdtsc_precise(void)
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (4 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 05/17] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
` (12 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_prefetch.h | 61 ++++++++++++++++++++++
1 file changed, 61 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..8d75fe6
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+ asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+ asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+ asm volatile ("pld [%[p]]" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (5 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 06/17] eal/arm: prefetch operations for ARM Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM Jan Viktorin
` (11 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_spinlock.h | 114 +++++++++++++++++++++
1 file changed, 114 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+ while (__sync_lock_test_and_set(&sl->locked, 1))
+ while (sl->locked)
+ rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+ __sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+ return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+ return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+ rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+ return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+ rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+ rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+ rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+ return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (6 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 07/17] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
` (10 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.
The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.
by X % faster
Length (B) a15 a7 a9
1 4.9 15.2 3.2
7 56.9 48.2 40.3
8 37.3 39.8 29.6
9 69.3 38.7 33.9
15 60.8 35.3 23.7
16 50.6 35.9 35.0
17 57.7 35.7 31.1
31 16.0 23.3 9.0
32 65.9 13.5 21.4
33 3.9 10.3 -3.7
63 2.0 12.9 -2.0
64 66.5 0.0 16.5
65 2.7 7.6 -35.6
127 0.1 4.5 -18.9
128 66.2 1.5 -51.4
129 -0.8 3.2 -35.8
255 -3.1 -0.9 -69.1
256 67.9 1.2 7.2
257 -3.6 -1.9 -36.9
320 67.7 1.4 0.0
384 66.8 1.4 -14.2
511 -44.9 -2.3 -41.9
512 67.3 1.4 -6.8
513 -41.7 -3.0 -36.2
1023 -82.4 -2.8 -41.2
1024 68.3 1.4 -11.6
1025 -80.1 -3.3 -38.1
1518 -47.3 -5.0 -38.3
1522 -48.3 -6.0 -37.9
1600 65.4 1.3 -27.3
2048 59.5 1.5 -10.9
3072 52.3 1.5 -12.2
4096 45.3 1.4 -12.5
5120 40.6 1.5 -14.5
6144 35.4 1.4 -13.4
7168 32.9 1.4 -13.9
8192 28.2 1.4 -15.1
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_memcpy.h | 270 +++++++++++++++++++++
1 file changed, 270 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..ac885e9
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,270 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+ vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("vld1.8 {d0-d3}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d5}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d5}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d7}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d7}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+ asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d7}, [%[src]]!\n\t"
+ "vld1.8 {d8-d11}, [%[src]]!\n\t"
+ "vld1.8 {d12-d15}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d7}, [%[dst]]!\n\t"
+ "vst1.8 {d8-d11}, [%[dst]]!\n\t"
+ "vst1.8 {d12-d15}, [%[dst]]\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+ "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("pld [%[src], #64]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #128]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #192]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #256]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #320]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #384]" : : [src] "r" (src));
+ asm volatile ("pld [%[src], #448]" : : [src] "r" (src));
+ asm volatile ("vld1.8 {d0-d3}, [%[src]]!\n\t"
+ "vld1.8 {d4-d7}, [%[src]]!\n\t"
+ "vld1.8 {d8-d11}, [%[src]]!\n\t"
+ "vld1.8 {d12-d15}, [%[src]]!\n\t"
+ "vld1.8 {d16-d19}, [%[src]]!\n\t"
+ "vld1.8 {d20-d23}, [%[src]]!\n\t"
+ "vld1.8 {d24-d27}, [%[src]]!\n\t"
+ "vld1.8 {d28-d31}, [%[src]]\n\t"
+ "vst1.8 {d0-d3}, [%[dst]]!\n\t"
+ "vst1.8 {d4-d7}, [%[dst]]!\n\t"
+ "vst1.8 {d8-d11}, [%[dst]]!\n\t"
+ "vst1.8 {d12-d15}, [%[dst]]!\n\t"
+ "vst1.8 {d16-d19}, [%[dst]]!\n\t"
+ "vst1.8 {d20-d23}, [%[dst]]!\n\t"
+ "vst1.8 {d24-d27}, [%[dst]]!\n\t"
+ "vst1.8 {d28-d31}, [%[dst]]!\n\t"
+ : [src] "+r" (src), [dst] "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+ "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+ "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+ "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n) \
+ ({ (__builtin_constant_p(n)) ? \
+ memcpy((dst), (src), (n)) : \
+ rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+ void *ret = dst;
+
+ /* We can't copy < 16 bytes using XMM registers so do it manually. */
+ if (n < 16) {
+ if (n & 0x01) {
+ *(uint8_t *)dst = *(const uint8_t *)src;
+ dst = (uint8_t *)dst + 1;
+ src = (const uint8_t *)src + 1;
+ }
+ if (n & 0x02) {
+ *(uint16_t *)dst = *(const uint16_t *)src;
+ dst = (uint16_t *)dst + 1;
+ src = (const uint16_t *)src + 1;
+ }
+ if (n & 0x04) {
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ dst = (uint32_t *)dst + 1;
+ src = (const uint32_t *)src + 1;
+ }
+ if (n & 0x08) {
+ /* ARMv7 can not handle unaligned access to long long
+ * (uint64_t). Therefore two uint32_t operations are used.
+ * TODO: use NEON too?
+ */
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ dst = (uint32_t *)dst + 1;
+ src = (const uint32_t *)src + 1;
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ }
+ return ret;
+ }
+
+ /* Special fast cases for <= 128 bytes */
+ if (n <= 32) {
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov16((uint8_t *)dst - 16 + n,
+ (const uint8_t *)src - 16 + n);
+ return ret;
+ }
+
+ if (n <= 64) {
+ rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov32((uint8_t *)dst - 32 + n,
+ (const uint8_t *)src - 32 + n);
+ return ret;
+ }
+
+ if (n <= 128) {
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov64((uint8_t *)dst - 64 + n,
+ (const uint8_t *)src - 64 + n);
+ return ret;
+ }
+
+ /*
+ * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+ * copies was found to be faster than doing 128 and 32 byte copies as
+ * well.
+ */
+ for ( ; n >= 256; n -= 256) {
+ rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+ dst = (uint8_t *)dst + 256;
+ src = (const uint8_t *)src + 256;
+ }
+
+ /*
+ * We split the remaining bytes (which will be less than 256) into
+ * 64byte (2^6) chunks.
+ * Using incrementing integers in the case labels of a switch statement
+ * enourages the compiler to use a jump table. To get incrementing
+ * integers, we shift the 2 relevant bits to the LSB position to first
+ * get decrementing integers, and then subtract.
+ */
+ switch (3 - (n >> 6)) {
+ case 0x00:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ case 0x01:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ case 0x02:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ default:
+ ;
+ }
+
+ /*
+ * We split the remaining bytes (which will be less than 64) into
+ * 16byte (2^4) chunks, using the same switch structure as above.
+ */
+ switch (3 - (n >> 4)) {
+ case 0x00:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ case 0x01:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ case 0x02:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ default:
+ ;
+ }
+
+ /* Copy any remaining bytes, without going beyond end of buffers */
+ if (n != 0)
+ rte_mov16((uint8_t *)dst - 16 + n,
+ (const uint8_t *)src - 16 + n);
+ return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (7 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 08/17] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM Jan Viktorin
` (9 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
.../common/include/arch/arm/rte_memcpy.h | 59 +++++++++++++++++++++-
1 file changed, 57 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index ac885e9..75e8bda 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@
#include <stdint.h>
#include <string.h>
-/* ARM NEON Intrinsics are used to copy data */
-#include <arm_neon.h>
#ifdef __cplusplus
extern "C" {
@@ -44,6 +42,11 @@ extern "C" {
#include "generic/rte_memcpy.h"
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
static inline void
rte_mov16(uint8_t *dst, const uint8_t *src)
{
@@ -263,6 +266,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
return ret;
}
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+ return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+ return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
#ifdef __cplusplus
}
#endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (8 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 09/17] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags Jan Viktorin
` (8 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
app/test/test_cpuflags.c | 5 +
.../common/include/arch/arm/rte_cpuflags.h | 177 +++++++++++++++++++++
mk/rte.cpuflags.mk | 6 +
3 files changed, 188 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
#endif
+#if defined(RTE_ARCH_ARM)
+ printf("Check for NEON:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
printf("Check for SSE:\t\t");
CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+ REG_HWCAP = 0,
+ REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+ RTE_CPUFLAG_SWP = 0,
+ RTE_CPUFLAG_HALF,
+ RTE_CPUFLAG_THUMB,
+ RTE_CPUFLAG_A26BIT,
+ RTE_CPUFLAG_FAST_MULT,
+ RTE_CPUFLAG_FPA,
+ RTE_CPUFLAG_VFP,
+ RTE_CPUFLAG_EDSP,
+ RTE_CPUFLAG_JAVA,
+ RTE_CPUFLAG_IWMMXT,
+ RTE_CPUFLAG_CRUNCH,
+ RTE_CPUFLAG_THUMBEE,
+ RTE_CPUFLAG_NEON,
+ RTE_CPUFLAG_VFPv3,
+ RTE_CPUFLAG_VFPv3D16,
+ RTE_CPUFLAG_TLS,
+ RTE_CPUFLAG_VFPv4,
+ RTE_CPUFLAG_IDIVA,
+ RTE_CPUFLAG_IDIVT,
+ RTE_CPUFLAG_VFPD32,
+ RTE_CPUFLAG_LPAE,
+ RTE_CPUFLAG_EVTSTRM,
+ RTE_CPUFLAG_AES,
+ RTE_CPUFLAG_PMULL,
+ RTE_CPUFLAG_SHA1,
+ RTE_CPUFLAG_SHA2,
+ RTE_CPUFLAG_CRC32,
+ /* The last item */
+ RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+ FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0)
+ FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1)
+ FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2)
+ FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3)
+ FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP, 4)
+ FEAT_DEF(FPA, 0x00000001, 0, REG_HWCAP, 5)
+ FEAT_DEF(VFP, 0x00000001, 0, REG_HWCAP, 6)
+ FEAT_DEF(EDSP, 0x00000001, 0, REG_HWCAP, 7)
+ FEAT_DEF(JAVA, 0x00000001, 0, REG_HWCAP, 8)
+ FEAT_DEF(IWMMXT, 0x00000001, 0, REG_HWCAP, 9)
+ FEAT_DEF(CRUNCH, 0x00000001, 0, REG_HWCAP, 10)
+ FEAT_DEF(THUMBEE, 0x00000001, 0, REG_HWCAP, 11)
+ FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 12)
+ FEAT_DEF(VFPv3, 0x00000001, 0, REG_HWCAP, 13)
+ FEAT_DEF(VFPv3D16, 0x00000001, 0, REG_HWCAP, 14)
+ FEAT_DEF(TLS, 0x00000001, 0, REG_HWCAP, 15)
+ FEAT_DEF(VFPv4, 0x00000001, 0, REG_HWCAP, 16)
+ FEAT_DEF(IDIVA, 0x00000001, 0, REG_HWCAP, 17)
+ FEAT_DEF(IDIVT, 0x00000001, 0, REG_HWCAP, 18)
+ FEAT_DEF(VFPD32, 0x00000001, 0, REG_HWCAP, 19)
+ FEAT_DEF(LPAE, 0x00000001, 0, REG_HWCAP, 20)
+ FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 21)
+ FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP2, 0)
+ FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP2, 1)
+ FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2)
+ FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3)
+ FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+ __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+ int auxv_fd;
+ Elf32_auxv_t auxv;
+
+ auxv_fd = open("/proc/self/auxv", O_RDONLY);
+ assert(auxv_fd);
+ while (read(auxv_fd, &auxv,
+ sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+ if (auxv.a_type == AT_HWCAP)
+ out[REG_HWCAP] = auxv.a_un.a_val;
+ else if (auxv.a_type == AT_HWCAP2)
+ out[REG_HWCAP2] = auxv.a_un.a_val;
+ }
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+ const struct feature_entry *feat;
+ cpuid_registers_t regs = {0};
+
+ if (feature >= RTE_CPUFLAG_NUMFLAGS)
+ /* Flag does not match anything in the feature tables */
+ return -ENOENT;
+
+ feat = &cpu_feature_table[feature];
+
+ if (!feat->leaf)
+ /* This entry in the table wasn't filled out! */
+ return -EFAULT;
+
+ /* get the cpuid leaf containing the desired feature */
+ rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+ /* check if the feature is enabled */
+ return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
CPUFLAGS += VSX
endif
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
# To strip whitespace
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (9 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 10/17] eal/arm: cpu flag checks for ARM Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM Jan Viktorin
` (7 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
v2 -> v3: fixed forgotten include of string.h
---
lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..7ce9d14 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -41,6 +41,7 @@ extern "C" {
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>
+#include <string.h>
#include "generic/rte_cpuflags.h"
@@ -52,10 +53,15 @@ extern "C" {
#define AT_HWCAP2 26
#endif
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
/* software based registers */
enum cpu_register_t {
REG_HWCAP = 0,
REG_HWCAP2,
+ REG_PLATFORM,
};
/**
@@ -89,6 +95,8 @@ enum rte_cpu_flag_t {
RTE_CPUFLAG_SHA1,
RTE_CPUFLAG_SHA2,
RTE_CPUFLAG_CRC32,
+ RTE_CPUFLAG_AARCH32,
+ RTE_CPUFLAG_AARCH64,
/* The last item */
RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
};
@@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = {
FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2)
FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3)
FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4)
+ FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0)
+ FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1)
};
/*
@@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
out[REG_HWCAP] = auxv.a_un.a_val;
else if (auxv.a_type == AT_HWCAP2)
out[REG_HWCAP2] = auxv.a_un.a_val;
+ else if (auxv.a_type == AT_PLATFORM) {
+ if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+ out[REG_PLATFORM] = 0x0001;
+ else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+ out[REG_PLATFORM] = 0x0002;
+ }
}
}
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (10 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 11/17] eal/arm: detect arm architecture in cpu flags Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
` (6 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
Just a copy from PPC.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_rwlock.h | 40 ++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (11 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 12/17] eal/arm: rwlock support for ARM Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-28 12:16 ` David Marchand
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7 Jan Viktorin
` (5 subsequent siblings)
18 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
mk/toolchain/gcc/rte.vars.mk | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..8f9c396 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
WERROR_FLAGS += -Wundef -Wwrite-strings
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error
+endif
+
# process cpu flags
include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-28 12:16 ` David Marchand
2015-10-28 17:34 ` Jan Viktorin
0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 12:16 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev, Vlastimil Kosar
On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:
> There several issues with alignment when compiling for ARMv7.
> They are not considered to be fatal (ARMv7 supports unaligned
> access of 32b words), so we just leave them as warnings. They
> should be solved later, however.
>
> Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> ---
> mk/toolchain/gcc/rte.vars.mk | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> index 0f51c66..8f9c396 100644
> --- a/mk/toolchain/gcc/rte.vars.mk
> +++ b/mk/toolchain/gcc/rte.vars.mk
> @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> -Wcast-qual
> WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
> WERROR_FLAGS += -Wundef -Wwrite-strings
>
> +# There are many issues reported for ARMv7 architecture
> +# which are not necessarily fatal. Report as warnings.
> +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> +WERROR_FLAGS += -Wno-error
> +endif
> +
>
Can we disable only "known" problems ?
Something like :
WERROR_FLAGS += -Wno-error=cast-align
--
David Marchand
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build
2015-10-28 12:16 ` David Marchand
@ 2015-10-28 17:34 ` Jan Viktorin
0 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:34 UTC (permalink / raw)
To: David Marchand; +Cc: dev, Vlastimil Kosar
On Wed, 28 Oct 2015 13:16:24 +0100
David Marchand <david.marchand@6wind.com> wrote:
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
>
> > There several issues with alignment when compiling for ARMv7.
> > They are not considered to be fatal (ARMv7 supports unaligned
> > access of 32b words), so we just leave them as warnings. They
> > should be solved later, however.
> >
> > Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
> > Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
> > ---
> > mk/toolchain/gcc/rte.vars.mk | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
> > index 0f51c66..8f9c396 100644
> > --- a/mk/toolchain/gcc/rte.vars.mk
> > +++ b/mk/toolchain/gcc/rte.vars.mk
> > @@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs
> > -Wcast-qual
> > WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
> > WERROR_FLAGS += -Wundef -Wwrite-strings
> >
> > +# There are many issues reported for ARMv7 architecture
> > +# which are not necessarily fatal. Report as warnings.
> > +ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
> > +WERROR_FLAGS += -Wno-error
> > +endif
> > +
> >
>
> Can we disable only "known" problems ?
>
> Something like :
> WERROR_FLAGS += -Wno-error=cast-align
>
>
Sure! That's better idea, I always forgot about this possibilities in
GCC...
Jan
--
Jan Viktorin E-mail: Viktorin@RehiveTech.com
System Architect Web: www.RehiveTech.com
RehiveTech
Brno, Czech Republic
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (12 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 13/17] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect Jan Viktorin
` (4 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
MAINTAINERS | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
F: lib/librte_eal/common/include/arch/ppc_64/
+ARM v7
+M: Jan Viktorin <viktorin@rehivetech.com>
+F: lib/librte_eal/common/include/arch/arm/
+
Intel x86
M: Bruce Richardson <bruce.richardson@intel.com>
M: Konstantin Ananyev <konstantin.ananyev@intel.com>
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (13 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 14/17] maintainers: claim responsibility for ARMv7 Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 Jan Viktorin
` (3 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
This patch does not map x86 SIMD operations to the ARM ones.
It just fills the necessary gap between the platforms to enable
compilation of libraries LPM (includes rte_vect.h, lpm_test needs
those SIMD functions) and ACL (includes rte_vect.h).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 81 +++++++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
new file mode 100644
index 0000000..b346c7d
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -0,0 +1,81 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VECT_ARM_H_
+#define _RTE_VECT_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define XMM_SIZE 16
+#define XMM_MASK (XMM_MASK - 1)
+
+typedef struct {
+ union uint128 {
+ uint8_t uint8[16];
+ uint32_t uint32[4];
+ } val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+ __m128i res;
+ res.val.uint32[0] = v0;
+ res.val.uint32[1] = v1;
+ res.val.uint32[2] = v2;
+ res.val.uint32[3] = v3;
+ return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i * v)
+{
+ __m128i res;
+ res = *v;
+ return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i * v)
+{
+ __m128i res;
+ res = *v;
+ return res;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (14 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 15/17] eal/arm: add very incomplete rte_vect Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported Jan Viktorin
` (2 subsequent siblings)
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin; +Cc: Vlastimil Kosar
From: Vlastimil Kosar <kosar@rehivetech.com>
LPM function rte_lpm_lookupx4() uses i686/x86_64 SIMD intrinsics. Therefore,
the function is reimplemented using non-vector operations for non-x86
architectures.
LPM now builds for ARM.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2 -> v3: as SIMD operations have been moved to rte_vect.h,
this patch is now quite clear and just defines the
non-x86 version of rte_lpm_lookupx4
---
config/defconfig_arm-armv7-a-linuxapp-gcc | 1 -
lib/librte_lpm/rte_lpm.h | 24 +++++++++++++++++++++---
2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/config/defconfig_arm-armv7-a-linuxapp-gcc b/config/defconfig_arm-armv7-a-linuxapp-gcc
index 5a778cf..a2c8b95 100644
--- a/config/defconfig_arm-armv7-a-linuxapp-gcc
+++ b/config/defconfig_arm-armv7-a-linuxapp-gcc
@@ -55,7 +55,6 @@ CONFIG_RTE_EAL_IGB_UIO=n
# fails to compile on ARM
CONFIG_RTE_LIBRTE_ACL=n
-CONFIG_RTE_LIBRTE_LPM=n
# cannot use those on ARM
CONFIG_RTE_KNI_KMOD=n
diff --git a/lib/librte_lpm/rte_lpm.h b/lib/librte_lpm/rte_lpm.h
index c299ce2..c02b355 100644
--- a/lib/librte_lpm/rte_lpm.h
+++ b/lib/librte_lpm/rte_lpm.h
@@ -358,9 +358,6 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
return 0;
}
-/* Mask four results. */
-#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
-
/**
* Lookup four IP addresses in an LPM table.
*
@@ -382,6 +379,14 @@ rte_lpm_lookup_bulk_func(const struct rte_lpm *lpm, const uint32_t * ips,
*/
static inline void
rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+ uint16_t defv);
+
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
+/* Mask four results. */
+#define RTE_LPM_MASKX4_RES UINT64_C(0x00ff00ff00ff00ff)
+
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
uint16_t defv)
{
__m128i i24;
@@ -472,6 +477,19 @@ rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
hop[2] = (tbl[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[2] : defv;
hop[3] = (tbl[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)tbl[3] : defv;
}
+#else
+static inline void
+rte_lpm_lookupx4(const struct rte_lpm *lpm, __m128i ip, uint16_t hop[4],
+ uint16_t defv)
+{
+ rte_lpm_lookup_bulk(lpm, ip.val.uint32, hop, 4);
+
+ hop[0] = (hop[0] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[0] : defv;
+ hop[1] = (hop[1] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[1] : defv;
+ hop[2] = (hop[2] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[2] : defv;
+ hop[3] = (hop[3] & RTE_LPM_LOOKUP_SUCCESS) ? (uint8_t)hop[3] : defv;
+}
+#endif
#ifdef __cplusplus
}
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (15 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 16/17] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk for non-x86 Jan Viktorin
@ 2015-10-27 19:13 ` Jan Viktorin
2015-10-28 14:54 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
18 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-27 19:13 UTC (permalink / raw)
To: dev, David Hunt, David Marchand, Ananyev, Konstantin
The main goal of this check is to avoid passing the -msse4.1
option to the GCC that does not support it (like arm toolchains).
The ACL now builds for ARM.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2 -> v3: handle missing SSE as suggested by K. Ananyev
---
lib/librte_acl/Makefile | 7 ++++++-
lib/librte_acl/rte_acl.c | 19 +++++++++++++++++--
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/lib/librte_acl/Makefile b/lib/librte_acl/Makefile
index 7a1cf8a..ed95f03 100644
--- a/lib/librte_acl/Makefile
+++ b/lib/librte_acl/Makefile
@@ -48,9 +48,14 @@ SRCS-$(CONFIG_RTE_LIBRTE_ACL) += rte_acl.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_bld.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_gen.c
SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_scalar.c
-SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+CC_SSE4_1_SUPPORT := $(shell $(CC) -msse4.1 -dM -E - < /dev/null >/dev/null 2>&1 && echo 1)
+
+ifeq ($(CC_SSE4_1_SUPPORT),1)
+SRCS-$(CONFIG_RTE_LIBRTE_ACL) += acl_run_sse.c
+CFLAGS_rte_acl.o += -DCC_SSE41_SUPPORT
CFLAGS_acl_run_sse.o += -msse4.1
+endif
#
# If the compiler supports AVX2 instructions,
diff --git a/lib/librte_acl/rte_acl.c b/lib/librte_acl/rte_acl.c
index d60219f..e7822de 100644
--- a/lib/librte_acl/rte_acl.c
+++ b/lib/librte_acl/rte_acl.c
@@ -42,6 +42,20 @@ static struct rte_tailq_elem rte_acl_tailq = {
EAL_REGISTER_TAILQ(rte_acl_tailq)
/*
+ * If the compiler doesn't support SSE instructions,
+ * then the dummy one would be used instead for SSE classify method.
+ */
+int __attribute__ ((weak))
+rte_acl_classify_sse(__rte_unused const struct rte_acl_ctx *ctx,
+ __rte_unused const uint8_t **data,
+ __rte_unused uint32_t *results,
+ __rte_unused uint32_t num,
+ __rte_unused uint32_t categories)
+{
+ return -ENOTSUP;
+}
+
+/*
* If the compiler doesn't support AVX2 instructions,
* then the dummy one would be used instead for AVX2 classify method.
*/
@@ -97,10 +111,11 @@ rte_acl_init(void)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_AVX2))
alg = RTE_ACL_CLASSIFY_AVX2;
else if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#else
+ alg = RTE_ACL_CLASSIFY_SSE;
+#elif defined (CC_SSE41_SUPPORT)
if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_1))
-#endif
alg = RTE_ACL_CLASSIFY_SSE;
+#endif
rte_acl_set_default_classify(alg);
}
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (16 preceding siblings ...)
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 17/17] acl: handle when SSE 4.1 is unsupported Jan Viktorin
@ 2015-10-28 14:54 ` David Marchand
2015-10-28 17:38 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
18 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 14:54 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev
Hello Jan,
On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:
> Hello DPDK community,
>
> this is the third attempt to post support for ARMv7 into the DPDK.
> There are changes related to the LPM and ACL libraries only:
>
> * included rte_vect.h, however, it is more a placeholder
> * rte_lpm.h was simplified due to the previous point
> * ACL now compiles as we detect whether the compiler
> supports SSE 4.1
>
This patchset looks good to me (with the minor comments I sent).
And armv8 support should fit quite well in this.
A last few things :
- checkpatch is not happy with some patches, can you have a look at this ?
- can you update the 2.2 release notes as part of this patchset to announce
armv7 support ?
- I am not really sure the acl et lpm fixes really belong to this patchset
as a more larger cleanup is necessary to have all libraries compile fine on
non-x86
- since you introduce a new architecture, do you intend to run daily build
checks and send reports to the test-report mailing list ?
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-28 14:54 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
@ 2015-10-28 17:38 ` Jan Viktorin
2015-10-28 17:58 ` David Marchand
0 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-28 17:38 UTC (permalink / raw)
To: David Marchand; +Cc: dev
On Wed, 28 Oct 2015 15:54:47 +0100
David Marchand <david.marchand@6wind.com> wrote:
> Hello Jan,
>
> On Tue, Oct 27, 2015 at 8:13 PM, Jan Viktorin <viktorin@rehivetech.com>
> wrote:
>
> > Hello DPDK community,
> >
> > this is the third attempt to post support for ARMv7 into the DPDK.
> > There are changes related to the LPM and ACL libraries only:
> >
> > * included rte_vect.h, however, it is more a placeholder
> > * rte_lpm.h was simplified due to the previous point
> > * ACL now compiles as we detect whether the compiler
> > supports SSE 4.1
> >
>
> This patchset looks good to me (with the minor comments I sent).
> And armv8 support should fit quite well in this.
>
> A last few things :
> - checkpatch is not happy with some patches, can you have a look at this ?
I will check this.
> - can you update the 2.2 release notes as part of this patchset to announce
> armv7 support ?
Yes, but where?
> - I am not really sure the acl et lpm fixes really belong to this patchset
> as a more larger cleanup is necessary to have all libraries compile fine on
> non-x86
So, you mean to omit those and disable them all? The LPM and ACL fixes
will be then included in 2.3?
> - since you introduce a new architecture, do you intend to run daily build
> checks and send reports to the test-report mailing list ?
I think, this is possible, if I automate it somehow. Do you mean to
test every individual patch? I have no tools for this (some ideas?). If
its just about git pull && test_script.sh, then it is quite OK.
I'd appreciate some help, ideas, advices, experiences in this area...
>
>
> Thanks.
>
Jan
--
Jan Viktorin E-mail: Viktorin@RehiveTech.com
System Architect Web: www.RehiveTech.com
RehiveTech
Brno, Czech Republic
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-28 17:38 ` Jan Viktorin
@ 2015-10-28 17:58 ` David Marchand
2015-10-29 14:02 ` Thomas Monjalon
0 siblings, 1 reply; 72+ messages in thread
From: David Marchand @ 2015-10-28 17:58 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev
On Wed, Oct 28, 2015 at 6:38 PM, Jan Viktorin <viktorin@rehivetech.com>
wrote:
>
>
> > - can you update the 2.2 release notes as part of this patchset to
> announce
> > armv7 support ?
>
> Yes, but where?
>
I would say "New Features" in doc/guides/rel_notes/release_2_2.rst.
> > - I am not really sure the acl et lpm fixes really belong to this
> patchset
> > as a more larger cleanup is necessary to have all libraries compile fine
> on
> > non-x86
>
> So, you mean to omit those and disable them all? The LPM and ACL fixes
> will be then included in 2.3?
>
This sounds more sane to me, rather than workarounds only for arm.
> > - since you introduce a new architecture, do you intend to run daily
> build
> > checks and send reports to the test-report mailing list ?
>
> I think, this is possible, if I automate it somehow. Do you mean to
> test every individual patch? I have no tools for this (some ideas?). If
> its just about git pull && test_script.sh, then it is quite OK.
>
> I'd appreciate some help, ideas, advices, experiences in this area...
>
I am pretty sure Thomas has some ideas about this.
Thanks.
--
David Marchand
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-28 17:58 ` David Marchand
@ 2015-10-29 14:02 ` Thomas Monjalon
2015-10-29 14:09 ` Jan Viktorin
0 siblings, 1 reply; 72+ messages in thread
From: Thomas Monjalon @ 2015-10-29 14:02 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev
2015-10-28 18:58, David Marchand:
> > > - since you introduce a new architecture, do you intend to run daily
> > > build checks and send reports to the test-report mailing list ?
> >
> > I think, this is possible, if I automate it somehow. Do you mean to
> > test every individual patch? I have no tools for this (some ideas?). If
> > its just about git pull && test_script.sh, then it is quite OK.
> >
> > I'd appreciate some help, ideas, advices, experiences in this area...
>
> I am pretty sure Thomas has some ideas about this.
In order to make sure new commits won't introduce a regression on ARM,
we need to run some tests before accepting the patch.
If those tests are not run, some daily tests could catch the recent
regressions.
The most basic test is the compilation, then there are some unit tests
and DTS.
Are you OK to start with a basic daily compilation test? It can be achieved
with a simple crontab job.
If you or someone else have some time and a machine to do more, it would
be great. I'm sure the DPDK/ARM will gain enough interest to attract some
volunteers for more automatic tests.
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-29 14:02 ` Thomas Monjalon
@ 2015-10-29 14:09 ` Jan Viktorin
2015-10-29 15:02 ` Thomas Monjalon
0 siblings, 1 reply; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 14:09 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: dev
On Thu, 29 Oct 2015 15:02:03 +0100
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 2015-10-28 18:58, David Marchand:
> > > > - since you introduce a new architecture, do you intend to run daily
> > > > build checks and send reports to the test-report mailing list ?
> > >
> > > I think, this is possible, if I automate it somehow. Do you mean to
> > > test every individual patch? I have no tools for this (some ideas?). If
> > > its just about git pull && test_script.sh, then it is quite OK.
> > >
> > > I'd appreciate some help, ideas, advices, experiences in this area...
> >
> > I am pretty sure Thomas has some ideas about this.
>
> In order to make sure new commits won't introduce a regression on ARM,
> we need to run some tests before accepting the patch.
> If those tests are not run, some daily tests could catch the recent
> regressions.
I understand the purpose.
> The most basic test is the compilation, then there are some unit tests
> and DTS.
The unit tests are a problem at the moment. Probably, I will later find
a way, how to perform some unit tests in QEMU.
>
> Are you OK to start with a basic daily compilation test? It can be achieved
> with a simple crontab job.
I can do daily build test in Jenkins or a cron job.
Is it sufficient to check the master branch? Or you mean to pick patches from
patchwork? That seems quite complicated to me.
> If you or someone else have some time and a machine to do more, it would
> be great. I'm sure the DPDK/ARM will gain enough interest to attract some
> volunteers for more automatic tests.
>
^ permalink raw reply [flat|nested] 72+ messages in thread
* Re: [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture
2015-10-29 14:09 ` Jan Viktorin
@ 2015-10-29 15:02 ` Thomas Monjalon
0 siblings, 0 replies; 72+ messages in thread
From: Thomas Monjalon @ 2015-10-29 15:02 UTC (permalink / raw)
To: Jan Viktorin; +Cc: dev
2015-10-29 15:09, Jan Viktorin:
> On Thu, 29 Oct 2015 15:02:03 +0100
> Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> > The most basic test is the compilation, then there are some unit tests
> > and DTS.
>
> The unit tests are a problem at the moment. Probably, I will later find
> a way, how to perform some unit tests in QEMU.
>
> > Are you OK to start with a basic daily compilation test? It can be achieved
> > with a simple crontab job.
>
> I can do daily build test in Jenkins or a cron job.
>
> Is it sufficient to check the master branch? Or you mean to pick patches from
> patchwork? That seems quite complicated to me.
Yes do what you can. A daily build test of the master branch is the minimum
and may be enough to start.
As I said below, we may hope having some contributions in test coverage.
> > If you or someone else have some time and a machine to do more, it would
> > be great. I'm sure the DPDK/ARM will gain enough interest to attract some
> > volunteers for more automatic tests.
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 00/15] Support ARMv7 architecture
2015-10-27 19:13 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture Jan Viktorin
` (17 preceding siblings ...)
2015-10-28 14:54 ` [dpdk-dev] [PATCH v3 00/17] Support ARMv7 architecture David Marchand
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
` (14 more replies)
18 siblings, 15 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
Hello DPDK community,
This is the 4th series of the ARMv7 patchset. I've cleaned up
most checkpatch errors:
* Whitespaces were fixed.
* The asm volatile syntax (checkpatch didn't like the named "%xyz"
parameters when listed in the InputOperands list as "[xyz]").
* There are still few complaints (documented in each patch) but
I consider them as unimportant (volatile, couple of line overlaps,
new typedef). If it still can be done better, please report me
your ideas.
Other changes:
* The "introduction patch" was moved to be almost the last patch as
suggested by D. Marchand.
* I've added a note into the release_2_2.rst.
* The RTE_BITMAP_OPTIMIZATIONS was removed.
* The ARMv7 defconfig was renamed as suggested by B. Richardson.
* I've removed the LPM and ACL fixes from the patchset for now.
The libraries table and pipeline are disabled as well.
* The igb_uio driver is disabled.
* The -Wno-error was restricted to cast-align only.
To be answered (from rte_atomic.h patch):
* use __atomic_* or __sync_*? (may affect the required GCC version)
Regards
Jan
---
You can pull the changes from
https://github.com/RehiveTech/dpdk.git arm-support-v4
since commit 82fb702077f67585d64a07de0080e5cb6a924a72:
ixgbe: support new flow director modes for X550 (2015-10-29 00:06:01 +0100)
up to 437c85fd6d9c5f3bdd2411fb9ddf703dc4cba5a5:
maintainers: claim responsibility for ARMv7 (2015-10-29 13:33:49 +0100)
---
Jan Viktorin (7):
eal/arm: implement rdtsc by PMU or clock_gettime
eal/arm: use vector memcpy only when NEON is enabled
eal/arm: detect arm architecture in cpu flags
eal/arm: rwlock support for ARM
eal/arm: add very incomplete rte_vect
gcc/arm: avoid alignment errors to break build
maintainers: claim responsibility for ARMv7
Vlastimil Kosar (8):
eal/arm: atomic operations for ARM
eal/arm: byte order operations for ARM
eal/arm: cpu cycle operations for ARM
eal/arm: prefetch operations for ARM
eal/arm: spinlock operations for ARM (without HTM)
eal/arm: vector memcpy for ARM
eal/arm: cpu flag checks for ARM
mk: Introduce ARMv7 architecture
MAINTAINERS | 4 +
app/test/test_cpuflags.c | 5 +
config/defconfig_arm-armv7a-linuxapp-gcc | 74 +++++
doc/guides/rel_notes/release_2_2.rst | 5 +
.../common/include/arch/arm/rte_atomic.h | 256 ++++++++++++++++
.../common/include/arch/arm/rte_byteorder.h | 150 +++++++++
.../common/include/arch/arm/rte_cpuflags.h | 193 ++++++++++++
.../common/include/arch/arm/rte_cycles.h | 121 ++++++++
.../common/include/arch/arm/rte_memcpy.h | 334 +++++++++++++++++++++
.../common/include/arch/arm/rte_prefetch.h | 61 ++++
.../common/include/arch/arm/rte_rwlock.h | 40 +++
.../common/include/arch/arm/rte_spinlock.h | 114 +++++++
lib/librte_eal/common/include/arch/arm/rte_vect.h | 84 ++++++
mk/arch/arm/rte.vars.mk | 39 +++
mk/machine/armv7-a/rte.vars.mk | 67 +++++
mk/rte.cpuflags.mk | 6 +
mk/toolchain/gcc/rte.vars.mk | 6 +
17 files changed, 1559 insertions(+)
create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
create mode 100644 mk/arch/arm/rte.vars.mk
create mode 100644 mk/machine/armv7-a/rte.vars.mk
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order " Jan Viktorin
` (13 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific atomic operation file
for ARM architecture. It utilizes compiler intrinsics only.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v1 -> v2:
* improve rte_wmb()
* use __atomic_* or __sync_*? (may affect the required GCC version)
v4:
* checkpatch complaints about volatile keyword (but seems to be OK to me)
* checkpatch complaints about do { ... } while (0) for single statement
with asm volatile (but I didn't find a way how to write it without
the checkpatch complaints)
* checkpatch is now happy with whitespaces
---
.../common/include/arch/arm/rte_atomic.h | 256 +++++++++++++++++++++
1 file changed, 256 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_atomic.h b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
new file mode 100644
index 0000000..ea1e485
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_atomic.h
@@ -0,0 +1,256 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_ATOMIC_ARM_H_
+#define _RTE_ATOMIC_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define rte_mb() __sync_synchronize()
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define rte_wmb() do { asm volatile ("dmb st" : : : "memory"); } while (0)
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define rte_rmb() __sync_synchronize()
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+ return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+ __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+ __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+ return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+ __atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+ __atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+ return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE,
+ __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ success = rte_atomic64_cmpset(
+ (volatile uint64_t *)&v->cnt, tmp, 0);
+ }
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ /* replace the value by itself */
+ success = rte_atomic64_cmpset(
+ (volatile uint64_t *) &v->cnt, tmp, tmp);
+ }
+ return tmp;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+ int success = 0;
+ uint64_t tmp;
+
+ while (success == 0) {
+ tmp = v->cnt;
+ success = rte_atomic64_cmpset(
+ (volatile uint64_t *)&v->cnt, tmp, new_value);
+ }
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+ __atomic_fetch_add(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+ __atomic_fetch_sub(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+ __atomic_fetch_add(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+ __atomic_fetch_sub(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+ return __atomic_add_fetch(&v->cnt, inc, __ATOMIC_ACQUIRE);
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+ return __atomic_sub_fetch(&v->cnt, dec, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+ return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+ return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+ return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ * A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+ rte_atomic64_set(v, 0);
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order operations for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle " Jan Viktorin
` (12 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific byte order operations
for ARM. The architecture supports both big and little endian.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4: fix passing params to asm volatile for checkpatch
---
.../common/include/arch/arm/rte_byteorder.h | 150 +++++++++++++++++++++
1 file changed, 150 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_byteorder.h b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
new file mode 100644
index 0000000..5776997
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_byteorder.h
@@ -0,0 +1,150 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_BYTEORDER_ARM_H_
+#define _RTE_BYTEORDER_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+ register uint16_t x = _x;
+
+ asm volatile ("rev16 %0,%1"
+ : "=r" (x)
+ : "r" (x)
+ );
+ return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+ register uint32_t x = _x;
+
+ asm volatile ("rev %0,%1"
+ : "=r" (x)
+ : "r" (x)
+ );
+ return x;
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+ return __builtin_bswap64(_x);
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap16(x) : \
+ rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap32(x) : \
+ rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap64(x) : \
+ rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ? \
+ rte_constant_bswap16(x) : \
+ rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* ARM architecture is bi-endian (both big and little). */
+#if RTE_BYTE_ORDER == RTE_LITTLE_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else /* RTE_BIG_ENDIAN */
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle operations for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 01/15] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 02/15] eal/arm: byte order " Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
` (11 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
ARM architecture doesn't have a suitable source of CPU cycles. This
patch uses clock_gettime instead. The implementation should be improved
in the future.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_cycles.h | 85 ++++++++++++++++++++++
1 file changed, 85 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
new file mode 100644
index 0000000..ff66ae2
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -0,0 +1,85 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CYCLES_ARM_H_
+#define _RTE_CYCLES_ARM_H_
+
+/* ARM v7 does not have suitable source of clock signals. The only clock counter
+ available in the core is 32 bit wide. Therefore it is unsuitable as the
+ counter overlaps every few seconds and probably is not accessible by
+ userspace programs. Therefore we use clock_gettime(CLOCK_MONOTONIC_RAW) to
+ simulate counter running at 1GHz.
+*/
+
+#include <time.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ * The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+ struct timespec val;
+ uint64_t v;
+
+ while (clock_gettime(CLOCK_MONOTONIC_RAW, &val) != 0)
+ /* no body */;
+
+ v = (uint64_t) val.tv_sec * 1000000000LL;
+ v += (uint64_t) val.tv_nsec;
+ return v;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+ rte_mb();
+ return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (2 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 03/15] eal/arm: cpu cycle " Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM Jan Viktorin
` (10 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
Enable to choose a preferred way to read timer based on the
configuration entry CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU.
It requires a kernel module that is not included to work.
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
.../common/include/arch/arm/rte_cycles.h | 38 +++++++++++++++++++++-
1 file changed, 37 insertions(+), 1 deletion(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cycles.h b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
index ff66ae2..5dcef25 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cycles.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cycles.h
@@ -54,8 +54,14 @@ extern "C" {
* @return
* The time base for this lcore.
*/
+#ifndef CONFIG_RTE_ARM_EAL_RDTSC_USE_PMU
+
+/**
+ * This call is easily portable to any ARM architecture, however,
+ * it may be damn slow and inprecise for some tasks.
+ */
static inline uint64_t
-rte_rdtsc(void)
+__rte_rdtsc_syscall(void)
{
struct timespec val;
uint64_t v;
@@ -67,6 +73,36 @@ rte_rdtsc(void)
v += (uint64_t) val.tv_nsec;
return v;
}
+#define rte_rdtsc __rte_rdtsc_syscall
+
+#else
+
+/**
+ * This function requires to configure the PMCCNTR and enable
+ * userspace access to it:
+ *
+ * asm volatile("mcr p15, 0, %0, c9, c14, 0" : : "r"(1));
+ * asm volatile("mcr p15, 0, %0, c9, c12, 0" : : "r"(29));
+ * asm volatile("mcr p15, 0, %0, c9, c12, 1" : : "r"(0x8000000f));
+ *
+ * which is possible only from the priviledged mode (kernel space).
+ */
+static inline uint64_t
+__rte_rdtsc_pmccntr(void)
+{
+ unsigned tsc;
+ uint64_t final_tsc;
+
+ /* Read PMCCNTR */
+ asm volatile("mrc p15, 0, %0, c9, c13, 0" : "=r"(tsc));
+ /* 1 tick = 64 clocks */
+ final_tsc = ((uint64_t)tsc) << 6;
+
+ return (uint64_t)final_tsc;
+}
+#define rte_rdtsc __rte_rdtsc_pmccntr
+
+#endif /* RTE_ARM_EAL_RDTSC_USE_PMU */
static inline uint64_t
rte_rdtsc_precise(void)
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (3 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 04/15] eal/arm: implement rdtsc by PMU or clock_gettime Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
` (9 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds architecture specific prefetch operations
for ARM architecture. It utilizes the pld instruction that
starts filling the appropriate cache line without blocking.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4:
* checkpatch does not like the syntax of naming params
to asm volatile; switched to %0, %1 syntax
* checkpatch complatins about volatile (seems to be OK for me)
---
.../common/include/arch/arm/rte_prefetch.h | 61 ++++++++++++++++++++++
1 file changed, 61 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_prefetch.h b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
new file mode 100644
index 0000000..62c3991
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_PREFETCH_ARM_H_
+#define _RTE_PREFETCH_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(const volatile void *p)
+{
+ asm volatile ("pld [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch1(const volatile void *p)
+{
+ asm volatile ("pld [%0]" : : "r" (p));
+}
+
+static inline void rte_prefetch2(const volatile void *p)
+{
+ asm volatile ("pld [%0]" : : "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (4 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 05/15] eal/arm: prefetch operations for ARM Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM Jan Viktorin
` (8 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
This patch adds spinlock operations for ARM architecture.
We do not support HTM in spinlocks on ARM.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_spinlock.h | 114 +++++++++++++++++++++
1 file changed, 114 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_spinlock.h b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
new file mode 100644
index 0000000..cd5ab8b
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_spinlock.h
@@ -0,0 +1,114 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_SPINLOCK_ARM_H_
+#define _RTE_SPINLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Intrinsics are used to implement the spinlock on ARM architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+ while (__sync_lock_test_and_set(&sl->locked, 1))
+ while (sl->locked)
+ rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock(rte_spinlock_t *sl)
+{
+ __sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock(rte_spinlock_t *sl)
+{
+ return (__sync_lock_test_and_set(&sl->locked, 1) == 0);
+}
+
+#endif
+
+static inline int rte_tm_supported(void)
+{
+ return 0;
+}
+
+static inline void
+rte_spinlock_lock_tm(rte_spinlock_t *sl)
+{
+ rte_spinlock_lock(sl); /* fall-back */
+}
+
+static inline int
+rte_spinlock_trylock_tm(rte_spinlock_t *sl)
+{
+ return rte_spinlock_trylock(sl);
+}
+
+static inline void
+rte_spinlock_unlock_tm(rte_spinlock_t *sl)
+{
+ rte_spinlock_unlock(sl);
+}
+
+static inline void
+rte_spinlock_recursive_lock_tm(rte_spinlock_recursive_t *slr)
+{
+ rte_spinlock_recursive_lock(slr); /* fall-back */
+}
+
+static inline void
+rte_spinlock_recursive_unlock_tm(rte_spinlock_recursive_t *slr)
+{
+ rte_spinlock_recursive_unlock(slr);
+}
+
+static inline int
+rte_spinlock_recursive_trylock_tm(rte_spinlock_recursive_t *slr)
+{
+ return rte_spinlock_recursive_trylock(slr);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (5 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 06/15] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
` (7 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
The SSE based memory copy in DPDK only support x86. This patch
adds ARM NEON based memory copy functions for ARM architecture.
The implementation improves memory copy of short or well aligned
data buffers. The following measurements show improvements over
the libc memcpy on Cortex CPUs.
by X % faster
Length (B) a15 a7 a9
1 4.9 15.2 3.2
7 56.9 48.2 40.3
8 37.3 39.8 29.6
9 69.3 38.7 33.9
15 60.8 35.3 23.7
16 50.6 35.9 35.0
17 57.7 35.7 31.1
31 16.0 23.3 9.0
32 65.9 13.5 21.4
33 3.9 10.3 -3.7
63 2.0 12.9 -2.0
64 66.5 0.0 16.5
65 2.7 7.6 -35.6
127 0.1 4.5 -18.9
128 66.2 1.5 -51.4
129 -0.8 3.2 -35.8
255 -3.1 -0.9 -69.1
256 67.9 1.2 7.2
257 -3.6 -1.9 -36.9
320 67.7 1.4 0.0
384 66.8 1.4 -14.2
511 -44.9 -2.3 -41.9
512 67.3 1.4 -6.8
513 -41.7 -3.0 -36.2
1023 -82.4 -2.8 -41.2
1024 68.3 1.4 -11.6
1025 -80.1 -3.3 -38.1
1518 -47.3 -5.0 -38.3
1522 -48.3 -6.0 -37.9
1600 65.4 1.3 -27.3
2048 59.5 1.5 -10.9
3072 52.3 1.5 -12.2
4096 45.3 1.4 -12.5
5120 40.6 1.5 -14.5
6144 35.4 1.4 -13.4
7168 32.9 1.4 -13.9
8192 28.2 1.4 -15.1
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4:
* fix whitespace issues reported by checkpatch
* fix passing params to asm volatile for checkpatch
---
.../common/include/arch/arm/rte_memcpy.h | 279 +++++++++++++++++++++
1 file changed, 279 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
new file mode 100644
index 0000000..3662b81
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -0,0 +1,279 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_MEMCPY_ARM_H_
+#define _RTE_MEMCPY_ARM_H_
+
+#include <stdint.h>
+#include <string.h>
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+ vst1q_u8(dst, vld1q_u8(src));
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile (
+ "vld1.8 {d0-d3}, [%0]\n\t"
+ "vst1.8 {d0-d3}, [%1]\n\t"
+ : "+r" (src), "+r" (dst)
+ : : "memory", "d0", "d1", "d2", "d3");
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile (
+ "vld1.8 {d0-d3}, [%0]!\n\t"
+ "vld1.8 {d4-d5}, [%0]\n\t"
+ "vst1.8 {d0-d3}, [%1]!\n\t"
+ "vst1.8 {d4-d5}, [%1]\n\t"
+ : "+r" (src), "+r" (dst)
+ :
+ : "memory", "d0", "d1", "d2", "d3", "d4", "d5");
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile (
+ "vld1.8 {d0-d3}, [%0]!\n\t"
+ "vld1.8 {d4-d7}, [%0]\n\t"
+ "vst1.8 {d0-d3}, [%1]!\n\t"
+ "vst1.8 {d4-d7}, [%1]\n\t"
+ : "+r" (src), "+r" (dst)
+ :
+ : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7");
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("pld [%0, #64]" : : "r" (src));
+ asm volatile (
+ "vld1.8 {d0-d3}, [%0]!\n\t"
+ "vld1.8 {d4-d7}, [%0]!\n\t"
+ "vld1.8 {d8-d11}, [%0]!\n\t"
+ "vld1.8 {d12-d15}, [%0]\n\t"
+ "vst1.8 {d0-d3}, [%1]!\n\t"
+ "vst1.8 {d4-d7}, [%1]!\n\t"
+ "vst1.8 {d8-d11}, [%1]!\n\t"
+ "vst1.8 {d12-d15}, [%1]\n\t"
+ : "+r" (src), "+r" (dst)
+ :
+ : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+ "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15");
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+ asm volatile ("pld [%0, #64]" : : "r" (src));
+ asm volatile ("pld [%0, #128]" : : "r" (src));
+ asm volatile ("pld [%0, #192]" : : "r" (src));
+ asm volatile ("pld [%0, #256]" : : "r" (src));
+ asm volatile ("pld [%0, #320]" : : "r" (src));
+ asm volatile ("pld [%0, #384]" : : "r" (src));
+ asm volatile ("pld [%0, #448]" : : "r" (src));
+ asm volatile (
+ "vld1.8 {d0-d3}, [%0]!\n\t"
+ "vld1.8 {d4-d7}, [%0]!\n\t"
+ "vld1.8 {d8-d11}, [%0]!\n\t"
+ "vld1.8 {d12-d15}, [%0]!\n\t"
+ "vld1.8 {d16-d19}, [%0]!\n\t"
+ "vld1.8 {d20-d23}, [%0]!\n\t"
+ "vld1.8 {d24-d27}, [%0]!\n\t"
+ "vld1.8 {d28-d31}, [%0]\n\t"
+ "vst1.8 {d0-d3}, [%1]!\n\t"
+ "vst1.8 {d4-d7}, [%1]!\n\t"
+ "vst1.8 {d8-d11}, [%1]!\n\t"
+ "vst1.8 {d12-d15}, [%1]!\n\t"
+ "vst1.8 {d16-d19}, [%1]!\n\t"
+ "vst1.8 {d20-d23}, [%1]!\n\t"
+ "vst1.8 {d24-d27}, [%1]!\n\t"
+ "vst1.8 {d28-d31}, [%1]!\n\t"
+ : "+r" (src), "+r" (dst)
+ :
+ : "memory", "d0", "d1", "d2", "d3", "d4", "d5", "d6", "d7",
+ "d8", "d9", "d10", "d11", "d12", "d13", "d14", "d15",
+ "d16", "d17", "d18", "d19", "d20", "d21", "d22", "d23",
+ "d24", "d25", "d26", "d27", "d28", "d29", "d30", "d31");
+}
+
+#define rte_memcpy(dst, src, n) \
+ ({ (__builtin_constant_p(n)) ? \
+ memcpy((dst), (src), (n)) : \
+ rte_memcpy_func((dst), (src), (n)); })
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+ void *ret = dst;
+
+ /* We can't copy < 16 bytes using XMM registers so do it manually. */
+ if (n < 16) {
+ if (n & 0x01) {
+ *(uint8_t *)dst = *(const uint8_t *)src;
+ dst = (uint8_t *)dst + 1;
+ src = (const uint8_t *)src + 1;
+ }
+ if (n & 0x02) {
+ *(uint16_t *)dst = *(const uint16_t *)src;
+ dst = (uint16_t *)dst + 1;
+ src = (const uint16_t *)src + 1;
+ }
+ if (n & 0x04) {
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ dst = (uint32_t *)dst + 1;
+ src = (const uint32_t *)src + 1;
+ }
+ if (n & 0x08) {
+ /* ARMv7 can not handle unaligned access to long long
+ * (uint64_t). Therefore two uint32_t operations are
+ * used.
+ */
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ dst = (uint32_t *)dst + 1;
+ src = (const uint32_t *)src + 1;
+ *(uint32_t *)dst = *(const uint32_t *)src;
+ }
+ return ret;
+ }
+
+ /* Special fast cases for <= 128 bytes */
+ if (n <= 32) {
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov16((uint8_t *)dst - 16 + n,
+ (const uint8_t *)src - 16 + n);
+ return ret;
+ }
+
+ if (n <= 64) {
+ rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov32((uint8_t *)dst - 32 + n,
+ (const uint8_t *)src - 32 + n);
+ return ret;
+ }
+
+ if (n <= 128) {
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ rte_mov64((uint8_t *)dst - 64 + n,
+ (const uint8_t *)src - 64 + n);
+ return ret;
+ }
+
+ /*
+ * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+ * copies was found to be faster than doing 128 and 32 byte copies as
+ * well.
+ */
+ for ( ; n >= 256; n -= 256) {
+ rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+ dst = (uint8_t *)dst + 256;
+ src = (const uint8_t *)src + 256;
+ }
+
+ /*
+ * We split the remaining bytes (which will be less than 256) into
+ * 64byte (2^6) chunks.
+ * Using incrementing integers in the case labels of a switch statement
+ * enourages the compiler to use a jump table. To get incrementing
+ * integers, we shift the 2 relevant bits to the LSB position to first
+ * get decrementing integers, and then subtract.
+ */
+ switch (3 - (n >> 6)) {
+ case 0x00:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ case 0x01:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ case 0x02:
+ rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+ n -= 64;
+ dst = (uint8_t *)dst + 64;
+ src = (const uint8_t *)src + 64; /* fallthrough */
+ default:
+ break;
+ }
+
+ /*
+ * We split the remaining bytes (which will be less than 64) into
+ * 16byte (2^4) chunks, using the same switch structure as above.
+ */
+ switch (3 - (n >> 4)) {
+ case 0x00:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ case 0x01:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ case 0x02:
+ rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+ n -= 16;
+ dst = (uint8_t *)dst + 16;
+ src = (const uint8_t *)src + 16; /* fallthrough */
+ default:
+ break;
+ }
+
+ /* Copy any remaining bytes, without going beyond end of buffers */
+ if (n != 0)
+ rte_mov16((uint8_t *)dst - 16 + n,
+ (const uint8_t *)src - 16 + n);
+ return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (6 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 07/15] eal/arm: vector memcpy for ARM Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin
` (6 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
The GCC can be configured to avoid using NEON extensions.
For that purpose, we provide just the memcpy implementation
of the rte_memcpy.
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
.../common/include/arch/arm/rte_memcpy.h | 59 +++++++++++++++++++++-
1 file changed, 57 insertions(+), 2 deletions(-)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
index 3662b81..f41648a 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_memcpy.h
@@ -35,8 +35,6 @@
#include <stdint.h>
#include <string.h>
-/* ARM NEON Intrinsics are used to copy data */
-#include <arm_neon.h>
#ifdef __cplusplus
extern "C" {
@@ -44,6 +42,11 @@ extern "C" {
#include "generic/rte_memcpy.h"
+#ifdef __ARM_NEON_FP
+
+/* ARM NEON Intrinsics are used to copy data */
+#include <arm_neon.h>
+
static inline void
rte_mov16(uint8_t *dst, const uint8_t *src)
{
@@ -272,6 +275,58 @@ rte_memcpy_func(void *dst, const void *src, size_t n)
return ret;
}
+#else
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 16);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 32);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 48);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 64);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 128);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+ memcpy(dst, src, 256);
+}
+
+static inline void *
+rte_memcpy(void *dst, const void *src, size_t n)
+{
+ return memcpy(dst, src, n);
+}
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+ return memcpy(dst, src, n);
+}
+
+#endif /* __ARM_NEON_FP */
+
#ifdef __cplusplus
}
#endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (7 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 08/15] eal/arm: use vector memcpy only when NEON is enabled Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin
` (5 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
This implementation is based on IBM POWER version of
rte_cpuflags. We use software emulation of HW capability
registers, because those are usually not directly accessible
from userspace on ARM.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
app/test/test_cpuflags.c | 5 +
.../common/include/arch/arm/rte_cpuflags.h | 177 +++++++++++++++++++++
mk/rte.cpuflags.mk | 6 +
3 files changed, 188 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 5b92061..557458f 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -115,6 +115,11 @@ test_cpuflags(void)
CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
#endif
+#if defined(RTE_ARCH_ARM)
+ printf("Check for NEON:\t\t");
+ CHECK_FOR_FLAG(RTE_CPUFLAG_NEON);
+#endif
+
#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686)
printf("Check for SSE:\t\t");
CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
new file mode 100644
index 0000000..1eadb33
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -0,0 +1,177 @@
+/*
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_CPUFLAGS_ARM_H_
+#define _RTE_CPUFLAGS_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+#ifndef AT_HWCAP
+#define AT_HWCAP 16
+#endif
+
+#ifndef AT_HWCAP2
+#define AT_HWCAP2 26
+#endif
+
+/* software based registers */
+enum cpu_register_t {
+ REG_HWCAP = 0,
+ REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+ RTE_CPUFLAG_SWP = 0,
+ RTE_CPUFLAG_HALF,
+ RTE_CPUFLAG_THUMB,
+ RTE_CPUFLAG_A26BIT,
+ RTE_CPUFLAG_FAST_MULT,
+ RTE_CPUFLAG_FPA,
+ RTE_CPUFLAG_VFP,
+ RTE_CPUFLAG_EDSP,
+ RTE_CPUFLAG_JAVA,
+ RTE_CPUFLAG_IWMMXT,
+ RTE_CPUFLAG_CRUNCH,
+ RTE_CPUFLAG_THUMBEE,
+ RTE_CPUFLAG_NEON,
+ RTE_CPUFLAG_VFPv3,
+ RTE_CPUFLAG_VFPv3D16,
+ RTE_CPUFLAG_TLS,
+ RTE_CPUFLAG_VFPv4,
+ RTE_CPUFLAG_IDIVA,
+ RTE_CPUFLAG_IDIVT,
+ RTE_CPUFLAG_VFPD32,
+ RTE_CPUFLAG_LPAE,
+ RTE_CPUFLAG_EVTSTRM,
+ RTE_CPUFLAG_AES,
+ RTE_CPUFLAG_PMULL,
+ RTE_CPUFLAG_SHA1,
+ RTE_CPUFLAG_SHA2,
+ RTE_CPUFLAG_CRC32,
+ /* The last item */
+ RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+ FEAT_DEF(SWP, 0x00000001, 0, REG_HWCAP, 0)
+ FEAT_DEF(HALF, 0x00000001, 0, REG_HWCAP, 1)
+ FEAT_DEF(THUMB, 0x00000001, 0, REG_HWCAP, 2)
+ FEAT_DEF(A26BIT, 0x00000001, 0, REG_HWCAP, 3)
+ FEAT_DEF(FAST_MULT, 0x00000001, 0, REG_HWCAP, 4)
+ FEAT_DEF(FPA, 0x00000001, 0, REG_HWCAP, 5)
+ FEAT_DEF(VFP, 0x00000001, 0, REG_HWCAP, 6)
+ FEAT_DEF(EDSP, 0x00000001, 0, REG_HWCAP, 7)
+ FEAT_DEF(JAVA, 0x00000001, 0, REG_HWCAP, 8)
+ FEAT_DEF(IWMMXT, 0x00000001, 0, REG_HWCAP, 9)
+ FEAT_DEF(CRUNCH, 0x00000001, 0, REG_HWCAP, 10)
+ FEAT_DEF(THUMBEE, 0x00000001, 0, REG_HWCAP, 11)
+ FEAT_DEF(NEON, 0x00000001, 0, REG_HWCAP, 12)
+ FEAT_DEF(VFPv3, 0x00000001, 0, REG_HWCAP, 13)
+ FEAT_DEF(VFPv3D16, 0x00000001, 0, REG_HWCAP, 14)
+ FEAT_DEF(TLS, 0x00000001, 0, REG_HWCAP, 15)
+ FEAT_DEF(VFPv4, 0x00000001, 0, REG_HWCAP, 16)
+ FEAT_DEF(IDIVA, 0x00000001, 0, REG_HWCAP, 17)
+ FEAT_DEF(IDIVT, 0x00000001, 0, REG_HWCAP, 18)
+ FEAT_DEF(VFPD32, 0x00000001, 0, REG_HWCAP, 19)
+ FEAT_DEF(LPAE, 0x00000001, 0, REG_HWCAP, 20)
+ FEAT_DEF(EVTSTRM, 0x00000001, 0, REG_HWCAP, 21)
+ FEAT_DEF(AES, 0x00000001, 0, REG_HWCAP2, 0)
+ FEAT_DEF(PMULL, 0x00000001, 0, REG_HWCAP2, 1)
+ FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2)
+ FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3)
+ FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4)
+};
+
+/*
+ * Read AUXV software register and get cpu features for ARM
+ */
+static inline void
+rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
+ __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+ int auxv_fd;
+ Elf32_auxv_t auxv;
+
+ auxv_fd = open("/proc/self/auxv", O_RDONLY);
+ assert(auxv_fd);
+ while (read(auxv_fd, &auxv,
+ sizeof(Elf32_auxv_t)) == sizeof(Elf32_auxv_t)) {
+ if (auxv.a_type == AT_HWCAP)
+ out[REG_HWCAP] = auxv.a_un.a_val;
+ else if (auxv.a_type == AT_HWCAP2)
+ out[REG_HWCAP2] = auxv.a_un.a_val;
+ }
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+ const struct feature_entry *feat;
+ cpuid_registers_t regs = {0};
+
+ if (feature >= RTE_CPUFLAG_NUMFLAGS)
+ /* Flag does not match anything in the feature tables */
+ return -ENOENT;
+
+ feat = &cpu_feature_table[feature];
+
+ if (!feat->leaf)
+ /* This entry in the table wasn't filled out! */
+ return -EFAULT;
+
+ /* get the cpuid leaf containing the desired feature */
+ rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+ /* check if the feature is enabled */
+ return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_ARM_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index f595cd0..bec7bdd 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -106,6 +106,12 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
CPUFLAGS += VSX
endif
+# ARM flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__ARM_NEON_FP),)
+CPUFLAGS += NEON
+endif
+
+
MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
# To strip whitespace
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (8 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 09/15] eal/arm: cpu flag checks for ARM Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM Jan Viktorin
` (4 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
Based on the patch by David Hunt and Armuta Zende:
lib: added support for armv7 architecture
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Amruta Zende <amruta.zende@intel.com>
Signed-off-by: David Hunt <david.hunt@intel.com>
---
v2 -> v3: fixed forgotten include of string.h
v4: checkpatch reports few characters over 80 for checking aarch64
---
lib/librte_eal/common/include/arch/arm/rte_cpuflags.h | 16 ++++++++++++++++
1 file changed, 16 insertions(+)
diff --git a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
index 1eadb33..7ce9d14 100644
--- a/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
+++ b/lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
@@ -41,6 +41,7 @@ extern "C" {
#include <fcntl.h>
#include <assert.h>
#include <unistd.h>
+#include <string.h>
#include "generic/rte_cpuflags.h"
@@ -52,10 +53,15 @@ extern "C" {
#define AT_HWCAP2 26
#endif
+#ifndef AT_PLATFORM
+#define AT_PLATFORM 15
+#endif
+
/* software based registers */
enum cpu_register_t {
REG_HWCAP = 0,
REG_HWCAP2,
+ REG_PLATFORM,
};
/**
@@ -89,6 +95,8 @@ enum rte_cpu_flag_t {
RTE_CPUFLAG_SHA1,
RTE_CPUFLAG_SHA2,
RTE_CPUFLAG_CRC32,
+ RTE_CPUFLAG_AARCH32,
+ RTE_CPUFLAG_AARCH64,
/* The last item */
RTE_CPUFLAG_NUMFLAGS,/**< This should always be the last! */
};
@@ -121,6 +129,8 @@ static const struct feature_entry cpu_feature_table[] = {
FEAT_DEF(SHA1, 0x00000001, 0, REG_HWCAP2, 2)
FEAT_DEF(SHA2, 0x00000001, 0, REG_HWCAP2, 3)
FEAT_DEF(CRC32, 0x00000001, 0, REG_HWCAP2, 4)
+ FEAT_DEF(AARCH32, 0x00000001, 0, REG_PLATFORM, 0)
+ FEAT_DEF(AARCH64, 0x00000001, 0, REG_PLATFORM, 1)
};
/*
@@ -141,6 +151,12 @@ rte_cpu_get_features(__attribute__((unused)) uint32_t leaf,
out[REG_HWCAP] = auxv.a_un.a_val;
else if (auxv.a_type == AT_HWCAP2)
out[REG_HWCAP2] = auxv.a_un.a_val;
+ else if (auxv.a_type == AT_PLATFORM) {
+ if (!strcmp((const char *)auxv.a_un.a_val, "aarch32"))
+ out[REG_PLATFORM] = 0x0001;
+ else if (!strcmp((const char *)auxv.a_un.a_val, "aarch64"))
+ out[REG_PLATFORM] = 0x0002;
+ }
}
}
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (9 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 10/15] eal/arm: detect arm architecture in cpu flags Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin
` (3 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
Just a copy from PPC.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
.../common/include/arch/arm/rte_rwlock.h | 40 ++++++++++++++++++++++
1 file changed, 40 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_rwlock.h b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
new file mode 100644
index 0000000..664bec8
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_rwlock.h
@@ -0,0 +1,40 @@
+/* copied from ppc_64 */
+
+#ifndef _RTE_RWLOCK_ARM_H_
+#define _RTE_RWLOCK_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_rwlock.h"
+
+static inline void
+rte_rwlock_read_lock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_read_lock(rwl);
+}
+
+static inline void
+rte_rwlock_read_unlock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_read_unlock(rwl);
+}
+
+static inline void
+rte_rwlock_write_lock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_write_lock(rwl);
+}
+
+static inline void
+rte_rwlock_write_unlock_tm(rte_rwlock_t *rwl)
+{
+ rte_rwlock_write_unlock(rwl);
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_RWLOCK_ARM_H_ */
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (10 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 11/15] eal/arm: rwlock support for ARM Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin
` (2 subsequent siblings)
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
This patch does not map x86 SIMD operations to the ARM ones.
It just fills the necessary gap between the platforms to enable
compilation of libraries LPM (includes rte_vect.h, lpm_test needs
those SIMD functions) and ACL (includes rte_vect.h).
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v4: checkpatch reports warning for the new typedef
---
lib/librte_eal/common/include/arch/arm/rte_vect.h | 84 +++++++++++++++++++++++
1 file changed, 84 insertions(+)
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_vect.h
diff --git a/lib/librte_eal/common/include/arch/arm/rte_vect.h b/lib/librte_eal/common/include/arch/arm/rte_vect.h
new file mode 100644
index 0000000..7d5de97
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/arm/rte_vect.h
@@ -0,0 +1,84 @@
+/*-
+ * BSD LICENSE
+ *
+ * Copyright(c) 2015 RehiveTech. All rights reserved.
+ *
+ * Redistribution and use in source and binary forms, with or without
+ * modification, are permitted provided that the following conditions
+ * are met:
+ *
+ * * Redistributions of source code must retain the above copyright
+ * notice, this list of conditions and the following disclaimer.
+ * * Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in
+ * the documentation and/or other materials provided with the
+ * distribution.
+ * * Neither the name of RehiveTech nor the names of its
+ * contributors may be used to endorse or promote products derived
+ * from this software without specific prior written permission.
+ *
+ * THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ * "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ * LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ * OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ * SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ * LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ * DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ * THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ * OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+ */
+
+#ifndef _RTE_VECT_ARM_H_
+#define _RTE_VECT_ARM_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#define XMM_SIZE 16
+#define XMM_MASK (XMM_MASK - 1)
+
+typedef struct {
+ union uint128 {
+ uint8_t uint8[16];
+ uint32_t uint32[4];
+ } val;
+} __m128i;
+
+static inline __m128i
+_mm_set_epi32(uint32_t v0, uint32_t v1, uint32_t v2, uint32_t v3)
+{
+ __m128i res;
+
+ res.val.uint32[0] = v0;
+ res.val.uint32[1] = v1;
+ res.val.uint32[2] = v2;
+ res.val.uint32[3] = v3;
+ return res;
+}
+
+static inline __m128i
+_mm_loadu_si128(__m128i *v)
+{
+ __m128i res;
+
+ res = *v;
+ return res;
+}
+
+static inline __m128i
+_mm_load_si128(__m128i *v)
+{
+ __m128i res;
+
+ res = *v;
+ return res;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (11 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 12/15] eal/arm: add very incomplete rte_vect Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev, Vlastimil Kosar
There several issues with alignment when compiling for ARMv7.
They are not considered to be fatal (ARMv7 supports unaligned
access of 32b words), so we just leave them as warnings. They
should be solved later, however.
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
---
v4: restrict -Wno-error to the cast-align only
---
mk/toolchain/gcc/rte.vars.mk | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/mk/toolchain/gcc/rte.vars.mk b/mk/toolchain/gcc/rte.vars.mk
index 0f51c66..c2c5255 100644
--- a/mk/toolchain/gcc/rte.vars.mk
+++ b/mk/toolchain/gcc/rte.vars.mk
@@ -77,6 +77,12 @@ WERROR_FLAGS += -Wcast-align -Wnested-externs -Wcast-qual
WERROR_FLAGS += -Wformat-nonliteral -Wformat-security
WERROR_FLAGS += -Wundef -Wwrite-strings
+# There are many issues reported for ARMv7 architecture
+# which are not necessarily fatal. Report as warnings.
+ifeq ($(CONFIG_RTE_ARCH_ARMv7),y)
+WERROR_FLAGS += -Wno-error=cast-align
+endif
+
# process cpu flags
include $(RTE_SDK)/mk/toolchain/$(RTE_TOOLCHAIN)/rte.toolchain-compat.mk
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (12 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 13/15] gcc/arm: avoid alignment errors to break build Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7 Jan Viktorin
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: Vlastimil Kosar, dev
From: Vlastimil Kosar <kosar@rehivetech.com>
Make DPDK run on ARMv7-A architecture. This patch assumes
ARM Cortex-A9. However, it is known to be working on Cortex-A7
and Cortex-A15.
Signed-off-by: Vlastimil Kosar <kosar@rehivetech.com>
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
v2:
* the -mtune parameter of GCC is configurable now
* the -mfpu=neon can be turned off
v3: XMM_SIZE is defined in rte_vect.h in a following patch
v4:
* update release notes for 2.2
* get rid of CONFIG_RTE_BITMAP_OPTIMIZATIONS=0 setting
* rename arm defconfig: "armv7-a" -> "arvm7a"
* disable pipeline and table modules unless lpm is fixed
---
config/defconfig_arm-armv7a-linuxapp-gcc | 74 ++++++++++++++++++++++++++++++++
doc/guides/rel_notes/release_2_2.rst | 5 +++
mk/arch/arm/rte.vars.mk | 39 +++++++++++++++++
mk/machine/armv7-a/rte.vars.mk | 67 +++++++++++++++++++++++++++++
4 files changed, 185 insertions(+)
create mode 100644 config/defconfig_arm-armv7a-linuxapp-gcc
create mode 100644 mk/arch/arm/rte.vars.mk
create mode 100644 mk/machine/armv7-a/rte.vars.mk
diff --git a/config/defconfig_arm-armv7a-linuxapp-gcc b/config/defconfig_arm-armv7a-linuxapp-gcc
new file mode 100644
index 0000000..d623222
--- /dev/null
+++ b/config/defconfig_arm-armv7a-linuxapp-gcc
@@ -0,0 +1,74 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All right reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp"
+
+CONFIG_RTE_MACHINE="armv7-a"
+
+CONFIG_RTE_ARCH="arm"
+CONFIG_RTE_ARCH_ARM=y
+CONFIG_RTE_ARCH_ARMv7=y
+CONFIG_RTE_ARCH_ARM_TUNE="cortex-a9"
+CONFIG_RTE_ARCH_ARM_NEON=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
+# ARM doesn't have support for vmware TSC map
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+# KNI is not supported on 32-bit
+CONFIG_RTE_LIBRTE_KNI=n
+
+# PCI is usually not used on ARM
+CONFIG_RTE_EAL_IGB_UIO=n
+
+# fails to compile on ARM
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_TABLE=n
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
+# cannot use those on ARM
+CONFIG_RTE_KNI_KMOD=n
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_CXGBE_PMD=n
+CONFIG_RTE_LIBRTE_E1000_PMD=n
+CONFIG_RTE_LIBRTE_ENIC_PMD=n
+CONFIG_RTE_LIBRTE_FM10K_PMD=n
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_MLX4_PMD=n
+CONFIG_RTE_LIBRTE_MPIPE_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+CONFIG_RTE_LIBRTE_PMD_BNX2X=n
diff --git a/doc/guides/rel_notes/release_2_2.rst b/doc/guides/rel_notes/release_2_2.rst
index be6f827..43a3a3c 100644
--- a/doc/guides/rel_notes/release_2_2.rst
+++ b/doc/guides/rel_notes/release_2_2.rst
@@ -23,6 +23,11 @@ New Features
* **Added vhost-user multiple queue support.**
+* **Introduce ARMv7 architecture**
+
+ It is now possible to build DPDK for the ARMv7 platform and test with
+ virtual PMD drivers.
+
Resolved Issues
---------------
diff --git a/mk/arch/arm/rte.vars.mk b/mk/arch/arm/rte.vars.mk
new file mode 100644
index 0000000..df0c043
--- /dev/null
+++ b/mk/arch/arm/rte.vars.mk
@@ -0,0 +1,39 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH ?= arm
+CROSS ?=
+
+CPU_CFLAGS ?= -marm -DRTE_CACHE_LINE_SIZE=64 -munaligned-access
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/armv7-a/rte.vars.mk b/mk/machine/armv7-a/rte.vars.mk
new file mode 100644
index 0000000..48d3979
--- /dev/null
+++ b/mk/machine/armv7-a/rte.vars.mk
@@ -0,0 +1,67 @@
+# BSD LICENSE
+#
+# Copyright (C) 2015 RehiveTech. All rights reserved.
+#
+# Redistribution and use in source and binary forms, with or without
+# modification, are permitted provided that the following conditions
+# are met:
+#
+# * Redistributions of source code must retain the above copyright
+# notice, this list of conditions and the following disclaimer.
+# * Redistributions in binary form must reproduce the above copyright
+# notice, this list of conditions and the following disclaimer in
+# the documentation and/or other materials provided with the
+# distribution.
+# * Neither the name of RehiveTech nor the names of its
+# contributors may be used to endorse or promote products derived
+# from this software without specific prior written permission.
+#
+# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+# - can define ARCH variable (overridden by cmdline value)
+# - can define CROSS variable (overridden by cmdline value)
+# - define MACHINE_CFLAGS variable (overridden by cmdline value)
+# - define MACHINE_LDFLAGS variable (overridden by cmdline value)
+# - define MACHINE_ASFLAGS variable (overridden by cmdline value)
+# - can define CPU_CFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_LDFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - can define CPU_ASFLAGS variable (overridden by cmdline value) that
+# overrides the one defined in arch.
+# - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+CPU_CFLAGS += -mfloat-abi=softfp
+
+MACHINE_CFLAGS += -march=armv7-a
+
+ifdef CONFIG_RTE_ARCH_ARM_TUNE
+MACHINE_CFLAGS += -mtune=$(CONFIG_RTE_ARCH_ARM_TUNE)
+endif
+
+ifeq ($(CONFIG_RTE_ARCH_ARM_NEON),y)
+MACHINE_CFLAGS += -mfpu=neon
+endif
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread
* [dpdk-dev] [PATCH v4 15/15] maintainers: claim responsibility for ARMv7
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 00/15] " Jan Viktorin
` (13 preceding siblings ...)
2015-10-29 12:43 ` [dpdk-dev] [PATCH v4 14/15] mk: Introduce ARMv7 architecture Jan Viktorin
@ 2015-10-29 12:43 ` Jan Viktorin
14 siblings, 0 replies; 72+ messages in thread
From: Jan Viktorin @ 2015-10-29 12:43 UTC (permalink / raw)
To: David Hunt, David Marchand; +Cc: dev
Signed-off-by: Jan Viktorin <viktorin@rehivetech.com>
---
MAINTAINERS | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index 080a8e8..a8933eb 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -124,6 +124,10 @@ IBM POWER
M: Chao Zhu <chaozhu@linux.vnet.ibm.com>
F: lib/librte_eal/common/include/arch/ppc_64/
+ARM v7
+M: Jan Viktorin <viktorin@rehivetech.com>
+F: lib/librte_eal/common/include/arch/arm/
+
Intel x86
M: Bruce Richardson <bruce.richardson@intel.com>
M: Konstantin Ananyev <konstantin.ananyev@intel.com>
--
2.6.1
^ permalink raw reply [flat|nested] 72+ messages in thread