From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wes1-so1.wedos.net (wes1-so1.wedos.net [46.28.106.15]) by dpdk.org (Postfix) with ESMTP id 5AF538D3D for ; Mon, 26 Oct 2015 17:39:30 +0100 (CET) Received: from pcviktorin.fit.vutbr.cz (pcviktorin.fit.vutbr.cz [147.229.13.147]) by wes1-so1.wedos.net (Postfix) with ESMTPSA id 3nl21t0zW8z3qt; Mon, 26 Oct 2015 17:39:30 +0100 (CET) From: Jan Viktorin To: Thomas Monjalon , David Hunt , dev@dpdk.org Date: Mon, 26 Oct 2015 17:37:22 +0100 Message-Id: <1445877458-31052-1-git-send-email-viktorin@rehivetech.com> X-Mailer: git-send-email 2.6.1 Subject: [dpdk-dev] [PATCH v2 00/16] Support ARMv7 architecture X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Oct 2015 16:39:30 -0000 Hello DPDK community, Thomas, Dave, here I propose the second version of the ARM support patch series. I've included some ideas from Dave's patch. There are no big changes to the original series. Important: * The timer issue has now 2 solutions, the user may configure to use PMU counter or the clock_gettime API. The PMU counter may however break perf or other tools using the PMU Linux API. This is the reason why I did not make it the default. Also, I didn't include the Linux Kernel module that enables the PMU for userspace. There is a note in the rte_cycles.h about it. You should know what you are doing if you use that, so you may also write that simple driver or get from the Dave's patch. Later, we can integrate it, after we have some real PMD driver (and some supporting Linux Kernel module infra...). * There is the NEON implementation of memcpy. It is faster then the native one (you can see stats in the patch), however, we must be sure, the target CPU contains the NEON co-processor. Also, for longer data lengths and ARM SoCs, the NEON memcpy implementation can be much slower then the native one. So this is again configurable. * The cpuflags now contains the best from my and Dave's patchs. * ACL build is broken. I've included a patch (16) that just prevents to pass -msse4.1 into gcc if it does not support it. But that does not solve the whole issue. * LPM build is broken unless you apply the patch 15. However, this is not the right solution and I provided just to have a workaround. I don't expect to merge it. * I've added myself to the MAINTAINERS. Dave, would I like to be there as well? * The Cortex A7, A8, A9 cores are non-LPAE (non Large Physical Address Extension) and thus there is no upstream support for huge pages in the Linux Kernel. It sounds like useless for devices with max 4 GB of RAM (usually 0.5-2 GB). However, our measurements have shown that it improve performance. A patch is somewhere deep in the kernel.org mailing lists. * Only the GCC toolchain is considered at the moment. Other details are included in each individual commit. --- You can pull the changes from https://github.com/RehiveTech/dpdk.git arm-support-v2 since commit d08d304508a8a8caf255baf622ab65db1fec952c: eal/linux: make alarm not affected by system time jump (2015-10-21 17:01:24 +0200) up to 57396c958571b651b4d14f90683b3d1b2d42a70e: acl: check for SSE 4.1 support (2015-10-26 17:29:36 +0100) --- Regards Jan Viktorin Jan Viktorin (7): eal/arm: implement rdtsc by PMU or clock_gettime eal/arm: use vector memcpy only when NEON is enabled eal/arm: detect arm architecture in cpu flags eal/arm: rwlock support for ARM gcc/arm: avoid alignment errors to break build maintainers: claim responsibility for ARMv7 acl: check for SSE 4.1 support Vlastimil Kosar (9): mk: Introduce ARMv7 architecture eal/arm: atomic operations for ARM eal/arm: byte order operations for ARM eal/arm: cpu cycle operations for ARM eal/arm: prefetch operations for ARM eal/arm: spinlock operations for ARM (without HTM) eal/arm: vector memcpy for ARM eal/arm: cpu flag checks for ARM lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 MAINTAINERS | 4 + app/test/test_cpuflags.c | 5 + config/defconfig_arm-armv7-a-linuxapp-gcc | 75 +++++ lib/librte_acl/Makefile | 4 + .../common/include/arch/arm/rte_atomic.h | 256 ++++++++++++++++ .../common/include/arch/arm/rte_byteorder.h | 148 ++++++++++ .../common/include/arch/arm/rte_cpuflags.h | 192 ++++++++++++ .../common/include/arch/arm/rte_cycles.h | 121 ++++++++ .../common/include/arch/arm/rte_memcpy.h | 325 +++++++++++++++++++++ .../common/include/arch/arm/rte_prefetch.h | 61 ++++ .../common/include/arch/arm/rte_rwlock.h | 40 +++ .../common/include/arch/arm/rte_spinlock.h | 114 ++++++++ lib/librte_lpm/rte_lpm.h | 71 +++++ mk/arch/arm/rte.vars.mk | 39 +++ mk/machine/armv7-a/rte.vars.mk | 60 ++++ mk/rte.cpuflags.mk | 6 + mk/toolchain/gcc/rte.vars.mk | 6 + 17 files changed, 1527 insertions(+) create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h create mode 100644 mk/arch/arm/rte.vars.mk create mode 100644 mk/machine/armv7-a/rte.vars.mk -- 2.6.1