From: Jan Viktorin <viktorin@rehivetech.com>
To: dev@dpdk.org
Cc: Jan Viktorin <viktorin@rehivetech.com>
Subject: [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7)
Date: Sat, 3 Oct 2015 10:58:06 +0200 [thread overview]
Message-ID: <cover.1443737626.git.viktorin@rehivetech.com> (raw)
Dear DPDK community,
I am proposing a patch series with support of the ARMv7 architecture
for DPDK. The patch series does not introduce any PMD driver. It is
possible to compile it, boot it and test it with some virtual PMD (eg.
pcap). It is rebased on top of v2.1.0.
All but the last two patches (11, 12) are quite staightforward
and usually based on the ppc_64 architecture. Notes:
* we test on Cortex-A9 (mostly Xilinx Zynq at the moment)
* atomic operations and spinlocks are implemented by (GCC) intrinsics
* cpu cycle is implemented by clock_gettime because there is no
standard 64-bit counter available
* we have to set -Wno-error to pass the build process because there are
quite a lot of alignment problems reported (we didn't find any real issues
so far)
The last two patches (11, 12) are not to be merged into mainline. They
are just a temporary workaround for the two libraries (ACL, LPM) which
heavily utilizes the SSE... It is not possible to easily convert the
SSE calls to the NEON SIMD operations.
============
It is important to note that the current Linux Kernel does not contain
the support for huge tables for non-LPAE ARM architectures (Cortex-A9).
There is a patch available on the Internet but it is not going to be
merged for now (4/2014):
http://thread.gmane.org/gmane.linux.kernel.mm/115788
We ported this patch to 3.18 and it can improve the performance. Here
follow results for our tests of several algorithms showing the execution
time reduction:
CPU median 3x3 - 0.2 %
NEON median 3x3 - 19.5 %
Random read - 0.0 %
Random write - 6.2 %
Matrix multiplication - 31.0 %
NEON copy - 4.2 %
============
We are working on the PMD + kernel-support part. At the moment, we have
a working PMD for Xilinx Zynq's EMAC. However, it uses some dirty features.
We have to rethink it a bit before going to the mainline. We are facing some
problems during the implementation (some are already being solved in the
mailing-list):
* rte_eth_dev is defined as a PCI device. As ARMs are SoCs with integrated
EMAC on the chip and an external phyter, we need a different approach.
There can be an ARM computer with PCI-E but then you put there a network
card and use a different kind of driver (but this is not very common
at the moment).
* ARM does not have coherent memory for DMA transfers. It is possible to
allocate non-cachable memory (DMA transfers can be as fast as possible)
but it slows down the payload processing on CPU. For this purpose, we
have to call dma_map/unmap_* in kernel. A custom kernel driver is needed
and it should not be the UIO because it is quite limited (almost
non-extendable mmap, no support for custom ioctl and write).
* We are not going to put the PHY layer into userspace, so it will stay
in the kernel. There is also a need for the CLK control (clock gating)
in the PMD.
Regards
Jan Viktorin
Jan Viktorin (2):
eal/arm: rwlock support for ARM
gcc/arm: avoid alignment errors to break build
Vlastimil Kosar (10):
mk: Introduce ARMv7 architecture
eal/arm: atomic operations for ARM
eal/arm: byte order operations for ARM
eal/arm: cpu cycle operations for ARM
eal/arm: prefetch operations for ARM
eal/arm: spinlock operations for ARM (without HTM)
eal/arm: vector memcpy for ARM
eal/arm: cpu flag checks for ARM
lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on
for-x86
arm: Disable usage of SSE optimized code in librte_acl
app/test/test_cpuflags.c | 5 +
config/defconfig_arm-armv7-a-linuxapp-gcc | 72 ++++++
lib/librte_acl/acl.h | 2 +
lib/librte_acl/rte_acl.c | 8 +-
lib/librte_acl/rte_acl_osdep.h | 2 +
.../common/include/arch/arm/rte_atomic.h | 257 ++++++++++++++++++++
.../common/include/arch/arm/rte_byteorder.h | 148 +++++++++++
.../common/include/arch/arm/rte_cpuflags.h | 169 +++++++++++++
.../common/include/arch/arm/rte_cycles.h | 85 +++++++
.../common/include/arch/arm/rte_memcpy.h | 270 +++++++++++++++++++++
.../common/include/arch/arm/rte_prefetch.h | 61 +++++
.../common/include/arch/arm/rte_rwlock.h | 40 +++
.../common/include/arch/arm/rte_spinlock.h | 114 +++++++++
lib/librte_lpm/rte_lpm.h | 71 ++++++
mk/arch/arm/rte.vars.mk | 39 +++
mk/machine/armv7-a/rte.vars.mk | 60 +++++
mk/rte.cpuflags.mk | 6 +
mk/toolchain/gcc/rte.vars.mk | 6 +
18 files changed, 1414 insertions(+), 1 deletion(-)
create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
create mode 100644 mk/arch/arm/rte.vars.mk
create mode 100644 mk/machine/armv7-a/rte.vars.mk
--
2.5.2
next reply other threads:[~2015-10-03 8:58 UTC|newest]
Thread overview: 13+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-10-03 8:58 Jan Viktorin [this message]
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order " Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle " Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch " Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks " Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support " Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
2015-10-03 8:58 ` [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl Jan Viktorin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cover.1443737626.git.viktorin@rehivetech.com \
--to=viktorin@rehivetech.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).