DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jan Viktorin <viktorin@rehivetech.com>
To: dev@dpdk.org
Cc: Jan Viktorin <viktorin@rehivetech.com>
Subject: [dpdk-dev] [PATCH v1 00/12] Support for ARM(v7)
Date: Sat,  3 Oct 2015 10:58:06 +0200	[thread overview]
Message-ID: <cover.1443737626.git.viktorin@rehivetech.com> (raw)

Dear DPDK community,

I am proposing a patch series with support of the ARMv7 architecture
for DPDK. The patch series does not introduce any PMD driver. It is
possible to compile it, boot it and test it with some virtual PMD (eg.
pcap). It is rebased on top of v2.1.0.

All but the last two patches (11, 12) are quite staightforward
and usually based on the ppc_64 architecture. Notes:

* we test on Cortex-A9 (mostly Xilinx Zynq at the moment)
* atomic operations and spinlocks are implemented by (GCC) intrinsics
* cpu cycle is implemented by clock_gettime because there is no
  standard 64-bit counter available
* we have to set -Wno-error to pass the build process because there are
  quite a lot of alignment problems reported (we didn't find any real issues
  so far)

The last two patches (11, 12) are not to be merged into mainline. They
are just a temporary workaround for the two libraries (ACL, LPM) which
heavily utilizes the SSE... It is not possible to easily convert the
SSE calls to the NEON SIMD operations.

============

It is important to note that the current Linux Kernel does not contain
the support for huge tables for non-LPAE ARM architectures (Cortex-A9).
There is a patch available on the Internet but it is not going to be
merged for now (4/2014):

 http://thread.gmane.org/gmane.linux.kernel.mm/115788

We ported this patch to 3.18 and it can improve the performance. Here
follow results for our tests of several algorithms showing the execution
time reduction:

CPU median 3x3        -  0.2 %
NEON median 3x3       - 19.5 %
Random read           -  0.0 %
Random write          -  6.2 %
Matrix multiplication - 31.0 %
NEON copy             -  4.2 %

============

We are working on the PMD + kernel-support part. At the moment, we have
a working PMD for Xilinx Zynq's EMAC. However, it uses some dirty features.
We have to rethink it a bit before going to the mainline. We are facing some
problems during the implementation (some are already being solved in the
mailing-list):

* rte_eth_dev is defined as a PCI device. As ARMs are SoCs with integrated
  EMAC on the chip and an external phyter, we need a different approach.
  There can be an ARM computer with PCI-E but then you put there a network
  card and use a different kind of driver (but this is not very common
  at the moment).
* ARM does not have coherent memory for DMA transfers. It is possible to
  allocate non-cachable memory (DMA transfers can be as fast as possible)
  but it slows down the payload processing on CPU. For this purpose, we
  have to call dma_map/unmap_* in kernel. A custom kernel driver is needed
  and it should not be the UIO because it is quite limited (almost
  non-extendable mmap, no support for custom ioctl and write).
* We are not going to put the PHY layer into userspace, so it will stay
  in the kernel. There is also a need for the CLK control (clock gating)
  in the PMD.

Regards
Jan Viktorin


Jan Viktorin (2):
  eal/arm: rwlock support for ARM
  gcc/arm: avoid alignment errors to break build

Vlastimil Kosar (10):
  mk: Introduce ARMv7 architecture
  eal/arm: atomic operations for ARM
  eal/arm: byte order operations for ARM
  eal/arm: cpu cycle operations for ARM
  eal/arm: prefetch operations for ARM
  eal/arm: spinlock operations for ARM (without HTM)
  eal/arm: vector memcpy for ARM
  eal/arm: cpu flag checks for ARM
  lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on
    for-x86
  arm: Disable usage of SSE optimized code in librte_acl

 app/test/test_cpuflags.c                           |   5 +
 config/defconfig_arm-armv7-a-linuxapp-gcc          |  72 ++++++
 lib/librte_acl/acl.h                               |   2 +
 lib/librte_acl/rte_acl.c                           |   8 +-
 lib/librte_acl/rte_acl_osdep.h                     |   2 +
 .../common/include/arch/arm/rte_atomic.h           | 257 ++++++++++++++++++++
 .../common/include/arch/arm/rte_byteorder.h        | 148 +++++++++++
 .../common/include/arch/arm/rte_cpuflags.h         | 169 +++++++++++++
 .../common/include/arch/arm/rte_cycles.h           |  85 +++++++
 .../common/include/arch/arm/rte_memcpy.h           | 270 +++++++++++++++++++++
 .../common/include/arch/arm/rte_prefetch.h         |  61 +++++
 .../common/include/arch/arm/rte_rwlock.h           |  40 +++
 .../common/include/arch/arm/rte_spinlock.h         | 114 +++++++++
 lib/librte_lpm/rte_lpm.h                           |  71 ++++++
 mk/arch/arm/rte.vars.mk                            |  39 +++
 mk/machine/armv7-a/rte.vars.mk                     |  60 +++++
 mk/rte.cpuflags.mk                                 |   6 +
 mk/toolchain/gcc/rte.vars.mk                       |   6 +
 18 files changed, 1414 insertions(+), 1 deletion(-)
 create mode 100644 config/defconfig_arm-armv7-a-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_rwlock.h
 create mode 100644 lib/librte_eal/common/include/arch/arm/rte_spinlock.h
 create mode 100644 mk/arch/arm/rte.vars.mk
 create mode 100644 mk/machine/armv7-a/rte.vars.mk

-- 
2.5.2

             reply	other threads:[~2015-10-03  8:58 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-03  8:58 Jan Viktorin [this message]
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 01/12] mk: Introduce ARMv7 architecture Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 02/12] eal/arm: atomic operations for ARM Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 03/12] eal/arm: byte order " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 04/12] eal/arm: cpu cycle " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 05/12] eal/arm: prefetch " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 06/12] eal/arm: spinlock operations for ARM (without HTM) Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 07/12] eal/arm: vector memcpy for ARM Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 08/12] eal/arm: cpu flag checks " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 09/12] eal/arm: rwlock support " Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 10/12] gcc/arm: avoid alignment errors to break build Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 11/12] lpm/arm: implement rte_lpm_lookupx4 using rte_lpm_lookup_bulk on for-x86 Jan Viktorin
2015-10-03  8:58 ` [dpdk-dev] [PATCH v1 12/12] arm: Disable usage of SSE optimized code in librte_acl Jan Viktorin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1443737626.git.viktorin@rehivetech.com \
    --to=viktorin@rehivetech.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).