DPDK patches and discussions
 help / color / mirror / Atom feed
* Re: [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM Power architecture
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM " Chao Zhu
@ 2014-11-23 22:02   ` Neil Horman
  2014-11-25  3:51     ` Chao Zhu
  0 siblings, 1 reply; 31+ messages in thread
From: Neil Horman @ 2014-11-23 22:02 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

On Sun, Nov 23, 2014 at 08:22:09PM -0500, Chao Zhu wrote:
> To make DPDK run on IBM Power architecture, configuration files for
> Power architecuture are added. Also, the compiling related .mk files are
> added.
> 
> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> ---
>  config/common_linuxapp_powerpc              |  394 +++++++++++++++++++++++++++
>  config/defconfig_ppc_64-power8-linuxapp-gcc |   40 +++
>  mk/arch/ppc_64/rte.vars.mk                  |   39 +++
>  mk/machine/power8/rte.vars.mk               |   57 ++++
>  4 files changed, 530 insertions(+), 0 deletions(-)
>  create mode 100644 config/common_linuxapp_powerpc
>  create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
>  create mode 100644 mk/arch/ppc_64/rte.vars.mk
>  create mode 100644 mk/machine/power8/rte.vars.mk
> 
> diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
> new file mode 100644
> index 0000000..d230a0b
> --- /dev/null
> +++ b/config/common_linuxapp_powerpc
This filename is common_linuxapp_powerpc, but given that it explicitly specifies
all the build options, there isn't really anything common about it.  I think
what you want to do is rename this defconfig_powerpc-native-linuxapp-gcc, and
have it include common_linuxapp, then change any power-specific option you see
fit.

Also, does BSD build on power?  I presume so. You likely want to create a
corresponding bsd power config

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture
@ 2014-11-24  1:22 Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM " Chao Zhu
                   ` (14 more replies)
  0 siblings, 15 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

The set of patches add IBM Power architecture to the DPDK. It adds the required support to the
EAL library. This set of patches doesn't support full DPDK function on Power processors. So a
separate common configuration file is used for Power to turn off some un-migrated functions. To
compile on PPC64 architecture, GCC version >= 4.8 must be used. This v3 patch updates eal_memory.c
to fix the memory zone allocation and also solves the compiling problems of test-pmd.

Chao Zhu (14):
  Add compiling definations for IBM Power architecture
  Add atomic operations for IBM Power architecture
  Add byte order operations for IBM Power architecture
  Add CPU cycle operations for IBM Power architecture
  Add prefetch operation for IBM Power architecture
  Add spinlock operation for IBM Power architecture
  Add vector memcpy for IBM Power architecture
  Add CPU flag checking for IBM Power architecture
  Remove iopl operation for IBM Power architecture
  Add cache size define for IBM Power Architecture
  Add huge page size define for IBM Power architecture
  Add eal memory support for IBM Power Architecture
  test_memzone:fix finding the second smallest segment
  Fix the compiling of test-pmd on IBM Power Architecture

 app/test-pmd/config.c                              |   33 +-
 app/test/test_cpuflags.c                           |   35 ++
 app/test/test_malloc.c                             |    8 +-
 app/test/test_memzone.c                            |  123 ++++++-
 config/common_linuxapp_powerpc                     |  394 +++++++++++++++++++
 config/defconfig_ppc_64-power8-linuxapp-gcc        |   42 ++
 config/defconfig_x86_64-native-linuxapp-clang      |    1 +
 config/defconfig_x86_64-native-linuxapp-gcc        |    1 +
 config/defconfig_x86_64-native-linuxapp-icc        |    1 +
 lib/librte_eal/common/eal_common_memzone.c         |   15 +-
 .../common/include/arch/ppc_64/rte_atomic.h        |  415 ++++++++++++++++++++
 .../common/include/arch/ppc_64/rte_byteorder.h     |  150 +++++++
 .../common/include/arch/ppc_64/rte_cpuflags.h      |  184 +++++++++
 .../common/include/arch/ppc_64/rte_cycles.h        |   86 ++++
 .../common/include/arch/ppc_64/rte_memcpy.h        |  224 +++++++++++
 .../common/include/arch/ppc_64/rte_prefetch.h      |   61 +++
 .../common/include/arch/ppc_64/rte_spinlock.h      |   73 ++++
 lib/librte_eal/common/include/rte_memory.h         |    9 +-
 lib/librte_eal/common/include/rte_memzone.h        |    8 +
 lib/librte_eal/linuxapp/eal/eal.c                  |   13 +-
 lib/librte_eal/linuxapp/eal/eal_memory.c           |   75 +++-
 mk/arch/ppc_64/rte.vars.mk                         |   39 ++
 mk/machine/power8/rte.vars.mk                      |   57 +++
 mk/rte.cpuflags.mk                                 |   17 +
 24 files changed, 2015 insertions(+), 49 deletions(-)
 create mode 100644 config/common_linuxapp_powerpc
 create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_spinlock.h
 create mode 100644 mk/arch/ppc_64/rte.vars.mk
 create mode 100644 mk/machine/power8/rte.vars.mk

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-23 22:02   ` Neil Horman
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 02/14] Add atomic operations " Chao Zhu
                   ` (13 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

To make DPDK run on IBM Power architecture, configuration files for
Power architecuture are added. Also, the compiling related .mk files are
added.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 config/common_linuxapp_powerpc              |  394 +++++++++++++++++++++++++++
 config/defconfig_ppc_64-power8-linuxapp-gcc |   40 +++
 mk/arch/ppc_64/rte.vars.mk                  |   39 +++
 mk/machine/power8/rte.vars.mk               |   57 ++++
 4 files changed, 530 insertions(+), 0 deletions(-)
 create mode 100644 config/common_linuxapp_powerpc
 create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
 create mode 100644 mk/arch/ppc_64/rte.vars.mk
 create mode 100644 mk/machine/power8/rte.vars.mk

diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
new file mode 100644
index 0000000..d230a0b
--- /dev/null
+++ b/config/common_linuxapp_powerpc
@@ -0,0 +1,394 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IBM Corporation 2014.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of IBM Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+#
+# define executive environment
+#
+# CONFIG_RTE_EXEC_ENV can be linuxapp, baremetal, bsdapp
+#
+CONFIG_RTE_EXEC_ENV="linuxapp"
+CONFIG_RTE_EXEC_ENV_LINUXAPP=y
+
+#
+# Use intrinsics or assembly code for key routines
+#
+CONFIG_RTE_FORCE_INTRINSICS=n
+
+#
+# Compile to share library
+#
+CONFIG_RTE_BUILD_SHARED_LIB=n
+
+#
+# Combine to one single library
+#
+CONFIG_RTE_BUILD_COMBINE_LIBS=n
+CONFIG_RTE_LIBNAME="powerpc_dpdk"
+
+#
+# Compile libc directory
+#
+CONFIG_RTE_LIBC=n
+
+#
+# Compile newlib as libc from source
+#
+CONFIG_RTE_LIBC_NEWLIB_SRC=n
+
+#
+# Use binary newlib
+#
+CONFIG_RTE_LIBC_NEWLIB_BIN=n
+
+#
+# Use binary newlib
+#
+CONFIG_RTE_LIBC_NETINCS=n
+
+#
+# Compile libgloss (newlib-stubs)
+#
+CONFIG_RTE_LIBGLOSS=n
+
+#
+# Compile Environment Abstraction Layer
+# Note: Power8 has 96 cores, so increase CONFIG_RTE_MAX_LCORE from 64 to 128
+#
+CONFIG_RTE_LIBRTE_EAL=y
+CONFIG_RTE_MAX_LCORE=128
+CONFIG_RTE_MAX_NUMA_NODES=8
+CONFIG_RTE_MAX_MEMSEG=256
+CONFIG_RTE_MAX_MEMZONE=2560
+CONFIG_RTE_MAX_TAILQ=32
+CONFIG_RTE_LOG_LEVEL=8
+CONFIG_RTE_LOG_HISTORY=256
+CONFIG_RTE_LIBEAL_USE_HPET=n
+CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
+CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
+CONFIG_RTE_EAL_IGB_UIO=y
+CONFIG_RTE_EAL_VFIO=y
+
+#
+# Special configurations in PCI Config Space for high performance
+#
+CONFIG_RTE_PCI_CONFIG=n
+CONFIG_RTE_PCI_EXTENDED_TAG=""
+CONFIG_RTE_PCI_MAX_READ_REQUEST_SIZE=0
+
+#
+# Compile Environment Abstraction Layer for linux
+#
+CONFIG_RTE_LIBRTE_EAL_LINUXAPP=y
+
+#
+# Compile Environment Abstraction Layer for Bare metal
+#
+CONFIG_RTE_LIBRTE_EAL_BAREMETAL=n
+
+#
+# Compile Environment Abstraction Layer to support Vmware TSC map
+# Note: Power doesn't have this support
+#
+CONFIG_RTE_LIBRTE_EAL_VMWARE_TSC_MAP_SUPPORT=n
+
+#
+# Compile the argument parser library
+#
+CONFIG_RTE_LIBRTE_KVARGS=y
+
+#
+# Compile generic ethernet library
+#
+CONFIG_RTE_LIBRTE_ETHER=y
+CONFIG_RTE_LIBRTE_ETHDEV_DEBUG=n
+CONFIG_RTE_MAX_ETHPORTS=32
+CONFIG_RTE_LIBRTE_IEEE1588=n
+CONFIG_RTE_ETHDEV_QUEUE_STAT_CNTRS=16
+
+#
+# Support NIC bypass logic
+#
+CONFIG_RTE_NIC_BYPASS=n
+
+#
+# Note: Initially, all of the PMD drivers compilation are turned off on Power
+# Will turn on them only after the successful testing on Power
+#
+
+#
+# Compile burst-oriented IGB & EM PMD drivers
+#
+CONFIG_RTE_LIBRTE_EM_PMD=n
+CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_E1000_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_E1000_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_E1000_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_E1000_DEBUG_TX_FREE=n
+CONFIG_RTE_LIBRTE_E1000_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_E1000_PF_DISABLE_STRIP_CRC=n
+
+#
+# Compile burst-oriented IXGBE PMD driver
+#
+CONFIG_RTE_LIBRTE_IXGBE_PMD=n
+CONFIG_RTE_LIBRTE_IXGBE_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_IXGBE_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_IXGBE_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_IXGBE_DEBUG_TX_FREE=n
+CONFIG_RTE_LIBRTE_IXGBE_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_IXGBE_PF_DISABLE_STRIP_CRC=n
+CONFIG_RTE_LIBRTE_IXGBE_RX_ALLOW_BULK_ALLOC=y
+CONFIG_RTE_IXGBE_INC_VECTOR=y
+CONFIG_RTE_IXGBE_RX_OLFLAGS_ENABLE=y
+
+#
+# Compile burst-oriented I40E PMD driver
+#
+CONFIG_RTE_LIBRTE_I40E_PMD=n
+CONFIG_RTE_LIBRTE_I40E_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_I40E_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_I40E_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_I40E_DEBUG_TX_FREE=n
+CONFIG_RTE_LIBRTE_I40E_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_I40E_RX_ALLOW_BULK_ALLOC=y
+CONFIG_RTE_LIBRTE_I40E_16BYTE_RX_DESC=n
+CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VF=4
+CONFIG_RTE_LIBRTE_I40E_QUEUE_NUM_PER_VM=4
+# interval up to 8160 us, aligned to 2 (or default value)
+CONFIG_RTE_LIBRTE_I40E_ITR_INTERVAL=-1
+
+#
+# Compile burst-oriented VIRTIO PMD driver
+#
+CONFIG_RTE_LIBRTE_VIRTIO_PMD=n
+CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DRIVER=n
+CONFIG_RTE_LIBRTE_VIRTIO_DEBUG_DUMP=n
+
+#
+# Compile burst-oriented VMXNET3 PMD driver
+#
+CONFIG_RTE_LIBRTE_VMXNET3_PMD=n
+CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_INIT=n
+CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_RX=n
+CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX=n
+CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_TX_FREE=n
+CONFIG_RTE_LIBRTE_VMXNET3_DEBUG_DRIVER=n
+
+#
+# Compile example software rings based PMD
+#
+CONFIG_RTE_LIBRTE_PMD_RING=y
+CONFIG_RTE_PMD_RING_MAX_RX_RINGS=16
+CONFIG_RTE_PMD_RING_MAX_TX_RINGS=16
+
+#
+# Compile software PMD backed by PCAP files
+#
+CONFIG_RTE_LIBRTE_PMD_PCAP=n
+
+#
+# Compile link bonding PMD library
+#
+CONFIG_RTE_LIBRTE_PMD_BOND=n
+
+#
+# Compile Xen PMD
+#
+CONFIG_RTE_LIBRTE_PMD_XENVIRT=n
+
+#
+# Do prefetch of packet data within PMD driver receive function
+#
+CONFIG_RTE_PMD_PACKET_PREFETCH=y
+
+#
+# Compile librte_ring
+#
+CONFIG_RTE_LIBRTE_RING=y
+CONFIG_RTE_LIBRTE_RING_DEBUG=n
+CONFIG_RTE_RING_SPLIT_PROD_CONS=n
+
+#
+# Compile librte_mempool
+#
+CONFIG_RTE_LIBRTE_MEMPOOL=y
+CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE=512
+CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n
+
+#
+# Compile librte_mbuf
+#
+CONFIG_RTE_LIBRTE_MBUF=y
+CONFIG_RTE_LIBRTE_MBUF_DEBUG=n
+CONFIG_RTE_MBUF_REFCNT=y
+CONFIG_RTE_MBUF_REFCNT_ATOMIC=y
+CONFIG_RTE_PKTMBUF_HEADROOM=128
+
+#
+# Compile librte_timer
+#
+CONFIG_RTE_LIBRTE_TIMER=y
+CONFIG_RTE_LIBRTE_TIMER_DEBUG=n
+
+#
+# Compile librte_malloc
+#
+CONFIG_RTE_LIBRTE_MALLOC=y
+CONFIG_RTE_LIBRTE_MALLOC_DEBUG=n
+CONFIG_RTE_MALLOC_MEMZONE_SIZE=11M
+
+#
+# Compile librte_cfgfile
+#
+CONFIG_RTE_LIBRTE_CFGFILE=y
+
+#
+# Compile librte_cmdline
+#
+CONFIG_RTE_LIBRTE_CMDLINE=y
+CONFIG_RTE_LIBRTE_CMDLINE_DEBUG=n
+
+#
+# Compile librte_hash
+#
+CONFIG_RTE_LIBRTE_HASH=y
+CONFIG_RTE_LIBRTE_HASH_DEBUG=n
+
+#
+# Compile librte_lpm
+#
+CONFIG_RTE_LIBRTE_LPM=n
+CONFIG_RTE_LIBRTE_LPM_DEBUG=n
+
+#
+# Compile librte_acl
+#
+CONFIG_RTE_LIBRTE_ACL=n
+CONFIG_RTE_LIBRTE_ACL_DEBUG=n
+CONFIG_RTE_LIBRTE_ACL_STANDALONE=n
+
+#
+# Compile librte_power
+#
+CONFIG_RTE_LIBRTE_POWER=y
+CONFIG_RTE_LIBRTE_POWER_DEBUG=n
+CONFIG_RTE_MAX_LCORE_FREQS=64
+
+#
+# Compile librte_net
+#
+CONFIG_RTE_LIBRTE_NET=y
+
+#
+# Compile librte_ip_frag
+#
+CONFIG_RTE_LIBRTE_IP_FRAG=y
+CONFIG_RTE_LIBRTE_IP_FRAG_DEBUG=n
+CONFIG_RTE_LIBRTE_IP_FRAG_MAX_FRAG=4
+CONFIG_RTE_LIBRTE_IP_FRAG_TBL_STAT=n
+
+#
+# Compile librte_meter
+#
+CONFIG_RTE_LIBRTE_METER=y
+
+#
+# Compile librte_sched
+#
+CONFIG_RTE_LIBRTE_SCHED=n
+CONFIG_RTE_SCHED_RED=n
+CONFIG_RTE_SCHED_COLLECT_STATS=n
+CONFIG_RTE_SCHED_SUBPORT_TC_OV=n
+CONFIG_RTE_SCHED_PORT_N_GRINDERS=8
+
+#
+# Compile the distributor library
+#
+CONFIG_RTE_LIBRTE_DISTRIBUTOR=y
+
+#
+# Compile librte_port
+#
+CONFIG_RTE_LIBRTE_PORT=n
+
+#
+# Compile librte_table
+#
+CONFIG_RTE_LIBRTE_TABLE=n
+
+#
+# Compile librte_pipeline
+#
+CONFIG_RTE_LIBRTE_PIPELINE=n
+
+#
+# Compile librte_kni
+#
+CONFIG_RTE_LIBRTE_KNI=y
+CONFIG_RTE_KNI_KO_DEBUG=n
+CONFIG_RTE_KNI_VHOST=n
+CONFIG_RTE_KNI_VHOST_MAX_CACHE_SIZE=1024
+CONFIG_RTE_KNI_VHOST_VNET_HDR_EN=n
+CONFIG_RTE_KNI_VHOST_DEBUG_RX=n
+CONFIG_RTE_KNI_VHOST_DEBUG_TX=n
+
+#
+# Compile vhost library
+# fuse-devel is needed to run vhost.
+# fuse-devel enables user space char driver development
+#
+CONFIG_RTE_LIBRTE_VHOST=n
+CONFIG_RTE_LIBRTE_VHOST_DEBUG=n
+
+#
+#Compile Xen domain0 support
+#
+CONFIG_RTE_LIBRTE_XEN_DOM0=n
+
+#
+# Enable warning directives
+#
+CONFIG_RTE_INSECURE_FUNCTION_WARNING=n
+
+#
+# Compile the test application
+#
+CONFIG_RTE_APP_TEST=y
+
+#
+# Compile the PMD test application
+#
+CONFIG_RTE_TEST_PMD=n
+CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
+CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
new file mode 100644
index 0000000..97d72ff
--- /dev/null
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -0,0 +1,40 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IBM Corporation 2014.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of IBM Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#include "common_linuxapp_powerpc"
+
+CONFIG_RTE_MACHINE="power8"
+
+CONFIG_RTE_ARCH="ppc_64"
+CONFIG_RTE_ARCH_PPC_64=y
+
+CONFIG_RTE_TOOLCHAIN="gcc"
+CONFIG_RTE_TOOLCHAIN_GCC=y
+
diff --git a/mk/arch/ppc_64/rte.vars.mk b/mk/arch/ppc_64/rte.vars.mk
new file mode 100644
index 0000000..363fcd1
--- /dev/null
+++ b/mk/arch/ppc_64/rte.vars.mk
@@ -0,0 +1,39 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IBM Corporation 2014.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of IBM Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+ARCH  ?= powerpc
+CROSS ?=
+
+CPU_CFLAGS  ?= -m64
+CPU_LDFLAGS ?=
+CPU_ASFLAGS ?= -felf64
+
+export ARCH CROSS CPU_CFLAGS CPU_LDFLAGS CPU_ASFLAGS
diff --git a/mk/machine/power8/rte.vars.mk b/mk/machine/power8/rte.vars.mk
new file mode 100644
index 0000000..05dccf4
--- /dev/null
+++ b/mk/machine/power8/rte.vars.mk
@@ -0,0 +1,57 @@
+#   BSD LICENSE
+#
+#   Copyright (C) IBM Corporation 2014.
+#
+#   Redistribution and use in source and binary forms, with or without
+#   modification, are permitted provided that the following conditions
+#   are met:
+#
+#     * Redistributions of source code must retain the above copyright
+#       notice, this list of conditions and the following disclaimer.
+#     * Redistributions in binary form must reproduce the above copyright
+#       notice, this list of conditions and the following disclaimer in
+#       the documentation and/or other materials provided with the
+#       distribution.
+#     * Neither the name of IBM Corporation nor the names of its
+#       contributors may be used to endorse or promote products derived
+#       from this software without specific prior written permission.
+#
+#   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+#   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+#   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+#   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+#   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+#   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+#   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+#   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+#   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+#   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+#   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+#
+# machine:
+#
+#   - can define ARCH variable (overriden by cmdline value)
+#   - can define CROSS variable (overriden by cmdline value)
+#   - define MACHINE_CFLAGS variable (overriden by cmdline value)
+#   - define MACHINE_LDFLAGS variable (overriden by cmdline value)
+#   - define MACHINE_ASFLAGS variable (overriden by cmdline value)
+#   - can define CPU_CFLAGS variable (overriden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_LDFLAGS variable (overriden by cmdline value) that
+#     overrides the one defined in arch.
+#   - can define CPU_ASFLAGS variable (overriden by cmdline value) that
+#     overrides the one defined in arch.
+#   - may override any previously defined variable
+#
+
+# ARCH =
+# CROSS =
+# MACHINE_CFLAGS =
+# MACHINE_LDFLAGS =
+# MACHINE_ASFLAGS =
+# CPU_CFLAGS =
+# CPU_LDFLAGS =
+# CPU_ASFLAGS =
+
+MACHINE_CFLAGS = 
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 02/14] Add atomic operations for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 03/14] Add byte order " Chao Zhu
                   ` (12 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

This patch adds architecture specific atomic operation file for IBM
Power architecture CPU.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 .../common/include/arch/ppc_64/rte_atomic.h        |  415 ++++++++++++++++++++
 1 files changed, 415 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
new file mode 100644
index 0000000..9c69935
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_atomic.h
@@ -0,0 +1,415 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+/*
+ * Inspired from FreeBSD src/sys/powerpc/include/atomic.h
+ * Copyright (c) 2008 Marcel Moolenaar
+ * Copyright (c) 2001 Benno Rice
+ * Copyright (c) 2001 David E. O'Brien
+ * Copyright (c) 1998 Doug Rabson
+ * All rights reserved.
+ */
+
+#ifndef _RTE_ATOMIC_PPC_64_H_
+#define _RTE_ATOMIC_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_atomic.h"
+
+/**
+ * General memory barrier.
+ *
+ * Guarantees that the LOAD and STORE operations generated before the
+ * barrier occur before the LOAD and STORE operations generated after.
+ */
+#define	rte_mb()  asm volatile("sync" : : : "memory")
+
+/**
+ * Write memory barrier.
+ *
+ * Guarantees that the STORE operations generated before the barrier
+ * occur before the STORE operations generated after.
+ */
+#define	rte_wmb() asm volatile("sync" : : : "memory")
+
+/**
+ * Read memory barrier.
+ *
+ * Guarantees that the LOAD operations generated before the barrier
+ * occur before the LOAD operations generated after.
+ */
+#define	rte_rmb() asm volatile("sync" : : : "memory")
+
+/*------------------------- 16 bit atomic operations -------------------------*/
+/* To be compatible with Power7, use GCC built-in functions for 16 bit operations */
+
+#ifndef RTE_FORCE_INTRINSICS
+static inline int
+rte_atomic16_cmpset(volatile uint16_t *dst, uint16_t exp, uint16_t src)
+{
+	return __atomic_compare_exchange(dst, &exp, &src, 0, __ATOMIC_ACQUIRE, __ATOMIC_ACQUIRE) ? 1 : 0;
+}
+
+static inline int rte_atomic16_test_and_set(rte_atomic16_t *v)
+{
+	return rte_atomic16_cmpset((volatile uint16_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic16_inc(rte_atomic16_t *v)
+{
+	__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline void
+rte_atomic16_dec(rte_atomic16_t *v)
+{
+	__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE);
+}
+
+static inline int rte_atomic16_inc_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_add_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+static inline int rte_atomic16_dec_and_test(rte_atomic16_t *v)
+{
+	return (__atomic_sub_fetch(&v->cnt, 1, __ATOMIC_ACQUIRE) == 0);
+}
+
+/*------------------------- 32 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic32_cmpset(volatile uint32_t *dst, uint32_t exp, uint32_t src)
+{
+	unsigned int ret = 0;
+
+	asm volatile(
+			"\tlwsync\n"
+			"1:\tlwarx %[ret], 0, %[dst]\n"
+			"cmplw %[exp], %[ret]\n"
+			"bne 2f\n"
+			"stwcx. %[src], 0, %[dst]\n"
+			"bne- 1b\n"
+			"li %[ret], 1\n"
+			"b 3f\n"
+			"2:\n"
+			"stwcx. %[ret], 0, %[dst]\n"   
+			"li %[ret], 0\n"
+			"3:\n"
+			"isync\n"
+			: [ret] "=&r" (ret), "=m" (*dst)
+			: [dst] "r" (dst), [exp] "r" (exp), [src] "r" (src), "m" (*dst)
+			: "cc", "memory");
+
+	return ret;
+}
+
+static inline int rte_atomic32_test_and_set(rte_atomic32_t *v)
+{
+	return rte_atomic32_cmpset((volatile uint32_t *)&v->cnt, 0, 1);
+}
+
+static inline void
+rte_atomic32_inc(rte_atomic32_t *v)
+{
+	int t;
+
+	asm volatile(
+			"1: lwarx %[t],0,%[cnt]\n"
+			"addic %[t],%[t],1\n"
+			"stwcx. %[t],0,%[cnt]\n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "=m" (v->cnt)
+			: [cnt] "r" (&v->cnt), "m" (v->cnt)
+			: "cc", "xer", "memory");
+}
+
+static inline void
+rte_atomic32_dec(rte_atomic32_t *v)
+{
+	int t;
+
+	asm volatile(
+			"1: lwarx %[t],0,%[cnt]\n"
+			"addic %[t],%[t],-1\n"
+			"stwcx. %[t],0,%[cnt]\n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "=m" (v->cnt)
+			: [cnt] "r" (&v->cnt), "m" (v->cnt)
+			: "cc", "xer", "memory");
+}
+
+static inline int rte_atomic32_inc_and_test(rte_atomic32_t *v)
+{
+	int ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: lwarx %[ret],0,%[cnt]\n"
+			"addic	%[ret],%[ret],1\n"
+			"stwcx. %[ret],0,%[cnt]\n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [cnt] "r" (&v->cnt)
+			: "cc", "xer", "memory");
+
+	return (ret == 0);
+}
+
+static inline int rte_atomic32_dec_and_test(rte_atomic32_t *v)
+{
+	int ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: lwarx %[ret],0,%[cnt]\n"
+			"addic %[ret],%[ret],-1\n"
+			"stwcx. %[ret],0,%[cnt]\n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [cnt] "r" (&v->cnt)
+			: "cc", "xer", "memory");
+
+	return (ret == 0);
+}
+/*------------------------- 64 bit atomic operations -------------------------*/
+
+static inline int
+rte_atomic64_cmpset(volatile uint64_t *dst, uint64_t exp, uint64_t src)
+{
+	unsigned int ret = 0;
+
+	asm volatile (
+			"\tlwsync\n"
+			"1: ldarx %[ret], 0, %[dst]\n"
+			"cmpld %[exp], %[ret]\n"
+			"bne 2f\n"
+			"stdcx. %[src], 0, %[dst]\n"
+			"bne- 1b\n"
+			"li %[ret], 1\n"
+			"b 3f\n"
+			"2:\n"
+			"stdcx. %[ret], 0, %[dst]\n"
+			"li %[ret], 0\n"
+			"3:\n"
+			"isync\n"
+			: [ret] "=&r" (ret), "=m" (*dst)
+			: [dst] "r" (dst), [exp] "r" (exp), [src] "r" (src), "m" (*dst)
+			: "cc", "memory");
+	return ret;
+}
+
+static inline void
+rte_atomic64_init(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+
+static inline int64_t
+rte_atomic64_read(rte_atomic64_t *v)
+{
+	long ret;
+
+	asm volatile("ld%U1%X1 %[ret],%[cnt]" : [ret] "=r"(ret) : [cnt] "m"(v->cnt));
+
+	return ret;
+}
+
+static inline void
+rte_atomic64_set(rte_atomic64_t *v, int64_t new_value)
+{
+	asm volatile("std%U0%X0 %[new_value],%[cnt]" : [cnt] "=m"(v->cnt) : [new_value] "r"(new_value));
+}
+
+static inline void
+rte_atomic64_add(rte_atomic64_t *v, int64_t inc)
+{
+	long t;
+
+	asm volatile(
+			"1: ldarx %[t],0,%[cnt]\n"
+			"add %[t],%[inc],%[t]\n"
+			"stdcx. %[t],0,%[cnt]\n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "=m" (v->cnt)
+			: [cnt] "r" (&v->cnt), [inc] "r" (inc), "m" (v->cnt)
+			: "cc", "memory");
+}
+
+static inline void
+rte_atomic64_sub(rte_atomic64_t *v, int64_t dec)
+{
+	long t;
+
+	asm volatile(
+			"1: ldarx %[t],0,%[cnt]\n"
+			"subf %[t],%[dec],%[t]\n"
+			"stdcx. %[t],0,%[cnt]\n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "+m" (v->cnt)
+			: [cnt] "r" (&v->cnt), [dec] "r" (dec), "m" (v->cnt)
+			: "cc", "memory");
+}
+
+static inline void
+rte_atomic64_inc(rte_atomic64_t *v)
+{
+	long t;
+
+	asm volatile(
+			"1: ldarx %[t],0,%[cnt]\n"
+			"addic %[t],%[t],1\n"
+			"stdcx. %[t],0,%[cnt] \n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "+m" (v->cnt)
+			: [cnt] "r" (&v->cnt), "m" (v->cnt)
+			: "cc", "xer", "memory");
+}
+
+static inline void
+rte_atomic64_dec(rte_atomic64_t *v)
+{
+	long t;
+
+	asm volatile(
+			"1: ldarx %[t],0,%[cnt]\n"
+			"addic %[t],%[t],-1\n"
+			"stdcx. %[t],0,%[cnt]\n"
+			"bne- 1b\n"
+			: [t] "=&r" (t), "+m" (v->cnt)
+			: [cnt] "r" (&v->cnt), "m" (v->cnt)
+			: "cc", "xer", "memory");
+}
+
+static inline int64_t
+rte_atomic64_add_return(rte_atomic64_t *v, int64_t inc)
+{
+	long ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: ldarx %[ret],0,%[cnt]\n"
+			"add %[ret],%[inc],%[ret]\n"
+			"stdcx. %[ret],0,%[cnt]\n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [inc] "r" (inc), [cnt] "r" (&v->cnt)
+			: "cc", "memory");
+
+	return ret;
+}
+
+static inline int64_t
+rte_atomic64_sub_return(rte_atomic64_t *v, int64_t dec)
+{
+	long ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: ldarx %[ret],0,%[cnt]\n"
+			"subf %[ret],%[dec],%[ret]\n"
+			"stdcx. %[ret],0,%[cnt] \n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [dec] "r" (dec), [cnt] "r" (&v->cnt)
+			: "cc", "memory");
+
+	return ret;
+}
+
+static inline int rte_atomic64_inc_and_test(rte_atomic64_t *v)
+{
+	long ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: ldarx %[ret],0,%[cnt]\n"
+			"addic %[ret],%[ret],1\n"
+			"stdcx. %[ret],0,%[cnt]\n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [cnt] "r" (&v->cnt)
+			: "cc", "xer", "memory");
+
+	return (ret==0);
+}
+
+static inline int rte_atomic64_dec_and_test(rte_atomic64_t *v)
+{
+	long ret;
+
+	asm volatile(
+			"\n\tlwsync\n"
+			"1: ldarx %[ret],0,%[cnt]\n"
+			"addic %[ret],%[ret],-1\n"
+			"stdcx. %[ret],0,%[cnt]\n"
+			"bne- 1b\n"
+			"isync\n"
+			: [ret] "=&r" (ret)
+			: [cnt] "r" (&v->cnt)
+			: "cc", "xer", "memory");
+
+	return (ret==0);
+}
+
+static inline int rte_atomic64_test_and_set(rte_atomic64_t *v)
+{
+	return rte_atomic64_cmpset((volatile uint64_t *)&v->cnt, 0, 1);
+}
+
+/**
+ * Atomically set a 64-bit counter to 0.
+ *
+ * @param v
+ *   A pointer to the atomic counter.
+ */
+static inline void rte_atomic64_clear(rte_atomic64_t *v)
+{
+	v->cnt = 0;
+}
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_ATOMIC_PPC_64_H_ */
+
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 03/14] Add byte order operations for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM " Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 02/14] Add atomic operations " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  8:11   ` Qiu, Michael
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 04/14] Add CPU cycle " Chao Zhu
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

This patch adds architecture specific byte order operations for IBM Power
architecture. Power architecture support both big endian and little
endian. This patch also adds a RTE_ARCH_BIG_ENDIAN micro.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 config/defconfig_ppc_64-power8-linuxapp-gcc        |    1 +
 .../common/include/arch/ppc_64/rte_byteorder.h     |  150 ++++++++++++++++++++
 2 files changed, 151 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h

diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index 97d72ff..b10f60c 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -34,6 +34,7 @@ CONFIG_RTE_MACHINE="power8"
 
 CONFIG_RTE_ARCH="ppc_64"
 CONFIG_RTE_ARCH_PPC_64=y
+CONFIG_RTE_ARCH_BIG_ENDIAN=y
 
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
new file mode 100644
index 0000000..a593e8a
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
@@ -0,0 +1,150 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+/* Inspired from FreeBSD src/sys/powerpc/include/endian.h
+ * Copyright (c) 1987, 1991, 1993
+ * The Regents of the University of California.  All rights reserved.
+*/
+
+#ifndef _RTE_BYTEORDER_PPC_64_H_
+#define _RTE_BYTEORDER_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_byteorder.h"
+
+/*
+ * An architecture-optimized byte swap for a 16-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap16().
+ */
+static inline uint16_t rte_arch_bswap16(uint16_t _x)
+{
+	return ((_x >> 8) | ((_x << 8) & 0xff00));
+}
+
+/*
+ * An architecture-optimized byte swap for a 32-bit value.
+ *
+ * Do not use this function directly. The preferred function is rte_bswap32().
+ */
+static inline uint32_t rte_arch_bswap32(uint32_t _x)
+{
+	return ((_x >> 24) | ((_x >> 8) & 0xff00) | ((_x << 8) & 0xff0000) |
+		((_x << 24) & 0xff000000));
+}
+
+/*
+ * An architecture-optimized byte swap for a 64-bit value.
+ *
+  * Do not use this function directly. The preferred function is rte_bswap64().
+ */
+/* 64-bit mode */
+static inline uint64_t rte_arch_bswap64(uint64_t _x)
+{
+	return ((_x >> 56) | ((_x >> 40) & 0xff00) | ((_x >> 24) & 0xff0000) |
+		((_x >> 8) & 0xff000000) | ((_x << 8) & (0xffULL << 32)) |
+		((_x << 24) & (0xffULL << 40)) |
+		((_x << 40) & (0xffULL << 48)) | ((_x << 56)));
+}
+
+#ifndef RTE_FORCE_INTRINSICS
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+
+#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap32(x) :		\
+				   rte_arch_bswap32(x)))
+
+#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap64(x) :		\
+				   rte_arch_bswap64(x)))
+#else
+/*
+ * __builtin_bswap16 is only available gcc 4.8 and upwards
+ */
+#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
+#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
+				   rte_constant_bswap16(x) :		\
+				   rte_arch_bswap16(x)))
+#endif
+#endif
+
+/* Power 8 have both little endian and big endian mode 
+ * Power 7 only support big endian
+ */
+#ifndef RTE_ARCH_BIG_ENDIAN
+
+#define rte_cpu_to_le_16(x) (x)
+#define rte_cpu_to_le_32(x) (x)
+#define rte_cpu_to_le_64(x) (x)
+
+#define rte_cpu_to_be_16(x) rte_bswap16(x)
+#define rte_cpu_to_be_32(x) rte_bswap32(x)
+#define rte_cpu_to_be_64(x) rte_bswap64(x)
+
+#define rte_le_to_cpu_16(x) (x)
+#define rte_le_to_cpu_32(x) (x)
+#define rte_le_to_cpu_64(x) (x)
+
+#define rte_be_to_cpu_16(x) rte_bswap16(x)
+#define rte_be_to_cpu_32(x) rte_bswap32(x)
+#define rte_be_to_cpu_64(x) rte_bswap64(x)
+
+#else
+
+#define rte_cpu_to_le_16(x) rte_bswap16(x)
+#define rte_cpu_to_le_32(x) rte_bswap32(x)
+#define rte_cpu_to_le_64(x) rte_bswap64(x)
+
+#define rte_cpu_to_be_16(x) (x)
+#define rte_cpu_to_be_32(x) (x)
+#define rte_cpu_to_be_64(x) (x)
+
+#define rte_le_to_cpu_16(x) rte_bswap16(x)
+#define rte_le_to_cpu_32(x) rte_bswap32(x)
+#define rte_le_to_cpu_64(x) rte_bswap64(x)
+
+#define rte_be_to_cpu_16(x) (x)
+#define rte_be_to_cpu_32(x) (x)
+#define rte_be_to_cpu_64(x) (x)
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_BYTEORDER_PPC_64_H_ */
+
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 04/14] Add CPU cycle operations for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (2 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 03/14] Add byte order " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 05/14] Add prefetch operation " Chao Zhu
                   ` (10 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

IBM Power architecture doesn't have TSC register to get CPU cycles. This
patch implements the time base register read instead of TSC register of
x86 on IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 .../common/include/arch/ppc_64/rte_cycles.h        |   86 ++++++++++++++++++++
 1 files changed, 86 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h b/lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h
new file mode 100644
index 0000000..ed66b48
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_cycles.h
@@ -0,0 +1,86 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CYCLES_PPC_64_H_
+#define _RTE_CYCLES_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_cycles.h"
+
+/**
+ * Read the time base register.
+ *
+ * @return
+ *   The time base for this lcore.
+ */
+static inline uint64_t
+rte_rdtsc(void)
+{
+	union {
+		uint64_t tsc_64;
+		struct {
+			uint32_t hi_32;
+			uint32_t lo_32;
+		};
+	} tsc;
+	uint32_t tmp;
+	asm volatile(
+			"0:\n"
+			"mftbu   %[hi32]\n"
+			"mftb    %[lo32]\n"
+			"mftbu   %[tmp]\n"
+			"cmpw    %[tmp],%[hi32]\n"
+			"bne     0b\n"
+			: [hi32] "=r"(tsc.hi_32), [lo32] "=r"(tsc.lo_32), [tmp] "=r"(tmp)
+		    );
+	return tsc.tsc_64;
+}
+
+static inline uint64_t
+rte_rdtsc_precise(void)
+{
+	rte_mb();
+	return rte_rdtsc();
+}
+
+static inline uint64_t
+rte_get_tsc_cycles(void) { return rte_rdtsc(); }
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CYCLES_PPC_64_H_ */
+
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 05/14] Add prefetch operation for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (3 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 04/14] Add CPU cycle " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 06/14] Add spinlock " Chao Zhu
                   ` (9 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

This patch add architecture specific prefetch operations for IBM Power
architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 .../common/include/arch/ppc_64/rte_prefetch.h      |   61 ++++++++++++++++++++
 1 files changed, 61 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
new file mode 100644
index 0000000..9df0d13
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_prefetch.h
@@ -0,0 +1,61 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_PREFETCH_PPC_64_H_
+#define _RTE_PREFETCH_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_prefetch.h"
+
+static inline void rte_prefetch0(volatile void *p)
+{
+	asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch1(volatile void *p)
+{
+	asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+}
+
+static inline void rte_prefetch2(volatile void *p)
+{
+	asm volatile ("dcbt 0,%[p],1" : : [p] "r" (p));
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_PREFETCH_PPC_64_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 06/14] Add spinlock operation for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (4 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 05/14] Add prefetch operation " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 07/14] Add vector memcpy " Chao Zhu
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

This patch adds spinlock operations for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 .../common/include/arch/ppc_64/rte_spinlock.h      |   73 ++++++++++++++++++++
 1 files changed, 73 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_spinlock.h

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_spinlock.h b/lib/librte_eal/common/include/arch/ppc_64/rte_spinlock.h
new file mode 100644
index 0000000..ba028fe
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_spinlock.h
@@ -0,0 +1,73 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_SPINLOCK_PPC_64_H_
+#define _RTE_SPINLOCK_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <rte_common.h>
+#include "generic/rte_spinlock.h"
+
+/* Fixme: Use intrinsics to implement the spinlock on Power architecture */
+
+#ifndef RTE_FORCE_INTRINSICS
+
+static inline void
+rte_spinlock_lock(rte_spinlock_t *sl)
+{
+	while (__sync_lock_test_and_set(&sl->locked, 1))
+		while(sl->locked)
+			rte_pause();
+}
+
+static inline void
+rte_spinlock_unlock (rte_spinlock_t *sl)
+{
+	__sync_lock_release(&sl->locked);
+}
+
+static inline int
+rte_spinlock_trylock (rte_spinlock_t *sl)
+{
+	return (__sync_lock_test_and_set(&sl->locked,1) == 0);
+}
+
+#endif
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_SPINLOCK_PPC_64_H_ */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 07/14] Add vector memcpy for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (5 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 06/14] Add spinlock " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking " Chao Zhu
                   ` (7 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

The SSE based memory copy in DPDK only support x86. This patch adds
altivec based memory copy functions for IBM Power architecture. This
patch includes altivec.h which requires GCC version>= 4.8.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 .../common/include/arch/ppc_64/rte_memcpy.h        |  224 ++++++++++++++++++++
 1 files changed, 224 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h

diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
new file mode 100644
index 0000000..b9b8ddc
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_memcpy.h
@@ -0,0 +1,224 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_MEMCPY_PPC_64_H_
+#define _RTE_MEMCPY_PPC_64_H_
+
+#include <stdint.h>
+#include <string.h>
+/*To include altivec.h, GCC version must  >= 4.8 */
+#include <altivec.h>
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include "generic/rte_memcpy.h"
+
+static inline void
+rte_mov16(uint8_t *dst, const uint8_t *src)
+{
+	vec_vsx_st(vec_vsx_ld(0, src), 0, dst);
+}
+
+static inline void
+rte_mov32(uint8_t *dst, const uint8_t *src)
+{
+	vec_vsx_st(vec_vsx_ld(0, src), 0, dst);
+	vec_vsx_st(vec_vsx_ld(16, src), 16, dst);
+}
+
+static inline void
+rte_mov48(uint8_t *dst, const uint8_t *src)
+{
+	vec_vsx_st(vec_vsx_ld(0, src), 0, dst);
+	vec_vsx_st(vec_vsx_ld(16, src), 16, dst);
+	vec_vsx_st(vec_vsx_ld(32, src), 32, dst);
+}
+
+static inline void
+rte_mov64(uint8_t *dst, const uint8_t *src)
+{
+	vec_vsx_st(vec_vsx_ld(0, src), 0, dst);
+	vec_vsx_st(vec_vsx_ld(16, src), 16, dst);
+	vec_vsx_st(vec_vsx_ld(32, src), 32, dst);
+	vec_vsx_st(vec_vsx_ld(48, src), 48, dst);
+}
+
+static inline void
+rte_mov128(uint8_t *dst, const uint8_t *src)
+{
+	vec_vsx_st(vec_vsx_ld(0, src), 0, dst);
+	vec_vsx_st(vec_vsx_ld(16, src), 16, dst);
+	vec_vsx_st(vec_vsx_ld(32, src), 32, dst);
+	vec_vsx_st(vec_vsx_ld(48, src), 48, dst);
+	vec_vsx_st(vec_vsx_ld(64, src), 64, dst);
+	vec_vsx_st(vec_vsx_ld(80, src), 80, dst);
+	vec_vsx_st(vec_vsx_ld(96, src), 96, dst);
+	vec_vsx_st(vec_vsx_ld(112, src), 112, dst);
+}
+
+static inline void
+rte_mov256(uint8_t *dst, const uint8_t *src)
+{
+	rte_mov128(dst, src);
+	rte_mov128(dst + 128, src + 128);
+}
+
+#define rte_memcpy(dst, src, n)              \
+	((__builtin_constant_p(n)) ?          \
+	memcpy((dst), (src), (n)) :          \
+	rte_memcpy_func((dst), (src), (n)))
+
+static inline void *
+rte_memcpy_func(void *dst, const void *src, size_t n)
+{
+	void *ret = dst;
+
+	/* We can't copy < 16 bytes using XMM registers so do it manually. */
+	if (n < 16) {
+		if (n & 0x01) {
+			*(uint8_t *)dst = *(const uint8_t *)src;
+			dst = (uint8_t *)dst + 1;
+			src = (const uint8_t *)src + 1;
+		}
+		if (n & 0x02) {
+			*(uint16_t *)dst = *(const uint16_t *)src;
+			dst = (uint16_t *)dst + 1;
+			src = (const uint16_t *)src + 1;
+		}
+		if (n & 0x04) {
+			*(uint32_t *)dst = *(const uint32_t *)src;
+			dst = (uint32_t *)dst + 1;
+			src = (const uint32_t *)src + 1;
+		}
+		if (n & 0x08) {
+			*(uint64_t *)dst = *(const uint64_t *)src;
+		}
+		return ret;
+	}
+
+	/* Special fast cases for <= 128 bytes */
+	if (n <= 32) {
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+		return ret;
+	}
+
+	if (n <= 64) {
+		rte_mov32((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov32((uint8_t *)dst - 32 + n, (const uint8_t *)src - 32 + n);
+		return ret;
+	}
+
+	if (n <= 128) {
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		rte_mov64((uint8_t *)dst - 64 + n, (const uint8_t *)src - 64 + n);
+		return ret;
+	}
+
+	/*
+	 * For large copies > 128 bytes. This combination of 256, 64 and 16 byte
+	 * copies was found to be faster than doing 128 and 32 byte copies as
+	 * well.
+	 */
+	for ( ; n >= 256; n -= 256) {
+		rte_mov256((uint8_t *)dst, (const uint8_t *)src);
+		dst = (uint8_t *)dst + 256;
+		src = (const uint8_t *)src + 256;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 256) into
+	 * 64byte (2^6) chunks.
+	 * Using incrementing integers in the case labels of a switch statement
+	 * enourages the compiler to use a jump table. To get incrementing
+	 * integers, we shift the 2 relevant bits to the LSB position to first
+	 * get decrementing integers, and then subtract.
+	 */
+	switch (3 - (n >> 6)) {
+	case 0x00:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x01:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	case 0x02:
+		rte_mov64((uint8_t *)dst, (const uint8_t *)src);
+		n -= 64;
+		dst = (uint8_t *)dst + 64;
+		src = (const uint8_t *)src + 64;      /* fallthrough */
+	default:
+		;
+	}
+
+	/*
+	 * We split the remaining bytes (which will be less than 64) into
+	 * 16byte (2^4) chunks, using the same switch structure as above.
+	 */
+	switch (3 - (n >> 4)) {
+	case 0x00:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x01:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	case 0x02:
+		rte_mov16((uint8_t *)dst, (const uint8_t *)src);
+		n -= 16;
+		dst = (uint8_t *)dst + 16;
+		src = (const uint8_t *)src + 16;      /* fallthrough */
+	default:
+		;
+	}
+
+	/* Copy any remaining bytes, without going beyond end of buffers */
+	if (n != 0) {
+		rte_mov16((uint8_t *)dst - 16 + n, (const uint8_t *)src - 16 + n);
+	}
+	return ret;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_MEMCPY_PPC_64_H_ */
+
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (6 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 07/14] Add vector memcpy " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24 14:14   ` Neil Horman
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 09/14] Remove iopl operation " Chao Zhu
                   ` (6 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

IBM Power processor doesn't have CPU flag hardware registers. This patch
uses aux vector software register to get CPU flags and add CPU flag
checking support for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 app/test/test_cpuflags.c                           |   35 ++++
 .../common/include/arch/ppc_64/rte_cpuflags.h      |  184 ++++++++++++++++++++
 mk/rte.cpuflags.mk                                 |   17 ++
 3 files changed, 236 insertions(+), 0 deletions(-)
 create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h

diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
index 82c0197..5aeba5d 100644
--- a/app/test/test_cpuflags.c
+++ b/app/test/test_cpuflags.c
@@ -80,6 +80,40 @@ test_cpuflags(void)
 	int result;
 	printf("\nChecking for flags from different registers...\n");
 
+#ifdef RTE_ARCH_PPC_64
+	printf("Check for PPC64:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC64);
+
+	printf("Check for PPC32:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC32);
+
+	printf("Check for VSX:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_VSX);
+
+	printf("Check for DFP:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_DFP);
+
+	printf("Check for FPU:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_FPU);
+
+	printf("Check for SMT:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_SMT);
+
+	printf("Check for MMU:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_MMU);
+
+	printf("Check for ALTIVEC:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_ALTIVEC);
+
+	printf("Check for ARCH_2_06:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_06);
+
+	printf("Check for ARCH_2_07:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_07);
+
+	printf("Check for ICACHE_SNOOP:\t\t");
+	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
+#else
 	printf("Check for SSE:\t\t");
 	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
 
@@ -117,6 +151,7 @@ test_cpuflags(void)
 	CHECK_FOR_FLAG(RTE_CPUFLAG_INVTSC);
 
 
+#endif
 
 	/*
 	 * Check if invalid data is handled properly
diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
new file mode 100644
index 0000000..6b38f1c
--- /dev/null
+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
@@ -0,0 +1,184 @@
+/*
+ *   BSD LICENSE
+ *
+ *   Copyright (C) IBM Corporation 2014.
+ *
+ *   Redistribution and use in source and binary forms, with or without
+ *   modification, are permitted provided that the following conditions
+ *   are met:
+ *
+ *     * Redistributions of source code must retain the above copyright
+ *       notice, this list of conditions and the following disclaimer.
+ *     * Redistributions in binary form must reproduce the above copyright
+ *       notice, this list of conditions and the following disclaimer in
+ *       the documentation and/or other materials provided with the
+ *       distribution.
+ *     * Neither the name of IBM Corporation nor the names of its
+ *       contributors may be used to endorse or promote products derived
+ *       from this software without specific prior written permission.
+ *
+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+*/
+
+#ifndef _RTE_CPUFLAGS_PPC_64_H_
+#define _RTE_CPUFLAGS_PPC_64_H_
+
+#ifdef __cplusplus
+extern "C" {
+#endif
+
+#include <elf.h>
+#include <fcntl.h>
+#include <assert.h>
+#include <unistd.h>
+
+#include "generic/rte_cpuflags.h"
+
+/* Symbolic values for the entries in the auxiliary table */
+#define AT_HWCAP  16
+#define AT_HWCAP2 26
+
+/* software based registers */
+enum cpu_register_t {
+	REG_HWCAP = 0,
+	REG_HWCAP2,
+};
+
+/**
+ * Enumeration of all CPU features supported
+ */
+enum rte_cpu_flag_t {
+	RTE_CPUFLAG_PPC_LE = 0,
+	RTE_CPUFLAG_TRUE_LE,
+	RTE_CPUFLAG_PSERIES_PERFMON_COMPAT,
+	RTE_CPUFLAG_VSX,
+	RTE_CPUFLAG_ARCH_2_06,
+	RTE_CPUFLAG_POWER6_EXT,
+	RTE_CPUFLAG_DFP,
+	RTE_CPUFLAG_PA6T,
+	RTE_CPUFLAG_ARCH_2_05,
+	RTE_CPUFLAG_ICACHE_SNOOP,
+	RTE_CPUFLAG_SMT,
+	RTE_CPUFLAG_BOOKE,
+	RTE_CPUFLAG_CELLBE,
+	RTE_CPUFLAG_POWER5_PLUS,
+	RTE_CPUFLAG_POWER5,
+	RTE_CPUFLAG_POWER4,
+	RTE_CPUFLAG_NOTB,
+	RTE_CPUFLAG_EFP_DOUBLE,
+	RTE_CPUFLAG_EFP_SINGLE,
+	RTE_CPUFLAG_SPE,
+	RTE_CPUFLAG_UNIFIED_CACHE,
+	RTE_CPUFLAG_4xxMAC,
+	RTE_CPUFLAG_MMU,
+	RTE_CPUFLAG_FPU,
+	RTE_CPUFLAG_ALTIVEC,
+	RTE_CPUFLAG_PPC601,
+	RTE_CPUFLAG_PPC64,
+	RTE_CPUFLAG_PPC32,
+	RTE_CPUFLAG_TAR,
+	RTE_CPUFLAG_LSEL,
+	RTE_CPUFLAG_EBB,
+	RTE_CPUFLAG_DSCR,
+	RTE_CPUFLAG_HTM,
+	RTE_CPUFLAG_ARCH_2_07,
+	/* The last item */
+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
+};
+
+static const struct feature_entry cpu_feature_table[] = {
+	FEAT_DEF(PPC_LE, 0x00000001, 0, REG_HWCAP,  0)
+	FEAT_DEF(TRUE_LE, 0x00000001, 0, REG_HWCAP,  1)
+	FEAT_DEF(PSERIES_PERFMON_COMPAT, 0x00000001, 0, REG_HWCAP,  6)
+	FEAT_DEF(VSX, 0x00000001, 0, REG_HWCAP,  7)
+	FEAT_DEF(ARCH_2_06, 0x00000001, 0, REG_HWCAP,  8)
+	FEAT_DEF(POWER6_EXT, 0x00000001, 0, REG_HWCAP,  9)
+	FEAT_DEF(DFP, 0x00000001, 0, REG_HWCAP,  10)
+	FEAT_DEF(PA6T, 0x00000001, 0, REG_HWCAP,  11)
+	FEAT_DEF(ARCH_2_05, 0x00000001, 0, REG_HWCAP,  12)
+	FEAT_DEF(ICACHE_SNOOP, 0x00000001, 0, REG_HWCAP,  13)
+	FEAT_DEF(SMT, 0x00000001, 0, REG_HWCAP,  14)
+	FEAT_DEF(BOOKE, 0x00000001, 0, REG_HWCAP,  15)
+	FEAT_DEF(CELLBE, 0x00000001, 0, REG_HWCAP,  16)
+	FEAT_DEF(POWER5_PLUS, 0x00000001, 0, REG_HWCAP,  17)
+	FEAT_DEF(POWER5, 0x00000001, 0, REG_HWCAP,  18)
+	FEAT_DEF(POWER4, 0x00000001, 0, REG_HWCAP,  19)
+	FEAT_DEF(NOTB, 0x00000001, 0, REG_HWCAP,  20)
+	FEAT_DEF(EFP_DOUBLE, 0x00000001, 0, REG_HWCAP,  21)
+	FEAT_DEF(EFP_SINGLE, 0x00000001, 0, REG_HWCAP,  22)
+	FEAT_DEF(SPE, 0x00000001, 0, REG_HWCAP,  23)
+	FEAT_DEF(UNIFIED_CACHE, 0x00000001, 0, REG_HWCAP,  24)
+	FEAT_DEF(4xxMAC, 0x00000001, 0, REG_HWCAP,  25)
+	FEAT_DEF(MMU, 0x00000001, 0, REG_HWCAP,  26)
+	FEAT_DEF(FPU, 0x00000001, 0, REG_HWCAP,  27)
+	FEAT_DEF(ALTIVEC, 0x00000001, 0, REG_HWCAP,  28)
+	FEAT_DEF(PPC601, 0x00000001, 0, REG_HWCAP,  29)
+	FEAT_DEF(PPC64, 0x00000001, 0, REG_HWCAP,  30)
+	FEAT_DEF(PPC32, 0x00000001, 0, REG_HWCAP,  31)
+	FEAT_DEF(TAR, 0x00000001, 0, REG_HWCAP2,  26)
+	FEAT_DEF(LSEL, 0x00000001, 0, REG_HWCAP2,  27)
+	FEAT_DEF(EBB, 0x00000001, 0, REG_HWCAP2,  28)
+	FEAT_DEF(DSCR, 0x00000001, 0, REG_HWCAP2,  29)
+	FEAT_DEF(HTM, 0x00000001, 0, REG_HWCAP2,  30)
+	FEAT_DEF(ARCH_2_07, 0x00000001, 0, REG_HWCAP2,  31)
+};
+
+/*
+ * Read AUXV software register and get cpu features for Power
+ */
+static inline void
+rte_cpu_get_features( __attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
+{
+  int auxv_fd;
+  Elf64_auxv_t auxv;
+  auxv_fd = open("/proc/self/auxv", O_RDONLY);
+  assert(auxv_fd);
+  while (read(auxv_fd, &auxv, sizeof(Elf64_auxv_t))== sizeof(Elf64_auxv_t)) {
+    if (auxv.a_type == AT_HWCAP)
+      out[REG_HWCAP] = auxv.a_un.a_val;
+    else if (auxv.a_type == AT_HWCAP2)
+      out[REG_HWCAP2] = auxv.a_un.a_val;
+  }
+}
+
+/*
+ * Checks if a particular flag is available on current machine.
+ */
+static inline int
+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
+{
+	const struct feature_entry *feat;
+	cpuid_registers_t regs={0};
+
+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
+		/* Flag does not match anything in the feature tables */
+		return -ENOENT;
+
+	feat = &cpu_feature_table[feature];
+
+	if (!feat->leaf)
+		/* This entry in the table wasn't filled out! */
+		return -EFAULT;
+
+	/* get the cpuid leaf containing the desired feature */
+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
+
+	/* check if the feature is enabled */
+	return (regs[feat->reg] >> feat->bit) & 1;
+}
+
+#ifdef __cplusplus
+}
+#endif
+
+#endif /* _RTE_CPUFLAGS_PPC_64_H_ */
diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
index 65332e1..f595cd0 100644
--- a/mk/rte.cpuflags.mk
+++ b/mk/rte.cpuflags.mk
@@ -89,6 +89,23 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__AVX2__),)
 CPUFLAGS += AVX2
 endif
 
+# IBM Power CPU flags
+ifneq ($(filter $(AUTO_CPUFLAGS),__PPC64__),)
+CPUFLAGS += PPC64
+endif
+
+ifneq ($(filter $(AUTO_CPUFLAGS),__PPC32__),)
+CPUFLAGS += PPC32
+endif
+
+ifneq ($(filter $(AUTO_CPUFLAGS),__vector),)
+CPUFLAGS += ALTIVEC
+endif
+
+ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
+CPUFLAGS += VSX
+endif
+
 MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
 
 # To strip whitespace
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 09/14] Remove iopl operation for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (7 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 10/14] Add cache size define for IBM Power Architecture Chao Zhu
                   ` (5 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

iopl() call is mostly for the i386 architecture. In Power and other
architecture, it doesn't exist. This patch modified rte_eal_iopl_init()
and make it return -1 for Power and other architecture. Thus
rte_config.flags will not contain EAL_FLG_HIGH_IOPL flag for other
architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 lib/librte_eal/linuxapp/eal/eal.c |    8 ++++++++
 1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 7a1d087..0bf81be 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -50,7 +50,9 @@
 #include <errno.h>
 #include <sys/mman.h>
 #include <sys/queue.h>
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) 
 #include <sys/io.h>
+#endif
 
 #include <rte_common.h>
 #include <rte_debug.h>
@@ -752,13 +754,19 @@ rte_eal_mcfg_complete(void)
 
 /*
  * Request iopl privilege for all RPL, returns 0 on success
+ * iopl() call is mostly for the i386 architecture. For other architectures,
+ * return -1 to indicate IO priviledge can't be changed in this way. 
  */
 int
 rte_eal_iopl_init(void)
 {
+#if defined(RTE_ARCH_X86_64) || defined(RTE_ARCH_I686) 
 	if (iopl(3) != 0)
 		return -1;
 	return 0;
+#else
+	return -1;
+#endif
 }
 
 /* Launch threads, called at application init(). */
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 10/14] Add cache size define for IBM Power Architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (8 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 09/14] Remove iopl operation " Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 11/14] Add huge page size define for IBM Power architecture Chao Zhu
                   ` (4 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

IBM Power architecture has different cache line size (128 bytes) than
x86 (64 bytes). This patch defines CACHE_LINE_SIZE to 128 bytes to
override the default value 64 bytes to support IBM Power Architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 app/test/test_malloc.c     |    8 ++++----
 mk/arch/ppc_64/rte.vars.mk |    2 +-
 2 files changed, 5 insertions(+), 5 deletions(-)

diff --git a/app/test/test_malloc.c b/app/test/test_malloc.c
index ee34ca3..63e6b32 100644
--- a/app/test/test_malloc.c
+++ b/app/test/test_malloc.c
@@ -300,9 +300,9 @@ test_big_alloc(void)
 	size_t size =rte_str_to_size(MALLOC_MEMZONE_SIZE)*2;
 	int align = 0;
 #ifndef RTE_LIBRTE_MALLOC_DEBUG
-	int overhead = 64 + 64;
+	int overhead = CACHE_LINE_SIZE + CACHE_LINE_SIZE;
 #else
-	int overhead = 64 + 64 + 64;
+	int overhead = CACHE_LINE_SIZE + CACHE_LINE_SIZE + CACHE_LINE_SIZE;
 #endif
 
 	rte_malloc_get_socket_stats(socket, &pre_stats);
@@ -356,9 +356,9 @@ test_multi_alloc_statistics(void)
 #ifndef RTE_LIBRTE_MALLOC_DEBUG
 	int trailer_size = 0;
 #else
-	int trailer_size = 64;
+	int trailer_size = CACHE_LINE_SIZE;
 #endif
-	int overhead = 64 + trailer_size;
+	int overhead = CACHE_LINE_SIZE + trailer_size;
 
 	rte_malloc_get_socket_stats(socket, &pre_stats);
 
diff --git a/mk/arch/ppc_64/rte.vars.mk b/mk/arch/ppc_64/rte.vars.mk
index 363fcd1..dfdeaea 100644
--- a/mk/arch/ppc_64/rte.vars.mk
+++ b/mk/arch/ppc_64/rte.vars.mk
@@ -32,7 +32,7 @@
 ARCH  ?= powerpc
 CROSS ?=
 
-CPU_CFLAGS  ?= -m64
+CPU_CFLAGS  ?= -m64 -DCACHE_LINE_SIZE=128
 CPU_LDFLAGS ?=
 CPU_ASFLAGS ?= -felf64
 
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 11/14] Add huge page size define for IBM Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (9 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 10/14] Add cache size define for IBM Power Architecture Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture Chao Zhu
                   ` (3 subsequent siblings)
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

IBM Power architecture has different huge page sizes (16MB, 16GB) than
x86.This patch defines RTE_PGSIZE_16M and RTE_PGSIZE_16G in the
rte_page_sizes enum variable and adds huge page size support of DPDK
for IBM Power architecture.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 app/test/test_memzone.c                     |  119 ++++++++++++++++++++++++++-
 lib/librte_eal/common/eal_common_memzone.c  |   15 +++-
 lib/librte_eal/common/include/rte_memory.h  |    9 ++-
 lib/librte_eal/common/include/rte_memzone.h |    8 ++
 lib/librte_eal/linuxapp/eal/eal.c           |    5 +-
 5 files changed, 147 insertions(+), 9 deletions(-)

diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 381f643..8668103 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -133,6 +133,8 @@ test_memzone_reserve_flags(void)
 	const struct rte_memseg *ms;
 	int hugepage_2MB_avail = 0;
 	int hugepage_1GB_avail = 0;
+	int hugepage_16MB_avail = 0;
+	int hugepage_16GB_avail = 0;
 	const size_t size = 100;
 	int i = 0;
 	ms = rte_eal_get_physmem_layout();
@@ -141,12 +143,20 @@ test_memzone_reserve_flags(void)
 			hugepage_2MB_avail = 1;
 		if (ms[i].hugepage_sz == RTE_PGSIZE_1G)
 			hugepage_1GB_avail = 1;
+		if (ms[i].hugepage_sz == RTE_PGSIZE_16M)
+			hugepage_16MB_avail = 1;
+		if (ms[i].hugepage_sz == RTE_PGSIZE_16G)
+			hugepage_16GB_avail = 1;
 	}
-	/* Display the availability of 2MB and 1GB pages */
+	/* Display the availability of 2MB ,1GB, 16MB, 16GB pages */
 	if (hugepage_2MB_avail)
 		printf("2MB Huge pages available\n");
 	if (hugepage_1GB_avail)
 		printf("1GB Huge pages available\n");
+	if (hugepage_16MB_avail)
+		printf("16MB Huge pages available\n");
+	if (hugepage_16GB_avail)
+		printf("16GB Huge pages available\n");
 	/*
 	 * If 2MB pages available, check that a small memzone is correctly
 	 * reserved from 2MB huge pages when requested by the RTE_MEMZONE_2MB flag.
@@ -255,6 +265,113 @@ test_memzone_reserve_flags(void)
 			}
 		}
 	}
+	/*
+	 * This option is for IBM Power. If 16MB pages available, check that a small memzone is correctly
+	 * reserved from 16MB huge pages when requested by the RTE_MEMZONE_16MB flag.
+	 * Also check that RTE_MEMZONE_SIZE_HINT_ONLY flag only defaults to an
+	 * available page size (i.e 16GB ) when 16MB pages are unavailable.
+	 */
+	if (hugepage_16MB_avail){
+		mz = rte_memzone_reserve("flag_zone_16M", size, SOCKET_ID_ANY,
+				RTE_MEMZONE_16MB);
+		if (mz == NULL) {
+			printf("MEMZONE FLAG 16MB\n");
+			return -1;
+		}
+		if (mz->hugepage_sz != RTE_PGSIZE_16M) {
+			printf("hugepage_sz not equal 16M\n");
+			return -1;
+		}
+
+		mz = rte_memzone_reserve("flag_zone_16M_HINT", size, SOCKET_ID_ANY,
+				RTE_MEMZONE_16MB|RTE_MEMZONE_SIZE_HINT_ONLY);
+		if (mz == NULL) {
+			printf("MEMZONE FLAG 2MB\n");
+			return -1;
+		}
+		if (mz->hugepage_sz != RTE_PGSIZE_16M) {
+			printf("hugepage_sz not equal 16M\n");
+			return -1;
+		}
+
+		/* Check if 1GB huge pages are unavailable, that function fails unless
+		 * HINT flag is indicated
+		 */
+		if (!hugepage_16GB_avail) {
+			mz = rte_memzone_reserve("flag_zone_16G_HINT", size, SOCKET_ID_ANY,
+					RTE_MEMZONE_16GB|RTE_MEMZONE_SIZE_HINT_ONLY);
+			if (mz == NULL) {
+				printf("MEMZONE FLAG 16GB & HINT\n");
+				return -1;
+			}
+			if (mz->hugepage_sz != RTE_PGSIZE_16M) {
+				printf("hugepage_sz not equal 16M\n");
+				return -1;
+			}
+
+			mz = rte_memzone_reserve("flag_zone_16G", size, SOCKET_ID_ANY,
+					RTE_MEMZONE_16GB);
+			if (mz != NULL) {
+				printf("MEMZONE FLAG 16GB\n");
+				return -1;
+			}
+		}
+	}
+	/*As with 16MB tests above for 16GB huge page requests*/
+	if (hugepage_16GB_avail){
+		mz = rte_memzone_reserve("flag_zone_16G", size, SOCKET_ID_ANY,
+				RTE_MEMZONE_16GB);
+		if (mz == NULL) {
+			printf("MEMZONE FLAG 16GB\n");
+			return -1;
+		}
+		if (mz->hugepage_sz != RTE_PGSIZE_16G) {
+			printf("hugepage_sz not equal 16G\n");
+			return -1;
+		}
+
+		mz = rte_memzone_reserve("flag_zone_16G_HINT", size, SOCKET_ID_ANY,
+				RTE_MEMZONE_16GB|RTE_MEMZONE_SIZE_HINT_ONLY);
+		if (mz == NULL) {
+			printf("MEMZONE FLAG 16GB\n");
+			return -1;
+		}
+		if (mz->hugepage_sz != RTE_PGSIZE_16G) {
+			printf("hugepage_sz not equal 16G\n");
+			return -1;
+		}
+
+		/* Check if 1GB huge pages are unavailable, that function fails unless
+		 * HINT flag is indicated
+		 */
+		if (!hugepage_16MB_avail) {
+			mz = rte_memzone_reserve("flag_zone_16M_HINT", size, SOCKET_ID_ANY,
+					RTE_MEMZONE_16MB|RTE_MEMZONE_SIZE_HINT_ONLY);
+			if (mz == NULL){
+				printf("MEMZONE FLAG 16MB & HINT\n");
+				return -1;
+			}
+			if (mz->hugepage_sz != RTE_PGSIZE_16G) {
+				printf("hugepage_sz not equal 16G\n");
+				return -1;
+			}
+			mz = rte_memzone_reserve("flag_zone_16M", size, SOCKET_ID_ANY,
+					RTE_MEMZONE_16MB);
+			if (mz != NULL) {
+				printf("MEMZONE FLAG 16MB\n");
+				return -1;
+			}
+		}
+
+		if (hugepage_16MB_avail && hugepage_16GB_avail) {
+			mz = rte_memzone_reserve("flag_zone_16M_HINT", size, SOCKET_ID_ANY,
+								RTE_MEMZONE_16MB|RTE_MEMZONE_16GB);
+			if (mz != NULL) {
+				printf("BOTH SIZES SET\n");
+				return -1;
+			}
+		}
+	}
 	return 0;
 }
 
diff --git a/lib/librte_eal/common/eal_common_memzone.c b/lib/librte_eal/common/eal_common_memzone.c
index 5acd9ce..e552c7a 100644
--- a/lib/librte_eal/common/eal_common_memzone.c
+++ b/lib/librte_eal/common/eal_common_memzone.c
@@ -221,6 +221,12 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		if ((flags & RTE_MEMZONE_1GB) &&
 				free_memseg[i].hugepage_sz == RTE_PGSIZE_2M )
 			continue;
+		if ((flags & RTE_MEMZONE_16MB) &&
+				free_memseg[i].hugepage_sz == RTE_PGSIZE_16G )
+			continue;
+		if ((flags & RTE_MEMZONE_16GB) &&
+				free_memseg[i].hugepage_sz == RTE_PGSIZE_16M )
+			continue;
 
 		/* this segment is the best until now */
 		if (memseg_idx == -1) {
@@ -256,7 +262,8 @@ memzone_reserve_aligned_thread_unsafe(const char *name, size_t len,
 		 * try allocating again without the size parameter otherwise -fail.
 		 */
 		if ((flags & RTE_MEMZONE_SIZE_HINT_ONLY)  &&
-		    ((flags & RTE_MEMZONE_1GB) || (flags & RTE_MEMZONE_2MB)))
+		    ((flags & RTE_MEMZONE_1GB) || (flags & RTE_MEMZONE_2MB) 
+		     || (flags & RTE_MEMZONE_16MB) || (flags & RTE_MEMZONE_16GB)))
 			return memzone_reserve_aligned_thread_unsafe(name,
 				len, socket_id, 0, align, bound);
 
@@ -313,7 +320,8 @@ rte_memzone_reserve_aligned(const char *name, size_t len,
 	const struct rte_memzone *mz = NULL;
 
 	/* both sizes cannot be explicitly called for */
-	if ((flags & RTE_MEMZONE_1GB) && (flags & RTE_MEMZONE_2MB)) {
+	if (((flags & RTE_MEMZONE_1GB) && (flags & RTE_MEMZONE_2MB)) 
+		|| ((flags & RTE_MEMZONE_16MB) && (flags & RTE_MEMZONE_16GB))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
@@ -344,7 +352,8 @@ rte_memzone_reserve_bounded(const char *name, size_t len,
 	const struct rte_memzone *mz = NULL;
 
 	/* both sizes cannot be explicitly called for */
-	if ((flags & RTE_MEMZONE_1GB) && (flags & RTE_MEMZONE_2MB)) {
+	if (((flags & RTE_MEMZONE_1GB) && (flags & RTE_MEMZONE_2MB)) 
+		|| ((flags & RTE_MEMZONE_16MB) && (flags & RTE_MEMZONE_16GB))) {
 		rte_errno = EINVAL;
 		return NULL;
 	}
diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h
index 4cf8ea9..2ed2637 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -53,9 +53,12 @@ extern "C" {
 #endif
 
 enum rte_page_sizes {
-	RTE_PGSIZE_4K = 1 << 12,
-	RTE_PGSIZE_2M = RTE_PGSIZE_4K << 9,
-	RTE_PGSIZE_1G = RTE_PGSIZE_2M <<9
+	RTE_PGSIZE_4K = 1ULL << 12,
+	RTE_PGSIZE_2M = 1ULL << 21,
+	RTE_PGSIZE_1G = 1ULL << 30,
+	RTE_PGSIZE_64K = 1ULL << 16,
+	RTE_PGSIZE_16M = 1ULL << 24,
+	RTE_PGSIZE_16G = 1ULL << 34
 };
 
 #define SOCKET_ID_ANY -1                    /**< Any NUMA socket. */
diff --git a/lib/librte_eal/common/include/rte_memzone.h b/lib/librte_eal/common/include/rte_memzone.h
index 5014409..7d47bff 100644
--- a/lib/librte_eal/common/include/rte_memzone.h
+++ b/lib/librte_eal/common/include/rte_memzone.h
@@ -60,6 +60,8 @@ extern "C" {
 
 #define RTE_MEMZONE_2MB            0x00000001   /**< Use 2MB pages. */
 #define RTE_MEMZONE_1GB            0x00000002   /**< Use 1GB pages. */
+#define RTE_MEMZONE_16MB            0x00000100   /**< Use 16MB pages. */
+#define RTE_MEMZONE_16GB            0x00000200   /**< Use 16GB pages. */
 #define RTE_MEMZONE_SIZE_HINT_ONLY 0x00000004   /**< Use available page size */
 
 /**
@@ -111,6 +113,8 @@ struct rte_memzone {
  *   taken from 1GB or 2MB hugepages.
  *   - RTE_MEMZONE_2MB - Reserve from 2MB pages
  *   - RTE_MEMZONE_1GB - Reserve from 1GB pages
+ *   - RTE_MEMZONE_16MB - Reserve from 16MB pages
+ *   - RTE_MEMZONE_16GB - Reserve from 16GB pages
  *   - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if
  *                                  the requested page size is unavailable.
  *                                  If this flag is not set, the function
@@ -156,6 +160,8 @@ const struct rte_memzone *rte_memzone_reserve(const char *name,
  *   taken from 1GB or 2MB hugepages.
  *   - RTE_MEMZONE_2MB - Reserve from 2MB pages
  *   - RTE_MEMZONE_1GB - Reserve from 1GB pages
+ *   - RTE_MEMZONE_16MB - Reserve from 16MB pages
+ *   - RTE_MEMZONE_16GB - Reserve from 16GB pages
  *   - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if
  *                                  the requested page size is unavailable.
  *                                  If this flag is not set, the function
@@ -206,6 +212,8 @@ const struct rte_memzone *rte_memzone_reserve_aligned(const char *name,
  *   taken from 1GB or 2MB hugepages.
  *   - RTE_MEMZONE_2MB - Reserve from 2MB pages
  *   - RTE_MEMZONE_1GB - Reserve from 1GB pages
+ *   - RTE_MEMZONE_16MB - Reserve from 16MB pages
+ *   - RTE_MEMZONE_16GB - Reserve from 16GB pages
  *   - RTE_MEMZONE_SIZE_HINT_ONLY - Allow alternative page size to be used if
  *                                  the requested page size is unavailable.
  *                                  If this flag is not set, the function
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 0bf81be..f9517c7 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -455,9 +455,10 @@ eal_parse_base_virtaddr(const char *arg)
 		return -1;
 #endif
 
-	/* align the addr on 2M boundary */
+	/* align the addr on 16M boundary, 16MB is the minimum huge page size on IBM Power architecture.
+         * If the addr is aligned to 16MB, it can align to 2MB for x86. So this alignment can also be used on x86 */
 	internal_config.base_virtaddr = RTE_PTR_ALIGN_CEIL((uintptr_t)addr,
-	                                                   RTE_PGSIZE_2M);
+	                                                   RTE_PGSIZE_16M);
 
 	return 0;
 }
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (10 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 11/14] Add huge page size define for IBM Power architecture Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24 15:17   ` David Marchand
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 13/14] test_memzone:fix finding the second smallest segment Chao Zhu
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

The mmap of hugepage files on IBM Power starts from high address to low
address. This is different from x86. This patch modified the memory
segment detection code to get the correct memory segment layout on Power
architecture. This patch also added a commond ARCH_PPC_64 defination for
64 bit systems.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 config/defconfig_ppc_64-power8-linuxapp-gcc   |    1 +
 config/defconfig_x86_64-native-linuxapp-clang |    1 +
 config/defconfig_x86_64-native-linuxapp-gcc   |    1 +
 config/defconfig_x86_64-native-linuxapp-icc   |    1 +
 lib/librte_eal/linuxapp/eal/eal_memory.c      |   75 ++++++++++++++++++-------
 5 files changed, 59 insertions(+), 20 deletions(-)

diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
index b10f60c..23a5591 100644
--- a/config/defconfig_ppc_64-power8-linuxapp-gcc
+++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
@@ -35,6 +35,7 @@ CONFIG_RTE_MACHINE="power8"
 CONFIG_RTE_ARCH="ppc_64"
 CONFIG_RTE_ARCH_PPC_64=y
 CONFIG_RTE_ARCH_BIG_ENDIAN=y
+CONFIG_RTE_ARCH_64=y
 
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
diff --git a/config/defconfig_x86_64-native-linuxapp-clang b/config/defconfig_x86_64-native-linuxapp-clang
index bbda080..5f3074e 100644
--- a/config/defconfig_x86_64-native-linuxapp-clang
+++ b/config/defconfig_x86_64-native-linuxapp-clang
@@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
 
 CONFIG_RTE_ARCH="x86_64"
 CONFIG_RTE_ARCH_X86_64=y
+CONFIG_RTE_ARCH_64=y
 
 CONFIG_RTE_TOOLCHAIN="clang"
 CONFIG_RTE_TOOLCHAIN_CLANG=y
diff --git a/config/defconfig_x86_64-native-linuxapp-gcc b/config/defconfig_x86_64-native-linuxapp-gcc
index 3de818a..60baf5b 100644
--- a/config/defconfig_x86_64-native-linuxapp-gcc
+++ b/config/defconfig_x86_64-native-linuxapp-gcc
@@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
 
 CONFIG_RTE_ARCH="x86_64"
 CONFIG_RTE_ARCH_X86_64=y
+CONFIG_RTE_ARCH_64=y
 
 CONFIG_RTE_TOOLCHAIN="gcc"
 CONFIG_RTE_TOOLCHAIN_GCC=y
diff --git a/config/defconfig_x86_64-native-linuxapp-icc b/config/defconfig_x86_64-native-linuxapp-icc
index 795333b..71d1e28 100644
--- a/config/defconfig_x86_64-native-linuxapp-icc
+++ b/config/defconfig_x86_64-native-linuxapp-icc
@@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
 
 CONFIG_RTE_ARCH="x86_64"
 CONFIG_RTE_ARCH_X86_64=y
+CONFIG_RTE_ARCH_64=y
 
 CONFIG_RTE_TOOLCHAIN="icc"
 CONFIG_RTE_TOOLCHAIN_ICC=y
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index f2454f4..a8e7421 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -316,11 +316,11 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
 #endif
 			hugepg_tbl[i].filepath[sizeof(hugepg_tbl[i].filepath) - 1] = '\0';
 		}
-#ifndef RTE_ARCH_X86_64
-		/* for 32-bit systems, don't remap 1G pages, just reuse original
+#ifndef RTE_ARCH_64
+		/* for 32-bit systems, don't remap 1G and 16G pages, just reuse original
 		 * map address as final map address.
 		 */
-		else if (hugepage_sz == RTE_PGSIZE_1G){
+		else if ((hugepage_sz == RTE_PGSIZE_1G) || (hugepage_sz == RTE_PGSIZE_16G)){
 			hugepg_tbl[i].final_va = hugepg_tbl[i].orig_va;
 			hugepg_tbl[i].orig_va = NULL;
 			continue;
@@ -335,9 +335,16 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
 			 * physical block: count the number of
 			 * contiguous physical pages. */
 			for (j = i+1; j < hpi->num_pages[0] ; j++) {
+#ifdef RTE_ARCH_PPC_64
+                /* The physical addresses are sorted in descending order on PPC64 */
+				if (hugepg_tbl[j].physaddr !=
+				    hugepg_tbl[j-1].physaddr - hugepage_sz)
+					break;
+#else
 				if (hugepg_tbl[j].physaddr !=
 				    hugepg_tbl[j-1].physaddr + hugepage_sz)
 					break;
+#endif
 			}
 			num_pages = j - i;
 			vma_len = num_pages * hugepage_sz;
@@ -412,11 +419,11 @@ remap_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 
 	while (i < hpi->num_pages[0]) {
 
-#ifndef RTE_ARCH_X86_64
-		/* for 32-bit systems, don't remap 1G pages, just reuse original
+#ifndef RTE_ARCH_64
+		/* for 32-bit systems, don't remap 1G pages and 16G pages, just reuse original
 		 * map address as final map address.
 		 */
-		if (hugepage_sz == RTE_PGSIZE_1G){
+		if ((hugepage_sz == RTE_PGSIZE_1G) || (hugepage_sz == RTE_PGSIZE_16G)){
 			hugepg_tbl[i].final_va = hugepg_tbl[i].orig_va;
 			hugepg_tbl[i].orig_va = NULL;
 			i++;
@@ -428,9 +435,15 @@ remap_all_hugepages(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 		 * physical block: count the number of
 		 * contiguous physical pages. */
 		for (j = i+1; j < hpi->num_pages[0] ; j++) {
+#ifdef RTE_ARCH_PPC_64
+            /*  The physical addresses are sorted in descending order on PPC64 */
+			if (hugepg_tbl[j].physaddr != hugepg_tbl[j-1].physaddr - hugepage_sz)
+				break;
+#else
 			if (hugepg_tbl[j].physaddr != hugepg_tbl[j-1].physaddr + hugepage_sz)
 				break;
-		}
+#endif
+       }
 		num_pages = j - i;
 		vma_len = num_pages * hugepage_sz;
 
@@ -652,21 +665,21 @@ error:
 }
 
 /*
- * Sort the hugepg_tbl by physical address (lower addresses first). We
- * use a slow algorithm, but we won't have millions of pages, and this
+ * Sort the hugepg_tbl by physical address (lower addresses first on x86, higher address first
+ * on powerpc). We use a slow algorithm, but we won't have millions of pages, and this
  * is only done at init time.
  */
 static int
 sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 {
 	unsigned i, j;
-	int smallest_idx;
-	uint64_t smallest_addr;
+	int compare_idx;
+	uint64_t compare_addr;
 	struct hugepage_file tmp;
 
 	for (i = 0; i < hpi->num_pages[0]; i++) {
-		smallest_addr = 0;
-		smallest_idx = -1;
+		compare_addr = 0;
+		compare_idx = -1;
 
 		/*
 		 * browse all entries starting at 'i', and find the
@@ -674,22 +687,26 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info *hpi)
 		 */
 		for (j=i; j< hpi->num_pages[0]; j++) {
 
-			if (smallest_addr == 0 ||
-			    hugepg_tbl[j].physaddr < smallest_addr) {
-				smallest_addr = hugepg_tbl[j].physaddr;
-				smallest_idx = j;
+			if (compare_addr == 0 ||
+#ifdef RTE_ARCH_PPC_64
+			    hugepg_tbl[j].physaddr > compare_addr) {
+#else
+			    hugepg_tbl[j].physaddr < compare_addr) {
+#endif
+                compare_addr = hugepg_tbl[j].physaddr;
+                compare_idx = j;
 			}
 		}
 
 		/* should not happen */
-		if (smallest_idx == -1) {
+		if (compare_idx == -1) {
 			RTE_LOG(ERR, EAL, "%s(): error in physaddr sorting\n", __func__);
 			return -1;
 		}
 
 		/* swap the 2 entries in the table */
-		memcpy(&tmp, &hugepg_tbl[smallest_idx], sizeof(struct hugepage_file));
-		memcpy(&hugepg_tbl[smallest_idx], &hugepg_tbl[i],
+		memcpy(&tmp, &hugepg_tbl[compare_idx], sizeof(struct hugepage_file));
+		memcpy(&hugepg_tbl[compare_idx], &hugepg_tbl[i],
 				sizeof(struct hugepage_file));
 		memcpy(&hugepg_tbl[i], &tmp, sizeof(struct hugepage_file));
 	}
@@ -1260,12 +1277,24 @@ rte_eal_hugepage_init(void)
 			new_memseg = 1;
 		else if (hugepage[i].size != hugepage[i-1].size)
 			new_memseg = 1;
+
+#ifdef RTE_ARCH_PPC_64
+		/* On PPC64 architecture, the mmap always start from higher virtual address to lower address.
+        * Here, both the physical address and virtual address are in descending order */
+		else if ((hugepage[i-1].physaddr - hugepage[i].physaddr) !=
+		    hugepage[i].size)
+			new_memseg = 1;
+		else if (((unsigned long)hugepage[i-1].final_va -
+		    (unsigned long)hugepage[i].final_va) != hugepage[i].size)
+			new_memseg = 1;
+#else
 		else if ((hugepage[i].physaddr - hugepage[i-1].physaddr) !=
 		    hugepage[i].size)
 			new_memseg = 1;
 		else if (((unsigned long)hugepage[i].final_va -
 		    (unsigned long)hugepage[i-1].final_va) != hugepage[i].size)
 			new_memseg = 1;
+#endif
 
 		if (new_memseg) {
 			j += 1;
@@ -1284,6 +1313,12 @@ rte_eal_hugepage_init(void)
 		}
 		/* continuation of previous memseg */
 		else {
+#ifdef RTE_ARCH_PPC_64
+		/* Use the phy and virt address of the last page as segment address  
+		  * for IBM Power architecture */
+			mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
+			mcfg->memseg[j].addr = hugepage[i].final_va;
+#endif
 			mcfg->memseg[j].len += mcfg->memseg[j].hugepage_sz;
 		}
 		hugepage[i].memseg_id = j;
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 13/14] test_memzone:fix finding the second smallest segment
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (11 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 14/14] Fix the compiling of test-pmd on IBM Power Architecture Chao Zhu
  2014-11-24 15:05 ` [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture David Marchand
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

Curent implementation in test_memzone.c has bugs in finding the
second smallest memory segment. It's the last smallest memory segment,
but it's not the second smallest memory segment. This bug may cause test
failure in some cases. This patch fixes this bug.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 app/test/test_memzone.c |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/app/test/test_memzone.c b/app/test/test_memzone.c
index 8668103..f3da2c1 100644
--- a/app/test/test_memzone.c
+++ b/app/test/test_memzone.c
@@ -794,7 +794,7 @@ test_memzone_reserve_memory_in_smallest_segment(void)
 			/* set new smallest */
 			min_ms = ms;
 		}
-		else if (prev_min_ms == NULL) {
+		else if ((prev_min_ms == NULL) || (prev_min_ms->len > ms->len)) {
 			prev_min_ms = ms;
 		}
 	}
@@ -874,7 +874,7 @@ test_memzone_reserve_memory_with_smallest_offset(void)
 			/* set new smallest */
 			min_ms = ms;
 		}
-		else if (prev_min_ms == NULL) {
+		else if ((prev_min_ms == NULL) || (prev_min_ms->len > ms->len)){
 			prev_min_ms = ms;
 		}
 	}
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH v3 14/14] Fix the compiling of test-pmd on IBM Power Architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (12 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 13/14] test_memzone:fix finding the second smallest segment Chao Zhu
@ 2014-11-24  1:22 ` Chao Zhu
  2014-11-24 15:05 ` [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture David Marchand
  14 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-24  1:22 UTC (permalink / raw)
  To: dev

This patch fixes compiling problems on IBM Power architecture and turn
on the test-pmd compiling option in configuration file. Actually, this
is an big endian compiling fix.

Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
---
 app/test-pmd/config.c          |   33 +++++++++++++++++++--------------
 config/common_linuxapp_powerpc |    6 +++---
 2 files changed, 22 insertions(+), 17 deletions(-)

diff --git a/app/test-pmd/config.c b/app/test-pmd/config.c
index 9bc08f4..ba26da1 100644
--- a/app/test-pmd/config.c
+++ b/app/test-pmd/config.c
@@ -612,8 +612,13 @@ ring_dma_zone_lookup(const char *ring_name, uint8_t port_id, uint16_t q_id)
 union igb_ring_dword {
 	uint64_t dword;
 	struct {
+#ifdef RTE_ARCH_BIG_ENDIAN
+		uint32_t lo;
+		uint32_t hi;
+#else
 		uint32_t hi;
 		uint32_t lo;
+#endif
 	} words;
 };
 
@@ -656,23 +661,23 @@ ring_rx_descriptor_display(const struct rte_memzone *ring_mz,
 		/* 32 bytes RX descriptor, i40e only */
 		struct igb_ring_desc_32_bytes *ring =
 			(struct igb_ring_desc_32_bytes *)ring_mz->addr;
+        ring[desc_id].lo_dword.dword = rte_le_to_cpu_64(ring[desc_id].lo_dword.dword);
+		ring_rxd_display_dword(ring[desc_id].lo_dword);
+		ring[desc_id].hi_dword.dword = rte_le_to_cpu_64(ring[desc_id].hi_dword.dword);
+		ring_rxd_display_dword(ring[desc_id].hi_dword);
+        ring[desc_id].resv1.dword = rte_le_to_cpu_64(ring[desc_id].resv1.dword);
+		ring_rxd_display_dword(ring[desc_id].resv1);
+		ring[desc_id].resv2.dword = rte_le_to_cpu_64(ring[desc_id].resv2.dword);
+		ring_rxd_display_dword(ring[desc_id].resv2);		
 
-		ring_rxd_display_dword(rte_le_to_cpu_64(
-				ring[desc_id].lo_dword));
-		ring_rxd_display_dword(rte_le_to_cpu_64(
-				ring[desc_id].hi_dword));
-		ring_rxd_display_dword(rte_le_to_cpu_64(
-				ring[desc_id].resv1));
-		ring_rxd_display_dword(rte_le_to_cpu_64(
-				ring[desc_id].resv2));
 		return;
 	}
 #endif
 	/* 16 bytes RX descriptor */
-	ring_rxd_display_dword(rte_le_to_cpu_64(
-			ring[desc_id].lo_dword));
-	ring_rxd_display_dword(rte_le_to_cpu_64(
-			ring[desc_id].hi_dword));
+    ring[desc_id].lo_dword.dword = rte_le_to_cpu_64(ring[desc_id].lo_dword.dword);
+	ring_rxd_display_dword(ring[desc_id].lo_dword);
+	ring[desc_id].hi_dword.dword = rte_le_to_cpu_64(ring[desc_id].hi_dword.dword);
+	ring_rxd_display_dword(ring[desc_id].hi_dword);	
 }
 
 static void
@@ -682,8 +687,8 @@ ring_tx_descriptor_display(const struct rte_memzone *ring_mz, uint16_t desc_id)
 	struct igb_ring_desc_16_bytes txd;
 
 	ring = (struct igb_ring_desc_16_bytes *)ring_mz->addr;
-	txd.lo_dword = rte_le_to_cpu_64(ring[desc_id].lo_dword);
-	txd.hi_dword = rte_le_to_cpu_64(ring[desc_id].hi_dword);
+	txd.lo_dword.dword = rte_le_to_cpu_64(ring[desc_id].lo_dword.dword);
+	txd.hi_dword.dword = rte_le_to_cpu_64(ring[desc_id].hi_dword.dword);
 	printf("    0x%08X - 0x%08X / 0x%08X - 0x%08X\n",
 			(unsigned)txd.lo_dword.words.lo,
 			(unsigned)txd.lo_dword.words.hi,
diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
index d230a0b..68f1b6b 100644
--- a/config/common_linuxapp_powerpc
+++ b/config/common_linuxapp_powerpc
@@ -146,8 +146,8 @@ CONFIG_RTE_NIC_BYPASS=n
 #
 # Compile burst-oriented IGB & EM PMD drivers
 #
-CONFIG_RTE_LIBRTE_EM_PMD=n
-CONFIG_RTE_LIBRTE_IGB_PMD=n
+CONFIG_RTE_LIBRTE_EM_PMD=y
+CONFIG_RTE_LIBRTE_IGB_PMD=y
 CONFIG_RTE_LIBRTE_E1000_DEBUG_INIT=n
 CONFIG_RTE_LIBRTE_E1000_DEBUG_RX=n
 CONFIG_RTE_LIBRTE_E1000_DEBUG_TX=n
@@ -389,6 +389,6 @@ CONFIG_RTE_APP_TEST=y
 #
 # Compile the PMD test application
 #
-CONFIG_RTE_TEST_PMD=n
+CONFIG_RTE_TEST_PMD=y
 CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=n
 CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=n
-- 
1.7.1

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 03/14] Add byte order operations for IBM Power architecture
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 03/14] Add byte order " Chao Zhu
@ 2014-11-24  8:11   ` Qiu, Michael
  2014-11-26  2:35     ` Chao Zhu
  0 siblings, 1 reply; 31+ messages in thread
From: Qiu, Michael @ 2014-11-24  8:11 UTC (permalink / raw)
  To: Chao Zhu, dev

On 11/23/2014 9:22 PM, Chao Zhu wrote:
> This patch adds architecture specific byte order operations for IBM Power
> architecture. Power architecture support both big endian and little
> endian. This patch also adds a RTE_ARCH_BIG_ENDIAN micro.
>
> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> ---
>  config/defconfig_ppc_64-power8-linuxapp-gcc        |    1 +
>  .../common/include/arch/ppc_64/rte_byteorder.h     |  150 ++++++++++++++++++++
>  2 files changed, 151 insertions(+), 0 deletions(-)
>  create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>
> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
> index 97d72ff..b10f60c 100644
> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
> @@ -34,6 +34,7 @@ CONFIG_RTE_MACHINE="power8"
>  
>  CONFIG_RTE_ARCH="ppc_64"
>  CONFIG_RTE_ARCH_PPC_64=y
> +CONFIG_RTE_ARCH_BIG_ENDIAN=y

Does this means default is Big Endian,  if I runs it in little endian
mode, I need to change it manually?
>  
>  CONFIG_RTE_TOOLCHAIN="gcc"
>  CONFIG_RTE_TOOLCHAIN_GCC=y
> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
> new file mode 100644
> index 0000000..a593e8a
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
> @@ -0,0 +1,150 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +/* Inspired from FreeBSD src/sys/powerpc/include/endian.h
> + * Copyright (c) 1987, 1991, 1993
> + * The Regents of the University of California.  All rights reserved.
> +*/
> +
> +#ifndef _RTE_BYTEORDER_PPC_64_H_
> +#define _RTE_BYTEORDER_PPC_64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include "generic/rte_byteorder.h"
> +
> +/*
> + * An architecture-optimized byte swap for a 16-bit value.
> + *
> + * Do not use this function directly. The preferred function is rte_bswap16().
> + */
> +static inline uint16_t rte_arch_bswap16(uint16_t _x)
> +{
> +	return ((_x >> 8) | ((_x << 8) & 0xff00));
> +}
> +
> +/*
> + * An architecture-optimized byte swap for a 32-bit value.
> + *
> + * Do not use this function directly. The preferred function is rte_bswap32().
> + */
> +static inline uint32_t rte_arch_bswap32(uint32_t _x)
> +{
> +	return ((_x >> 24) | ((_x >> 8) & 0xff00) | ((_x << 8) & 0xff0000) |
> +		((_x << 24) & 0xff000000));
> +}
> +
> +/*
> + * An architecture-optimized byte swap for a 64-bit value.
> + *
> +  * Do not use this function directly. The preferred function is rte_bswap64().
> + */
> +/* 64-bit mode */
> +static inline uint64_t rte_arch_bswap64(uint64_t _x)
> +{
> +	return ((_x >> 56) | ((_x >> 40) & 0xff00) | ((_x >> 24) & 0xff0000) |
> +		((_x >> 8) & 0xff000000) | ((_x << 8) & (0xffULL << 32)) |
> +		((_x << 24) & (0xffULL << 40)) |
> +		((_x << 40) & (0xffULL << 48)) | ((_x << 56)));
> +}
> +
> +#ifndef RTE_FORCE_INTRINSICS
> +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
> +				   rte_constant_bswap16(x) :		\
> +				   rte_arch_bswap16(x)))
> +
> +#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
> +				   rte_constant_bswap32(x) :		\
> +				   rte_arch_bswap32(x)))
> +
> +#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
> +				   rte_constant_bswap64(x) :		\
> +				   rte_arch_bswap64(x)))
> +#else
> +/*
> + * __builtin_bswap16 is only available gcc 4.8 and upwards
> + */
> +#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
> +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
> +				   rte_constant_bswap16(x) :		\
> +				   rte_arch_bswap16(x)))
> +#endif
> +#endif
> +
> +/* Power 8 have both little endian and big endian mode 
> + * Power 7 only support big endian

Are you sure about this ? What I've heard is that all power CPU(at least
Power7 and 8) supports, but not check the spec.
> + */
> +#ifndef RTE_ARCH_BIG_ENDIAN
> +
> +#define rte_cpu_to_le_16(x) (x)
> +#define rte_cpu_to_le_32(x) (x)
> +#define rte_cpu_to_le_64(x) (x)
> +
> +#define rte_cpu_to_be_16(x) rte_bswap16(x)
> +#define rte_cpu_to_be_32(x) rte_bswap32(x)
> +#define rte_cpu_to_be_64(x) rte_bswap64(x)
> +
> +#define rte_le_to_cpu_16(x) (x)
> +#define rte_le_to_cpu_32(x) (x)
> +#define rte_le_to_cpu_64(x) (x)
> +
> +#define rte_be_to_cpu_16(x) rte_bswap16(x)
> +#define rte_be_to_cpu_32(x) rte_bswap32(x)
> +#define rte_be_to_cpu_64(x) rte_bswap64(x)
> +
> +#else
> +
> +#define rte_cpu_to_le_16(x) rte_bswap16(x)
> +#define rte_cpu_to_le_32(x) rte_bswap32(x)
> +#define rte_cpu_to_le_64(x) rte_bswap64(x)
> +
> +#define rte_cpu_to_be_16(x) (x)
> +#define rte_cpu_to_be_32(x) (x)
> +#define rte_cpu_to_be_64(x) (x)
> +
> +#define rte_le_to_cpu_16(x) rte_bswap16(x)
> +#define rte_le_to_cpu_32(x) rte_bswap32(x)
> +#define rte_le_to_cpu_64(x) rte_bswap64(x)
> +
> +#define rte_be_to_cpu_16(x) (x)
> +#define rte_be_to_cpu_32(x) (x)
> +#define rte_be_to_cpu_64(x) (x)
> +#endif
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_BYTEORDER_PPC_64_H_ */
> +


^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking for IBM Power architecture
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking " Chao Zhu
@ 2014-11-24 14:14   ` Neil Horman
  2014-11-25  3:27     ` Chao Zhu
  0 siblings, 1 reply; 31+ messages in thread
From: Neil Horman @ 2014-11-24 14:14 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

On Sun, Nov 23, 2014 at 08:22:16PM -0500, Chao Zhu wrote:
> IBM Power processor doesn't have CPU flag hardware registers. This patch
> uses aux vector software register to get CPU flags and add CPU flag
> checking support for IBM Power architecture.
> 
> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> ---
>  app/test/test_cpuflags.c                           |   35 ++++
>  .../common/include/arch/ppc_64/rte_cpuflags.h      |  184 ++++++++++++++++++++
>  mk/rte.cpuflags.mk                                 |   17 ++
>  3 files changed, 236 insertions(+), 0 deletions(-)
>  create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> 
> diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
> index 82c0197..5aeba5d 100644
> --- a/app/test/test_cpuflags.c
> +++ b/app/test/test_cpuflags.c
> @@ -80,6 +80,40 @@ test_cpuflags(void)
>  	int result;
>  	printf("\nChecking for flags from different registers...\n");
>  
> +#ifdef RTE_ARCH_PPC_64
> +	printf("Check for PPC64:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC64);
> +
> +	printf("Check for PPC32:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC32);
> +
> +	printf("Check for VSX:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_VSX);
> +
> +	printf("Check for DFP:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_DFP);
> +
> +	printf("Check for FPU:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_FPU);
> +
> +	printf("Check for SMT:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_SMT);
> +
> +	printf("Check for MMU:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_MMU);
> +
> +	printf("Check for ALTIVEC:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ALTIVEC);
> +
> +	printf("Check for ARCH_2_06:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_06);
> +
> +	printf("Check for ARCH_2_07:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_07);
> +
> +	printf("Check for ICACHE_SNOOP:\t\t");
> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
> +#else
>  	printf("Check for SSE:\t\t");
>  	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
>  
> @@ -117,6 +151,7 @@ test_cpuflags(void)
>  	CHECK_FOR_FLAG(RTE_CPUFLAG_INVTSC);
>  
>  
> +#endif
>  
>  	/*
>  	 * Check if invalid data is handled properly
> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> new file mode 100644
> index 0000000..6b38f1c
> --- /dev/null
> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> @@ -0,0 +1,184 @@
> +/*
> + *   BSD LICENSE
> + *
> + *   Copyright (C) IBM Corporation 2014.
> + *
> + *   Redistribution and use in source and binary forms, with or without
> + *   modification, are permitted provided that the following conditions
> + *   are met:
> + *
> + *     * Redistributions of source code must retain the above copyright
> + *       notice, this list of conditions and the following disclaimer.
> + *     * Redistributions in binary form must reproduce the above copyright
> + *       notice, this list of conditions and the following disclaimer in
> + *       the documentation and/or other materials provided with the
> + *       distribution.
> + *     * Neither the name of IBM Corporation nor the names of its
> + *       contributors may be used to endorse or promote products derived
> + *       from this software without specific prior written permission.
> + *
> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> +*/
> +
> +#ifndef _RTE_CPUFLAGS_PPC_64_H_
> +#define _RTE_CPUFLAGS_PPC_64_H_
> +
> +#ifdef __cplusplus
> +extern "C" {
> +#endif
> +
> +#include <elf.h>
> +#include <fcntl.h>
> +#include <assert.h>
> +#include <unistd.h>
> +
> +#include "generic/rte_cpuflags.h"
> +
> +/* Symbolic values for the entries in the auxiliary table */
> +#define AT_HWCAP  16
> +#define AT_HWCAP2 26
> +
> +/* software based registers */
> +enum cpu_register_t {
> +	REG_HWCAP = 0,
> +	REG_HWCAP2,
> +};
> +
> +/**
> + * Enumeration of all CPU features supported
> + */
> +enum rte_cpu_flag_t {
> +	RTE_CPUFLAG_PPC_LE = 0,
> +	RTE_CPUFLAG_TRUE_LE,
> +	RTE_CPUFLAG_PSERIES_PERFMON_COMPAT,
> +	RTE_CPUFLAG_VSX,
> +	RTE_CPUFLAG_ARCH_2_06,
> +	RTE_CPUFLAG_POWER6_EXT,
> +	RTE_CPUFLAG_DFP,
> +	RTE_CPUFLAG_PA6T,
> +	RTE_CPUFLAG_ARCH_2_05,
> +	RTE_CPUFLAG_ICACHE_SNOOP,
> +	RTE_CPUFLAG_SMT,
> +	RTE_CPUFLAG_BOOKE,
> +	RTE_CPUFLAG_CELLBE,
> +	RTE_CPUFLAG_POWER5_PLUS,
> +	RTE_CPUFLAG_POWER5,
> +	RTE_CPUFLAG_POWER4,
> +	RTE_CPUFLAG_NOTB,
> +	RTE_CPUFLAG_EFP_DOUBLE,
> +	RTE_CPUFLAG_EFP_SINGLE,
> +	RTE_CPUFLAG_SPE,
> +	RTE_CPUFLAG_UNIFIED_CACHE,
> +	RTE_CPUFLAG_4xxMAC,
> +	RTE_CPUFLAG_MMU,
> +	RTE_CPUFLAG_FPU,
> +	RTE_CPUFLAG_ALTIVEC,
> +	RTE_CPUFLAG_PPC601,
> +	RTE_CPUFLAG_PPC64,
> +	RTE_CPUFLAG_PPC32,
> +	RTE_CPUFLAG_TAR,
> +	RTE_CPUFLAG_LSEL,
> +	RTE_CPUFLAG_EBB,
> +	RTE_CPUFLAG_DSCR,
> +	RTE_CPUFLAG_HTM,
> +	RTE_CPUFLAG_ARCH_2_07,
> +	/* The last item */
> +	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
> +};
> +
> +static const struct feature_entry cpu_feature_table[] = {
> +	FEAT_DEF(PPC_LE, 0x00000001, 0, REG_HWCAP,  0)
> +	FEAT_DEF(TRUE_LE, 0x00000001, 0, REG_HWCAP,  1)
> +	FEAT_DEF(PSERIES_PERFMON_COMPAT, 0x00000001, 0, REG_HWCAP,  6)
> +	FEAT_DEF(VSX, 0x00000001, 0, REG_HWCAP,  7)
> +	FEAT_DEF(ARCH_2_06, 0x00000001, 0, REG_HWCAP,  8)
> +	FEAT_DEF(POWER6_EXT, 0x00000001, 0, REG_HWCAP,  9)
> +	FEAT_DEF(DFP, 0x00000001, 0, REG_HWCAP,  10)
> +	FEAT_DEF(PA6T, 0x00000001, 0, REG_HWCAP,  11)
> +	FEAT_DEF(ARCH_2_05, 0x00000001, 0, REG_HWCAP,  12)
> +	FEAT_DEF(ICACHE_SNOOP, 0x00000001, 0, REG_HWCAP,  13)
> +	FEAT_DEF(SMT, 0x00000001, 0, REG_HWCAP,  14)
> +	FEAT_DEF(BOOKE, 0x00000001, 0, REG_HWCAP,  15)
> +	FEAT_DEF(CELLBE, 0x00000001, 0, REG_HWCAP,  16)
> +	FEAT_DEF(POWER5_PLUS, 0x00000001, 0, REG_HWCAP,  17)
> +	FEAT_DEF(POWER5, 0x00000001, 0, REG_HWCAP,  18)
> +	FEAT_DEF(POWER4, 0x00000001, 0, REG_HWCAP,  19)
> +	FEAT_DEF(NOTB, 0x00000001, 0, REG_HWCAP,  20)
> +	FEAT_DEF(EFP_DOUBLE, 0x00000001, 0, REG_HWCAP,  21)
> +	FEAT_DEF(EFP_SINGLE, 0x00000001, 0, REG_HWCAP,  22)
> +	FEAT_DEF(SPE, 0x00000001, 0, REG_HWCAP,  23)
> +	FEAT_DEF(UNIFIED_CACHE, 0x00000001, 0, REG_HWCAP,  24)
> +	FEAT_DEF(4xxMAC, 0x00000001, 0, REG_HWCAP,  25)
> +	FEAT_DEF(MMU, 0x00000001, 0, REG_HWCAP,  26)
> +	FEAT_DEF(FPU, 0x00000001, 0, REG_HWCAP,  27)
> +	FEAT_DEF(ALTIVEC, 0x00000001, 0, REG_HWCAP,  28)
> +	FEAT_DEF(PPC601, 0x00000001, 0, REG_HWCAP,  29)
> +	FEAT_DEF(PPC64, 0x00000001, 0, REG_HWCAP,  30)
> +	FEAT_DEF(PPC32, 0x00000001, 0, REG_HWCAP,  31)
> +	FEAT_DEF(TAR, 0x00000001, 0, REG_HWCAP2,  26)
> +	FEAT_DEF(LSEL, 0x00000001, 0, REG_HWCAP2,  27)
> +	FEAT_DEF(EBB, 0x00000001, 0, REG_HWCAP2,  28)
> +	FEAT_DEF(DSCR, 0x00000001, 0, REG_HWCAP2,  29)
> +	FEAT_DEF(HTM, 0x00000001, 0, REG_HWCAP2,  30)
> +	FEAT_DEF(ARCH_2_07, 0x00000001, 0, REG_HWCAP2,  31)
> +};
> +
> +/*
> + * Read AUXV software register and get cpu features for Power
> + */
> +static inline void
> +rte_cpu_get_features( __attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
> +{
> +  int auxv_fd;
> +  Elf64_auxv_t auxv;
> +  auxv_fd = open("/proc/self/auxv", O_RDONLY);
> +  assert(auxv_fd);
> +  while (read(auxv_fd, &auxv, sizeof(Elf64_auxv_t))== sizeof(Elf64_auxv_t)) {
> +    if (auxv.a_type == AT_HWCAP)
> +      out[REG_HWCAP] = auxv.a_un.a_val;
> +    else if (auxv.a_type == AT_HWCAP2)
> +      out[REG_HWCAP2] = auxv.a_un.a_val;
> +  }
> +}
> +
> +/*
> + * Checks if a particular flag is available on current machine.
> + */
> +static inline int
> +rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
> +{
> +	const struct feature_entry *feat;
> +	cpuid_registers_t regs={0};
> +
> +	if (feature >= RTE_CPUFLAG_NUMFLAGS)
> +		/* Flag does not match anything in the feature tables */
> +		return -ENOENT;
> +
> +	feat = &cpu_feature_table[feature];
> +
> +	if (!feat->leaf)
> +		/* This entry in the table wasn't filled out! */
> +		return -EFAULT;
> +
> +	/* get the cpuid leaf containing the desired feature */
> +	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
> +
> +	/* check if the feature is enabled */
> +	return (regs[feat->reg] >> feat->bit) & 1;
> +}
> +
> +#ifdef __cplusplus
> +}
> +#endif
> +
> +#endif /* _RTE_CPUFLAGS_PPC_64_H_ */
> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> index 65332e1..f595cd0 100644
> --- a/mk/rte.cpuflags.mk
> +++ b/mk/rte.cpuflags.mk
> @@ -89,6 +89,23 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__AVX2__),)
>  CPUFLAGS += AVX2
>  endif
>  
> +# IBM Power CPU flags
> +ifneq ($(filter $(AUTO_CPUFLAGS),__PPC64__),)
> +CPUFLAGS += PPC64
> +endif
> +
> +ifneq ($(filter $(AUTO_CPUFLAGS),__PPC32__),)
> +CPUFLAGS += PPC32
> +endif
> +
> +ifneq ($(filter $(AUTO_CPUFLAGS),__vector),)
> +CPUFLAGS += ALTIVEC
> +endif
> +
> +ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
> +CPUFLAGS += VSX
> +endif
> +
>  MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
>  
>  # To strip whitespace
> -- 
> 1.7.1
> 
> 

Something occurs to me with this patch.  rte_cpu_get_flag_enabled is a public
API call.  Interally, and externally we might use this call for checking cpu
support (rte_acl_init is an example).  Because the API call accepts an
rte_cpu_flag_t type as an input, all the ennumerated values need to be defined
all the time, or we will get build breakage (I.e. with this patch above, I exect
you never compiled the ACL library, as RTE_CPUFLAG_SSE4_1 shouldn't be defined,
and you would get build breakage).  What we probably need to do is merge the
cpufalgs to a single enumeration that is available for all arches.  

Neil
 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture
  2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
                   ` (13 preceding siblings ...)
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 14/14] Fix the compiling of test-pmd on IBM Power Architecture Chao Zhu
@ 2014-11-24 15:05 ` David Marchand
  2014-11-24 15:49   ` chaozhu
  2014-11-25  2:49   ` Chao Zhu
  14 siblings, 2 replies; 31+ messages in thread
From: David Marchand @ 2014-11-24 15:05 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

Hello Chao,

On Mon, Nov 24, 2014 at 2:22 AM, Chao Zhu <chaozhu@linux.vnet.ibm.com>
wrote:

> The set of patches add IBM Power architecture to the DPDK. It adds the
> required support to the
> EAL library. This set of patches doesn't support full DPDK function on
> Power processors. So a
> separate common configuration file is used for Power to turn off some
> un-migrated functions. To
> compile on PPC64 architecture, GCC version >= 4.8 must be used. This v3
> patch updates eal_memory.c
> to fix the memory zone allocation and also solves the compiling problems
> of test-pmd.
>

Please run a little checkpath on this patchset.
There are some issues.

Thanks.

-- 
David Marchand

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture
  2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture Chao Zhu
@ 2014-11-24 15:17   ` David Marchand
  2014-11-24 15:18     ` [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures David Marchand
  0 siblings, 1 reply; 31+ messages in thread
From: David Marchand @ 2014-11-24 15:17 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

Chao,

I think there are two remaining issues, in
lib/librte_eal/linuxapp/eal/eal.c
and lib/librte_eal/linuxapp/eal/eal_hugepage_info.c.
I will send a patch in reply to this patch.
I think it can be integrated into your patchset.

Thanks.

-- 
David Marchand

On Mon, Nov 24, 2014 at 2:22 AM, Chao Zhu <chaozhu@linux.vnet.ibm.com>
wrote:

> The mmap of hugepage files on IBM Power starts from high address to low
> address. This is different from x86. This patch modified the memory
> segment detection code to get the correct memory segment layout on Power
> architecture. This patch also added a commond ARCH_PPC_64 defination for
> 64 bit systems.
>
> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> ---
>  config/defconfig_ppc_64-power8-linuxapp-gcc   |    1 +
>  config/defconfig_x86_64-native-linuxapp-clang |    1 +
>  config/defconfig_x86_64-native-linuxapp-gcc   |    1 +
>  config/defconfig_x86_64-native-linuxapp-icc   |    1 +
>  lib/librte_eal/linuxapp/eal/eal_memory.c      |   75
> ++++++++++++++++++-------
>  5 files changed, 59 insertions(+), 20 deletions(-)
>
> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc
> b/config/defconfig_ppc_64-power8-linuxapp-gcc
> index b10f60c..23a5591 100644
> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
> @@ -35,6 +35,7 @@ CONFIG_RTE_MACHINE="power8"
>  CONFIG_RTE_ARCH="ppc_64"
>  CONFIG_RTE_ARCH_PPC_64=y
>  CONFIG_RTE_ARCH_BIG_ENDIAN=y
> +CONFIG_RTE_ARCH_64=y
>
>  CONFIG_RTE_TOOLCHAIN="gcc"
>  CONFIG_RTE_TOOLCHAIN_GCC=y
> diff --git a/config/defconfig_x86_64-native-linuxapp-clang
> b/config/defconfig_x86_64-native-linuxapp-clang
> index bbda080..5f3074e 100644
> --- a/config/defconfig_x86_64-native-linuxapp-clang
> +++ b/config/defconfig_x86_64-native-linuxapp-clang
> @@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
>
>  CONFIG_RTE_ARCH="x86_64"
>  CONFIG_RTE_ARCH_X86_64=y
> +CONFIG_RTE_ARCH_64=y
>
>  CONFIG_RTE_TOOLCHAIN="clang"
>  CONFIG_RTE_TOOLCHAIN_CLANG=y
> diff --git a/config/defconfig_x86_64-native-linuxapp-gcc
> b/config/defconfig_x86_64-native-linuxapp-gcc
> index 3de818a..60baf5b 100644
> --- a/config/defconfig_x86_64-native-linuxapp-gcc
> +++ b/config/defconfig_x86_64-native-linuxapp-gcc
> @@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
>
>  CONFIG_RTE_ARCH="x86_64"
>  CONFIG_RTE_ARCH_X86_64=y
> +CONFIG_RTE_ARCH_64=y
>
>  CONFIG_RTE_TOOLCHAIN="gcc"
>  CONFIG_RTE_TOOLCHAIN_GCC=y
> diff --git a/config/defconfig_x86_64-native-linuxapp-icc
> b/config/defconfig_x86_64-native-linuxapp-icc
> index 795333b..71d1e28 100644
> --- a/config/defconfig_x86_64-native-linuxapp-icc
> +++ b/config/defconfig_x86_64-native-linuxapp-icc
> @@ -36,6 +36,7 @@ CONFIG_RTE_MACHINE="native"
>
>  CONFIG_RTE_ARCH="x86_64"
>  CONFIG_RTE_ARCH_X86_64=y
> +CONFIG_RTE_ARCH_64=y
>
>  CONFIG_RTE_TOOLCHAIN="icc"
>  CONFIG_RTE_TOOLCHAIN_ICC=y
> diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c
> b/lib/librte_eal/linuxapp/eal/eal_memory.c
> index f2454f4..a8e7421 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_memory.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
> @@ -316,11 +316,11 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
>  #endif
>
> hugepg_tbl[i].filepath[sizeof(hugepg_tbl[i].filepath) - 1] = '\0';
>                 }
> -#ifndef RTE_ARCH_X86_64
> -               /* for 32-bit systems, don't remap 1G pages, just reuse
> original
> +#ifndef RTE_ARCH_64
> +               /* for 32-bit systems, don't remap 1G and 16G pages, just
> reuse original
>                  * map address as final map address.
>                  */
> -               else if (hugepage_sz == RTE_PGSIZE_1G){
> +               else if ((hugepage_sz == RTE_PGSIZE_1G) || (hugepage_sz ==
> RTE_PGSIZE_16G)){
>                         hugepg_tbl[i].final_va = hugepg_tbl[i].orig_va;
>                         hugepg_tbl[i].orig_va = NULL;
>                         continue;
> @@ -335,9 +335,16 @@ map_all_hugepages(struct hugepage_file *hugepg_tbl,
>                          * physical block: count the number of
>                          * contiguous physical pages. */
>                         for (j = i+1; j < hpi->num_pages[0] ; j++) {
> +#ifdef RTE_ARCH_PPC_64
> +                /* The physical addresses are sorted in descending order
> on PPC64 */
> +                               if (hugepg_tbl[j].physaddr !=
> +                                   hugepg_tbl[j-1].physaddr - hugepage_sz)
> +                                       break;
> +#else
>                                 if (hugepg_tbl[j].physaddr !=
>                                     hugepg_tbl[j-1].physaddr + hugepage_sz)
>                                         break;
> +#endif
>                         }
>                         num_pages = j - i;
>                         vma_len = num_pages * hugepage_sz;
> @@ -412,11 +419,11 @@ remap_all_hugepages(struct hugepage_file
> *hugepg_tbl, struct hugepage_info *hpi)
>
>         while (i < hpi->num_pages[0]) {
>
> -#ifndef RTE_ARCH_X86_64
> -               /* for 32-bit systems, don't remap 1G pages, just reuse
> original
> +#ifndef RTE_ARCH_64
> +               /* for 32-bit systems, don't remap 1G pages and 16G pages,
> just reuse original
>                  * map address as final map address.
>                  */
> -               if (hugepage_sz == RTE_PGSIZE_1G){
> +               if ((hugepage_sz == RTE_PGSIZE_1G) || (hugepage_sz ==
> RTE_PGSIZE_16G)){
>                         hugepg_tbl[i].final_va = hugepg_tbl[i].orig_va;
>                         hugepg_tbl[i].orig_va = NULL;
>                         i++;
> @@ -428,9 +435,15 @@ remap_all_hugepages(struct hugepage_file *hugepg_tbl,
> struct hugepage_info *hpi)
>                  * physical block: count the number of
>                  * contiguous physical pages. */
>                 for (j = i+1; j < hpi->num_pages[0] ; j++) {
> +#ifdef RTE_ARCH_PPC_64
> +            /*  The physical addresses are sorted in descending order on
> PPC64 */
> +                       if (hugepg_tbl[j].physaddr !=
> hugepg_tbl[j-1].physaddr - hugepage_sz)
> +                               break;
> +#else
>                         if (hugepg_tbl[j].physaddr !=
> hugepg_tbl[j-1].physaddr + hugepage_sz)
>                                 break;
> -               }
> +#endif
> +       }
>                 num_pages = j - i;
>                 vma_len = num_pages * hugepage_sz;
>
> @@ -652,21 +665,21 @@ error:
>  }
>
>  /*
> - * Sort the hugepg_tbl by physical address (lower addresses first). We
> - * use a slow algorithm, but we won't have millions of pages, and this
> + * Sort the hugepg_tbl by physical address (lower addresses first on x86,
> higher address first
> + * on powerpc). We use a slow algorithm, but we won't have millions of
> pages, and this
>   * is only done at init time.
>   */
>  static int
>  sort_by_physaddr(struct hugepage_file *hugepg_tbl, struct hugepage_info
> *hpi)
>  {
>         unsigned i, j;
> -       int smallest_idx;
> -       uint64_t smallest_addr;
> +       int compare_idx;
> +       uint64_t compare_addr;
>         struct hugepage_file tmp;
>
>         for (i = 0; i < hpi->num_pages[0]; i++) {
> -               smallest_addr = 0;
> -               smallest_idx = -1;
> +               compare_addr = 0;
> +               compare_idx = -1;
>
>                 /*
>                  * browse all entries starting at 'i', and find the
> @@ -674,22 +687,26 @@ sort_by_physaddr(struct hugepage_file *hugepg_tbl,
> struct hugepage_info *hpi)
>                  */
>                 for (j=i; j< hpi->num_pages[0]; j++) {
>
> -                       if (smallest_addr == 0 ||
> -                           hugepg_tbl[j].physaddr < smallest_addr) {
> -                               smallest_addr = hugepg_tbl[j].physaddr;
> -                               smallest_idx = j;
> +                       if (compare_addr == 0 ||
> +#ifdef RTE_ARCH_PPC_64
> +                           hugepg_tbl[j].physaddr > compare_addr) {
> +#else
> +                           hugepg_tbl[j].physaddr < compare_addr) {
> +#endif
> +                compare_addr = hugepg_tbl[j].physaddr;
> +                compare_idx = j;
>                         }
>                 }
>
>                 /* should not happen */
> -               if (smallest_idx == -1) {
> +               if (compare_idx == -1) {
>                         RTE_LOG(ERR, EAL, "%s(): error in physaddr
> sorting\n", __func__);
>                         return -1;
>                 }
>
>                 /* swap the 2 entries in the table */
> -               memcpy(&tmp, &hugepg_tbl[smallest_idx], sizeof(struct
> hugepage_file));
> -               memcpy(&hugepg_tbl[smallest_idx], &hugepg_tbl[i],
> +               memcpy(&tmp, &hugepg_tbl[compare_idx], sizeof(struct
> hugepage_file));
> +               memcpy(&hugepg_tbl[compare_idx], &hugepg_tbl[i],
>                                 sizeof(struct hugepage_file));
>                 memcpy(&hugepg_tbl[i], &tmp, sizeof(struct hugepage_file));
>         }
> @@ -1260,12 +1277,24 @@ rte_eal_hugepage_init(void)
>                         new_memseg = 1;
>                 else if (hugepage[i].size != hugepage[i-1].size)
>                         new_memseg = 1;
> +
> +#ifdef RTE_ARCH_PPC_64
> +               /* On PPC64 architecture, the mmap always start from
> higher virtual address to lower address.
> +        * Here, both the physical address and virtual address are in
> descending order */
> +               else if ((hugepage[i-1].physaddr - hugepage[i].physaddr) !=
> +                   hugepage[i].size)
> +                       new_memseg = 1;
> +               else if (((unsigned long)hugepage[i-1].final_va -
> +                   (unsigned long)hugepage[i].final_va) !=
> hugepage[i].size)
> +                       new_memseg = 1;
> +#else
>                 else if ((hugepage[i].physaddr - hugepage[i-1].physaddr) !=
>                     hugepage[i].size)
>                         new_memseg = 1;
>                 else if (((unsigned long)hugepage[i].final_va -
>                     (unsigned long)hugepage[i-1].final_va) !=
> hugepage[i].size)
>                         new_memseg = 1;
> +#endif
>
>                 if (new_memseg) {
>                         j += 1;
> @@ -1284,6 +1313,12 @@ rte_eal_hugepage_init(void)
>                 }
>                 /* continuation of previous memseg */
>                 else {
> +#ifdef RTE_ARCH_PPC_64
> +               /* Use the phy and virt address of the last page as
> segment address
> +                 * for IBM Power architecture */
> +                       mcfg->memseg[j].phys_addr = hugepage[i].physaddr;
> +                       mcfg->memseg[j].addr = hugepage[i].final_va;
> +#endif
>                         mcfg->memseg[j].len += mcfg->memseg[j].hugepage_sz;
>                 }
>                 hugepage[i].memseg_id = j;
> --
> 1.7.1
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures
  2014-11-24 15:17   ` David Marchand
@ 2014-11-24 15:18     ` David Marchand
  2014-11-24 15:58       ` chaozhu
  0 siblings, 1 reply; 31+ messages in thread
From: David Marchand @ 2014-11-24 15:18 UTC (permalink / raw)
  To: chaozhu; +Cc: dev

RTE_ARCH_X86_64 can not be used as a way to determine if we are building for
64bits cpus. Instead, RTE_ARCH_64 should be used.

Signed-off-by: David Marchand <david.marchand@6wind.com>
---
 lib/librte_eal/linuxapp/eal/eal.c               |    2 +-
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |    2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index f9517c7..e321524 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -450,7 +450,7 @@ eal_parse_base_virtaddr(const char *arg)
 		return -1;
 
 	/* make sure we don't exceed 32-bit boundary on 32-bit target */
-#ifndef RTE_ARCH_X86_64
+#ifndef RTE_ARCH_64
 	if (addr >= UINTPTR_MAX)
 		return -1;
 #endif
diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
index 73d1cdb..590cb56 100644
--- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
+++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
@@ -324,7 +324,7 @@ eal_hugepage_info_init(void)
 				 * later they will be sorted */
 				hpi->num_pages[0] = get_num_hugepages(dirent->d_name);
 
-#ifndef RTE_ARCH_X86_64
+#ifndef RTE_ARCH_64
 				/* for 32-bit systems, limit number of hugepages to 1GB per page size */
 				hpi->num_pages[0] = RTE_MIN(hpi->num_pages[0],
 						RTE_PGSIZE_1G / hpi->hugepage_sz);
-- 
1.7.10.4

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture
  2014-11-24 15:05 ` [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture David Marchand
@ 2014-11-24 15:49   ` chaozhu
  2014-11-25  2:49   ` Chao Zhu
  1 sibling, 0 replies; 31+ messages in thread
From: chaozhu @ 2014-11-24 15:49 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

David,

My email server just come back. Sorry for the delay.
I'm running the patchcheck and I'll send out the updates later.
Thanks !

Quoting David Marchand <david.marchand@6wind.com>:

> Hello Chao,
>
> On Mon, Nov 24, 2014 at 2:22 AM, Chao Zhu <chaozhu@linux.vnet.ibm.com>
> wrote:
>
>> The set of patches add IBM Power architecture to the DPDK. It adds the
>> required support to the
>> EAL library. This set of patches doesn't support full DPDK function on
>> Power processors. So a
>> separate common configuration file is used for Power to turn off some
>> un-migrated functions. To
>> compile on PPC64 architecture, GCC version >= 4.8 must be used. This v3
>> patch updates eal_memory.c
>> to fix the memory zone allocation and also solves the compiling problems
>> of test-pmd.
>>
>
> Please run a little checkpath on this patchset.
> There are some issues.
>
> Thanks.
>
> --
> David Marchand

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures
  2014-11-24 15:18     ` [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures David Marchand
@ 2014-11-24 15:58       ` chaozhu
  2014-11-27  7:47         ` Thomas Monjalon
  0 siblings, 1 reply; 31+ messages in thread
From: chaozhu @ 2014-11-24 15:58 UTC (permalink / raw)
  To: David Marchand; +Cc: dev


Quoting David Marchand <david.marchand@6wind.com>:

> RTE_ARCH_X86_64 can not be used as a way to determine if we are building for
> 64bits cpus. Instead, RTE_ARCH_64 should be used.
>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
>  lib/librte_eal/linuxapp/eal/eal.c               |    2 +-
>  lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |    2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c  
> b/lib/librte_eal/linuxapp/eal/eal.c
> index f9517c7..e321524 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -450,7 +450,7 @@ eal_parse_base_virtaddr(const char *arg)
>  		return -1;
>
>  	/* make sure we don't exceed 32-bit boundary on 32-bit target */
> -#ifndef RTE_ARCH_X86_64
> +#ifndef RTE_ARCH_64
>  	if (addr >= UINTPTR_MAX)
>  		return -1;
>  #endif
> diff --git a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c  
> b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> index 73d1cdb..590cb56 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_hugepage_info.c
> @@ -324,7 +324,7 @@ eal_hugepage_info_init(void)
>  				 * later they will be sorted */
>  				hpi->num_pages[0] = get_num_hugepages(dirent->d_name);
>
> -#ifndef RTE_ARCH_X86_64
> +#ifndef RTE_ARCH_64
>  				/* for 32-bit systems, limit number of hugepages to 1GB per page size */
>  				hpi->num_pages[0] = RTE_MIN(hpi->num_pages[0],
>  						RTE_PGSIZE_1G / hpi->hugepage_sz);
> --
> 1.7.10.4
Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture
  2014-11-24 15:05 ` [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture David Marchand
  2014-11-24 15:49   ` chaozhu
@ 2014-11-25  2:49   ` Chao Zhu
  1 sibling, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-25  2:49 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

David,

I submitted a updated patchset.
I fixed all of the checkpatch errors, except one error(this error I 
think it is invalid).
Thanks a lot!

On 2014/11/24 23:05, David Marchand wrote:
> Hello Chao,
>
> On Mon, Nov 24, 2014 at 2:22 AM, Chao Zhu <chaozhu@linux.vnet.ibm.com 
> <mailto:chaozhu@linux.vnet.ibm.com>> wrote:
>
>     The set of patches add IBM Power architecture to the DPDK. It adds
>     the required support to the
>     EAL library. This set of patches doesn't support full DPDK
>     function on Power processors. So a
>     separate common configuration file is used for Power to turn off
>     some un-migrated functions. To
>     compile on PPC64 architecture, GCC version >= 4.8 must be used.
>     This v3 patch updates eal_memory.c
>     to fix the memory zone allocation and also solves the compiling
>     problems of test-pmd.
>
>
> Please run a little checkpath on this patchset.
> There are some issues.
>
> Thanks.
>
> -- 
> David Marchand

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking for IBM Power architecture
  2014-11-24 14:14   ` Neil Horman
@ 2014-11-25  3:27     ` Chao Zhu
  2014-11-25 11:37       ` Neil Horman
  0 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-25  3:27 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

Neil,

I didn't compiled ACL library on Power because SSE is not supported by 
Power. This is why ACL compiling was
turned off on Power. rte_cpu_flag_t is an architecture specific value, 
each CPU has its own rte_cpu_flag_t . The Power one has no influence on 
x86, so I think there should be no building problem on x86. However, you 
suggestion is very good. It can ease the migration effort from x86 to 
other architectures. Probably we need to do it later.

On 2014/11/24 22:14, Neil Horman wrote:
> On Sun, Nov 23, 2014 at 08:22:16PM -0500, Chao Zhu wrote:
>> IBM Power processor doesn't have CPU flag hardware registers. This patch
>> uses aux vector software register to get CPU flags and add CPU flag
>> checking support for IBM Power architecture.
>>
>> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>> ---
>>   app/test/test_cpuflags.c                           |   35 ++++
>>   .../common/include/arch/ppc_64/rte_cpuflags.h      |  184 ++++++++++++++++++++
>>   mk/rte.cpuflags.mk                                 |   17 ++
>>   3 files changed, 236 insertions(+), 0 deletions(-)
>>   create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
>>
>> diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
>> index 82c0197..5aeba5d 100644
>> --- a/app/test/test_cpuflags.c
>> +++ b/app/test/test_cpuflags.c
>> @@ -80,6 +80,40 @@ test_cpuflags(void)
>>   	int result;
>>   	printf("\nChecking for flags from different registers...\n");
>>   
>> +#ifdef RTE_ARCH_PPC_64
>> +	printf("Check for PPC64:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC64);
>> +
>> +	printf("Check for PPC32:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC32);
>> +
>> +	printf("Check for VSX:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_VSX);
>> +
>> +	printf("Check for DFP:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_DFP);
>> +
>> +	printf("Check for FPU:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_FPU);
>> +
>> +	printf("Check for SMT:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_SMT);
>> +
>> +	printf("Check for MMU:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_MMU);
>> +
>> +	printf("Check for ALTIVEC:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ALTIVEC);
>> +
>> +	printf("Check for ARCH_2_06:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_06);
>> +
>> +	printf("Check for ARCH_2_07:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_07);
>> +
>> +	printf("Check for ICACHE_SNOOP:\t\t");
>> +	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
>> +#else
>>   	printf("Check for SSE:\t\t");
>>   	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
>>   
>> @@ -117,6 +151,7 @@ test_cpuflags(void)
>>   	CHECK_FOR_FLAG(RTE_CPUFLAG_INVTSC);
>>   
>>   
>> +#endif
>>   
>>   	/*
>>   	 * Check if invalid data is handled properly
>> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
>> new file mode 100644
>> index 0000000..6b38f1c
>> --- /dev/null
>> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
>> @@ -0,0 +1,184 @@
>> +/*
>> + *   BSD LICENSE
>> + *
>> + *   Copyright (C) IBM Corporation 2014.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of IBM Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> +*/
>> +
>> +#ifndef _RTE_CPUFLAGS_PPC_64_H_
>> +#define _RTE_CPUFLAGS_PPC_64_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include <elf.h>
>> +#include <fcntl.h>
>> +#include <assert.h>
>> +#include <unistd.h>
>> +
>> +#include "generic/rte_cpuflags.h"
>> +
>> +/* Symbolic values for the entries in the auxiliary table */
>> +#define AT_HWCAP  16
>> +#define AT_HWCAP2 26
>> +
>> +/* software based registers */
>> +enum cpu_register_t {
>> +	REG_HWCAP = 0,
>> +	REG_HWCAP2,
>> +};
>> +
>> +/**
>> + * Enumeration of all CPU features supported
>> + */
>> +enum rte_cpu_flag_t {
>> +	RTE_CPUFLAG_PPC_LE = 0,
>> +	RTE_CPUFLAG_TRUE_LE,
>> +	RTE_CPUFLAG_PSERIES_PERFMON_COMPAT,
>> +	RTE_CPUFLAG_VSX,
>> +	RTE_CPUFLAG_ARCH_2_06,
>> +	RTE_CPUFLAG_POWER6_EXT,
>> +	RTE_CPUFLAG_DFP,
>> +	RTE_CPUFLAG_PA6T,
>> +	RTE_CPUFLAG_ARCH_2_05,
>> +	RTE_CPUFLAG_ICACHE_SNOOP,
>> +	RTE_CPUFLAG_SMT,
>> +	RTE_CPUFLAG_BOOKE,
>> +	RTE_CPUFLAG_CELLBE,
>> +	RTE_CPUFLAG_POWER5_PLUS,
>> +	RTE_CPUFLAG_POWER5,
>> +	RTE_CPUFLAG_POWER4,
>> +	RTE_CPUFLAG_NOTB,
>> +	RTE_CPUFLAG_EFP_DOUBLE,
>> +	RTE_CPUFLAG_EFP_SINGLE,
>> +	RTE_CPUFLAG_SPE,
>> +	RTE_CPUFLAG_UNIFIED_CACHE,
>> +	RTE_CPUFLAG_4xxMAC,
>> +	RTE_CPUFLAG_MMU,
>> +	RTE_CPUFLAG_FPU,
>> +	RTE_CPUFLAG_ALTIVEC,
>> +	RTE_CPUFLAG_PPC601,
>> +	RTE_CPUFLAG_PPC64,
>> +	RTE_CPUFLAG_PPC32,
>> +	RTE_CPUFLAG_TAR,
>> +	RTE_CPUFLAG_LSEL,
>> +	RTE_CPUFLAG_EBB,
>> +	RTE_CPUFLAG_DSCR,
>> +	RTE_CPUFLAG_HTM,
>> +	RTE_CPUFLAG_ARCH_2_07,
>> +	/* The last item */
>> +	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
>> +};
>> +
>> +static const struct feature_entry cpu_feature_table[] = {
>> +	FEAT_DEF(PPC_LE, 0x00000001, 0, REG_HWCAP,  0)
>> +	FEAT_DEF(TRUE_LE, 0x00000001, 0, REG_HWCAP,  1)
>> +	FEAT_DEF(PSERIES_PERFMON_COMPAT, 0x00000001, 0, REG_HWCAP,  6)
>> +	FEAT_DEF(VSX, 0x00000001, 0, REG_HWCAP,  7)
>> +	FEAT_DEF(ARCH_2_06, 0x00000001, 0, REG_HWCAP,  8)
>> +	FEAT_DEF(POWER6_EXT, 0x00000001, 0, REG_HWCAP,  9)
>> +	FEAT_DEF(DFP, 0x00000001, 0, REG_HWCAP,  10)
>> +	FEAT_DEF(PA6T, 0x00000001, 0, REG_HWCAP,  11)
>> +	FEAT_DEF(ARCH_2_05, 0x00000001, 0, REG_HWCAP,  12)
>> +	FEAT_DEF(ICACHE_SNOOP, 0x00000001, 0, REG_HWCAP,  13)
>> +	FEAT_DEF(SMT, 0x00000001, 0, REG_HWCAP,  14)
>> +	FEAT_DEF(BOOKE, 0x00000001, 0, REG_HWCAP,  15)
>> +	FEAT_DEF(CELLBE, 0x00000001, 0, REG_HWCAP,  16)
>> +	FEAT_DEF(POWER5_PLUS, 0x00000001, 0, REG_HWCAP,  17)
>> +	FEAT_DEF(POWER5, 0x00000001, 0, REG_HWCAP,  18)
>> +	FEAT_DEF(POWER4, 0x00000001, 0, REG_HWCAP,  19)
>> +	FEAT_DEF(NOTB, 0x00000001, 0, REG_HWCAP,  20)
>> +	FEAT_DEF(EFP_DOUBLE, 0x00000001, 0, REG_HWCAP,  21)
>> +	FEAT_DEF(EFP_SINGLE, 0x00000001, 0, REG_HWCAP,  22)
>> +	FEAT_DEF(SPE, 0x00000001, 0, REG_HWCAP,  23)
>> +	FEAT_DEF(UNIFIED_CACHE, 0x00000001, 0, REG_HWCAP,  24)
>> +	FEAT_DEF(4xxMAC, 0x00000001, 0, REG_HWCAP,  25)
>> +	FEAT_DEF(MMU, 0x00000001, 0, REG_HWCAP,  26)
>> +	FEAT_DEF(FPU, 0x00000001, 0, REG_HWCAP,  27)
>> +	FEAT_DEF(ALTIVEC, 0x00000001, 0, REG_HWCAP,  28)
>> +	FEAT_DEF(PPC601, 0x00000001, 0, REG_HWCAP,  29)
>> +	FEAT_DEF(PPC64, 0x00000001, 0, REG_HWCAP,  30)
>> +	FEAT_DEF(PPC32, 0x00000001, 0, REG_HWCAP,  31)
>> +	FEAT_DEF(TAR, 0x00000001, 0, REG_HWCAP2,  26)
>> +	FEAT_DEF(LSEL, 0x00000001, 0, REG_HWCAP2,  27)
>> +	FEAT_DEF(EBB, 0x00000001, 0, REG_HWCAP2,  28)
>> +	FEAT_DEF(DSCR, 0x00000001, 0, REG_HWCAP2,  29)
>> +	FEAT_DEF(HTM, 0x00000001, 0, REG_HWCAP2,  30)
>> +	FEAT_DEF(ARCH_2_07, 0x00000001, 0, REG_HWCAP2,  31)
>> +};
>> +
>> +/*
>> + * Read AUXV software register and get cpu features for Power
>> + */
>> +static inline void
>> +rte_cpu_get_features( __attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
>> +{
>> +  int auxv_fd;
>> +  Elf64_auxv_t auxv;
>> +  auxv_fd = open("/proc/self/auxv", O_RDONLY);
>> +  assert(auxv_fd);
>> +  while (read(auxv_fd, &auxv, sizeof(Elf64_auxv_t))== sizeof(Elf64_auxv_t)) {
>> +    if (auxv.a_type == AT_HWCAP)
>> +      out[REG_HWCAP] = auxv.a_un.a_val;
>> +    else if (auxv.a_type == AT_HWCAP2)
>> +      out[REG_HWCAP2] = auxv.a_un.a_val;
>> +  }
>> +}
>> +
>> +/*
>> + * Checks if a particular flag is available on current machine.
>> + */
>> +static inline int
>> +rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
>> +{
>> +	const struct feature_entry *feat;
>> +	cpuid_registers_t regs={0};
>> +
>> +	if (feature >= RTE_CPUFLAG_NUMFLAGS)
>> +		/* Flag does not match anything in the feature tables */
>> +		return -ENOENT;
>> +
>> +	feat = &cpu_feature_table[feature];
>> +
>> +	if (!feat->leaf)
>> +		/* This entry in the table wasn't filled out! */
>> +		return -EFAULT;
>> +
>> +	/* get the cpuid leaf containing the desired feature */
>> +	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
>> +
>> +	/* check if the feature is enabled */
>> +	return (regs[feat->reg] >> feat->bit) & 1;
>> +}
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_CPUFLAGS_PPC_64_H_ */
>> diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
>> index 65332e1..f595cd0 100644
>> --- a/mk/rte.cpuflags.mk
>> +++ b/mk/rte.cpuflags.mk
>> @@ -89,6 +89,23 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__AVX2__),)
>>   CPUFLAGS += AVX2
>>   endif
>>   
>> +# IBM Power CPU flags
>> +ifneq ($(filter $(AUTO_CPUFLAGS),__PPC64__),)
>> +CPUFLAGS += PPC64
>> +endif
>> +
>> +ifneq ($(filter $(AUTO_CPUFLAGS),__PPC32__),)
>> +CPUFLAGS += PPC32
>> +endif
>> +
>> +ifneq ($(filter $(AUTO_CPUFLAGS),__vector),)
>> +CPUFLAGS += ALTIVEC
>> +endif
>> +
>> +ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
>> +CPUFLAGS += VSX
>> +endif
>> +
>>   MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
>>   
>>   # To strip whitespace
>> -- 
>> 1.7.1
>>
>>
> Something occurs to me with this patch.  rte_cpu_get_flag_enabled is a public
> API call.  Interally, and externally we might use this call for checking cpu
> support (rte_acl_init is an example).  Because the API call accepts an
> rte_cpu_flag_t type as an input, all the ennumerated values need to be defined
> all the time, or we will get build breakage (I.e. with this patch above, I exect
> you never compiled the ACL library, as RTE_CPUFLAG_SSE4_1 shouldn't be defined,
> and you would get build breakage).  What we probably need to do is merge the
> cpufalgs to a single enumeration that is available for all arches.
>
> Neil
>
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM Power architecture
  2014-11-23 22:02   ` Neil Horman
@ 2014-11-25  3:51     ` Chao Zhu
  2014-11-25  8:44       ` Bruce Richardson
  0 siblings, 1 reply; 31+ messages in thread
From: Chao Zhu @ 2014-11-25  3:51 UTC (permalink / raw)
  To: Neil Horman; +Cc: dev

Neil,
Current Power related patches are not a full functional one. Some of the 
libraries are not migrated. So
common_linuxapp_powerpc is used to turn off the uncompiled part. This 
file is a copy of the common_linuxapp. And this file is intended to be 
removed when all of the libraries are migrated to Power. Actually, it's 
the current common file for linux and other OS, such as BSD.  However, I 
didn't try the compilation on BSD.  But this probably needs to be done.

On 2014/11/24 6:02, Neil Horman wrote:
> On Sun, Nov 23, 2014 at 08:22:09PM -0500, Chao Zhu wrote:
>> To make DPDK run on IBM Power architecture, configuration files for
>> Power architecuture are added. Also, the compiling related .mk files are
>> added.
>>
>> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>> ---
>>   config/common_linuxapp_powerpc              |  394 +++++++++++++++++++++++++++
>>   config/defconfig_ppc_64-power8-linuxapp-gcc |   40 +++
>>   mk/arch/ppc_64/rte.vars.mk                  |   39 +++
>>   mk/machine/power8/rte.vars.mk               |   57 ++++
>>   4 files changed, 530 insertions(+), 0 deletions(-)
>>   create mode 100644 config/common_linuxapp_powerpc
>>   create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
>>   create mode 100644 mk/arch/ppc_64/rte.vars.mk
>>   create mode 100644 mk/machine/power8/rte.vars.mk
>>
>> diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
>> new file mode 100644
>> index 0000000..d230a0b
>> --- /dev/null
>> +++ b/config/common_linuxapp_powerpc
> This filename is common_linuxapp_powerpc, but given that it explicitly specifies
> all the build options, there isn't really anything common about it.  I think
> what you want to do is rename this defconfig_powerpc-native-linuxapp-gcc, and
> have it include common_linuxapp, then change any power-specific option you see
> fit.
>
> Also, does BSD build on power?  I presume so. You likely want to create a
> corresponding bsd power config
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM Power architecture
  2014-11-25  3:51     ` Chao Zhu
@ 2014-11-25  8:44       ` Bruce Richardson
  2014-11-25  9:19         ` Chao Zhu
  0 siblings, 1 reply; 31+ messages in thread
From: Bruce Richardson @ 2014-11-25  8:44 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

On Tue, Nov 25, 2014 at 11:51:13AM +0800, Chao Zhu wrote:
> Neil,
> Current Power related patches are not a full functional one. Some of the
> libraries are not migrated. So
> common_linuxapp_powerpc is used to turn off the uncompiled part.

Hi Chao,
just to re-echo what Neil says - this would be better as a 
defconfig_powerpc-native-linuxapp-gcc config file including common_linuxapp.
Anything you need to turn off in the config can be turned off in the defconfig
file after you include the common_linuxapp one - later definitions override
earlier ones. It also makes things clearer to read as you end up with a 
powerpc config file that essentially reads as "use common linux settings except
for this, and this, and this, etc...."

	Regards,
	/Bruce

> This file
> is a copy of the common_linuxapp. And this file is intended to be removed
> when all of the libraries are migrated to Power. Actually, it's the current
> common file for linux and other OS, such as BSD.  However, I didn't try the
> compilation on BSD.  But this probably needs to be done.
> 
> On 2014/11/24 6:02, Neil Horman wrote:
> >On Sun, Nov 23, 2014 at 08:22:09PM -0500, Chao Zhu wrote:
> >>To make DPDK run on IBM Power architecture, configuration files for
> >>Power architecuture are added. Also, the compiling related .mk files are
> >>added.
> >>
> >>Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> >>---
> >>  config/common_linuxapp_powerpc              |  394 +++++++++++++++++++++++++++
> >>  config/defconfig_ppc_64-power8-linuxapp-gcc |   40 +++
> >>  mk/arch/ppc_64/rte.vars.mk                  |   39 +++
> >>  mk/machine/power8/rte.vars.mk               |   57 ++++
> >>  4 files changed, 530 insertions(+), 0 deletions(-)
> >>  create mode 100644 config/common_linuxapp_powerpc
> >>  create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
> >>  create mode 100644 mk/arch/ppc_64/rte.vars.mk
> >>  create mode 100644 mk/machine/power8/rte.vars.mk
> >>
> >>diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
> >>new file mode 100644
> >>index 0000000..d230a0b
> >>--- /dev/null
> >>+++ b/config/common_linuxapp_powerpc
> >This filename is common_linuxapp_powerpc, but given that it explicitly specifies
> >all the build options, there isn't really anything common about it.  I think
> >what you want to do is rename this defconfig_powerpc-native-linuxapp-gcc, and
> >have it include common_linuxapp, then change any power-specific option you see
> >fit.
> >
> >Also, does BSD build on power?  I presume so. You likely want to create a
> >corresponding bsd power config
> >
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM Power architecture
  2014-11-25  8:44       ` Bruce Richardson
@ 2014-11-25  9:19         ` Chao Zhu
  0 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-25  9:19 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: dev

Bruce,

Good point! I'll update the current patches.
Thanks for your suggestions!

On 2014/11/25 16:44, Bruce Richardson wrote:
> On Tue, Nov 25, 2014 at 11:51:13AM +0800, Chao Zhu wrote:
>> Neil,
>> Current Power related patches are not a full functional one. Some of the
>> libraries are not migrated. So
>> common_linuxapp_powerpc is used to turn off the uncompiled part.
> Hi Chao,
> just to re-echo what Neil says - this would be better as a
> defconfig_powerpc-native-linuxapp-gcc config file including common_linuxapp.
> Anything you need to turn off in the config can be turned off in the defconfig
> file after you include the common_linuxapp one - later definitions override
> earlier ones. It also makes things clearer to read as you end up with a
> powerpc config file that essentially reads as "use common linux settings except
> for this, and this, and this, etc...."
>
> 	Regards,
> 	/Bruce
>
>> This file
>> is a copy of the common_linuxapp. And this file is intended to be removed
>> when all of the libraries are migrated to Power. Actually, it's the current
>> common file for linux and other OS, such as BSD.  However, I didn't try the
>> compilation on BSD.  But this probably needs to be done.
>>
>> On 2014/11/24 6:02, Neil Horman wrote:
>>> On Sun, Nov 23, 2014 at 08:22:09PM -0500, Chao Zhu wrote:
>>>> To make DPDK run on IBM Power architecture, configuration files for
>>>> Power architecuture are added. Also, the compiling related .mk files are
>>>> added.
>>>>
>>>> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>>>> ---
>>>>   config/common_linuxapp_powerpc              |  394 +++++++++++++++++++++++++++
>>>>   config/defconfig_ppc_64-power8-linuxapp-gcc |   40 +++
>>>>   mk/arch/ppc_64/rte.vars.mk                  |   39 +++
>>>>   mk/machine/power8/rte.vars.mk               |   57 ++++
>>>>   4 files changed, 530 insertions(+), 0 deletions(-)
>>>>   create mode 100644 config/common_linuxapp_powerpc
>>>>   create mode 100644 config/defconfig_ppc_64-power8-linuxapp-gcc
>>>>   create mode 100644 mk/arch/ppc_64/rte.vars.mk
>>>>   create mode 100644 mk/machine/power8/rte.vars.mk
>>>>
>>>> diff --git a/config/common_linuxapp_powerpc b/config/common_linuxapp_powerpc
>>>> new file mode 100644
>>>> index 0000000..d230a0b
>>>> --- /dev/null
>>>> +++ b/config/common_linuxapp_powerpc
>>> This filename is common_linuxapp_powerpc, but given that it explicitly specifies
>>> all the build options, there isn't really anything common about it.  I think
>>> what you want to do is rename this defconfig_powerpc-native-linuxapp-gcc, and
>>> have it include common_linuxapp, then change any power-specific option you see
>>> fit.
>>>
>>> Also, does BSD build on power?  I presume so. You likely want to create a
>>> corresponding bsd power config
>>>
>>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking for IBM Power architecture
  2014-11-25  3:27     ` Chao Zhu
@ 2014-11-25 11:37       ` Neil Horman
  0 siblings, 0 replies; 31+ messages in thread
From: Neil Horman @ 2014-11-25 11:37 UTC (permalink / raw)
  To: Chao Zhu; +Cc: dev

On Tue, Nov 25, 2014 at 11:27:31AM +0800, Chao Zhu wrote:
> Neil,
> 
> I didn't compiled ACL library on Power because SSE is not supported by
> Power. This is why ACL compiling was
> turned off on Power. rte_cpu_flag_t is an architecture specific value, each
> CPU has its own rte_cpu_flag_t . The Power one has no influence on x86, so I
> think there should be no building problem on x86. However, you suggestion is
> very good. It can ease the migration effort from x86 to other architectures.
> Probably we need to do it later.
> 
Yes please, this is a real problem.  Its not so much a problem with your patch,
but with the current layout of the cpuflag interface.  Because each cpuflag is
unique to its architecture, the cpuflag api cannot be used in any common code in
a portable way, and as the first proposal on the list to support a new
architecture, I think you need to address this, because it will lead to
permenently non functional libraries and applications that can only build on one
arch if you dont
Neil

> On 2014/11/24 22:14, Neil Horman wrote:
> >On Sun, Nov 23, 2014 at 08:22:16PM -0500, Chao Zhu wrote:
> >>IBM Power processor doesn't have CPU flag hardware registers. This patch
> >>uses aux vector software register to get CPU flags and add CPU flag
> >>checking support for IBM Power architecture.
> >>
> >>Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
> >>---
> >>  app/test/test_cpuflags.c                           |   35 ++++
> >>  .../common/include/arch/ppc_64/rte_cpuflags.h      |  184 ++++++++++++++++++++
> >>  mk/rte.cpuflags.mk                                 |   17 ++
> >>  3 files changed, 236 insertions(+), 0 deletions(-)
> >>  create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> >>
> >>diff --git a/app/test/test_cpuflags.c b/app/test/test_cpuflags.c
> >>index 82c0197..5aeba5d 100644
> >>--- a/app/test/test_cpuflags.c
> >>+++ b/app/test/test_cpuflags.c
> >>@@ -80,6 +80,40 @@ test_cpuflags(void)
> >>  	int result;
> >>  	printf("\nChecking for flags from different registers...\n");
> >>+#ifdef RTE_ARCH_PPC_64
> >>+	printf("Check for PPC64:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC64);
> >>+
> >>+	printf("Check for PPC32:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_PPC32);
> >>+
> >>+	printf("Check for VSX:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_VSX);
> >>+
> >>+	printf("Check for DFP:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_DFP);
> >>+
> >>+	printf("Check for FPU:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_FPU);
> >>+
> >>+	printf("Check for SMT:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_SMT);
> >>+
> >>+	printf("Check for MMU:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_MMU);
> >>+
> >>+	printf("Check for ALTIVEC:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_ALTIVEC);
> >>+
> >>+	printf("Check for ARCH_2_06:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_06);
> >>+
> >>+	printf("Check for ARCH_2_07:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_ARCH_2_07);
> >>+
> >>+	printf("Check for ICACHE_SNOOP:\t\t");
> >>+	CHECK_FOR_FLAG(RTE_CPUFLAG_ICACHE_SNOOP);
> >>+#else
> >>  	printf("Check for SSE:\t\t");
> >>  	CHECK_FOR_FLAG(RTE_CPUFLAG_SSE);
> >>@@ -117,6 +151,7 @@ test_cpuflags(void)
> >>  	CHECK_FOR_FLAG(RTE_CPUFLAG_INVTSC);
> >>+#endif
> >>  	/*
> >>  	 * Check if invalid data is handled properly
> >>diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> >>new file mode 100644
> >>index 0000000..6b38f1c
> >>--- /dev/null
> >>+++ b/lib/librte_eal/common/include/arch/ppc_64/rte_cpuflags.h
> >>@@ -0,0 +1,184 @@
> >>+/*
> >>+ *   BSD LICENSE
> >>+ *
> >>+ *   Copyright (C) IBM Corporation 2014.
> >>+ *
> >>+ *   Redistribution and use in source and binary forms, with or without
> >>+ *   modification, are permitted provided that the following conditions
> >>+ *   are met:
> >>+ *
> >>+ *     * Redistributions of source code must retain the above copyright
> >>+ *       notice, this list of conditions and the following disclaimer.
> >>+ *     * Redistributions in binary form must reproduce the above copyright
> >>+ *       notice, this list of conditions and the following disclaimer in
> >>+ *       the documentation and/or other materials provided with the
> >>+ *       distribution.
> >>+ *     * Neither the name of IBM Corporation nor the names of its
> >>+ *       contributors may be used to endorse or promote products derived
> >>+ *       from this software without specific prior written permission.
> >>+ *
> >>+ *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
> >>+ *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
> >>+ *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
> >>+ *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
> >>+ *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
> >>+ *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
> >>+ *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
> >>+ *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
> >>+ *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
> >>+ *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
> >>+ *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
> >>+*/
> >>+
> >>+#ifndef _RTE_CPUFLAGS_PPC_64_H_
> >>+#define _RTE_CPUFLAGS_PPC_64_H_
> >>+
> >>+#ifdef __cplusplus
> >>+extern "C" {
> >>+#endif
> >>+
> >>+#include <elf.h>
> >>+#include <fcntl.h>
> >>+#include <assert.h>
> >>+#include <unistd.h>
> >>+
> >>+#include "generic/rte_cpuflags.h"
> >>+
> >>+/* Symbolic values for the entries in the auxiliary table */
> >>+#define AT_HWCAP  16
> >>+#define AT_HWCAP2 26
> >>+
> >>+/* software based registers */
> >>+enum cpu_register_t {
> >>+	REG_HWCAP = 0,
> >>+	REG_HWCAP2,
> >>+};
> >>+
> >>+/**
> >>+ * Enumeration of all CPU features supported
> >>+ */
> >>+enum rte_cpu_flag_t {
> >>+	RTE_CPUFLAG_PPC_LE = 0,
> >>+	RTE_CPUFLAG_TRUE_LE,
> >>+	RTE_CPUFLAG_PSERIES_PERFMON_COMPAT,
> >>+	RTE_CPUFLAG_VSX,
> >>+	RTE_CPUFLAG_ARCH_2_06,
> >>+	RTE_CPUFLAG_POWER6_EXT,
> >>+	RTE_CPUFLAG_DFP,
> >>+	RTE_CPUFLAG_PA6T,
> >>+	RTE_CPUFLAG_ARCH_2_05,
> >>+	RTE_CPUFLAG_ICACHE_SNOOP,
> >>+	RTE_CPUFLAG_SMT,
> >>+	RTE_CPUFLAG_BOOKE,
> >>+	RTE_CPUFLAG_CELLBE,
> >>+	RTE_CPUFLAG_POWER5_PLUS,
> >>+	RTE_CPUFLAG_POWER5,
> >>+	RTE_CPUFLAG_POWER4,
> >>+	RTE_CPUFLAG_NOTB,
> >>+	RTE_CPUFLAG_EFP_DOUBLE,
> >>+	RTE_CPUFLAG_EFP_SINGLE,
> >>+	RTE_CPUFLAG_SPE,
> >>+	RTE_CPUFLAG_UNIFIED_CACHE,
> >>+	RTE_CPUFLAG_4xxMAC,
> >>+	RTE_CPUFLAG_MMU,
> >>+	RTE_CPUFLAG_FPU,
> >>+	RTE_CPUFLAG_ALTIVEC,
> >>+	RTE_CPUFLAG_PPC601,
> >>+	RTE_CPUFLAG_PPC64,
> >>+	RTE_CPUFLAG_PPC32,
> >>+	RTE_CPUFLAG_TAR,
> >>+	RTE_CPUFLAG_LSEL,
> >>+	RTE_CPUFLAG_EBB,
> >>+	RTE_CPUFLAG_DSCR,
> >>+	RTE_CPUFLAG_HTM,
> >>+	RTE_CPUFLAG_ARCH_2_07,
> >>+	/* The last item */
> >>+	RTE_CPUFLAG_NUMFLAGS,               /**< This should always be the last! */
> >>+};
> >>+
> >>+static const struct feature_entry cpu_feature_table[] = {
> >>+	FEAT_DEF(PPC_LE, 0x00000001, 0, REG_HWCAP,  0)
> >>+	FEAT_DEF(TRUE_LE, 0x00000001, 0, REG_HWCAP,  1)
> >>+	FEAT_DEF(PSERIES_PERFMON_COMPAT, 0x00000001, 0, REG_HWCAP,  6)
> >>+	FEAT_DEF(VSX, 0x00000001, 0, REG_HWCAP,  7)
> >>+	FEAT_DEF(ARCH_2_06, 0x00000001, 0, REG_HWCAP,  8)
> >>+	FEAT_DEF(POWER6_EXT, 0x00000001, 0, REG_HWCAP,  9)
> >>+	FEAT_DEF(DFP, 0x00000001, 0, REG_HWCAP,  10)
> >>+	FEAT_DEF(PA6T, 0x00000001, 0, REG_HWCAP,  11)
> >>+	FEAT_DEF(ARCH_2_05, 0x00000001, 0, REG_HWCAP,  12)
> >>+	FEAT_DEF(ICACHE_SNOOP, 0x00000001, 0, REG_HWCAP,  13)
> >>+	FEAT_DEF(SMT, 0x00000001, 0, REG_HWCAP,  14)
> >>+	FEAT_DEF(BOOKE, 0x00000001, 0, REG_HWCAP,  15)
> >>+	FEAT_DEF(CELLBE, 0x00000001, 0, REG_HWCAP,  16)
> >>+	FEAT_DEF(POWER5_PLUS, 0x00000001, 0, REG_HWCAP,  17)
> >>+	FEAT_DEF(POWER5, 0x00000001, 0, REG_HWCAP,  18)
> >>+	FEAT_DEF(POWER4, 0x00000001, 0, REG_HWCAP,  19)
> >>+	FEAT_DEF(NOTB, 0x00000001, 0, REG_HWCAP,  20)
> >>+	FEAT_DEF(EFP_DOUBLE, 0x00000001, 0, REG_HWCAP,  21)
> >>+	FEAT_DEF(EFP_SINGLE, 0x00000001, 0, REG_HWCAP,  22)
> >>+	FEAT_DEF(SPE, 0x00000001, 0, REG_HWCAP,  23)
> >>+	FEAT_DEF(UNIFIED_CACHE, 0x00000001, 0, REG_HWCAP,  24)
> >>+	FEAT_DEF(4xxMAC, 0x00000001, 0, REG_HWCAP,  25)
> >>+	FEAT_DEF(MMU, 0x00000001, 0, REG_HWCAP,  26)
> >>+	FEAT_DEF(FPU, 0x00000001, 0, REG_HWCAP,  27)
> >>+	FEAT_DEF(ALTIVEC, 0x00000001, 0, REG_HWCAP,  28)
> >>+	FEAT_DEF(PPC601, 0x00000001, 0, REG_HWCAP,  29)
> >>+	FEAT_DEF(PPC64, 0x00000001, 0, REG_HWCAP,  30)
> >>+	FEAT_DEF(PPC32, 0x00000001, 0, REG_HWCAP,  31)
> >>+	FEAT_DEF(TAR, 0x00000001, 0, REG_HWCAP2,  26)
> >>+	FEAT_DEF(LSEL, 0x00000001, 0, REG_HWCAP2,  27)
> >>+	FEAT_DEF(EBB, 0x00000001, 0, REG_HWCAP2,  28)
> >>+	FEAT_DEF(DSCR, 0x00000001, 0, REG_HWCAP2,  29)
> >>+	FEAT_DEF(HTM, 0x00000001, 0, REG_HWCAP2,  30)
> >>+	FEAT_DEF(ARCH_2_07, 0x00000001, 0, REG_HWCAP2,  31)
> >>+};
> >>+
> >>+/*
> >>+ * Read AUXV software register and get cpu features for Power
> >>+ */
> >>+static inline void
> >>+rte_cpu_get_features( __attribute__((unused)) uint32_t leaf, __attribute__((unused)) uint32_t subleaf, cpuid_registers_t out)
> >>+{
> >>+  int auxv_fd;
> >>+  Elf64_auxv_t auxv;
> >>+  auxv_fd = open("/proc/self/auxv", O_RDONLY);
> >>+  assert(auxv_fd);
> >>+  while (read(auxv_fd, &auxv, sizeof(Elf64_auxv_t))== sizeof(Elf64_auxv_t)) {
> >>+    if (auxv.a_type == AT_HWCAP)
> >>+      out[REG_HWCAP] = auxv.a_un.a_val;
> >>+    else if (auxv.a_type == AT_HWCAP2)
> >>+      out[REG_HWCAP2] = auxv.a_un.a_val;
> >>+  }
> >>+}
> >>+
> >>+/*
> >>+ * Checks if a particular flag is available on current machine.
> >>+ */
> >>+static inline int
> >>+rte_cpu_get_flag_enabled(enum rte_cpu_flag_t feature)
> >>+{
> >>+	const struct feature_entry *feat;
> >>+	cpuid_registers_t regs={0};
> >>+
> >>+	if (feature >= RTE_CPUFLAG_NUMFLAGS)
> >>+		/* Flag does not match anything in the feature tables */
> >>+		return -ENOENT;
> >>+
> >>+	feat = &cpu_feature_table[feature];
> >>+
> >>+	if (!feat->leaf)
> >>+		/* This entry in the table wasn't filled out! */
> >>+		return -EFAULT;
> >>+
> >>+	/* get the cpuid leaf containing the desired feature */
> >>+	rte_cpu_get_features(feat->leaf, feat->subleaf, regs);
> >>+
> >>+	/* check if the feature is enabled */
> >>+	return (regs[feat->reg] >> feat->bit) & 1;
> >>+}
> >>+
> >>+#ifdef __cplusplus
> >>+}
> >>+#endif
> >>+
> >>+#endif /* _RTE_CPUFLAGS_PPC_64_H_ */
> >>diff --git a/mk/rte.cpuflags.mk b/mk/rte.cpuflags.mk
> >>index 65332e1..f595cd0 100644
> >>--- a/mk/rte.cpuflags.mk
> >>+++ b/mk/rte.cpuflags.mk
> >>@@ -89,6 +89,23 @@ ifneq ($(filter $(AUTO_CPUFLAGS),__AVX2__),)
> >>  CPUFLAGS += AVX2
> >>  endif
> >>+# IBM Power CPU flags
> >>+ifneq ($(filter $(AUTO_CPUFLAGS),__PPC64__),)
> >>+CPUFLAGS += PPC64
> >>+endif
> >>+
> >>+ifneq ($(filter $(AUTO_CPUFLAGS),__PPC32__),)
> >>+CPUFLAGS += PPC32
> >>+endif
> >>+
> >>+ifneq ($(filter $(AUTO_CPUFLAGS),__vector),)
> >>+CPUFLAGS += ALTIVEC
> >>+endif
> >>+
> >>+ifneq ($(filter $(AUTO_CPUFLAGS),__builtin_vsx_xvnmaddadp),)
> >>+CPUFLAGS += VSX
> >>+endif
> >>+
> >>  MACHINE_CFLAGS += $(addprefix -DRTE_MACHINE_CPUFLAG_,$(CPUFLAGS))
> >>  # To strip whitespace
> >>-- 
> >>1.7.1
> >>
> >>
> >Something occurs to me with this patch.  rte_cpu_get_flag_enabled is a public
> >API call.  Interally, and externally we might use this call for checking cpu
> >support (rte_acl_init is an example).  Because the API call accepts an
> >rte_cpu_flag_t type as an input, all the ennumerated values need to be defined
> >all the time, or we will get build breakage (I.e. with this patch above, I exect
> >you never compiled the ACL library, as RTE_CPUFLAG_SSE4_1 shouldn't be defined,
> >and you would get build breakage).  What we probably need to do is merge the
> >cpufalgs to a single enumeration that is available for all arches.
> >
> >Neil
> >
> >
> 
> 
> 

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH v3 03/14] Add byte order operations for IBM Power architecture
  2014-11-24  8:11   ` Qiu, Michael
@ 2014-11-26  2:35     ` Chao Zhu
  0 siblings, 0 replies; 31+ messages in thread
From: Chao Zhu @ 2014-11-26  2:35 UTC (permalink / raw)
  To: Qiu, Michael, dev

Michael,

The default endianess of Power7/8 is big endian.  So I set big endian in 
the configuration file. If use little endian, just change the 
configuration file. Of cause, there is some way to determine the endian 
in run time. However, the original DPDK didn't do this.  I think this 
can be improved later.
About your second question, Power7 can support little endian, but it is 
a emulated one, not a CPU hardware feature.  Also, there is no official 
little endian support for Power7.  So I marked Power7 only support big 
endian.

On 2014/11/24 16:11, Qiu, Michael wrote:
> On 11/23/2014 9:22 PM, Chao Zhu wrote:
>> This patch adds architecture specific byte order operations for IBM Power
>> architecture. Power architecture support both big endian and little
>> endian. This patch also adds a RTE_ARCH_BIG_ENDIAN micro.
>>
>> Signed-off-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>
>> ---
>>   config/defconfig_ppc_64-power8-linuxapp-gcc        |    1 +
>>   .../common/include/arch/ppc_64/rte_byteorder.h     |  150 ++++++++++++++++++++
>>   2 files changed, 151 insertions(+), 0 deletions(-)
>>   create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>>
>> diff --git a/config/defconfig_ppc_64-power8-linuxapp-gcc b/config/defconfig_ppc_64-power8-linuxapp-gcc
>> index 97d72ff..b10f60c 100644
>> --- a/config/defconfig_ppc_64-power8-linuxapp-gcc
>> +++ b/config/defconfig_ppc_64-power8-linuxapp-gcc
>> @@ -34,6 +34,7 @@ CONFIG_RTE_MACHINE="power8"
>>   
>>   CONFIG_RTE_ARCH="ppc_64"
>>   CONFIG_RTE_ARCH_PPC_64=y
>> +CONFIG_RTE_ARCH_BIG_ENDIAN=y
> Does this means default is Big Endian,  if I runs it in little endian
> mode, I need to change it manually?
>>   
>>   CONFIG_RTE_TOOLCHAIN="gcc"
>>   CONFIG_RTE_TOOLCHAIN_GCC=y
>> diff --git a/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>> new file mode 100644
>> index 0000000..a593e8a
>> --- /dev/null
>> +++ b/lib/librte_eal/common/include/arch/ppc_64/rte_byteorder.h
>> @@ -0,0 +1,150 @@
>> +/*
>> + *   BSD LICENSE
>> + *
>> + *   Copyright (C) IBM Corporation 2014.
>> + *
>> + *   Redistribution and use in source and binary forms, with or without
>> + *   modification, are permitted provided that the following conditions
>> + *   are met:
>> + *
>> + *     * Redistributions of source code must retain the above copyright
>> + *       notice, this list of conditions and the following disclaimer.
>> + *     * Redistributions in binary form must reproduce the above copyright
>> + *       notice, this list of conditions and the following disclaimer in
>> + *       the documentation and/or other materials provided with the
>> + *       distribution.
>> + *     * Neither the name of IBM Corporation nor the names of its
>> + *       contributors may be used to endorse or promote products derived
>> + *       from this software without specific prior written permission.
>> + *
>> + *   THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
>> + *   "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
>> + *   LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
>> + *   A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
>> + *   OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
>> + *   SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
>> + *   LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
>> + *   DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
>> + *   THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
>> + *   (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
>> + *   OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
>> +*/
>> +
>> +/* Inspired from FreeBSD src/sys/powerpc/include/endian.h
>> + * Copyright (c) 1987, 1991, 1993
>> + * The Regents of the University of California.  All rights reserved.
>> +*/
>> +
>> +#ifndef _RTE_BYTEORDER_PPC_64_H_
>> +#define _RTE_BYTEORDER_PPC_64_H_
>> +
>> +#ifdef __cplusplus
>> +extern "C" {
>> +#endif
>> +
>> +#include "generic/rte_byteorder.h"
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 16-bit value.
>> + *
>> + * Do not use this function directly. The preferred function is rte_bswap16().
>> + */
>> +static inline uint16_t rte_arch_bswap16(uint16_t _x)
>> +{
>> +	return ((_x >> 8) | ((_x << 8) & 0xff00));
>> +}
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 32-bit value.
>> + *
>> + * Do not use this function directly. The preferred function is rte_bswap32().
>> + */
>> +static inline uint32_t rte_arch_bswap32(uint32_t _x)
>> +{
>> +	return ((_x >> 24) | ((_x >> 8) & 0xff00) | ((_x << 8) & 0xff0000) |
>> +		((_x << 24) & 0xff000000));
>> +}
>> +
>> +/*
>> + * An architecture-optimized byte swap for a 64-bit value.
>> + *
>> +  * Do not use this function directly. The preferred function is rte_bswap64().
>> + */
>> +/* 64-bit mode */
>> +static inline uint64_t rte_arch_bswap64(uint64_t _x)
>> +{
>> +	return ((_x >> 56) | ((_x >> 40) & 0xff00) | ((_x >> 24) & 0xff0000) |
>> +		((_x >> 8) & 0xff000000) | ((_x << 8) & (0xffULL << 32)) |
>> +		((_x << 24) & (0xffULL << 40)) |
>> +		((_x << 40) & (0xffULL << 48)) | ((_x << 56)));
>> +}
>> +
>> +#ifndef RTE_FORCE_INTRINSICS
>> +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
>> +				   rte_constant_bswap16(x) :		\
>> +				   rte_arch_bswap16(x)))
>> +
>> +#define rte_bswap32(x) ((uint32_t)(__builtin_constant_p(x) ?		\
>> +				   rte_constant_bswap32(x) :		\
>> +				   rte_arch_bswap32(x)))
>> +
>> +#define rte_bswap64(x) ((uint64_t)(__builtin_constant_p(x) ?		\
>> +				   rte_constant_bswap64(x) :		\
>> +				   rte_arch_bswap64(x)))
>> +#else
>> +/*
>> + * __builtin_bswap16 is only available gcc 4.8 and upwards
>> + */
>> +#if __GNUC__ < 4 || (__GNUC__ == 4 && __GNUC_MINOR__ < 8)
>> +#define rte_bswap16(x) ((uint16_t)(__builtin_constant_p(x) ?		\
>> +				   rte_constant_bswap16(x) :		\
>> +				   rte_arch_bswap16(x)))
>> +#endif
>> +#endif
>> +
>> +/* Power 8 have both little endian and big endian mode
>> + * Power 7 only support big endian
> Are you sure about this ? What I've heard is that all power CPU(at least
> Power7 and 8) supports, but not check the spec.
>> + */
>> +#ifndef RTE_ARCH_BIG_ENDIAN
>> +
>> +#define rte_cpu_to_le_16(x) (x)
>> +#define rte_cpu_to_le_32(x) (x)
>> +#define rte_cpu_to_le_64(x) (x)
>> +
>> +#define rte_cpu_to_be_16(x) rte_bswap16(x)
>> +#define rte_cpu_to_be_32(x) rte_bswap32(x)
>> +#define rte_cpu_to_be_64(x) rte_bswap64(x)
>> +
>> +#define rte_le_to_cpu_16(x) (x)
>> +#define rte_le_to_cpu_32(x) (x)
>> +#define rte_le_to_cpu_64(x) (x)
>> +
>> +#define rte_be_to_cpu_16(x) rte_bswap16(x)
>> +#define rte_be_to_cpu_32(x) rte_bswap32(x)
>> +#define rte_be_to_cpu_64(x) rte_bswap64(x)
>> +
>> +#else
>> +
>> +#define rte_cpu_to_le_16(x) rte_bswap16(x)
>> +#define rte_cpu_to_le_32(x) rte_bswap32(x)
>> +#define rte_cpu_to_le_64(x) rte_bswap64(x)
>> +
>> +#define rte_cpu_to_be_16(x) (x)
>> +#define rte_cpu_to_be_32(x) (x)
>> +#define rte_cpu_to_be_64(x) (x)
>> +
>> +#define rte_le_to_cpu_16(x) rte_bswap16(x)
>> +#define rte_le_to_cpu_32(x) rte_bswap32(x)
>> +#define rte_le_to_cpu_64(x) rte_bswap64(x)
>> +
>> +#define rte_be_to_cpu_16(x) (x)
>> +#define rte_be_to_cpu_32(x) (x)
>> +#define rte_be_to_cpu_64(x) (x)
>> +#endif
>> +
>> +#ifdef __cplusplus
>> +}
>> +#endif
>> +
>> +#endif /* _RTE_BYTEORDER_PPC_64_H_ */
>> +
>

^ permalink raw reply	[flat|nested] 31+ messages in thread

* Re: [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures
  2014-11-24 15:58       ` chaozhu
@ 2014-11-27  7:47         ` Thomas Monjalon
  0 siblings, 0 replies; 31+ messages in thread
From: Thomas Monjalon @ 2014-11-27  7:47 UTC (permalink / raw)
  To: David Marchand; +Cc: dev

> > RTE_ARCH_X86_64 can not be used as a way to determine if we are building for
> > 64bits cpus. Instead, RTE_ARCH_64 should be used.
> >
> > Signed-off-by: David Marchand <david.marchand@6wind.com>
> 
> Acked-by: Chao Zhu <chaozhu@linux.vnet.ibm.com>

Applied

Thanks
-- 
Thomas

^ permalink raw reply	[flat|nested] 31+ messages in thread

end of thread, other threads:[~2014-11-27  7:48 UTC | newest]

Thread overview: 31+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2014-11-24  1:22 [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 01/14] Add compiling definations for IBM " Chao Zhu
2014-11-23 22:02   ` Neil Horman
2014-11-25  3:51     ` Chao Zhu
2014-11-25  8:44       ` Bruce Richardson
2014-11-25  9:19         ` Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 02/14] Add atomic operations " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 03/14] Add byte order " Chao Zhu
2014-11-24  8:11   ` Qiu, Michael
2014-11-26  2:35     ` Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 04/14] Add CPU cycle " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 05/14] Add prefetch operation " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 06/14] Add spinlock " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 07/14] Add vector memcpy " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 08/14] Add CPU flag checking " Chao Zhu
2014-11-24 14:14   ` Neil Horman
2014-11-25  3:27     ` Chao Zhu
2014-11-25 11:37       ` Neil Horman
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 09/14] Remove iopl operation " Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 10/14] Add cache size define for IBM Power Architecture Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 11/14] Add huge page size define for IBM Power architecture Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 12/14] Add eal memory support for IBM Power Architecture Chao Zhu
2014-11-24 15:17   ` David Marchand
2014-11-24 15:18     ` [dpdk-dev] [PATCH] eal: fix remaining checks for other 64bits architectures David Marchand
2014-11-24 15:58       ` chaozhu
2014-11-27  7:47         ` Thomas Monjalon
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 13/14] test_memzone:fix finding the second smallest segment Chao Zhu
2014-11-24  1:22 ` [dpdk-dev] [PATCH v3 14/14] Fix the compiling of test-pmd on IBM Power Architecture Chao Zhu
2014-11-24 15:05 ` [dpdk-dev] [PATCH v3 00/14] Patches for DPDK to support Power architecture David Marchand
2014-11-24 15:49   ` chaozhu
2014-11-25  2:49   ` Chao Zhu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).