From: Shahaf Shuler <shahafs@mellanox.com>
To: Mordechay Haimovsky <motih@mellanox.com>,
Yongseok Koh <yskoh@mellanox.com>,
Adrien Mazarguil <adrien.mazarguil@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, Mordechay Haimovsky <motih@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems
Date: Mon, 2 Jul 2018 07:05:02 +0000 [thread overview]
Message-ID: <DB7PR05MB4426F8CEEB5912BA76EC899FC3430@DB7PR05MB4426.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <1530169969-6708-1-git-send-email-motih@mellanox.com>
Hi Moty,
Few nits,
Also please fix the check patch warning :
### net/mlx5: add support for 32bit systems
CHECK:OPEN_ENDED_LINE: Lines should not end with a '('
#235: FILE: drivers/net/mlx5/mlx5_rxtx.c:1591:
+ addr_64 = rte_cpu_to_be_64(
total: 0 errors, 0 warnings, 1 checks, 311 lines checked
Thursday, June 28, 2018 10:13 AM, Moti Haimovsky:
> Subject: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems
>
> This patch adds support for building and running mlx5 PMD on 32bit systems
> such as i686.
>
> The main issue to tackle was handling the 32bit access to the UAR as quoted
> from the mlx5 PRM:
> QP and CQ DoorBells require 64-bit writes. For best performance, it is
> recommended to execute the QP/CQ DoorBell as a single 64-bit write
> operation. For platforms that do not support 64 bit writes, it is possible to
> issue the 64 bits DoorBells through two consecutive writes, each write 32
> bits, as described below:
> * The order of writing each of the Dwords is from lower to upper
> addresses.
> * No other DoorBell can be rung (or even start ringing) in the midst of
> an on-going write of a DoorBell over a given UAR page.
> The last rule implies that in a multi-threaded environment, the access to a
> UAR page (which can be accessible by all threads in the process) must be
> synchronized (for example, using a semaphore) unless an atomic write of 64
> bits in a single bus operation is guaranteed. Such a synchronization is not
> required for when ringing DoorBells on different UAR pages.
>
> Signed-off-by: Moti Haimovsky <motih@mellanox.com>
> ---
> doc/guides/nics/features/mlx5.ini | 1 +
> doc/guides/nics/mlx5.rst | 11 +++++++
> drivers/net/mlx5/mlx5.c | 8 ++++-
> drivers/net/mlx5/mlx5.h | 5 +++
> drivers/net/mlx5/mlx5_defs.h | 18 ++++++++--
> drivers/net/mlx5/mlx5_rxq.c | 6 +++-
> drivers/net/mlx5/mlx5_rxtx.c | 22 +++++++------
> drivers/net/mlx5/mlx5_rxtx.h | 69
> ++++++++++++++++++++++++++++++++++++++-
> drivers/net/mlx5/mlx5_txq.c | 13 +++++++-
> 9 files changed, 137 insertions(+), 16 deletions(-)
>
> diff --git a/doc/guides/nics/features/mlx5.ini
> b/doc/guides/nics/features/mlx5.ini
> index e75b14b..b28b43e 100644
> --- a/doc/guides/nics/features/mlx5.ini
> +++ b/doc/guides/nics/features/mlx5.ini
> @@ -43,5 +43,6 @@ Multiprocess aware = Y
> Other kdrv = Y
> ARMv8 = Y
> Power8 = Y
> +x86-32 = Y
> x86-64 = Y
> Usage doc = Y
> diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index
> 7dd9c1c..cb9d5d8 100644
> --- a/doc/guides/nics/mlx5.rst
> +++ b/doc/guides/nics/mlx5.rst
> @@ -50,6 +50,8 @@ Features
> --------
>
> - Multi arch support: x86_64, POWER8, ARMv8.
> +- Support for i686 is available only when working with
> + rdma-core version 18.0 or above, built with 32bit support.
I think we can just add i686 to the supported arch. The limitation on the rdma-core version is well documented below.
> - Multiple TX and RX queues.
> - Support for scattered TX and RX frames.
> - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of queues.
> @@ -136,6 +138,11 @@ Limitations
> enabled (``rxq_cqe_comp_en``) at the same time, RSS hash result is not
> fully
> supported. Some Rx packets may not have PKT_RX_RSS_HASH.
>
> +- Building for i686 is only supported with:
> +
> + - rdma-core version 18.0 or above built with 32bit support.
> + - Kernel version 4.14.41 or above.
Why the kernel is related? The rdma-core I understand.
> +
> Statistics
> ----------
>
> @@ -477,6 +484,10 @@ RMDA Core with Linux Kernel
> - Minimal kernel version : v4.14 or the most recent 4.14-rc (see `Linux
> installation documentation`_)
> - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull
> request #227 from yishaih/tm")
> (see `RDMA Core installation documentation`_)
> +- When building for i686 use:
> +
> + - rdma-core version 18.0 or above built with 32bit support.
> + - Kernel version 4.14.41 or above.
>
> .. _`Linux installation documentation`:
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit.
> kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fstable%2Flinux-
> stable.git%2Fplain%2FDocumentation%2Fadmin-
> guide%2FREADME.rst&data=02%7C01%7Cshahafs%40mellanox.com%7C3793
> 359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d149256f461b%7
> C0%7C0%7C636657668016130861&sdata=yFHd7tQET5SqIcPgj66BSuwJp3sydo
> ujC0ldCMkChVE%3D&reserved=0
> .. _`RDMA Core installation documentation`:
> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fra
> w.githubusercontent.com%2Flinux-rdma%2Frdma-
> core%2Fmaster%2FREADME.md&data=02%7C01%7Cshahafs%40mellanox.co
> m%7C3793359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d1492
> 56f461b%7C0%7C0%7C636657668016130861&sdata=4LNh%2Fr5vM4BJeizvEIxi
> ShMrfcx0NrlBFWz4V2wA%2FkY%3D&reserved=0
> diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index
> f0e6ed7..5d0f706 100644
> --- a/drivers/net/mlx5/mlx5.c
> +++ b/drivers/net/mlx5/mlx5.c
> @@ -567,7 +567,7 @@
> rte_memseg_walk(find_lower_va_bound, &addr);
>
> /* keep distance to hugepages to minimize potential conflicts. */
> - addr = RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE);
> + addr = RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET +
> +MLX5_UAR_SIZE));
> /* anonymous mmap, no real memory consumption. */
> addr = mmap(addr, MLX5_UAR_SIZE,
> PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
> @@ -953,6 +953,12 @@
> priv->port = port;
> priv->pd = pd;
> priv->mtu = ETHER_MTU;
> +#ifndef RTE_ARCH_64
> + /* Initialize UAR access locks for 32bit implementations. */
> + rte_spinlock_init(&priv->uar_lock_cq);
> + for (i = 0; i < MLX5_UAR_PAGE_NUM_MAX; i++)
> + rte_spinlock_init(&priv->uar_lock[i]);
> +#endif
> err = mlx5_args(&config, pci_dev->device.devargs);
> if (err) {
> err = rte_errno;
> diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index
> 997b04a..2da32cd 100644
> --- a/drivers/net/mlx5/mlx5.h
> +++ b/drivers/net/mlx5/mlx5.h
> @@ -198,6 +198,11 @@ struct priv {
> /* Context for Verbs allocator. */
> int nl_socket; /* Netlink socket. */
> uint32_t nl_sn; /* Netlink message sequence number. */
> +#ifndef RTE_ARCH_64
> + rte_spinlock_t uar_lock_cq; /* CQs share a common distinct UAR */
> + rte_spinlock_t uar_lock[MLX5_UAR_PAGE_NUM_MAX];
> + /* UAR same-page access control required in 32bit implementations.
> */
> +#endif
> };
>
> #define PORT_ID(priv) ((priv)->dev_data->port_id) diff --git
> a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index
> 5bbbec2..f6ec415 100644
> --- a/drivers/net/mlx5/mlx5_defs.h
> +++ b/drivers/net/mlx5/mlx5_defs.h
> @@ -87,14 +87,28 @@
> #define MLX5_LINK_STATUS_TIMEOUT 10
>
> /* Reserved address space for UAR mapping. */ -#define MLX5_UAR_SIZE
> (1ULL << 32)
> +#define MLX5_UAR_SIZE (1ULL << (sizeof(uintptr_t) * 4))
>
> /* Offset of reserved UAR address space to hugepage memory. Offset is
> used here
> * to minimize possibility of address next to hugepage being used by other
> code
> * in either primary or secondary process, failing to map TX UAR would make
> TX
> * packets invisible to HW.
> */
> -#define MLX5_UAR_OFFSET (1ULL << 32)
> +#define MLX5_UAR_OFFSET (1ULL << (sizeof(uintptr_t) * 4))
> +
> +/* Maximum number of UAR pages used by a port,
> + * These are the size and mask for an array of mutexes used to
> +synchronize
> + * the access to port's UARs on platforms that do not support 64 bit writes.
> + * In such systems it is possible to issue the 64 bits DoorBells
> +through two
> + * consecutive writes, each write 32 bits. The access to a UAR page
> +(which can
> + * be accessible by all threads in the process) must be synchronized
> + * (for example, using a semaphore). Such a synchronization is not
> +required
> + * when ringing DoorBells on different UAR pages.
> + * A port with 512 Tx queues uses 8, 4kBytes, UAR pages which are
> +shared
> + * among the ports.
> + */
> +#define MLX5_UAR_PAGE_NUM_MAX 64
> +#define MLX5_UAR_PAGE_NUM_MASK ((MLX5_UAR_PAGE_NUM_MAX)
> - 1)
>
> /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. */
> #define MLX5_MPRQ_STRIDE_NUM_N 6U diff --git
> a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index
> 08dd559..820048f 100644
> --- a/drivers/net/mlx5/mlx5_rxq.c
> +++ b/drivers/net/mlx5/mlx5_rxq.c
> @@ -643,7 +643,8 @@
> doorbell = (uint64_t)doorbell_hi << 32;
> doorbell |= rxq->cqn;
> rxq->cq_db[MLX5_CQ_ARM_DB] = rte_cpu_to_be_32(doorbell_hi);
> - rte_write64(rte_cpu_to_be_64(doorbell), cq_db_reg);
> + mlx5_uar_write64(rte_cpu_to_be_64(doorbell),
> + cq_db_reg, rxq->uar_lock_cq);
> }
>
> /**
> @@ -1445,6 +1446,9 @@ struct mlx5_rxq_ctrl *
> tmpl->rxq.elts_n = log2above(desc);
> tmpl->rxq.elts =
> (struct rte_mbuf *(*)[1 << tmpl->rxq.elts_n])(tmpl + 1);
> +#ifndef RTE_ARCH_64
> + tmpl->rxq.uar_lock_cq = &priv->uar_lock_cq; #endif
> tmpl->idx = idx;
> rte_atomic32_inc(&tmpl->refcnt);
> LIST_INSERT_HEAD(&priv->rxqsctrl, tmpl, next); diff --git
> a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index
> a7ed8d8..ec35ea0 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.c
> +++ b/drivers/net/mlx5/mlx5_rxtx.c
> @@ -495,6 +495,7 @@
> volatile struct mlx5_wqe_ctrl *last_wqe = NULL;
> unsigned int segs_n = 0;
> const unsigned int max_inline = txq->max_inline;
> + uint64_t addr_64;
>
> if (unlikely(!pkts_n))
> return 0;
> @@ -711,12 +712,12 @@
> ds = 3;
> use_dseg:
> /* Add the remaining packet as a simple ds. */
> - addr = rte_cpu_to_be_64(addr);
> + addr_64 = rte_cpu_to_be_64(addr);
> *dseg = (rte_v128u32_t){
> rte_cpu_to_be_32(length),
> mlx5_tx_mb2mr(txq, buf),
> - addr,
> - addr >> 32,
> + addr_64,
> + addr_64 >> 32,
> };
> ++ds;
> if (!segs_n)
> @@ -750,12 +751,12 @@
> total_length += length;
> #endif
> /* Store segment information. */
> - addr = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf,
> uintptr_t));
> + addr_64 = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf,
> uintptr_t));
> *dseg = (rte_v128u32_t){
> rte_cpu_to_be_32(length),
> mlx5_tx_mb2mr(txq, buf),
> - addr,
> - addr >> 32,
> + addr_64,
> + addr_64 >> 32,
> };
> (*txq->elts)[++elts_head & elts_m] = buf;
> if (--segs_n)
> @@ -1450,6 +1451,7 @@
> unsigned int mpw_room = 0;
> unsigned int inl_pad = 0;
> uint32_t inl_hdr;
> + uint64_t addr_64;
> struct mlx5_mpw mpw = {
> .state = MLX5_MPW_STATE_CLOSED,
> };
> @@ -1586,13 +1588,13 @@
> ((uintptr_t)mpw.data.raw +
> inl_pad);
> (*txq->elts)[elts_head++ & elts_m] = buf;
> - addr = rte_cpu_to_be_64(rte_pktmbuf_mtod(buf,
> - uintptr_t));
> + addr_64 = rte_cpu_to_be_64(
> + rte_pktmbuf_mtod(buf, uintptr_t));
> *dseg = (rte_v128u32_t) {
> rte_cpu_to_be_32(length),
> mlx5_tx_mb2mr(txq, buf),
> - addr,
> - addr >> 32,
> + addr_64,
> + addr_64 >> 32,
> };
> mpw.data.raw = (volatile void *)(dseg + 1);
> mpw.total_len += (inl_pad + sizeof(*dseg)); diff --git
> a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index
> 0007be0..2448d73 100644
> --- a/drivers/net/mlx5/mlx5_rxtx.h
> +++ b/drivers/net/mlx5/mlx5_rxtx.h
> @@ -26,6 +26,8 @@
> #include <rte_common.h>
> #include <rte_hexdump.h>
> #include <rte_atomic.h>
> +#include <rte_spinlock.h>
> +#include <rte_io.h>
>
> #include "mlx5_utils.h"
> #include "mlx5.h"
> @@ -115,6 +117,10 @@ struct mlx5_rxq_data {
> void *cq_uar; /* CQ user access region. */
> uint32_t cqn; /* CQ number. */
> uint8_t cq_arm_sn; /* CQ arm seq number. */
> +#ifndef RTE_ARCH_64
> + rte_spinlock_t *uar_lock_cq;
> + /* CQ (UAR) access lock required for 32bit implementations */ #endif
> uint32_t tunnel; /* Tunnel information. */ } __rte_cache_aligned;
>
> @@ -196,6 +202,10 @@ struct mlx5_txq_data {
> volatile void *bf_reg; /* Blueflame register remapped. */
> struct rte_mbuf *(*elts)[]; /* TX elements. */
> struct mlx5_txq_stats stats; /* TX queue counters. */
> +#ifndef RTE_ARCH_64
> + rte_spinlock_t *uar_lock;
> + /* UAR access lock required for 32bit implementations */ #endif
> } __rte_cache_aligned;
>
> /* Verbs Rx queue elements. */
> @@ -348,6 +358,63 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, struct
> rte_mbuf **pkts, uint32_t mlx5_rx_addr2mr_bh(struct mlx5_rxq_data
> *rxq, uintptr_t addr); uint32_t mlx5_tx_addr2mr_bh(struct mlx5_txq_data
> *txq, uintptr_t addr);
>
> +/**
> + * Provide safe 64bit store operation to mlx5 UAR region for both 32bit
> +and
> + * 64bit architectures.
> + *
> + * @param val
> + * value to write in CPU endian format.
> + * @param addr
> + * Address to write to.
> + * @param lock
> + * Address of the lock to use for that UAR access.
> + */
> +static __rte_always_inline void
> +__mlx5_uar_write64_relaxed(uint64_t val, volatile void *addr,
> + rte_spinlock_t *lock __rte_unused) { #ifdef
> RTE_ARCH_64
> + rte_write64_relaxed(val, addr);
> +#else /* !RTE_ARCH_64 */
> + rte_spinlock_lock(lock);
> + rte_write32_relaxed(val, addr);
> + rte_io_wmb();
> + rte_write32_relaxed(val >> 32,
> + (volatile void *)((volatile char *)addr + 4));
> + rte_spinlock_unlock(lock);
> +#endif
> +}
> +
> +/**
> + * Provide safe 64bit store operation to mlx5 UAR region for both 32bit
> +and
> + * 64bit architectures while guaranteeing the order of execution with
> +the
> + * code being executed.
> + *
> + * @param val
> + * value to write in CPU endian format.
> + * @param addr
> + * Address to write to.
> + * @param lock
> + * Address of the lock to use for that UAR access.
> + */
> +static __rte_always_inline void
> +__mlx5_uar_write64(uint64_t val, volatile void *addr, rte_spinlock_t
> +*lock) {
> + rte_io_wmb();
> + __mlx5_uar_write64_relaxed(val, addr, lock); }
> +
> +/* Assist macros, used instead of directly calling the functions the
> +wrap. */ #ifdef RTE_ARCH_64 #define mlx5_uar_write64_relaxed(val, dst,
> +lock) \
> + __mlx5_uar_write64_relaxed(val, dst, NULL) #define
> +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, NULL)
> +#else #define mlx5_uar_write64_relaxed(val, dst, lock) \
> + __mlx5_uar_write64_relaxed(val, dst, lock) #define
> +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, lock)
> +#endif
> +
> #ifndef NDEBUG
> /**
> * Verify or set magic value in CQE.
> @@ -614,7 +681,7 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, struct
> rte_mbuf **pkts,
> *txq->qp_db = rte_cpu_to_be_32(txq->wqe_ci);
> /* Ensure ordering between DB record and BF copy. */
> rte_wmb();
> - *dst = *src;
> + mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock);
> if (cond)
> rte_wmb();
> }
> diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c
> index 669b913..dc786d4 100644
> --- a/drivers/net/mlx5/mlx5_txq.c
> +++ b/drivers/net/mlx5/mlx5_txq.c
> @@ -255,6 +255,9 @@
> struct mlx5_txq_ctrl *txq_ctrl;
> int already_mapped;
> size_t page_size = sysconf(_SC_PAGESIZE);
> +#ifndef RTE_ARCH_64
> + unsigned int lock_idx;
> +#endif
>
> memset(pages, 0, priv->txqs_n * sizeof(uintptr_t));
> /*
> @@ -281,7 +284,7 @@
> }
> /* new address in reserved UAR address space. */
> addr = RTE_PTR_ADD(priv->uar_base,
> - uar_va & (MLX5_UAR_SIZE - 1));
> + uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1));
> if (!already_mapped) {
> pages[pages_n++] = uar_va;
> /* fixed mmap to specified address in reserved @@ -
> 305,6 +308,12 @@
> else
> assert(txq_ctrl->txq.bf_reg ==
> RTE_PTR_ADD((void *)addr, off));
> +#ifndef RTE_ARCH_64
> + /* Assign a UAR lock according to UAR page number */
> + lock_idx = (txq_ctrl->uar_mmap_offset / page_size) &
> + MLX5_UAR_PAGE_NUM_MASK;
> + txq->uar_lock = &priv->uar_lock[lock_idx]; #endif
> }
> return 0;
> }
> @@ -511,6 +520,8 @@ struct mlx5_txq_ibv *
> rte_atomic32_inc(&txq_ibv->refcnt);
> if (qp.comp_mask & MLX5DV_QP_MASK_UAR_MMAP_OFFSET) {
> txq_ctrl->uar_mmap_offset = qp.uar_mmap_offset;
> + DRV_LOG(DEBUG, "port %u: uar_mmap_offset 0x%lx",
> + dev->data->port_id, txq_ctrl->uar_mmap_offset);
> } else {
> DRV_LOG(ERR,
> "port %u failed to retrieve UAR info, invalid"
> --
> 1.8.3.1
next prev parent reply other threads:[~2018-07-02 7:05 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-28 7:12 Moti Haimovsky
2018-07-02 7:05 ` Shahaf Shuler [this message]
2018-07-02 10:39 ` Mordechay Haimovsky
2018-07-02 11:11 ` [dpdk-dev] [PATCH v2] " Moti Haimovsky
2018-07-02 20:59 ` Yongseok Koh
2018-07-03 12:03 ` Shahaf Shuler
2018-07-04 13:48 ` Ferruh Yigit
2018-07-05 10:09 ` Mordechay Haimovsky
2018-07-05 11:27 ` Ferruh Yigit
2018-07-11 12:22 ` Shahaf Shuler
2018-07-05 17:07 ` Mordechay Haimovsky
2018-07-05 17:49 ` Ferruh Yigit
2018-07-09 7:23 ` Shahaf Shuler
2018-07-08 17:04 ` Mordechay Haimovsky
2018-07-12 12:01 ` [dpdk-dev] [PATCH v3] " Moti Haimovsky
2018-07-13 6:16 ` Shahaf Shuler
2018-07-18 8:08 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DB7PR05MB4426F8CEEB5912BA76EC899FC3430@DB7PR05MB4426.eurprd05.prod.outlook.com \
--to=shahafs@mellanox.com \
--cc=adrien.mazarguil@6wind.com \
--cc=dev@dpdk.org \
--cc=motih@mellanox.com \
--cc=yskoh@mellanox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).