From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40076.outbound.protection.outlook.com [40.107.4.76]) by dpdk.org (Postfix) with ESMTP id 930BB3195 for ; Mon, 2 Jul 2018 09:05:04 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=bhIpYi+/VYchRkicS2f+7QoxRLr57Qruq2nUGoNN/fQ=; b=R3MQN9xuvDBbSY67wPLvPkQsJdTOBBhZHME402yhm6Ch7xtzI92dcT7wfwdHe7I/I0qbXYLDsr1Go4jw6dNSzqSIwSd5Z4VRyWmpeDUiOl272Nk6O+a/caXdKINjMrvRHTsbqhdt33bTsqkKZNWOykYIjLW2StNUjpx/VCS6FHU= Received: from DB7PR05MB4426.eurprd05.prod.outlook.com (52.134.109.15) by DB7PR05MB4105.eurprd05.prod.outlook.com (52.134.107.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.906.26; Mon, 2 Jul 2018 07:05:03 +0000 Received: from DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::d9c6:913c:c361:f7b7]) by DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::d9c6:913c:c361:f7b7%5]) with mapi id 15.20.0906.026; Mon, 2 Jul 2018 07:05:03 +0000 From: Shahaf Shuler To: Mordechay Haimovsky , Yongseok Koh , Adrien Mazarguil CC: "dev@dpdk.org" , Mordechay Haimovsky Thread-Topic: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems Thread-Index: AQHUDq9/FJ1Ob90I90WVN5xJhOV9kqR7b0uQ Date: Mon, 2 Jul 2018 07:05:02 +0000 Message-ID: References: <1530169969-6708-1-git-send-email-motih@mellanox.com> In-Reply-To: <1530169969-6708-1-git-send-email-motih@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=shahafs@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB7PR05MB4105; 7:mezoXsvHXE7SMZNLLkpp/Z9Kz1sPOtzzPQzgqKw5pDpMqWAmxhWQzSmkDR8Z6UM5qCj/EPb69/oChgtqkJpxrqwM959rMa9S52bZypZ87oFrHUflDeDd/g73JIJSRwtslfwXipsS6OV4535ZqmvbM9OF1Oly1ROjDP1kiHGDrLcKvOxQDeD6NQCxxJK8ghnj68InrXI6y6nO3SU007TL8aPryxKv8uNPd0Zol7Zl76UMYUxHWYopyvRm3whzZ1/R x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 6d9488ae-eab2-4c8b-fc07-08d5dfea2221 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989117)(5600053)(711020)(48565401081)(2017052603328)(7153060)(7193020); SRVR:DB7PR05MB4105; x-ms-traffictypediagnostic: DB7PR05MB4105: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(189930954265078)(45079756050767); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3231254)(944501410)(52105095)(10201501046)(3002001)(6055026)(149027)(150027)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123560045)(20161123562045)(20161123564045)(6072148)(201708071742011)(7699016); SRVR:DB7PR05MB4105; BCL:0; PCL:0; RULEID:; SRVR:DB7PR05MB4105; x-forefront-prvs: 07215D0470 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(136003)(396003)(376002)(366004)(39860400002)(346002)(189003)(199004)(66066001)(256004)(14444005)(316002)(26005)(74316002)(99286004)(2900100001)(110136005)(54906003)(5250100002)(305945005)(102836004)(229853002)(76176011)(7696005)(7736002)(81166006)(55016002)(81156014)(8936002)(6506007)(8676002)(9686003)(6306002)(6436002)(3846002)(14454004)(25786009)(53936002)(105586002)(2906002)(53946003)(478600001)(106356001)(4326008)(45080400002)(33656002)(6246003)(486006)(476003)(11346002)(446003)(966005)(5660300001)(86362001)(97736004)(575784001)(68736007)(107886003)(6116002); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR05MB4105; H:DB7PR05MB4426.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: nowPFSJGx6IuzR6gXPHoYNuJinHuM14rIklZ/XmFlGCE4x4D3m+x34Tn/BBruHdWJHiDzlSruruyUYZcyl1+TlbiVPt4GxFSx7egPoqCWTw9jZfgEHhlXESDQolpZMgIPZJ3EAXzUEZh6L/BJzlVGgImvyAtBzNKNbHYySHFgtNg0gvUq1idZuB6vPjmIYmzUZY5RcXUDPaVysq08TLvwQrTA6n5Kpv1BfG6930AkgT7hNzwrS0wJnRt9zJaXRletdbqI91XHERRs23Gt3yhJRVpSp1ANKoUD5XthQtya9nCB6EoYe4zlp0HfeBZ9cQ9HdPBu4ChbhQpOK8HyTWE2AmDUedlfyvbpeOpzWyxxm0= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 6d9488ae-eab2-4c8b-fc07-08d5dfea2221 X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Jul 2018 07:05:02.9380 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR05MB4105 Subject: Re: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Jul 2018 07:05:04 -0000 Hi Moty, Few nits, Also please fix the check patch warning : ### net/mlx5: add support for 32bit systems =20 =20 CHECK:OPEN_ENDED_LINE: Lines should not end with a '(' =20 #235: FILE: drivers/net/mlx5/mlx5_rxtx.c:1591: =20 + addr_64 =3D rte_cpu_to_be_64( =20 =20 total: 0 errors, 0 warnings, 1 checks, 311 lines checked=20 =20 Thursday, June 28, 2018 10:13 AM, Moti Haimovsky: > Subject: [dpdk-dev] [PATCH] net/mlx5: add support for 32bit systems >=20 > This patch adds support for building and running mlx5 PMD on 32bit system= s > such as i686. >=20 > The main issue to tackle was handling the 32bit access to the UAR as quot= ed > from the mlx5 PRM: > QP and CQ DoorBells require 64-bit writes. For best performance, it is > recommended to execute the QP/CQ DoorBell as a single 64-bit write > operation. For platforms that do not support 64 bit writes, it is possibl= e to > issue the 64 bits DoorBells through two consecutive writes, each write 32 > bits, as described below: > * The order of writing each of the Dwords is from lower to upper > addresses. > * No other DoorBell can be rung (or even start ringing) in the midst of > an on-going write of a DoorBell over a given UAR page. > The last rule implies that in a multi-threaded environment, the access to= a > UAR page (which can be accessible by all threads in the process) must be > synchronized (for example, using a semaphore) unless an atomic write of 6= 4 > bits in a single bus operation is guaranteed. Such a synchronization is n= ot > required for when ringing DoorBells on different UAR pages. >=20 > Signed-off-by: Moti Haimovsky > --- > doc/guides/nics/features/mlx5.ini | 1 + > doc/guides/nics/mlx5.rst | 11 +++++++ > drivers/net/mlx5/mlx5.c | 8 ++++- > drivers/net/mlx5/mlx5.h | 5 +++ > drivers/net/mlx5/mlx5_defs.h | 18 ++++++++-- > drivers/net/mlx5/mlx5_rxq.c | 6 +++- > drivers/net/mlx5/mlx5_rxtx.c | 22 +++++++------ > drivers/net/mlx5/mlx5_rxtx.h | 69 > ++++++++++++++++++++++++++++++++++++++- > drivers/net/mlx5/mlx5_txq.c | 13 +++++++- > 9 files changed, 137 insertions(+), 16 deletions(-) >=20 > diff --git a/doc/guides/nics/features/mlx5.ini > b/doc/guides/nics/features/mlx5.ini > index e75b14b..b28b43e 100644 > --- a/doc/guides/nics/features/mlx5.ini > +++ b/doc/guides/nics/features/mlx5.ini > @@ -43,5 +43,6 @@ Multiprocess aware =3D Y > Other kdrv =3D Y > ARMv8 =3D Y > Power8 =3D Y > +x86-32 =3D Y > x86-64 =3D Y > Usage doc =3D Y > diff --git a/doc/guides/nics/mlx5.rst b/doc/guides/nics/mlx5.rst index > 7dd9c1c..cb9d5d8 100644 > --- a/doc/guides/nics/mlx5.rst > +++ b/doc/guides/nics/mlx5.rst > @@ -50,6 +50,8 @@ Features > -------- >=20 > - Multi arch support: x86_64, POWER8, ARMv8. > +- Support for i686 is available only when working with > + rdma-core version 18.0 or above, built with 32bit support. I think we can just add i686 to the supported arch. The limitation on the r= dma-core version is well documented below. > - Multiple TX and RX queues. > - Support for scattered TX and RX frames. > - IPv4, IPv6, TCPv4, TCPv6, UDPv4 and UDPv6 RSS on any number of queues. > @@ -136,6 +138,11 @@ Limitations > enabled (``rxq_cqe_comp_en``) at the same time, RSS hash result is not > fully > supported. Some Rx packets may not have PKT_RX_RSS_HASH. >=20 > +- Building for i686 is only supported with: > + > + - rdma-core version 18.0 or above built with 32bit support. > + - Kernel version 4.14.41 or above. Why the kernel is related? The rdma-core I understand.=20 > + > Statistics > ---------- >=20 > @@ -477,6 +484,10 @@ RMDA Core with Linux Kernel > - Minimal kernel version : v4.14 or the most recent 4.14-rc (see `Linux > installation documentation`_) > - Minimal rdma-core version: v15+ commit 0c5f5765213a ("Merge pull > request #227 from yishaih/tm") > (see `RDMA Core installation documentation`_) > +- When building for i686 use: > + > + - rdma-core version 18.0 or above built with 32bit support. > + - Kernel version 4.14.41 or above. >=20 > .. _`Linux installation documentation`: > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fgit. > kernel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fstable%2Flinux- > stable.git%2Fplain%2FDocumentation%2Fadmin- > guide%2FREADME.rst&data=3D02%7C01%7Cshahafs%40mellanox.com%7C3793 > 359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d149256f461b%7 > C0%7C0%7C636657668016130861&sdata=3DyFHd7tQET5SqIcPgj66BSuwJp3sydo > ujC0ldCMkChVE%3D&reserved=3D0 > .. _`RDMA Core installation documentation`: > https://emea01.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fra > w.githubusercontent.com%2Flinux-rdma%2Frdma- > core%2Fmaster%2FREADME.md&data=3D02%7C01%7Cshahafs%40mellanox.co > m%7C3793359a175d46b47c2508d5dcc69ff1%7Ca652971c7d2e4d9ba6a4d1492 > 56f461b%7C0%7C0%7C636657668016130861&sdata=3D4LNh%2Fr5vM4BJeizvEIxi > ShMrfcx0NrlBFWz4V2wA%2FkY%3D&reserved=3D0 > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > f0e6ed7..5d0f706 100644 > --- a/drivers/net/mlx5/mlx5.c > +++ b/drivers/net/mlx5/mlx5.c > @@ -567,7 +567,7 @@ > rte_memseg_walk(find_lower_va_bound, &addr); >=20 > /* keep distance to hugepages to minimize potential conflicts. */ > - addr =3D RTE_PTR_SUB(addr, MLX5_UAR_OFFSET + MLX5_UAR_SIZE); > + addr =3D RTE_PTR_SUB(addr, (uintptr_t)(MLX5_UAR_OFFSET + > +MLX5_UAR_SIZE)); > /* anonymous mmap, no real memory consumption. */ > addr =3D mmap(addr, MLX5_UAR_SIZE, > PROT_NONE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); > @@ -953,6 +953,12 @@ > priv->port =3D port; > priv->pd =3D pd; > priv->mtu =3D ETHER_MTU; > +#ifndef RTE_ARCH_64 > + /* Initialize UAR access locks for 32bit implementations. */ > + rte_spinlock_init(&priv->uar_lock_cq); > + for (i =3D 0; i < MLX5_UAR_PAGE_NUM_MAX; i++) > + rte_spinlock_init(&priv->uar_lock[i]); > +#endif > err =3D mlx5_args(&config, pci_dev->device.devargs); > if (err) { > err =3D rte_errno; > diff --git a/drivers/net/mlx5/mlx5.h b/drivers/net/mlx5/mlx5.h index > 997b04a..2da32cd 100644 > --- a/drivers/net/mlx5/mlx5.h > +++ b/drivers/net/mlx5/mlx5.h > @@ -198,6 +198,11 @@ struct priv { > /* Context for Verbs allocator. */ > int nl_socket; /* Netlink socket. */ > uint32_t nl_sn; /* Netlink message sequence number. */ > +#ifndef RTE_ARCH_64 > + rte_spinlock_t uar_lock_cq; /* CQs share a common distinct UAR */ > + rte_spinlock_t uar_lock[MLX5_UAR_PAGE_NUM_MAX]; > + /* UAR same-page access control required in 32bit implementations. > */ > +#endif > }; >=20 > #define PORT_ID(priv) ((priv)->dev_data->port_id) diff --git > a/drivers/net/mlx5/mlx5_defs.h b/drivers/net/mlx5/mlx5_defs.h index > 5bbbec2..f6ec415 100644 > --- a/drivers/net/mlx5/mlx5_defs.h > +++ b/drivers/net/mlx5/mlx5_defs.h > @@ -87,14 +87,28 @@ > #define MLX5_LINK_STATUS_TIMEOUT 10 >=20 > /* Reserved address space for UAR mapping. */ -#define MLX5_UAR_SIZE > (1ULL << 32) > +#define MLX5_UAR_SIZE (1ULL << (sizeof(uintptr_t) * 4)) >=20 > /* Offset of reserved UAR address space to hugepage memory. Offset is > used here > * to minimize possibility of address next to hugepage being used by oth= er > code > * in either primary or secondary process, failing to map TX UAR would m= ake > TX > * packets invisible to HW. > */ > -#define MLX5_UAR_OFFSET (1ULL << 32) > +#define MLX5_UAR_OFFSET (1ULL << (sizeof(uintptr_t) * 4)) > + > +/* Maximum number of UAR pages used by a port, > + * These are the size and mask for an array of mutexes used to > +synchronize > + * the access to port's UARs on platforms that do not support 64 bit wri= tes. > + * In such systems it is possible to issue the 64 bits DoorBells > +through two > + * consecutive writes, each write 32 bits. The access to a UAR page > +(which can > + * be accessible by all threads in the process) must be synchronized > + * (for example, using a semaphore). Such a synchronization is not > +required > + * when ringing DoorBells on different UAR pages. > + * A port with 512 Tx queues uses 8, 4kBytes, UAR pages which are > +shared > + * among the ports. > + */ > +#define MLX5_UAR_PAGE_NUM_MAX 64 > +#define MLX5_UAR_PAGE_NUM_MASK ((MLX5_UAR_PAGE_NUM_MAX) > - 1) >=20 > /* Log 2 of the default number of strides per WQE for Multi-Packet RQ. *= / > #define MLX5_MPRQ_STRIDE_NUM_N 6U diff --git > a/drivers/net/mlx5/mlx5_rxq.c b/drivers/net/mlx5/mlx5_rxq.c index > 08dd559..820048f 100644 > --- a/drivers/net/mlx5/mlx5_rxq.c > +++ b/drivers/net/mlx5/mlx5_rxq.c > @@ -643,7 +643,8 @@ > doorbell =3D (uint64_t)doorbell_hi << 32; > doorbell |=3D rxq->cqn; > rxq->cq_db[MLX5_CQ_ARM_DB] =3D rte_cpu_to_be_32(doorbell_hi); > - rte_write64(rte_cpu_to_be_64(doorbell), cq_db_reg); > + mlx5_uar_write64(rte_cpu_to_be_64(doorbell), > + cq_db_reg, rxq->uar_lock_cq); > } >=20 > /** > @@ -1445,6 +1446,9 @@ struct mlx5_rxq_ctrl * > tmpl->rxq.elts_n =3D log2above(desc); > tmpl->rxq.elts =3D > (struct rte_mbuf *(*)[1 << tmpl->rxq.elts_n])(tmpl + 1); > +#ifndef RTE_ARCH_64 > + tmpl->rxq.uar_lock_cq =3D &priv->uar_lock_cq; #endif > tmpl->idx =3D idx; > rte_atomic32_inc(&tmpl->refcnt); > LIST_INSERT_HEAD(&priv->rxqsctrl, tmpl, next); diff --git > a/drivers/net/mlx5/mlx5_rxtx.c b/drivers/net/mlx5/mlx5_rxtx.c index > a7ed8d8..ec35ea0 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.c > +++ b/drivers/net/mlx5/mlx5_rxtx.c > @@ -495,6 +495,7 @@ > volatile struct mlx5_wqe_ctrl *last_wqe =3D NULL; > unsigned int segs_n =3D 0; > const unsigned int max_inline =3D txq->max_inline; > + uint64_t addr_64; >=20 > if (unlikely(!pkts_n)) > return 0; > @@ -711,12 +712,12 @@ > ds =3D 3; > use_dseg: > /* Add the remaining packet as a simple ds. */ > - addr =3D rte_cpu_to_be_64(addr); > + addr_64 =3D rte_cpu_to_be_64(addr); > *dseg =3D (rte_v128u32_t){ > rte_cpu_to_be_32(length), > mlx5_tx_mb2mr(txq, buf), > - addr, > - addr >> 32, > + addr_64, > + addr_64 >> 32, > }; > ++ds; > if (!segs_n) > @@ -750,12 +751,12 @@ > total_length +=3D length; > #endif > /* Store segment information. */ > - addr =3D rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > uintptr_t)); > + addr_64 =3D rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > uintptr_t)); > *dseg =3D (rte_v128u32_t){ > rte_cpu_to_be_32(length), > mlx5_tx_mb2mr(txq, buf), > - addr, > - addr >> 32, > + addr_64, > + addr_64 >> 32, > }; > (*txq->elts)[++elts_head & elts_m] =3D buf; > if (--segs_n) > @@ -1450,6 +1451,7 @@ > unsigned int mpw_room =3D 0; > unsigned int inl_pad =3D 0; > uint32_t inl_hdr; > + uint64_t addr_64; > struct mlx5_mpw mpw =3D { > .state =3D MLX5_MPW_STATE_CLOSED, > }; > @@ -1586,13 +1588,13 @@ > ((uintptr_t)mpw.data.raw + > inl_pad); > (*txq->elts)[elts_head++ & elts_m] =3D buf; > - addr =3D rte_cpu_to_be_64(rte_pktmbuf_mtod(buf, > - uintptr_t)); > + addr_64 =3D rte_cpu_to_be_64( > + rte_pktmbuf_mtod(buf, uintptr_t)); > *dseg =3D (rte_v128u32_t) { > rte_cpu_to_be_32(length), > mlx5_tx_mb2mr(txq, buf), > - addr, > - addr >> 32, > + addr_64, > + addr_64 >> 32, > }; > mpw.data.raw =3D (volatile void *)(dseg + 1); > mpw.total_len +=3D (inl_pad + sizeof(*dseg)); diff --git > a/drivers/net/mlx5/mlx5_rxtx.h b/drivers/net/mlx5/mlx5_rxtx.h index > 0007be0..2448d73 100644 > --- a/drivers/net/mlx5/mlx5_rxtx.h > +++ b/drivers/net/mlx5/mlx5_rxtx.h > @@ -26,6 +26,8 @@ > #include > #include > #include > +#include > +#include >=20 > #include "mlx5_utils.h" > #include "mlx5.h" > @@ -115,6 +117,10 @@ struct mlx5_rxq_data { > void *cq_uar; /* CQ user access region. */ > uint32_t cqn; /* CQ number. */ > uint8_t cq_arm_sn; /* CQ arm seq number. */ > +#ifndef RTE_ARCH_64 > + rte_spinlock_t *uar_lock_cq; > + /* CQ (UAR) access lock required for 32bit implementations */ #endif > uint32_t tunnel; /* Tunnel information. */ } __rte_cache_aligned; >=20 > @@ -196,6 +202,10 @@ struct mlx5_txq_data { > volatile void *bf_reg; /* Blueflame register remapped. */ > struct rte_mbuf *(*elts)[]; /* TX elements. */ > struct mlx5_txq_stats stats; /* TX queue counters. */ > +#ifndef RTE_ARCH_64 > + rte_spinlock_t *uar_lock; > + /* UAR access lock required for 32bit implementations */ #endif > } __rte_cache_aligned; >=20 > /* Verbs Rx queue elements. */ > @@ -348,6 +358,63 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, struct > rte_mbuf **pkts, uint32_t mlx5_rx_addr2mr_bh(struct mlx5_rxq_data > *rxq, uintptr_t addr); uint32_t mlx5_tx_addr2mr_bh(struct mlx5_txq_data > *txq, uintptr_t addr); >=20 > +/** > + * Provide safe 64bit store operation to mlx5 UAR region for both 32bit > +and > + * 64bit architectures. > + * > + * @param val > + * value to write in CPU endian format. > + * @param addr > + * Address to write to. > + * @param lock > + * Address of the lock to use for that UAR access. > + */ > +static __rte_always_inline void > +__mlx5_uar_write64_relaxed(uint64_t val, volatile void *addr, > + rte_spinlock_t *lock __rte_unused) { #ifdef > RTE_ARCH_64 > + rte_write64_relaxed(val, addr); > +#else /* !RTE_ARCH_64 */ > + rte_spinlock_lock(lock); > + rte_write32_relaxed(val, addr); > + rte_io_wmb(); > + rte_write32_relaxed(val >> 32, > + (volatile void *)((volatile char *)addr + 4)); > + rte_spinlock_unlock(lock); > +#endif > +} > + > +/** > + * Provide safe 64bit store operation to mlx5 UAR region for both 32bit > +and > + * 64bit architectures while guaranteeing the order of execution with > +the > + * code being executed. > + * > + * @param val > + * value to write in CPU endian format. > + * @param addr > + * Address to write to. > + * @param lock > + * Address of the lock to use for that UAR access. > + */ > +static __rte_always_inline void > +__mlx5_uar_write64(uint64_t val, volatile void *addr, rte_spinlock_t > +*lock) { > + rte_io_wmb(); > + __mlx5_uar_write64_relaxed(val, addr, lock); } > + > +/* Assist macros, used instead of directly calling the functions the > +wrap. */ #ifdef RTE_ARCH_64 #define mlx5_uar_write64_relaxed(val, dst, > +lock) \ > + __mlx5_uar_write64_relaxed(val, dst, NULL) #define > +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, NULL) > +#else #define mlx5_uar_write64_relaxed(val, dst, lock) \ > + __mlx5_uar_write64_relaxed(val, dst, lock) #define > +mlx5_uar_write64(val, dst, lock) __mlx5_uar_write64(val, dst, lock) > +#endif > + > #ifndef NDEBUG > /** > * Verify or set magic value in CQE. > @@ -614,7 +681,7 @@ uint16_t mlx5_rx_burst_vec(void *dpdk_txq, struct > rte_mbuf **pkts, > *txq->qp_db =3D rte_cpu_to_be_32(txq->wqe_ci); > /* Ensure ordering between DB record and BF copy. */ > rte_wmb(); > - *dst =3D *src; > + mlx5_uar_write64_relaxed(*src, dst, txq->uar_lock); > if (cond) > rte_wmb(); > } > diff --git a/drivers/net/mlx5/mlx5_txq.c b/drivers/net/mlx5/mlx5_txq.c > index 669b913..dc786d4 100644 > --- a/drivers/net/mlx5/mlx5_txq.c > +++ b/drivers/net/mlx5/mlx5_txq.c > @@ -255,6 +255,9 @@ > struct mlx5_txq_ctrl *txq_ctrl; > int already_mapped; > size_t page_size =3D sysconf(_SC_PAGESIZE); > +#ifndef RTE_ARCH_64 > + unsigned int lock_idx; > +#endif >=20 > memset(pages, 0, priv->txqs_n * sizeof(uintptr_t)); > /* > @@ -281,7 +284,7 @@ > } > /* new address in reserved UAR address space. */ > addr =3D RTE_PTR_ADD(priv->uar_base, > - uar_va & (MLX5_UAR_SIZE - 1)); > + uar_va & (uintptr_t)(MLX5_UAR_SIZE - 1)); > if (!already_mapped) { > pages[pages_n++] =3D uar_va; > /* fixed mmap to specified address in reserved @@ - > 305,6 +308,12 @@ > else > assert(txq_ctrl->txq.bf_reg =3D=3D > RTE_PTR_ADD((void *)addr, off)); > +#ifndef RTE_ARCH_64 > + /* Assign a UAR lock according to UAR page number */ > + lock_idx =3D (txq_ctrl->uar_mmap_offset / page_size) & > + MLX5_UAR_PAGE_NUM_MASK; > + txq->uar_lock =3D &priv->uar_lock[lock_idx]; #endif > } > return 0; > } > @@ -511,6 +520,8 @@ struct mlx5_txq_ibv * > rte_atomic32_inc(&txq_ibv->refcnt); > if (qp.comp_mask & MLX5DV_QP_MASK_UAR_MMAP_OFFSET) { > txq_ctrl->uar_mmap_offset =3D qp.uar_mmap_offset; > + DRV_LOG(DEBUG, "port %u: uar_mmap_offset 0x%lx", > + dev->data->port_id, txq_ctrl->uar_mmap_offset); > } else { > DRV_LOG(ERR, > "port %u failed to retrieve UAR info, invalid" > -- > 1.8.3.1