From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 08838A2EDB for ; Fri, 6 Sep 2019 14:26:40 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 3F6941F318; Fri, 6 Sep 2019 14:26:38 +0200 (CEST) Received: from EUR03-VE1-obe.outbound.protection.outlook.com (mail-eopbgr50080.outbound.protection.outlook.com [40.107.5.80]) by dpdk.org (Postfix) with ESMTP id C689E1F2C6; Fri, 6 Sep 2019 14:26:35 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=LFaDrIJiU0V4ra8kxsGqAleOZbTRP1D6kwFGFdLe1aqqsv6VY9xfqNHiuzXNDAa0WVTPKdLNk6HV+C5uVofehWFxylfUaQGF1sB8kL1HGHlwyQuz47Q7x6ZeL3txIwWBIi0sKknwmmPsT4TmauRgx+Wl0ZGpjvrkyFaNGFi8GRl5p8R4t49oxXyJ1sr6ayF7T23PtmEAZDLl637ebHhWka7qdzxrym+8Qy62AEtNWu6Zyp6eVqNcqDqr8D1GSUps94dRbkg/nHBd3TA7O/SqgDhVti5OIf7+2rzz5aXZDh0dUVPcWMbwCb89Pe3aRgOq1Bxk1Y7L1NXCnnN27LgprQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LUb06GfNsEKiVRGpMuIWHUc397NsWQfqguV5bcnJRfk=; b=FgU8fWsWbt8A5H7qUoP1p13lqUqAmbwzNUQW/mYnF7pq+QCCxMl7jZbjlAKpl7MNbgriGXgAciNkhKgOOlQe6aUULDUuF5FjM510UnVlk/IjhgYtS2T/+GE8EI/4wgdN5juU0TATG8prWIgv/VcV2DF+Dk7NvJ3ArfsIZjz8VPGbUlA6fXKUg3WN84PsbfLx7llBZsVkeYkHRRbVUbS0S1pelTW0Z3Q0gqK1Yfoew6QuytxFkiEZvmp8ZhkAJFM3GuT2KV58b+luWOMHIhHTcPgpQp/AAzg4QHOUGS8HUnNvNbyKuVKNUsJ7X7+FpPfxd+1g0Ze+7PgWynbmESttVg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LUb06GfNsEKiVRGpMuIWHUc397NsWQfqguV5bcnJRfk=; b=LAzF0auad9EqgOj8+HpZLFPDhM3qGQ503Dfjfb3WzCnbcLyPuYb1lcpwAoIduT3VrvlBHpyhMmKiBoLMYhCSOrBZwuzMQwuA9SX0QSxvzds5b4ZSIDEqfcHDlGPAxH3EHb+I/fJCgq7yV72DjJbbbQFlhLrz72VbMn8VqYvoOzA= Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.188.154) by AM4PR05MB3490.eurprd05.prod.outlook.com (10.171.186.155) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2241.15; Fri, 6 Sep 2019 12:26:34 +0000 Received: from AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::da9:65ba:1323:a39b]) by AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::da9:65ba:1323:a39b%7]) with mapi id 15.20.2220.022; Fri, 6 Sep 2019 12:26:34 +0000 From: Slava Ovsiienko To: "Phil Yang (Arm Technology China)" , Yongseok Koh , Matan Azrad , =?iso-8859-1?Q?N=E9lio_Laranjeiro?= , "dev@dpdk.org" CC: Thomas Monjalon , "jerinj@marvell.com" , Honnappa Nagarahalli , "Gavin Hu (Arm Technology China)" , nd , "stable@dpdk.org" , nd Thread-Topic: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on aarch64 Thread-Index: AQHVY9h4bf4CU/U3oE66T8cjKZO3GKcc66LwgAFTywCAAE2JgA== Date: Fri, 6 Sep 2019 12:26:34 +0000 Message-ID: References: <1567680908-31210-1-git-send-email-phil.yang@arm.com> <1567680908-31210-2-git-send-email-phil.yang@arm.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; x-originating-ip: [95.164.10.10] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e592621a-23e9-440d-e668-08d732c574c8 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600166)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:AM4PR05MB3490; x-ms-traffictypediagnostic: AM4PR05MB3490:|AM4PR05MB3490: x-ms-exchange-purlcount: 1 x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:10000; x-forefront-prvs: 0152EBA40F x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(39860400002)(396003)(346002)(136003)(376002)(366004)(13464003)(199004)(189003)(52536014)(316002)(45080400002)(33656002)(66476007)(7696005)(8676002)(478600001)(966005)(2501003)(110136005)(66446008)(256004)(26005)(14444005)(229853002)(102836004)(8936002)(54906003)(186003)(76116006)(99286004)(81156014)(66946007)(66556008)(64756008)(6506007)(53546011)(71200400001)(71190400001)(81166006)(86362001)(3846002)(6436002)(305945005)(76176011)(25786009)(74316002)(53936002)(486006)(2906002)(11346002)(476003)(6246003)(4326008)(446003)(55016002)(6306002)(9686003)(66574012)(5660300002)(14454004)(6116002)(7736002)(66066001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3490; H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: Vow/SBIikOg7csq8zIWnKE+T2DDNKqATp/Dvbv0Openimh7z67wOusS3nreBl1iP1hEuvzM8pLRJtvi8C8EhReCrTsvH5dO3SI/U/bWp0qV/Fyoq/cIJKITnfNrJQye5oKWAU5fijAtL+ezQVgjc4vH6fnhh29YAu/AcxT0rlI7iczhvK6aaMhbsgI+Gqt87OA/7vkoBC6RK8YyfGk43wOsIYb2ZLl6dZm7TQVXol3ZwSdFVXxD9pV/Msho1iDAgYy9n/z25Tt2kAHvMUVyUw/j/6dugun5+/hV8IW+cMsisboOws5B414Ane/k8/eH0JaMxbrV2JijFowDDvGZQE7ocCjuCTQ8ode9CYTG59s7zW5Ety/S83t4fIu0046LrgaLVGGZ3OZvYGdJsDjh/kFUot8OVjPHfo3pS3vlMjrM= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: e592621a-23e9-440d-e668-08d732c574c8 X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Sep 2019 12:26:34.3754 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: DBKOH5fyBXM4MtiZjgDw8Urh/koPvJF0ozKLobUanPdqxdBcT6dPJLdgm7PbFb+nJmRgKOKHnTWnedpYmNrjvNZfc8+QU1bRwj2jroMwYfM= X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3490 Subject: Re: [dpdk-dev] [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on aarch64 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, Phil Thanks for explanations, please, see below. > -----Original Message----- > From: Phil Yang (Arm Technology China) > Sent: Friday, September 6, 2019 10:20 > To: Slava Ovsiienko ; Yongseok Koh > ; Matan Azrad ; N=E9lio > Laranjeiro ; dev@dpdk.org > Cc: Thomas Monjalon ; jerinj@marvell.com; > Honnappa Nagarahalli ; Gavin Hu (Arm > Technology China) ; nd ; > stable@dpdk.org; nd > Subject: RE: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on > aarch64 >=20 > Hi, Slava >=20 > Thanks for your comments. >=20 > > -----Original Message----- > > From: Slava Ovsiienko > > Sent: Thursday, September 5, 2019 8:12 PM > > To: Phil Yang (Arm Technology China) ; > > yskoh@mellanox.com; Matan Azrad ; N=E9lio > Laranjeiro > > ; dev@dpdk.org > > Cc: thomas@monjalon.net; jerinj@marvell.com; Honnappa Nagarahalli > > ; Gavin Hu (Arm Technology China) > > ; nd ; stable@dpdk.org > > Subject: RE: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization > > on > > aarch64 > > > > Hi, Phil > > > > This point is in datapath and performance is very critical. > > The rte_cio_wmb() may take a lot of CPU cycles, waiting till all > > previous writes become visible for all external (relating to core) agen= ts. > > The Tx CQE doorbelling does not need any writes to other locations to > > be completed, >=20 > In my understanding, the PMD needs to wait till all txq fields update is > completed then ring the doorbell for HW. > Before the Tx CQE doorbelling, it will update the producer index of work > queue in Tx queue descriptor (at line 2037). txq->wqe_pi is exclusively software field, not related to HW directly. We should not wait for write completions to this one (assuming the tx_burst= () must be called with strict affinity settings and core can't be changed). There may be some concern about reading from "last_cqe->wqe_counter" at the line 2037. The compiler barrier was implemented to guarantee this read is issued before doorbell write. As for possible reordering these operations (read index from CQE at 2037 an= d=20 write to CQ doorbell register at 2046): a) read is performed from already cached area (we touched this CQE performing ownership check very recently) so it is quite unlikely to be completed after the doorbell write=20 b) The only risk to read wrong data is the case of CQE overwriting by HW with CQ buffer overflow. We create the CQ ring buffer with some extra space= , so completions which are "in-flight" can't overwrite the CQE is being read. The new completion request may be issued by setting flags in WQE descriptor= s and following SQ doorbell write, which is already prepended by wmb.=20 (in mlx5_tx_dbrec_cond_wmb(), line 4733). So, it seems there is no chance for CQE to be overwritten. > The compiler barrier cannot guarantee the ordering of these operations. S= o > use the explicit HW fence to achieve that. >=20 > As same as the HW Tx doorbell in vectorized Tx burst routine, it uses a w= rite > memory barrier to enforce the register update visible to HW immediately. > Section 32.5.2 in > https://eur03.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F%2Fdoc.d > pdk.org%2Fguides%2Fnics%2Fmlx5.html&data=3D02%7C01%7Cviacheslav > o%40mellanox.com%7C76938b08a9f145c4a0dd08d7329ab932%7Ca652971 > c7d2e4d9ba6a4d149256f461b%7C0%7C1%7C637033512428674501&sd > ata=3D8tdVjY0%2FHOUFo1%2BeHiuqkPadSS%2FHLeo4b97gdgEHgME%3D&am > p;reserved=3D0 This is quite different case. PMD build descriptors (WQEs) in the memory an= d must guarantee these data are visible for external agents before SQ (sending que= ue, not completion queue) doorbelling. Now there are no vectorized Tx routines = (since 19.08), but, of course, we still have the "true" write memory barrier (in mlx5_tx_d= brec_cond_wmb) for this case. >=20 > > the only concern is not to reorder/merge the writes to the same > > doorbell register of the same sending queue in the tx_burst() internal > sending loop/subsequent calls. > > > > As far as I know - the writes to the same location should not be > > reordered by any arch (may be merged if memory settings allow this, it > > is not critical for CQE doorbell), could you, please, explain why we > > need explicit hardware fence before CQE doorbell update? Do you think > > doorbell write might be rearranged with previously reads from the ring > buffer? > > > > WBR, > > Slava > > > > > -----Original Message----- > > > From: Phil Yang > > > Sent: Thursday, September 5, 2019 13:55 > > > To: Yongseok Koh ; Slava Ovsiienko > > > ; Matan Azrad ; > > N=E9lio > > > Laranjeiro ; dev@dpdk.org > > > Cc: Thomas Monjalon ; jerinj@marvell.com; > > > Honnappa.Nagarahalli@arm.com; gavin.hu@arm.com; nd@arm.com; > > > stable@dpdk.org > > > Subject: [PATCH 2/2] net/mlx5: fix Tx CQ doorbell synchronization on > > > aarch64 > > > > > > For the weaker memory model processors, the compiler barrier is not > > > sufficient to guarantee the coherent memory update be observed by > > > I/O device. It needs the coherent I/O memory barrier to enforce the > > > ordering > > of > > > Tx completion queue doorbell operation. > > > > > > Fixes: da1df1ccabad ("net/mlx5: fix completion queue drain loop") > > > Cc: stable@dpdk.org > > > > > > Suggested-by: Gavin Hu > > > Signed-off-by: Phil Yang > > > Reviewed-by: Gavin Hu > > > --- > > > drivers/net/mlx5/mlx5_rxtx.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/net/mlx5/mlx5_rxtx.c > > > b/drivers/net/mlx5/mlx5_rxtx.c index 4c01187..c11148b 100644 > > > --- a/drivers/net/mlx5/mlx5_rxtx.c > > > +++ b/drivers/net/mlx5/mlx5_rxtx.c > > > @@ -2042,7 +2042,7 @@ mlx5_tx_comp_flush(struct mlx5_txq_data > > > *restrict txq, > > > } else { > > > return; > > > } > > > - rte_compiler_barrier(); > > > + rte_cio_wmb(); > > > *txq->cq_db =3D rte_cpu_to_be_32(txq->cq_ci); > > > if (likely(tail !=3D txq->elts_tail)) { > > > mlx5_tx_free_elts(txq, tail, olx); > > > -- > > > 2.7.4