From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 45EB5A053D; Thu, 23 Jul 2020 10:57:43 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A6B4D1BFFA; Thu, 23 Jul 2020 10:57:33 +0200 (CEST) Received: from EUR05-DB8-obe.outbound.protection.outlook.com (mail-db8eur05on2056.outbound.protection.outlook.com [40.107.20.56]) by dpdk.org (Postfix) with ESMTP id D39411BF94; Thu, 23 Jul 2020 10:57:29 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lL8T6IMI4dmAKpMc9lL9ZTyzikc6n5ZMq4KBAVQGtPQi4MlLaMyL1eJr8hr4fTngM9OwzjJuq8/uaM0IIZy8serUB3B+mZIajEcCvH01ZwaS+JembDg7zNmDBfoILmel7M8MUTlAl3s2ENdD646ri/0bxNETTPwePr/A6AxiMzjJHECr6BNPLK/rAQRpD3FSIrtic5NDmZVmIRWVDVJoBP3gmI9dfIzS3JHHShPFaGHs3j//X9txCeliciCGjp7gjA4dftyRmow36IRDp6p/XXCUUeA9mNRqgGimtQuPdukTuQ8sj1RpXD7IQULMZJ5fqSg9K3p1VFlsMFDPmkZwCw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4j7eVV0EmTzMG2wJaLa4BzvT64zAsb1dtlxlw015VUk=; b=lhRQIwhCnoGkI2gDdxGf5urzlFDzfoux/5wBwSSxUWzm9sBUJG94kwVm3MnxCzU7vRZ9nV1aJrW3fM42dI6xEFcaC7QPy6d04KUYtB135He7KTgXvMDd2fyAGhANyt06HEpj/6Eb/GpEmgEfVBpQ61qsEBhEAHEgoiE1yjrtUPgyEp48rP0fpKqJIvXxeNFYxFF+zDmLvpI2Zx09VXCM0/sACcLrtdOCzvs5hyauHv2CJ6LA+Dg3WiXRwev6DIJpATOCHhh/DqbDniXhyCMfBL7Hs3i7o4IKBvnayLy6QVDGBoYnux7e8eyDq3EGrbti7tCr2Z4B8vlyfCZO+7gsCw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=4j7eVV0EmTzMG2wJaLa4BzvT64zAsb1dtlxlw015VUk=; b=c0qnDmEKuLAFHeeDSNaxrJOrUJaD6X5xtEnghpqAdFSdl8FcdBv5EsYilFR7Cc7h7tXtJuUINhwYcK3nat5wdbZ4ZThvcgEnHEO8yyv9d6uE8kyzEIg7FMOJL5xzcjyz+ONj6CpMHt46SzoVDVunK1WZY3Ixjtavcl6CjoFa0YI= Received: from AM0PR05MB6707.eurprd05.prod.outlook.com (2603:10a6:20b:15b::17) by AM0PR05MB4402.eurprd05.prod.outlook.com (2603:10a6:208:5c::31) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3216.24; Thu, 23 Jul 2020 08:57:27 +0000 Received: from AM0PR05MB6707.eurprd05.prod.outlook.com ([fe80::ecd3:6008:3784:4012]) by AM0PR05MB6707.eurprd05.prod.outlook.com ([fe80::ecd3:6008:3784:4012%5]) with mapi id 15.20.3195.028; Thu, 23 Jul 2020 08:57:27 +0000 From: Raslan Darawsheh To: Alexander Kozyrev , "dev@dpdk.org" CC: "stable@dpdk.org" , Slava Ovsiienko Thread-Topic: [PATCH] net/mlx5: fix vectorized mini-CQE prefetching Thread-Index: AQHWYGdAQeUPYAJFKEyE463PJEpE0akU3V+w Date: Thu, 23 Jul 2020 08:57:27 +0000 Message-ID: References: <20200722203238.14250-1-akozyrev@mellanox.com> In-Reply-To: <20200722203238.14250-1-akozyrev@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: mellanox.com; dkim=none (message not signed) header.d=none;mellanox.com; dmarc=none action=none header.from=mellanox.com; x-originating-ip: [84.242.49.134] x-ms-publictraffictype: Email x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: db5a203d-ece0-4b29-a37d-08d82ee66cbf x-ms-traffictypediagnostic: AM0PR05MB4402: x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:7219; x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: 21SOq6EUqGZgAt7eNlcEtfZsPtA2w+ERb/e4/k5AnOwQ4WT+cdVn0cpxO0GisTPIh5XvbbEX7eTV/oSutSVNgO2yz1iMc4pepKbjRb62vQ+F7GuxdIjI0urGAOnWnL/F8hicqDdAph812+dhhnUhLq7COTqd0+SnGbsMwzM9TxwNL0wAaRsceuAXibdy5imykCDU82mKi3ZvhDNQ4RdGtzdbiBeFdFK9iNukX2x9z4JlJ0iAY8lG9mvgcoVV+WX73DyL6EJmvpk02JxEmZYH1N7eqxA681rpa12wLWd6rhaGwCly1WbuI6LsS+nS6jYhZ2Yj+o42VNcXHm741LsR3A== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:AM0PR05MB6707.eurprd05.prod.outlook.com; PTR:; CAT:NONE; SFTY:; SFS:(4636009)(366004)(136003)(396003)(376002)(346002)(39860400002)(107886003)(110136005)(5660300002)(316002)(54906003)(8676002)(9686003)(66446008)(66946007)(8936002)(66476007)(450100002)(64756008)(66556008)(76116006)(55016002)(4326008)(7696005)(33656002)(26005)(2906002)(186003)(478600001)(53546011)(6506007)(86362001)(52536014)(83380400001)(71200400001); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: TlU2LV60L5ao+PrwKDvAhXw3xhi7lM4zRsufQT2NjfSQkDQAxC+G+PeTIhV4DPX++KPVy8Cy+raJOP2d12OmszqWQeoR0FcZQNPzKQeGZTfthx8MYHkJmfS5V5hSkxsccLw2h5Gof1/HEm0OuhOd+TH3exQ6Z49UgH2wLNayMwZAo85PToorsidTgCr0k6v9co99WVxxugd0863bJPxbFJD8dOfwP/XlnObOEcx6aa/0aOG4pa/uCauZKus23gktYfUy+dAIOKyJ21pmQOTNmvmxLI+92JwOL7dav3cl7QZWJ3dmuEYhzo+Va5bG0wETTwgdWjVLNMkAR34Fkhds+y1STjdsmkvyqiN12jtRlBrEx1qGdRIVESCtwuL2FMH36h2WNlvFqq2YEsgaZp9Fj9ek0EeWeOkh0MxqZy0gipaFC7OQIaqQoEU/mnJxvqmc6s/lbn7lKxqG7Fc1hzf5esB0fPUoOJQRFnblXmUeT53l73NRYQP/M9pscJ7O9VNK Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: AM0PR05MB6707.eurprd05.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: db5a203d-ece0-4b29-a37d-08d82ee66cbf X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Jul 2020 08:57:27.3973 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: yS+gTWPSwDXzRzG6z1zbYluOMPD2lQivkZU0kMbjERi4++L8s0bOxtESQx8ADr0HjtrMz6KiGEQ0Q1zvp8RKNw== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR05MB4402 Subject: Re: [dpdk-dev] [PATCH] net/mlx5: fix vectorized mini-CQE prefetching X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, > -----Original Message----- > From: Alexander Kozyrev > Sent: Wednesday, July 22, 2020 11:33 PM > To: dev@dpdk.org > Cc: stable@dpdk.org; Raslan Darawsheh ; Slava > Ovsiienko > Subject: [PATCH] net/mlx5: fix vectorized mini-CQE prefetching >=20 > There was an optimization work to prefetch all the CQEs before > their invalidation. It allowed us to speed up the mini-CQE > decompression process by preheating the cache in the vectorized > Rx routine. >=20 > Prefetching of the next mini-CQE, on the other hand, showed > no difference in the performance on x86 platform. So, that was > removed. Unfortunately this caused the performance drop on ARM. >=20 > Prefetch the mini-CQE as well as well as the all the soon to be > invalidated CQEs to get both CQE and mini-CQE on the hot path. >=20 > Fixes: 28a4b9632 ("net/mlx5: prefetch CQEs for a faster decompression") > Cc: stable@dpdk.org >=20 > Signed-off-by: Alexander Kozyrev > Acked-by: Viacheslav Ovsiienko > --- > drivers/net/mlx5/mlx5_rxtx_vec_altivec.h | 3 ++- > drivers/net/mlx5/mlx5_rxtx_vec_neon.h | 3 +++ > drivers/net/mlx5/mlx5_rxtx_vec_sse.h | 3 ++- > 3 files changed, 7 insertions(+), 2 deletions(-) >=20 > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h > b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h > index f5414eebad..cb4ce1a099 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_altivec.h > @@ -158,7 +158,6 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > for (i =3D 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) > if (likely(pos + i < mcqe_n)) > rte_prefetch0((void *)(cq + pos + i)); > - > /* A.1 load mCQEs into a 128bit register. */ > mcqe1 =3D (vector unsigned char)vec_vsx_ld(0, > (signed int const *)&mcq[pos % 8]); > @@ -287,6 +286,8 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > pos +=3D MLX5_VPMD_DESCS_PER_LOOP; > /* Move to next CQE and invalidate consumed CQEs. */ > if (!(pos & 0x7) && pos < mcqe_n) { > + if (pos + 8 < mcqe_n) > + rte_prefetch0((void *)(cq + pos + 8)); > mcq =3D (void *)&(cq + pos)->pkt_info; > for (i =3D 0; i < 8; ++i) > cq[inv++].op_own =3D > MLX5_CQE_INVALIDATE; > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > index 555c342626..6c3149523e 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_neon.h > @@ -145,6 +145,7 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > -1UL << ((mcqe_n - pos) * > sizeof(uint16_t) * 8) : 0); > #endif > + > for (i =3D 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) > if (likely(pos + i < mcqe_n)) > rte_prefetch0((void *)(cq + pos + i)); > @@ -227,6 +228,8 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > pos +=3D MLX5_VPMD_DESCS_PER_LOOP; > /* Move to next CQE and invalidate consumed CQEs. */ > if (!(pos & 0x7) && pos < mcqe_n) { > + if (pos + 8 < mcqe_n) > + rte_prefetch0((void *)(cq + pos + 8)); > mcq =3D (void *)&(cq + pos)->pkt_info; > for (i =3D 0; i < 8; ++i) > cq[inv++].op_own =3D > MLX5_CQE_INVALIDATE; > diff --git a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h > b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h > index 34e3397115..554924d7fc 100644 > --- a/drivers/net/mlx5/mlx5_rxtx_vec_sse.h > +++ b/drivers/net/mlx5/mlx5_rxtx_vec_sse.h > @@ -135,7 +135,6 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > for (i =3D 0; i < MLX5_VPMD_DESCS_PER_LOOP; ++i) > if (likely(pos + i < mcqe_n)) > rte_prefetch0((void *)(cq + pos + i)); > - > /* A.1 load mCQEs into a 128bit register. */ > mcqe1 =3D _mm_loadu_si128((__m128i *)&mcq[pos % 8]); > mcqe2 =3D _mm_loadu_si128((__m128i *)&mcq[pos % 8 + 2]); > @@ -214,6 +213,8 @@ rxq_cq_decompress_v(struct mlx5_rxq_data *rxq, > volatile struct mlx5_cqe *cq, > pos +=3D MLX5_VPMD_DESCS_PER_LOOP; > /* Move to next CQE and invalidate consumed CQEs. */ > if (!(pos & 0x7) && pos < mcqe_n) { > + if (pos + 8 < mcqe_n) > + rte_prefetch0((void *)(cq + pos + 8)); > mcq =3D (void *)(cq + pos); > for (i =3D 0; i < 8; ++i) > cq[inv++].op_own =3D > MLX5_CQE_INVALIDATE; > -- > 2.24.1 Patch applied to next-net-mlx, Kindest regards, Raslan Darawsheh