From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2F2E14325D; Wed, 1 Nov 2023 09:29:55 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A8CF940151; Wed, 1 Nov 2023 09:29:54 +0100 (CET) Received: from NAM12-BN8-obe.outbound.protection.outlook.com (mail-bn8nam12on2079.outbound.protection.outlook.com [40.107.237.79]) by mails.dpdk.org (Postfix) with ESMTP id B4790400EF for ; Wed, 1 Nov 2023 09:29:53 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=M05agU1ziujFiMZnIoPpq0o19fcz2NnH/ylsguIxTgNbxS9aO49zmX08fW6BmxXMmRfIy13/zi0CIYZGdOSuaQnoxuYe3/bcoSCh9JfSIjFW1utq3wPyvWVdHsqy8Ik2Tv2KXPL5k1c+kon4nuOAKIYAcqWJ8BrldOm4FFLQT5d3i89+pZAaDpk0Mgb3qtaDPy20pb+8qHqZeDR4hg+X71Yt0vqMQNSXU4ddRD4GFUZx2ZKxTvbet1if/cuy9cTJ8Me7plyADU03IRgq01qZYWFaAxVtkWLNl1KNB5qYNGrQkU/33iIjgInLdUMliFLj/taklvXW3kpjIUPfIp57ZA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=e0vsYBUGRE/JAetvkGv3H1TpClFwat2dDYBM00XBDGI=; b=A+7ceyJ4ih8g/26o/iV9fFaLWa9TjUmjVV/fNUXeeJaQ4tqYed9zI41idSBLx6Xryq3uxJCBciC6XOuo+V7kyVfHmP088wmZ1iUSfmpBw7QNxb21jwHW91hUFNVCpw0HQTXiDgeKSkxtcWuXMow3juV/PKy0FiTCvTccgIyLD/dgNri9VjNNeFa7X+u5ecC31sF0Qb+ZFIZAxfk0WBMOA3d6PwmQ03/HA0wfWhZaun9y3BRKRZ75NxQJ/Qj14Oszgw2vqaga6DIraHVcjWVyXbSkRQfd2OjG8uY8cybGg9DYB4SxIzhPripTYdqZ7WBxrfphc9RpbSLR5WkThJxd4A== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=e0vsYBUGRE/JAetvkGv3H1TpClFwat2dDYBM00XBDGI=; b=pvgKI+7GZs9FW48OgvCxCwz/xQJUCrGBUzXGpyMB/RODM4LKIqwN1WX0+F8ISazdowSVn+wH221fciyRFjhRE2uYDH6IVXuHjUIXYQEb4FBLMjYY3d7HDEUvA8oJuhSmM9Q3fbhOQjPSADie0YzDygZm84GY3mfoJaAdZOoM2zPUWvxDbEf9QeQUGpmMhIdJ0+4IGL/bAerwVvo4RQegEN/sCONFqynTK4QK1I+w/7bLoS9Erd7GXu/d8D5Rqwul/bk8kaBsqM4AEybtXUS3URDdE3mo/AdC8FUN7XHJm8j7jze7XRXOYquJHu4nY5uPRYceV5Fpy67c7dXEGE+EQQ== Received: from DM6PR12MB3753.namprd12.prod.outlook.com (2603:10b6:5:1c7::18) by CH3PR12MB8934.namprd12.prod.outlook.com (2603:10b6:610:17a::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6954.19; Wed, 1 Nov 2023 08:29:50 +0000 Received: from DM6PR12MB3753.namprd12.prod.outlook.com ([fe80::5fd9:c0c:398c:7dfb]) by DM6PR12MB3753.namprd12.prod.outlook.com ([fe80::5fd9:c0c:398c:7dfb%2]) with mapi id 15.20.6954.019; Wed, 1 Nov 2023 08:29:49 +0000 From: Slava Ovsiienko To: Aaron Conole , "dev@dpdk.org" CC: John Romein , Raslan Darawsheh , Elena Agostini , Dmitry Kozlyuk , Matan Azrad , Ori Kam , Suanming Mou Subject: RE: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem Thread-Topic: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem Thread-Index: AQHZ+4dm1TeynLJJGk+zu13lMKLOV7BlP5Zw Date: Wed, 1 Nov 2023 08:29:49 +0000 Message-ID: References: <20231010143800.102459-1-aconole@redhat.com> In-Reply-To: <20231010143800.102459-1-aconole@redhat.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: DM6PR12MB3753:EE_|CH3PR12MB8934:EE_ x-ms-office365-filtering-correlation-id: b865b95d-5497-4027-6803-08dbdab4b688 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: zm9inpQFxMp4NgKypq0isHVf1d34kP00hB6UxrDaoWCpMDqT0Vjevhd5dM7pjmBMmOkB9bld0QRwidahH5ZlngykacEnOG+iMmPeuHtbZx6CUTc3ve3fm8j031IkWEO3vHJ4iYQJO5Ko+qmZuUWV7fSgasiC+1m/aTCS8CuyUYkk0e8IDvEi74kjV45nxXaqE6ifIAZcNhZeNQDQ9pzoR4EvrES8KR3NgE0ZTWxPkkSRBpNqyTvOpbirofDSWGA3LxmIsxqZCtMezjM3dbh7ERT7OH09FLrlT+1kZhNE8KeTquRlP0tiAzJcgVwducAz01lJpry5eYhobWRuRWcuBBB4wj4XXlf3EN5EkcHDqJsjA1OMRF8Azv4ePXAuj1mh5+WhRHhmr1j8RPIB3u+FU2JcnHhbwHRem29KZXr3t52YVsMXFHxIHYoyI5HCMLqn2vBpk7RPxbjYIgL6t15Zt/X0usVSEK+ePzM5dDkSkxKxkvI5cPlAhgKaa6cJEm/CMcHT6YPVUFtoP6YqjkMVvmuRg02Zkbl+uGADdwcscbmvAXwpsTpZV9R6Yj51HCh7ERzSJHFCsWpu15anSbhZNj945Evj884kcSmnkitCzG+rCUBa5TthlW6mNGmZRGzbxJriEltc+rAsnIj7ghn1vmLkAzkSUX5fPjYOwBCY/Ok= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR12MB3753.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230031)(376002)(346002)(366004)(136003)(396003)(39860400002)(230173577357003)(230273577357003)(230922051799003)(451199024)(186009)(1800799009)(64100799003)(55016003)(26005)(53546011)(9686003)(38100700002)(38070700009)(122000001)(52536014)(86362001)(33656002)(2906002)(5660300002)(83380400001)(7696005)(107886003)(6506007)(478600001)(71200400001)(4326008)(64756008)(54906003)(8936002)(8676002)(316002)(66556008)(66446008)(66946007)(66476007)(41300700001)(76116006)(110136005); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?QUoSfuNlhQkNIPnk/9zcgq5hMGMYMwJ0aF/1pHRoZGr+35pa6a4KnwHK2liN?= =?us-ascii?Q?A0NOJ8ZDz8SZQSmNC6XVvZYSNtkfvvIzdMzaO5Mlj+T3qiSHRP5fqK/rOW0d?= =?us-ascii?Q?OcvMdORDEHPdF1c+J/jesi6ZZZvALSd4M3/Vn2iAVwQTYNimPCPvkRzNC3BR?= =?us-ascii?Q?R/IJ7VFeRV7YxLJ/UuAlGJUN4LE4GU77FmcdTmDo4KyDKthbn8w6C5b08S+j?= =?us-ascii?Q?8BSdTUsYOZtpRHIoMFQx9THEhZRXt880w67C4ttSrCGUAmj7VbMvvKtz/b22?= =?us-ascii?Q?E0cCAjEhTjfVPdkLL3+uqhEqClxLm7z13NDwl3jK7LvCm3NQ0uzzA0jGYLhW?= =?us-ascii?Q?IyBx/zJIqECqJ2SUnHatozggM7a25mCgqQrsBFnUnUf+xnzGF5l7h6qPh7wL?= =?us-ascii?Q?D18U8/f1J9LcF5JNj6GRIGgA3yP7uZep03gJ/vzMZeZ8Z0wvVjsjR+Laxa2v?= =?us-ascii?Q?+WrVONqwHUx6N/o6lyrcfNp9sLJj4n62nQPwVT6t2uEC1iJ4I2gXWrLOKdUl?= =?us-ascii?Q?KrRISeraFpFthxYHkhCxwsQVthF/9iM+j5dtZ3pMN5I/zM+kPXxICOTadhSJ?= =?us-ascii?Q?XIiRTAlJS/3HKbslzoNdnsilbwlyKI7aAIaH17AYKxAsttOCDjFbQ0TBjsqg?= =?us-ascii?Q?SR4UFSjQyEXwn6mX68mltRI9ktiUUoPRS/Ym+PsRV6TfG/hswLKZJ+vgUtsk?= =?us-ascii?Q?a6m6YHHl8CWQwJXEn7SVmZkAOF2gDqcDLg/LGdPtNt0jJ4VMcpilqzY2vVOh?= =?us-ascii?Q?f5Z/x/FCV1j9zXcqbXb0s8gjnOimuiQ6EN0wY7g8t6551McIcsaNXETgGSLY?= =?us-ascii?Q?PloeGL/IXOfSvlKWUZPbYdGQagOEophdCu3bA3shrw1GbHI3uvRWMz4XP7Mh?= =?us-ascii?Q?YPCrXsS5iZIV/HAYJecO0WG2EbJ0x0tFuM/IgU2HLXuMFHW6qJqup/pgk2cp?= =?us-ascii?Q?IUCB9G/NrZUqxjDrkcFNU9rkVgyFsLsY1mWFjx/3BycrGXp6wdqEzCgq+WKs?= =?us-ascii?Q?mTS9D/Wqv6Bikm5IyI45qppdNx+/yyY5WUhfVcHaFrLtovJC3OSZNvMyYh/V?= =?us-ascii?Q?QRvb3eaFY4lXPXBzM+bNqlRU6+g7qUamJCLl4C49pRioJJ20p0KBdw4+K88V?= =?us-ascii?Q?qSPIFTKQWSJrB3h9OgjCvnJH/pzG5ikEyl+r68bE6JozZvnsG8DIvODc8qdI?= =?us-ascii?Q?aCI+RG91mCZc/7i5AckTrR0sC4mnNNHB71CV05EG0DjZxQdYRn3xlfPKFDBX?= =?us-ascii?Q?gUFhcYlnFQ/GNnWOPXlhDBl/kGXuiKfdp3su+ft26JNRswdgHS+qTO3pj+ko?= =?us-ascii?Q?HUB6AlKbAujvxsNtfwHS+noKlrC4sCJMUN2wpixUN+MfcJP12PNA85XJYLoc?= =?us-ascii?Q?uJLHQ9sQGwgGACIDI0QCkVUYuZjztOvhRgu2adjxjxoKerflJMdELqAas4ja?= =?us-ascii?Q?zUVV2idbEaxfB4O4PBrhMMWPMGeyBKtGyB4SNruQadG4AUJyXTSLr9hbfTZ5?= =?us-ascii?Q?c9pOdaHyu7Mkcxlg9oUpD/9TWiPFfx7JCJq0od8SyaiO0qriAhs+8+CH4Yaz?= =?us-ascii?Q?VfOuimq1sOqLsEaZkwgVwr+QX2bxGADKsPoM9U15?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB3753.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: b865b95d-5497-4027-6803-08dbdab4b688 X-MS-Exchange-CrossTenant-originalarrivaltime: 01 Nov 2023 08:29:49.4305 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: nRf5uYxWn0fVcVDLR4KhJbQ1YDMZIhz0vYRVJxkgmHhGfwzr1vUiISLk/KQYjG91FUACAw70vD/u9z30UBGxDg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH3PR12MB8934 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, Thank you for this optimizing patch. My concern is this line: > + heap =3D malloc(mp->size * sizeof(struct mlx5_range)); The pool size can be huge and it might cause the large memory allocation=20 (on host CPU side). What is the reason causing "hours" of registering? Reallocs per each pool e= lement? The mp struct has "struct rte_mempool_memhdr_list mem_list" member. I think we should consider populating this list with data from "struct rte_pktmbuf_extmem *ext_mem" on pool creation. Because of it seems the rte_mempool_mem_iter() functionality is completely broken for the pools with external memory, and that's why mlx5 implemented the dedicated branch to handle their registration. With best regards, Slava > -----Original Message----- > From: Aaron Conole > Sent: Tuesday, October 10, 2023 5:38 PM > To: dev@dpdk.org > Cc: John Romein ; Raslan Darawsheh > ; Elena Agostini ; Dmitry > Kozlyuk ; Matan Azrad ; Slava > Ovsiienko ; Ori Kam ; > Suanming Mou > Subject: [PATCH v2] common/mlx5: Optimize mlx5 mempool get extmem >=20 > From: John Romein >=20 > This patch reduces the time to allocate and register tens of gigabytes of= GPU > memory from hours to seconds, by sorting the heap only once instead of fo= r > each object in the mempool. >=20 > Fixes: 690b2a88c2f7 ("common/mlx5: add mempool registration facilities") >=20 > Signed-off-by: John Romein > --- > drivers/common/mlx5/mlx5_common_mr.c | 69 ++++++++-------------------- > 1 file changed, 20 insertions(+), 49 deletions(-) >=20 > diff --git a/drivers/common/mlx5/mlx5_common_mr.c > b/drivers/common/mlx5/mlx5_common_mr.c > index 40ff9153bd..77b66e444b 100644 > --- a/drivers/common/mlx5/mlx5_common_mr.c > +++ b/drivers/common/mlx5/mlx5_common_mr.c > @@ -1389,63 +1389,23 @@ mlx5_mempool_get_chunks(struct > rte_mempool *mp, struct mlx5_range **out, > return 0; > } >=20 > -struct mlx5_mempool_get_extmem_data { > - struct mlx5_range *heap; > - unsigned int heap_size; > - int ret; > -}; > - > static void > mlx5_mempool_get_extmem_cb(struct rte_mempool *mp, void *opaque, > void *obj, unsigned int obj_idx) > { > - struct mlx5_mempool_get_extmem_data *data =3D opaque; > + struct mlx5_range *heap =3D opaque; > struct rte_mbuf *mbuf =3D obj; > uintptr_t addr =3D (uintptr_t)mbuf->buf_addr; > - struct mlx5_range *seg, *heap; > struct rte_memseg_list *msl; > size_t page_size; > uintptr_t page_start; > - unsigned int pos =3D 0, len =3D data->heap_size, delta; >=20 > RTE_SET_USED(mp); > - RTE_SET_USED(obj_idx); > - if (data->ret < 0) > - return; > - /* Binary search for an already visited page. */ > - while (len > 1) { > - delta =3D len / 2; > - if (addr < data->heap[pos + delta].start) { > - len =3D delta; > - } else { > - pos +=3D delta; > - len -=3D delta; > - } > - } > - if (data->heap !=3D NULL) { > - seg =3D &data->heap[pos]; > - if (seg->start <=3D addr && addr < seg->end) > - return; > - } > - /* Determine the page boundaries and remember them. */ > - heap =3D realloc(data->heap, sizeof(heap[0]) * (data->heap_size + 1)); > - if (heap =3D=3D NULL) { > - free(data->heap); > - data->heap =3D NULL; > - data->ret =3D -1; > - return; > - } > - data->heap =3D heap; > - data->heap_size++; > - seg =3D &heap[data->heap_size - 1]; > msl =3D rte_mem_virt2memseg_list((void *)addr); > page_size =3D msl !=3D NULL ? msl->page_sz : rte_mem_page_size(); > page_start =3D RTE_PTR_ALIGN_FLOOR(addr, page_size); > - seg->start =3D page_start; > - seg->end =3D page_start + page_size; > - /* Maintain the heap order. */ > - qsort(data->heap, data->heap_size, sizeof(heap[0]), > - mlx5_range_compare_start); > + heap[obj_idx].start =3D page_start; > + heap[obj_idx].end =3D page_start + page_size; > } >=20 > /** > @@ -1457,15 +1417,26 @@ static int > mlx5_mempool_get_extmem(struct rte_mempool *mp, struct mlx5_range > **out, > unsigned int *out_n) > { > - struct mlx5_mempool_get_extmem_data data; > + unsigned int out_size =3D 1; > + struct mlx5_range *heap; >=20 > DRV_LOG(DEBUG, "Recovering external pinned pages of mempool > %s", > mp->name); > - memset(&data, 0, sizeof(data)); > - rte_mempool_obj_iter(mp, mlx5_mempool_get_extmem_cb, > &data); > - *out =3D data.heap; > - *out_n =3D data.heap_size; > - return data.ret; > + heap =3D malloc(mp->size * sizeof(struct mlx5_range)); > + if (heap =3D=3D NULL) > + return -1; > + rte_mempool_obj_iter(mp, mlx5_mempool_get_extmem_cb, heap); > + qsort(heap, mp->size, sizeof(heap[0]), mlx5_range_compare_start); > + /* remove duplicates */ > + for (unsigned int i =3D 1; i < mp->size; i++) > + if (heap[out_size - 1].start !=3D heap[i].start) > + heap[out_size++] =3D heap[i]; > + heap =3D realloc(heap, out_size * sizeof(struct mlx5_range)); > + if (heap =3D=3D NULL) > + return -1; > + *out =3D heap; > + *out_n =3D out_size; > + return 0; > } >=20 > /** > -- > 2.41.0