From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0075.outbound.protection.outlook.com [104.47.1.75]) by dpdk.org (Postfix) with ESMTP id 7B2751B6BF for ; Thu, 10 May 2018 05:00:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=H9jyxxGfCmC7lGU4L0UoH7NQ3ZGRKqtscuLgH7RwRW0=; b=NjGiaVke9O0Bm53A6XRcgMjGyiKX2HsfmFzVpe29J1O7V04luWh1SRFJ8nsUOiuiDB84xcqAH2VQXF/5dMz6WWGZCxr82aRN98Q9Wn+NoGtgTc0cr1bZ7bGwvNaOeUYbwrd6OBF4PHIfZ9d4SsX/4TbDy8id9z+yVasR/vuoIiE= Received: from VI1PR0501MB2045.eurprd05.prod.outlook.com (10.167.195.147) by VI1PR0501MB2189.eurprd05.prod.outlook.com (10.169.134.136) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.755.16; Thu, 10 May 2018 03:00:45 +0000 Received: from VI1PR0501MB2045.eurprd05.prod.outlook.com ([fe80::11b2:1e2d:709c:695d]) by VI1PR0501MB2045.eurprd05.prod.outlook.com ([fe80::11b2:1e2d:709c:695d%13]) with mapi id 15.20.0755.012; Thu, 10 May 2018 03:00:44 +0000 From: Yongseok Koh To: Ferruh Yigit CC: Adrien Mazarguil , =?iso-8859-1?Q?N=E9lio_Laranjeiro?= , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2 4/4] net/mlx4: add new Memory Region support Thread-Index: AQHT54Y6pMhYWdR2g0m7oIZlo5OK/6QoB36AgAA/nwA= Date: Thu, 10 May 2018 03:00:44 +0000 Message-ID: <2DDD3185-6284-4BA3-A187-AED4C96407EB@mellanox.com> References: <20180502231654.7596-1-yskoh@mellanox.com> <20180509110906.19462-1-yskoh@mellanox.com> <20180509110906.19462-5-yskoh@mellanox.com> <274843e9-ed82-9576-baf8-a704babf64c5@intel.com> In-Reply-To: <274843e9-ed82-9576-baf8-a704babf64c5@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; x-originating-ip: [73.222.116.174] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; VI1PR0501MB2189; 7:y52bWEJyllzmd0iA+NnvbjC+Y23wGSvzwb4+D/Zn1d67W4uYfp9CVy5P2+vjnKmUBu9g1WHFz8knczj97lMusJjcegKZuxfuDYbyrK7nhSqMAljUaUy40LIbja6LdSPDRs5x9KOt0ZfYMHlSZ2BcRkAchHj446dzojBIg+aOE+Oko5ehRjm9Qcflt7K3AoqsSxVUWNP9UWlMGrYvipayqIlcSgqFKbow2nXTH6MxX8eburzLjaRGxYZ+8aUvWiQp x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(4534165)(4627221)(201703031133081)(201702281549075)(5600026)(48565401081)(2017052603328)(7153060)(7193020); SRVR:VI1PR0501MB2189; x-ms-traffictypediagnostic: VI1PR0501MB2189: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231254)(944501410)(52105095)(93006095)(93001095)(6055026)(149027)(150027)(6041310)(20161123558120)(20161123564045)(20161123562045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(6072148)(201708071742011); SRVR:VI1PR0501MB2189; BCL:0; PCL:0; RULEID:; SRVR:VI1PR0501MB2189; x-forefront-prvs: 066898046A x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(346002)(366004)(376002)(39380400002)(39860400002)(189003)(199004)(305945005)(6436002)(6506007)(99286004)(476003)(76176011)(25786009)(316002)(2906002)(102836004)(8936002)(53546011)(7736002)(186003)(83716003)(6916009)(59450400001)(486006)(86362001)(11346002)(36756003)(4326008)(2616005)(33656002)(14454004)(8676002)(6246003)(26005)(5660300001)(82746002)(93886005)(68736007)(81156014)(66066001)(6512007)(446003)(229853002)(3660700001)(6486002)(2900100001)(97736004)(53936002)(5250100002)(106356001)(478600001)(3280700002)(105586002)(81166006)(54906003)(6116002)(3846002); DIR:OUT; SFP:1101; SCL:1; SRVR:VI1PR0501MB2189; H:VI1PR0501MB2045.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: vbX6HZ6SJPrAtyr3PbtjPhauBGR372cJ9bi+WvBvegpzYftkHnnAcNovLBUndzNi8ELdpoEonGg9lTTVafGhEU1WqZcnwgQTOFPTTxw1aRxQzSe8T7WkHdiesPPtYmwWl9vU7zRmfg8vbWOsXoGNrgvq9kbkoi9hO3a+rXLS5ddqiH0ETeHCi5BOxwFl126I spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-ID: <70F642FB48C555409EE17B8859CC82C2@eurprd05.prod.outlook.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 8adc109f-44ec-4dd4-59e2-08d5b622393f X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 8adc109f-44ec-4dd4-59e2-08d5b622393f X-MS-Exchange-CrossTenant-originalarrivaltime: 10 May 2018 03:00:44.7295 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: VI1PR0501MB2189 Subject: Re: [dpdk-dev] [PATCH v2 4/4] net/mlx4: add new Memory Region support X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 10 May 2018 03:00:47 -0000 > On May 9, 2018, at 4:12 PM, Ferruh Yigit wrote: >=20 > On 5/9/2018 12:09 PM, Yongseok Koh wrote: >> This is the new design of Memory Region (MR) for mlx PMD, in order to: >> - Accommodate the new memory hotplug model. >> - Support non-contiguous Mempool. >>=20 >> There are multiple layers for MR search. >>=20 >> L0 is to look up the last-hit entry which is pointed by mr_ctrl->mru (Mo= st >> Recently Used). If L0 misses, L1 is to look up the address in a fixed-si= zed >> array by linear search. L0/L1 is in an inline function - >> mlx4_mr_lookup_cache(). >>=20 >> If L1 misses, the bottom-half function is called to look up the address >> from the bigger local cache of the queue. This is L2 - mlx4_mr_addr2mr_b= h() >> and it is not an inline function. Data structure for L2 is the Binary Tr= ee. >>=20 >> If L2 misses, the search falls into the slowest path which takes locks i= n >> order to access global device cache (priv->mr.cache) which is also a B-t= ree >> and caches the original MR list (priv->mr.mr_list) of the device. Unless >> the global cache is overflowed, it is all-inclusive of the MR list. This= is >> L3 - mlx4_mr_lookup_dev(). The size of the L3 cache table is limited and >> can't be expanded on the fly due to deadlock. Refer to the comments in t= he >> code for the details - mr_lookup_dev(). If L3 is overflowed, the list wi= ll >> have to be searched directly bypassing the cache although it is slower. >>=20 >> If L3 misses, a new MR for the address should be created - >> mlx4_mr_create(). When it creates a new MR, it tries to register adjacen= t >> memsegs as much as possible which are virtually contiguous around the >> address. This must take two locks - memory_hotplug_lock and >> priv->mr.rwlock. Due to memory_hotplug_lock, there can't be any >> allocation/free of memory inside. >>=20 >> In the free callback of the memory hotplug event, freed space is searche= d >> from the MR list and corresponding bits are cleared from the bitmap of M= Rs. >> This can fragment a MR and the MR will have multiple search entries in t= he >> caches. Once there's a change by the event, the global cache must be >> rebuilt and all the per-queue caches will be flushed as well. If memory = is >> frequently freed in run-time, that may cause jitter on dataplane process= ing >> in the worst case by incurring MR cache flush and rebuild. But, it would= be >> the least probable scenario. >>=20 >> To guarantee the most optimal performance, it is highly recommended to u= se >> an EAL option - '--socket-mem'. Then, the reserved memory will be pinned >> and won't be freed dynamically. And it is also recommended to configure >> per-lcore cache of Mempool. Even though there're many MRs for a device o= r >> MRs are highly fragmented, the cache of Mempool will be much helpful to >> reduce misses on per-queue caches anyway. >>=20 >> '--legacy-mem' is also supported. >>=20 >> Signed-off-by: Yongseok Koh >=20 > <...> >=20 >> +/** >> + * Insert an entry to B-tree lookup table. >> + * >> + * @param bt >> + * Pointer to B-tree structure. >> + * @param entry >> + * Pointer to new entry to insert. >> + * >> + * @return >> + * 0 on success, -1 on failure. >> + */ >> +static int >> +mr_btree_insert(struct mlx4_mr_btree *bt, struct mlx4_mr_cache *entry) >> +{ >> + struct mlx4_mr_cache *lkp_tbl; >> + uint16_t idx =3D 0; >> + size_t shift; >> + >> + assert(bt !=3D NULL); >> + assert(bt->len <=3D bt->size); >> + assert(bt->len > 0); >> + lkp_tbl =3D *bt->table; >> + /* Find out the slot for insertion. */ >> + if (mr_btree_lookup(bt, &idx, entry->start) !=3D UINT32_MAX) { >> + DEBUG("abort insertion to B-tree(%p):" >> + " already exist at idx=3D%u [0x%lx, 0x%lx) lkey=3D0x%x", >> + (void *)bt, idx, entry->start, entry->end, entry->lkey); >=20 > This and various other logs causing 32bits build error because of %lx usa= ge. Can > you please check them? >=20 > I am feeling sad to complain a patch like this just because of log format= issue, > we should find a solution to this issue as community, either checkpatch c= hecks > or automated 32bit builds, I don't know. Bummer. I have to change my bad habit of using %lx. And we will add 32-bit = build check to our internal system to filter this kind of mistakes beforehand. Will work with Shahaf to fix it and rebase next-net-mlx. Thanks, Yongseok