From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id 843B7A49F for ; Tue, 23 Jan 2018 14:16:53 +0100 (CET) X-Virus-Scanned: Proofpoint Essentials engine Received: from webmail.solarflare.com (webmail.solarflare.com [12.187.104.26]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1-us4.ppe-hosted.com (Proofpoint Essentials ESMTP Server) with ESMTPS id A765ABC005E; Tue, 23 Jan 2018 13:16:52 +0000 (UTC) Received: from ocex03.SolarFlarecom.com (10.20.40.36) by ocex03.SolarFlarecom.com (10.20.40.36) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Tue, 23 Jan 2018 05:16:49 -0800 Received: from opal.uk.solarflarecom.com (10.17.10.1) by ocex03.SolarFlarecom.com (10.20.40.36) with Microsoft SMTP Server (TLS) id 15.0.1044.25 via Frontend Transport; Tue, 23 Jan 2018 05:16:49 -0800 Received: from uklogin.uk.solarflarecom.com (uklogin.uk.solarflarecom.com [10.17.10.10]) by opal.uk.solarflarecom.com (8.13.8/8.13.8) with ESMTP id w0NDGlGb006393; Tue, 23 Jan 2018 13:16:47 GMT Received: from uklogin.uk.solarflarecom.com (localhost.localdomain [127.0.0.1]) by uklogin.uk.solarflarecom.com (8.13.8/8.13.8) with ESMTP id w0NDGleZ010689; Tue, 23 Jan 2018 13:16:47 GMT From: Andrew Rybchenko To: CC: Olivier Matz , Santosh Shukla , Jerin Jacob , Hemant Agrawal , Shreyansh Jain Date: Tue, 23 Jan 2018 13:15:55 +0000 Message-ID: <1516713372-10572-1-git-send-email-arybchenko@solarflare.com> X-Mailer: git-send-email 1.8.2.3 In-Reply-To: <1511539591-20966-1-git-send-email-arybchenko@solarflare.com> References: <1511539591-20966-1-git-send-email-arybchenko@solarflare.com> MIME-Version: 1.0 Content-Type: text/plain X-MDID: 1516713413-UbKxkLfcOy6s Subject: [dpdk-dev] [RFC v2 00/17] mempool: add bucket mempool driver X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 23 Jan 2018 13:16:54 -0000 The patch series starts from generic enhancements suggested by Olivier. Basically it adds driver callbacks to calculate required memory size and to populate objects using provided memory area. It allows to remove so-called capability flags used before to tell generic code how to allocate and slice allocated memory into mempool objects. Clean up which removes get_capabilities and register_memory_area is not strictly required, but I think right thing to do. Existing mempool drivers are updated. I've kept rte_mempool_populate_iova_tab() intact since it seems to be not directly related XMEM API functions. The patch series adds bucket mempool driver which allows to allocate (both physically and virtually) contiguous blocks of objects and adds mempool API to do it. It is still capable to provide separate objects, but it is definitely more heavy-weight than ring/stack drivers. The driver will be used by the future Solarflare driver enhancements which allow to utilize physical contiguous blocks in the NIC hardware/firmware. The target usecase is dequeue in blocks and enqueue separate objects back (which are collected in buckets to be dequeued). So, the memory pool with bucket driver is created by an application and provided to networking PMD receive queue. The choice of bucket driver is done using rte_eth_dev_pool_ops_supported(). A PMD that relies upon contiguous block allocation should report the bucket driver as the only supported and preferred one. Introduction of the contiguous block dequeue operation is proven by performance measurements using autotest with minor enhancements: - in the original test bulks are powers of two, which is unacceptable for us, so they are changed to multiple of contig_block_size; - the test code is duplicated to support plain dequeue and dequeue_contig_blocks; - all the extra test variations (with/without cache etc) are eliminated; - a fake read from the dequeued buffer is added (in both cases) to simulate mbufs access. start performance test for bucket (without cache) mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Srate_persec= 111935488 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Srate_persec= 115290931 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Srate_persec= 353055539 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Srate_persec= 353330790 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Srate_persec= 224657407 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Srate_persec= 230411468 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Srate_persec= 706700902 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Srate_persec= 703673139 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Srate_persec= 425236887 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Srate_persec= 437295512 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Srate_persec= 1343409356 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Srate_persec= 1336567397 start performance test for bucket (without cache + contiguous dequeue) mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Crate_persec= 122945536 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Crate_persec= 126458265 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Crate_persec= 374262988 mempool_autotest cache= 0 cores= 1 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Crate_persec= 377316966 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Crate_persec= 244842496 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Crate_persec= 251618917 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Crate_persec= 751226060 mempool_autotest cache= 0 cores= 2 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Crate_persec= 756233010 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 1 n_keep= 30 Crate_persec= 462068120 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 1 n_keep= 60 Crate_persec= 476997221 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 15 n_keep= 30 Crate_persec= 1432171313 mempool_autotest cache= 0 cores= 4 n_get_bulk= 15 n_put_bulk= 15 n_keep= 60 Crate_persec= 1438829771 The number of objects in the contiguous block is a function of bucket memory size (.config option) and total element size. In the future additional API with possibility to pass parameters on mempool allocation may be added. It breaks ABI since changes rte_mempool_ops. Also it removes rte_mempool_ops_register_memory_area() and rte_mempool_ops_get_capabilities() since corresponding callbacks are removed. The target DPDK release is 18.05. v2: - add driver ops to calculate required memory size and populate mempool objects, remove extra flags which were required before to control it - transition of octeontx and dpaa drivers to the new callbacks - change info API to get information from driver required to API user to know contiguous block size - remove get_capabilities (not required any more and may be substituted with more in info get API) - remove register_memory_area since it is substituted with populate callback which can do more - use SPDX tags - avoid all objects affinity to single lcore - fix bucket get_count - deprecate XMEM API - avoid introduction of a new function to flush cache - fix NO_CACHE_ALIGN case in bucket mempool Andrew Rybchenko (10): mempool: fix phys contig check if populate default skipped mempool: add op to calculate memory size to be allocated mempool/octeontx: add callback to calculate memory size mempool: add op to populate objects using provided memory mempool/octeontx: implement callback to populate objects mempool: remove callback to get capabilities mempool: deprecate xmem functions mempool/octeontx: prepare to remove register memory area op mempool/dpaa: convert to use populate driver op mempool: remove callback to register memory area Artem V. Andreev (7): mempool: ensure the mempool is initialized before populating mempool/bucket: implement bucket mempool manager mempool: support flushing the default cache of the mempool mempool: implement abstract mempool info API mempool: support block dequeue operation mempool/bucket: implement block dequeue operation mempool/bucket: do not allow one lcore to grab all buckets MAINTAINERS | 9 + config/common_base | 2 + drivers/mempool/Makefile | 1 + drivers/mempool/bucket/Makefile | 27 + drivers/mempool/bucket/rte_mempool_bucket.c | 626 +++++++++++++++++++++ .../mempool/bucket/rte_mempool_bucket_version.map | 4 + drivers/mempool/dpaa/dpaa_mempool.c | 13 +- drivers/mempool/octeontx/rte_mempool_octeontx.c | 63 ++- lib/librte_mempool/rte_mempool.c | 192 ++++--- lib/librte_mempool/rte_mempool.h | 366 +++++++++--- lib/librte_mempool/rte_mempool_ops.c | 48 +- lib/librte_mempool/rte_mempool_version.map | 11 +- mk/rte.app.mk | 1 + 13 files changed, 1184 insertions(+), 179 deletions(-) create mode 100644 drivers/mempool/bucket/Makefile create mode 100644 drivers/mempool/bucket/rte_mempool_bucket.c create mode 100644 drivers/mempool/bucket/rte_mempool_bucket_version.map -- 2.7.4