From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id E48B9A0352; Mon, 23 Dec 2019 12:09:38 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 1EFF01BE3D; Mon, 23 Dec 2019 12:09:36 +0100 (CET) Received: from integrity.niometrics.com (integrity.niometrics.com [42.61.70.122]) by dpdk.org (Postfix) with ESMTP id D5BCA1F5; Mon, 23 Dec 2019 12:09:32 +0100 (CET) Received: from [10.15.0.122] (unknown [10.15.0.122]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by integrity.niometrics.com (Postfix) with ESMTPSA id D4C55409CBA8; Mon, 23 Dec 2019 19:09:29 +0800 (+08) DMARC-Filter: OpenDMARC Filter v1.3.2 integrity.niometrics.com D4C55409CBA8 Authentication-Results: integrity.niometrics.com; dmarc=fail (p=reject dis=none) header.from=niometrics.com Authentication-Results: integrity.niometrics.com; spf=fail smtp.mailfrom=tranbaolong@niometrics.com DKIM-Filter: OpenDKIM Filter v2.11.0 integrity.niometrics.com D4C55409CBA8 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=niometrics.com; s=default; t=1577099371; bh=SguvJ+zZ7Gv8U7PSOFyuxPXnSVArxC4KUUWOXF4RVBE=; h=From:Subject:Date:Cc:To:From; b=b98J0Arxx5a3NH0RztzfjBdqehKalunY3vIBvwNMxSfzg7VLtGHTxu91QfDmoNxnV ayEYx9rWeoCMg+ZP7XsPxeNXRd/OevZjKNP7rsm4dQs4wpUbWx9hNplJomGmB+ZCIn 1igin4s18ifcgKwwA2O7mvfv+SZISeNu2XTVKGOk= From: Bao-Long Tran Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: quoted-printable Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3601.0.10\)) Message-Id: Date: Mon, 23 Dec 2019 19:09:29 +0800 Cc: dev@dpdk.org, users@dpdk.org, ricudis@niometrics.com To: anatoly.burakov@intel.com, olivier.matz@6wind.com, arybchenko@solarflare.com X-Mailer: Apple Mail (2.3601.0.10) X-Spam-Status: No, score=-1.0 required=3.5 tests=ALL_TRUSTED,AWL autolearn=disabled version=3.4.0 X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on integrity.niometrics.com Subject: [dpdk-dev] Inconsistent behavior of mempool with regards to hugepage allocation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, I'm not sure if this is a bug, but I've seen an inconsistency in the = behavior=20 of DPDK with regards to hugepage allocation for rte_mempool. Basically, = for the same mempool size, the number of hugepages allocated changes from run to = run. Here's how I reproduce with DPDK 19.11. IOVA=3Dpa (default) 1. Reserve 16x1G hugepages on socket 0=20 2. Replace examples/skeleton/basicfwd.c with the code below, build and = run make && ./build/basicfwd=20 3. At the same time, watch the number of hugepages allocated=20 "watch -n.1 ls /dev/hugepages" 4. Repeat step 2 If you can reproduce, you should see that for some runs, DPDK allocates = 5 hugepages, other times it allocates 6. When it allocates 6, if you watch = the=20 output from step 3., you will see that DPDK first try to allocate 5 = hugepages,=20 then unmap all 5, retry, and got 6. For our use case, it's important that DPDK allocate the same number of=20= hugepages on every run so we can get reproducable results. Studying the code, this seems to be the behavior of rte_mempool_populate_default(). If I understand correctly, if the first = try fail to get 5 IOVA-contiguous pages, it retries, relaxing the IOVA-contiguous condition, and eventually wound up with 6 hugepages. Questions:=20 1. Why does the API sometimes fail to get IOVA contig mem, when hugepage = memory=20 is abundant?=20 2. Why does the 2nd retry need N+1 hugepages? Some insights for Q1: =46rom my experiments, seems like the IOVA of the = first hugepage is not guaranteed to be at the start of the IOVA space = (understandably). It could explain the retry when the IOVA of the first hugepage is near = the end of=20 the IOVA space. But I have also seen situation where the 1st hugepage is = near the beginning of the IOVA space and it still failed the 1st time. Here's the code: #include #include int main(int argc, char *argv[]) { struct rte_mempool *mbuf_pool; unsigned mbuf_pool_size =3D 2097151; int ret =3D rte_eal_init(argc, argv); if (ret < 0) rte_exit(EXIT_FAILURE, "Error with EAL = initialization\n"); printf("Creating mbuf pool size=3D%u\n", mbuf_pool_size); mbuf_pool =3D rte_pktmbuf_pool_create("MBUF_POOL", = mbuf_pool_size, 256, 0, RTE_MBUF_DEFAULT_BUF_SIZE, 0); printf("mbuf_pool %p\n", mbuf_pool); return 0; } Best regards, BL=