From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 11E24A00C3; Tue, 18 Jan 2022 09:25:27 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 964584117D; Tue, 18 Jan 2022 09:25:26 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id B44D54067E for ; Tue, 18 Jan 2022 09:25:24 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [PATCH] mempool: fix get objects from mempool with cache X-MimeOLE: Produced By Microsoft Exchange V6.5 Date: Tue, 18 Jan 2022 09:25:22 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D86E0F@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [PATCH] mempool: fix get objects from mempool with cache Thread-Index: AdgLyJeDF8b9FBSYT9qBIq7DSnBASwAcnbVA References: <98CBD80474FA8B44BF855DF32C47DC35D86DB2@smartserver.smartshare.dk> <20220114163650.94288-1-mb@smartsharesystems.com> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" Cc: , , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Monday, 17 January 2022 18.35 >=20 > On Fri, Jan 14, 2022 at 05:36:50PM +0100, Morten Br=F8rup wrote: > > A flush threshold for the mempool cache was introduced in DPDK > version > > 1.3, but rte_mempool_do_generic_get() was not completely updated = back > > then, and some inefficiencies were introduced. > > > > This patch fixes the following in rte_mempool_do_generic_get(): > > > > 1. The code that initially screens the cache request was not updated > > with the change in DPDK version 1.3. > > The initial screening compared the request length to the cache size, > > which was correct before, but became irrelevant with the = introduction > of > > the flush threshold. E.g. the cache can hold up to flushthresh > objects, > > which is more than its size, so some requests were not served from > the > > cache, even though they could be. > > The initial screening has now been corrected to match the initial > > screening in rte_mempool_do_generic_put(), which verifies that a > cache > > is present, and that the length of the request does not overflow the > > memory allocated for the cache. > > > > 2. The function is a helper for rte_mempool_generic_get(), so it = must > > behave according to the description of that function. > > Specifically, objects must first be returned from the cache, > > subsequently from the ring. > > After the change in DPDK version 1.3, this was not the behavior when > > the request was partially satisfied from the cache; instead, the > objects > > from the ring were returned ahead of the objects from the cache. = This > is > > bad for CPUs with a small L1 cache, which benefit from having the = hot > > objects first in the returned array. (This is also the reason why > > the function returns the objects in reverse order.) > > Now, all code paths first return objects from the cache, = subsequently > > from the ring. > > > > 3. If the cache could not be backfilled, the function would attempt > > to get all the requested objects from the ring (instead of only the > > number of requested objects minus the objects available in the = ring), > > and the function would fail if that failed. > > Now, the first part of the request is always satisfied from the > cache, > > and if the subsequent backfilling of the cache from the ring fails, > only > > the remaining requested objects are retrieved from the ring. > > > > 4. The code flow for satisfying the request from the cache was > slightly > > inefficient: > > The likely code path where the objects are simply served from the > cache > > was treated as unlikely. Now it is treated as likely. > > And in the code path where the cache was backfilled first, numbers > were > > added and subtracted from the cache length; now this code path = simply > > sets the cache length to its final value. > > > > 5. Some comments were not correct anymore. > > The comments have been updated. > > Most importanly, the description of the succesful return value was > > inaccurate. Success only returns 0, not >=3D 0. > > > > Signed-off-by: Morten Br=F8rup > > --- >=20 > I am a little uncertain about the reversing of the copies taking = things > out > of the mempool - for machines where we are not that cache constrainted > will > we lose out in possible optimizations where the compiler optimizes the > copy > loop as a memcpy? The objects are also returned in reverse order in the code it replaces, = so this behavior is not introduced by this patch; I only describe the = reason for it. I floated a previous patch, in which the objects were returned in order, = but Jerin argued [1] that we should keep it the way it was, unless I = could show a performance improvement. So I retracted that patch to split it up in two independent patches = instead. This patch for get(), and [3] for put(). While experimenting using rte_memcpy() for these, I couldn't achieve a = performance boost - quite the opposite. So I gave up on it. Reviewing the x86 variant of rte_memcpy() [2] makes me think that it is = inefficient for copying small bulks of pointers, especially when n is = unknown at compile time, and its code path goes through a great deal of = branches. >=20 > Otherwise the logic all looks correct to me. >=20 > /Bruce [1]: = http://inbox.dpdk.org/dev/CALBAE1OjCswxUfaNLWg5y-tnPkFhvvKQ8sJ3JpBoo7Obge= B5OA@mail.gmail.com/ [2]: = http://code.dpdk.org/dpdk/latest/source/lib/eal/x86/include/rte_memcpy.h [3]: = http://inbox.dpdk.org/dev/20220117115231.8060-1-mb@smartsharesystems.com/=