From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 5D061A0487
	for <public@inbox.dpdk.org>; Mon,  1 Jul 2019 16:22:18 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 4C4B01B9BE;
	Mon,  1 Jul 2019 16:22:17 +0200 (CEST)
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id A66311B99D
 for <dev@dpdk.org>; Mon,  1 Jul 2019 16:22:14 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Jul 2019 07:21:44 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.63,439,1557212400"; d="scan'208";a="246934083"
Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202])
 by orsmga001.jf.intel.com with ESMTP; 01 Jul 2019 07:21:44 -0700
Received: from FMSMSX110.amr.corp.intel.com (10.18.116.10) by
 fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS)
 id 14.3.439.0; Mon, 1 Jul 2019 07:21:43 -0700
Received: from shsmsx104.ccr.corp.intel.com (10.239.4.70) by
 fmsmsx110.amr.corp.intel.com (10.18.116.10) with Microsoft SMTP Server (TLS)
 id 14.3.439.0; Mon, 1 Jul 2019 07:21:43 -0700
Received: from shsmsx106.ccr.corp.intel.com ([169.254.10.240]) by
 SHSMSX104.ccr.corp.intel.com ([169.254.5.110]) with mapi id 14.03.0439.000;
 Mon, 1 Jul 2019 22:21:41 +0800
From: "Wang, Xiao W" <xiao.w.wang@intel.com>
To: Olivier Matz <olivier.matz@6wind.com>, Andrew Rybchenko
 <arybchenko@solarflare.com>
CC: "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [PATCH] mempool: optimize copy in cache get
Thread-Index: AQHVD7ZnqvpJLa7I+k2NTIbZn0mBj6Z0y1yAgECr+YCAAJSgoA==
Date: Mon, 1 Jul 2019 14:21:41 +0000
Message-ID: <B7F2E978279D1D49A3034B7786DACF407AF5B530@SHSMSX106.ccr.corp.intel.com>
References: <20190521090321.138062-1-xiao.w.wang@intel.com>
 <98702cfa-66ed-e13f-a5ae-c85eb6d64c2f@solarflare.com>
 <20190701131103.zi72h3me63mzg73v@platinum>
In-Reply-To: <20190701131103.zi72h3me63mzg73v@platinum>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ctpclassification: CTP_NT
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZmM3ZTQ0ODMtZDQxZC00ZmFiLWFlZTgtMTljYjZmYzgxOTRjIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiTG1kNnpJTEo0ZFdWQmlKODhGcGFsSEVydE9SMVdxeFwvMlR1R0xUdEVhTTRWTkh4NFJhR2ZKNlZFMXFXMjNZQkkifQ==
dlp-product: dlpe-windows
dlp-version: 11.2.0.6
dlp-reaction: no-action
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH] mempool: optimize copy in cache get
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hi,

> -----Original Message-----
> From: Olivier Matz [mailto:olivier.matz@6wind.com]
> Sent: Monday, July 1, 2019 9:11 PM
> To: Andrew Rybchenko <arybchenko@solarflare.com>
> Cc: Wang, Xiao W <xiao.w.wang@intel.com>; dev@dpdk.org
> Subject: Re: [PATCH] mempool: optimize copy in cache get
>=20
> Hi,
>=20
> On Tue, May 21, 2019 at 12:34:55PM +0300, Andrew Rybchenko wrote:
> > On 5/21/19 12:03 PM, Xiao Wang wrote:
> > > Use rte_memcpy to improve the pointer array copy. This optimization
> method
> > > has already been applied to __mempool_generic_put() [1], this patch
> applies
> > > it to __mempool_generic_get(). Slight performance gain can be observe=
d
> in
> > > testpmd txonly test.
> > >
> > > [1] 863bfb47449 ("mempool: optimize copy in cache")
> > >
> > > Signed-off-by: Xiao Wang <xiao.w.wang@intel.com>
> > > ---
> > >   lib/librte_mempool/rte_mempool.h | 7 +------
> > >   1 file changed, 1 insertion(+), 6 deletions(-)
> > >
> > > diff --git a/lib/librte_mempool/rte_mempool.h
> b/lib/librte_mempool/rte_mempool.h
> > > index 8053f7a04..975da8d22 100644
> > > --- a/lib/librte_mempool/rte_mempool.h
> > > +++ b/lib/librte_mempool/rte_mempool.h
> > > @@ -1344,15 +1344,11 @@ __mempool_generic_get(struct
> rte_mempool *mp, void **obj_table,
> > >   		      unsigned int n, struct rte_mempool_cache *cache)
> > >   {
> > >   	int ret;
> > > -	uint32_t index, len;
> > > -	void **cache_objs;
> > >   	/* No cache provided or cannot be satisfied from cache */
> > >   	if (unlikely(cache =3D=3D NULL || n >=3D cache->size))
> > >   		goto ring_dequeue;
> > > -	cache_objs =3D cache->objs;
> > > -
> > >   	/* Can this be satisfied from the cache? */
> > >   	if (cache->len < n) {
> > >   		/* No. Backfill the cache first, and then fill from it */
> > > @@ -1375,8 +1371,7 @@ __mempool_generic_get(struct rte_mempool
> *mp, void **obj_table,
> > >   	}
> > >   	/* Now fill in the response ... */
> > > -	for (index =3D 0, len =3D cache->len - 1; index < n; ++index, len--=
,
> obj_table++)
> > > -		*obj_table =3D cache_objs[len];
> > > +	rte_memcpy(obj_table, &cache->objs[cache->len - n], sizeof(void *) =
*
> n);
> > >   	cache->len -=3D n;
> >
> > I think the idea of the loop above is to get objects in reverse order t=
o
> > order
> > to reuse cache top objects (put last) first. It should improve cache hi=
t
> > etc.
> > So, performance effect of the patch could be very different on various =
CPUs
> > (with different cache sizes) and various work-loads.
> >
> > So, I doubt that it is a step in right direction.
>=20
> For reference, this was already discussed 3 years ago:
>=20
> https://mails.dpdk.org/archives/dev/2016-May/039873.html
> https://mails.dpdk.org/archives/dev/2016-June/040029.html
>=20
> I'm still not convinced that reversing object addresses (as it's done
> today) is really important. But Andrew is probably right, the impact of
> this kind of patch probably varies depending on many factors. More
> performance numbers on real-life use-cases would help to decide what to
> do.
>=20
> Regards,
> Olivier

I agree, and thanks for the reference link. So theoretically neither way ca=
n be
a definite best choice, it depends on various real-life factors. I'm thinki=
ng about
how to let app developer be aware of this so that they themselves could mak=
e
the choice. Or it's not worth doing due to small perf gain?

BRs,
Xiao=20