From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 352984297D; Tue, 18 Apr 2023 14:55:59 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 26D2440EDF; Tue, 18 Apr 2023 14:55:59 +0200 (CEST) Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by mails.dpdk.org (Postfix) with ESMTP id DF7F840698 for ; Tue, 18 Apr 2023 14:55:56 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1681822557; x=1713358557; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=lppC+wXoIKJjeoxAvdDgYgDzNeUczN7GkCvxchxus2g=; b=evZ6bmWZgh7exfFQyhJ8cLXtwyvzlU/5J4Z/a+ozwonIli+Dpjvwx3NF vniL7XoOsYgyIT82DS90Fdwln02rQ5luqMfDOGtV55nODCfaqr/nyGqd/ YqTwuH7nO7/fhFo7rF4/PKafr7qcyMNIoJeEqx+amtOI2yPbo+mfz+iGR lgEqeiLGICOsBNL50hnq0HwS7IDW+OkSy2ntGQ3C/LnEkylTy4bBY7zf5 68yP/J9X4/A4QIhTSeH7rh+XHd0loCZRp6Ycrol9F3XFWtkOFB/KO3NqD R6yLF903laMpo8LDuwQGK8QEftgIh5HK7f+3lEn+1qHAO6Trcpod9olSH Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="342649854" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="342649854" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Apr 2023 05:55:35 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10684"; a="693608762" X-IronPort-AV: E=Sophos;i="5.99,207,1677571200"; d="scan'208";a="693608762" Received: from orsmsx602.amr.corp.intel.com ([10.22.229.15]) by fmsmga007.fm.intel.com with ESMTP; 18 Apr 2023 05:55:35 -0700 Received: from orsmsx610.amr.corp.intel.com (10.22.229.23) by ORSMSX602.amr.corp.intel.com (10.22.229.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23; Tue, 18 Apr 2023 05:55:35 -0700 Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by orsmsx610.amr.corp.intel.com (10.22.229.23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.23 via Frontend Transport; Tue, 18 Apr 2023 05:55:35 -0700 Received: from NAM12-MW2-obe.outbound.protection.outlook.com (104.47.66.48) by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.23; Tue, 18 Apr 2023 05:55:34 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bSMYLKsS+s+jVtPWdaVFVgKIhFA4FZwAa6ZZtil/XOmtM3gf1A1j7xu869OGaUB5EJ+m39uMlohD3ygWlSuFKPmsbP+gz4h77ziFpWCl5JHWILdBFd8VqtqF/7Wqo0TjGYHc2JLGS+zL5FIiMdljSm8+EfFH6sR369nelSrKNPo6P5LKN/YkajMyi3Z/H30KdKIxV5IIScKEOcTPqota0VbLotNFy2iQYjWB/9ybkJNeW2GXWlvCDOwKJHIziIw3yWJlFJGVVKEDr5zL1sHeovb5k7KOIsIijIqXz/oZx1R2eBqGRYxIjelFHlte6hn43Uz8QlXCsgHP+SmQpP6NVA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=I8mzg71k/T4VI6l3LbX9N5ytHtoWh0/jcwfkmfWgmBM=; b=mpeXAa7jzl5nu1UgHqWMDmzXYEq7rdjv+gwvIz6LtAqHE+b8YIvgnMfBeIo8/x80IegZg/Ltyevj/iRGBGSe0udvBqrMyfOlYZNyKQ/Y2C1CqB/MEuXuqSeSN9P7Yq4dQjrshteOKf4ejs1PzfemgDMSsGZo4M8S4pgZZ8VVJCyDIAx9PC9eV/pDHroZLyWeBiDnQ6C7PbCgHB+SJ7FgIznG1DFTSppND0B/NlywB/Y/KjN9yumDtepBEnnTzoiJiVNmaeBWaZBFk3QrovWctr5U50gBsmqUrs4tHkUAMc4W/OZJFkvb3TAAXVC3rEyFqul3PvNiOVwJf+gp+5To4g== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from DS0PR11MB7309.namprd11.prod.outlook.com (2603:10b6:8:13e::17) by MN2PR11MB4536.namprd11.prod.outlook.com (2603:10b6:208:26a::23) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6298.45; Tue, 18 Apr 2023 12:55:32 +0000 Received: from DS0PR11MB7309.namprd11.prod.outlook.com ([fe80::695b:260c:f397:2b69]) by DS0PR11MB7309.namprd11.prod.outlook.com ([fe80::695b:260c:f397:2b69%4]) with mapi id 15.20.6298.030; Tue, 18 Apr 2023 12:55:32 +0000 Date: Tue, 18 Apr 2023 13:55:26 +0100 From: Bruce Richardson To: Morten =?iso-8859-1?Q?Br=F8rup?= CC: , , Subject: Re: [PATCH] mempool: optimize get objects with constant n Message-ID: References: <20230411064845.37713-1-mb@smartsharesystems.com> <98CBD80474FA8B44BF855DF32C47DC35D8788D@smartserver.smartshare.dk> Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D8788D@smartserver.smartshare.dk> X-ClientProxiedBy: LO2P265CA0274.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a1::22) To DS0PR11MB7309.namprd11.prod.outlook.com (2603:10b6:8:13e::17) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: DS0PR11MB7309:EE_|MN2PR11MB4536:EE_ X-MS-Office365-Filtering-Correlation-Id: e677831d-f2c1-4fa4-9930-08db400c31c0 X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: FLPVP1n+5KC643/PSwbmvr2P6gOyF7Nb7Z670C3N5ZER7IHdpD6lFkzGQ049vP4JogF9dC33pCG86e9W+LZiFHkmuEXcR4+czjUtL+0DQ7D6lkPx+0txKSjRPG/B51RfAK6XLZN94So3pDv5SjS640OpOW4utVTBYhoLAXloGjoYNXtaWc/HqwM342j3Y6OT1dmj56t1ZfJjwokmuB3DQ5Uso6aJuG83aps1QH1A3YvMjFrnnMoo8Hio1h6EHTteBGHLBB5IrMEkeS29v6i9NCVwmlC4w8eh25ildwIjXjSfa7RelIWR8gi1RdZ4eKg7Ce1ugEJnw7I3ZDfwP3tBz9gzX9mZFjMkekchZ1N2+vPnMziYYG/QCesxmZqUFapSEqComO6m1xXKzz7GzHb+Xz1SULxxKWRKWKcc3Jl5lVcfmuMpJiGWvH4NHwlxsGu93JyrRaOOtoYoPNmDCuu9jCe5bAR8Rb9MokoO3FwLcU3K82KBA2CPWo/zCVxfFNh2JkPpLUJGKWUd9crvUyomDsS7xZ9HNhF0KrfRr0A7COlYQHAJ/+0jWol11Iwdk0KIeelnouqMc2pYtG/G7ypgon1Y61lE/1vI/h38OKTsQNdH2zSHjPCq2tL/2+nUDWcE X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DS0PR11MB7309.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(396003)(366004)(136003)(346002)(376002)(39860400002)(451199021)(5660300002)(44832011)(66574015)(83380400001)(86362001)(186003)(26005)(6512007)(82960400001)(6506007)(38100700002)(8936002)(8676002)(478600001)(6486002)(6666004)(41300700001)(316002)(66556008)(66476007)(4326008)(66946007)(6916009)(2906002)(129723003); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?iso-8859-1?Q?Y7k/+ERL/Z/3lxqMWEJEjbJgfoSOL6+urJBD2BmT5m7WArHMpHPxUmUeYS?= =?iso-8859-1?Q?mnH02lcxyfZmrrKH9a3rzpt9x9ciTh9OEEEGGzuoKtYdzkYmunMkJMG3IN?= =?iso-8859-1?Q?mfAEjflSqHVF0nU8lDVF3gC9SABfMux4QDelV1eN4DSektENLFuFQCwkVm?= =?iso-8859-1?Q?GCeTMPFnoW0RLm7ZktXXhRpjERcAzcqhoShkmoRT1Z+g7mHRS2PPpeWdPp?= =?iso-8859-1?Q?0GUOCsNH6XevN1uHVkruj1//WfUmN1YhdS5mq8GAIiYO8HoTEoomaRveJf?= =?iso-8859-1?Q?CVVZjr23OOwNrG9eA3fJnfVxbmCyXnnV5KDJUMnDVdaWa7/5erkYwXzjze?= =?iso-8859-1?Q?hJF/Tdwjg6rq9yh97I5jejEOm5gsqHsT8g+B1TPFOJ0USBdZpy1rbq5r6k?= =?iso-8859-1?Q?2upCAZISiroZWb7RW+XUGENJJQIkRnJDReh47s0/5Sv8GqeQhUUqM/ROn0?= =?iso-8859-1?Q?9+BiT7wIb0kJbbt8/nb1lSfMLu+lhOmJO4Jlczqco2GZGFdBmgTO71Ur2d?= =?iso-8859-1?Q?NRUD363xWQN+5ld/OZ4CkCyb/WaqQbVOA8oK6eLvt1/yh+pizWzzgDKf4d?= =?iso-8859-1?Q?9IHoPaYefVoTzYmbhNvjJizOd67pMP3GylmuiZERu+EBzIpliJzwJRVk2O?= =?iso-8859-1?Q?FVJWrTxFVPKvvtAq1Gp5Em/l0WvWLELuIOG7t7hf2uPrf9n1gr9Lof74Yw?= =?iso-8859-1?Q?fZmp84WDiXjB/Nh96oCXqwFlrEoNiC42jUd9bn0kYal3e+qldEZ+cSyxGw?= =?iso-8859-1?Q?YJSRjbhNR85oAwHJjKqaOl7W94Bl9cYEL3IDPAdz+ZT9f0AE1Y+Q8YC0jL?= =?iso-8859-1?Q?/6J2taRUqIhuXt9JNbHpYg5EL/RA3sKFjzsxpIQ7S5gqHOMN86YGC7/uPZ?= =?iso-8859-1?Q?EpEBHjg9n0YZBnRMO+2jpTw/YuE4rVUOg4NXmYRcB7tZP3NjssXUICF5Dd?= =?iso-8859-1?Q?hs63jxSmRsugw3huCS04rF1xlUMkFt2nQNpsQQFtN4m7De9eAFbgk3GU4G?= =?iso-8859-1?Q?EVFXxeTYZHcyQSg9gXSoDqyjzc+BQveiWvfg7zlIM70y+yy5pfJpi1uxzm?= =?iso-8859-1?Q?qTRdKQ/4gHiJn9H71gqNtuIcRNJK7bUmTzrelNypcHUNjnaxsQnFCLJNEP?= =?iso-8859-1?Q?qPLCzHcXDZaQ/Evla50a9334j+oMEtRULHMGC24Y66IGnqFu+J/7fMqAlE?= =?iso-8859-1?Q?/rIkhp9vYG9x4k5/QSV/lXHdNWlsoyE5Npv0Ga6u5zS+CvVVe7JSanAjX3?= =?iso-8859-1?Q?BkG5yNBX2lR77POJIED3u6OuwRB6xgecYqkvQqXCy6K2p9f4A4ks0b6hph?= =?iso-8859-1?Q?YJ2zsPkmLqs5aufTp4ioudSSHmDv1LRUkUappVTtJtIHlZoZPJw5LB3zwV?= =?iso-8859-1?Q?44GsL05MznkZyZlHSuZuYvL61fcMHIIbhA7YbAOwxLAamglzBKSZTshYaJ?= =?iso-8859-1?Q?bcZeWCVRvZqtOr+VNYnkqDwHxhBlCruAvdUkNI3BJQEleejyfJi9CzvgVS?= =?iso-8859-1?Q?UlM4u3gy1ZQHA9GtlwHKSzPPZ3BC2N8dcU/h9Y6XSisayNVr56ztDbJmO1?= =?iso-8859-1?Q?0B8QIXOcSvrBMsGFJsPo0+npYU40LzlWXZ2EyALJAr/hnnYFvekNjgcffG?= =?iso-8859-1?Q?l85JyEA6NNBZGiQJvbPRvh0SDMW9YhfVESNdfZCKd8RFY3HOLjbaYlFQ?= =?iso-8859-1?Q?=3D=3D?= X-MS-Exchange-CrossTenant-Network-Message-Id: e677831d-f2c1-4fa4-9930-08db400c31c0 X-MS-Exchange-CrossTenant-AuthSource: DS0PR11MB7309.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Apr 2023 12:55:32.4078 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: OttLBZVS9fQyPUw0s20HagOyjl+ZLRFKkrfFvuhdqFqn/Wk5AZ1gTlXF4bXrzg0JZIyU+OLdDbw1e6D50FkG9eXuV5Oq2K4uO1Bfcza4dEk= X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR11MB4536 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Apr 18, 2023 at 01:29:49PM +0200, Morten Brørup wrote: > > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > > Sent: Tuesday, 18 April 2023 13.07 > > > > On Tue, Apr 11, 2023 at 08:48:45AM +0200, Morten Brørup wrote: > > > When getting objects from the mempool, the number of objects to get is > > > often constant at build time. > > > > > > This patch adds another code path for this case, so the compiler can > > > optimize more, e.g. unroll the copy loop when the entire request is > > > satisfied from the cache. > > > > > > On an Intel(R) Xeon(R) E5-2620 v4 CPU, and compiled with gcc 9.4.0, > > > mempool_perf_test with constant n shows an increase in rate_persec by an > > > average of 17 %, minimum 9.5 %, maximum 24 %. > > > > > > The code path where the number of objects to get is unknown at build time > > > remains essentially unchanged. > > > > > > Signed-off-by: Morten Brørup > > > > Change looks a good idea. Some suggestions inline below, which you may want to > > take on board for any future version. I'd strongly suggest adding some > > extra clarifying code comments, as I suggest below. > > With those exta code comments: > > > > Acked-by: Bruce Richardson > > > > > --- > > > lib/mempool/rte_mempool.h | 24 +++++++++++++++++++++--- > > > 1 file changed, 21 insertions(+), 3 deletions(-) > > > > > > diff --git a/lib/mempool/rte_mempool.h b/lib/mempool/rte_mempool.h > > > index 9f530db24b..ade0100ec7 100644 > > > --- a/lib/mempool/rte_mempool.h > > > +++ b/lib/mempool/rte_mempool.h > > > @@ -1500,15 +1500,33 @@ rte_mempool_do_generic_get(struct rte_mempool *mp, > > void **obj_table, > > > if (unlikely(cache == NULL)) > > > goto driver_dequeue; > > > > > > - /* Use the cache as much as we have to return hot objects first */ > > > - len = RTE_MIN(remaining, cache->len); > > > cache_objs = &cache->objs[cache->len]; > > > + > > > + if (__extension__(__builtin_constant_p(n)) && n <= cache->len) { > > > + /* > > > + * The request size is known at build time, and > > > + * the entire request can be satisfied from the cache, > > > + * so let the compiler unroll the fixed length copy loop. > > > + */ > > > + cache->len -= n; > > > + for (index = 0; index < n; index++) > > > + *obj_table++ = *--cache_objs; > > > + > > > > This loop looks a little awkward to me. Would it be clearer (and perhaps > > easier for compilers to unroll efficiently if it was rewritten as: > > > > cache->len -= n; > > cache_objs = &cache->objs[cache->len]; > > for (index = 0; index < n; index++) > > obj_table[index] = cache_objs[index]; > > The mempool cache is a stack, so the copy loop needs get the objects in decrementing order. I.e. the source index decrements and the destination index increments. > BTW: Please add this as a comment in the code too, above the loop to avoid future developers (or even future me), asking this question again! /Bruce