From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0068.outbound.protection.outlook.com [104.47.34.68]) by dpdk.org (Postfix) with ESMTP id 4015BAAB9; Thu, 19 Apr 2018 17:55:32 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=qcwuosjYHSukj4OrlVGeUOx701PavCeLuH8GtvpPzIo=; b=gTTqkY6knKV7qoIjUIOdlkuNrPnhjHJ/Oqot4Mlps2YCxeWxAGDFkN3iGuRe/HV6WmNkiPIRXkdygUEItRxJo7CqE2Wymo5YRo7W+18XzJte3sRBqsbsQjrDhUqFQDx1tSZ0MX4899+kh+wvr9Bm0dT/FclBPjyJ5IhmSieGd1Y= Authentication-Results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=caviumnetworks.com; Received: from ltp-pvn (111.93.218.67) by CY4PR07MB3464.namprd07.prod.outlook.com (2603:10b6:910:75::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.696.13; Thu, 19 Apr 2018 15:55:27 +0000 Date: Thu, 19 Apr 2018 21:25:18 +0530 From: Pavan Nikhilesh To: Bruce Richardson , Ferruh Yigit , thomas@monjalon.net, jerin.jacob@caviumnetworks.com, techboard@dpdk.org Cc: dev@dpdk.org Message-ID: <20180419155517.GA12194@ltp-pvn> References: <20180418153035.5972-1-pbhagavatula@caviumnetworks.com> <20180418175505.GA17954@ltp-pvn> <291a43da-6c2d-f65b-374d-206a0f674db6@intel.com> <20180419092051.GA8072@ltp-pvn> <20180419120958.GC11352@bricha3-MOBL.ger.corp.intel.com> <20180419151832.GA7962@ltp-pvn> <20180419153723.GB41580@bricha3-MOBL.ger.corp.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180419153723.GB41580@bricha3-MOBL.ger.corp.intel.com> User-Agent: Mutt/1.9.5 (2018-04-13) X-Originating-IP: [111.93.218.67] X-ClientProxiedBy: CY4PR04CA0053.namprd04.prod.outlook.com (2603:10b6:910:4f::18) To CY4PR07MB3464.namprd07.prod.outlook.com (2603:10b6:910:75::17) X-MS-PublicTrafficType: Email X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:CY4PR07MB3464; X-Microsoft-Exchange-Diagnostics: 1; CY4PR07MB3464; 3:mlZjSvDMsHd0S0JzcRPmoYzh+CfmXlc03TwCUG/7Q7/sLC0Owf1lOwUVrbhxocUL/5j9jsVzxTQRdP2edR4WPnblRc7a6TCOsTue6gSjRaF2IzC33w8Vwz7dv6BxAIelTc1mIdpxFzxOjPIP9zm02FyK4gOuR3efmtjjI1uuh9Wm/fOy6t2LKA8XVfX0cepNzSJ7DAdxdv9E52yNHB4pacTLa/fucIy5+jJJtf3mibehKdfNoKAo8Q4sQLV/zcIJ; 25:jnGs8nt4+W9NEKTfgacvMJEW5dIVAb7JeO25iX7lg8Um3z08xWkV/XV7lGg7q1TSnEct2iCgBDCNYqhHfcv4wy1Yeh5XOhmtqT8reSeaBVbEo5rWQ2y4Z4c6hA5SZv7H45TPt9hELNkW/FnjKOLdCrOcnK2x8KZrvdsUmkQZIJshsl+TFoeB/BENiPLfYXPOAuBfY8OUDD6bPmyH/3k5M36AuTjPIzaOJ3y+aBYI7F6ZVhRt6xLUgzj8HDmD5PuydF1Sc1EXRa5SGjHhvIczX2Ofl8R/kaC07ExQCWUoYH/Z0E0xIceSR1IfbcbTvRO0ilXsw2PYlg05V/Mc3/outQ==; 31:BzJJCl4hORsLiqJqo7GLHXZTwu71KEOj/tajRmgFWs+Zpo2DMw9PVGhx2injQPUkTcT+n8wyDRNHv0W6HZL3LlpLD5VRJf0R2or8MdGFi0AYZYRYrHLbbkjnaoVCwiNX8OAZnbNp2nc6GpiDsvGf7VA9vWUOiEKKXk2mqPntYNKQzpsErtk3PCMeYHYpLehTmIo2YHiLR+cpCDK0aa+ApcLK0nyeEy3pBa/AHfBIZrk= X-MS-TrafficTypeDiagnostic: CY4PR07MB3464: X-Microsoft-Exchange-Diagnostics: 1; CY4PR07MB3464; 20:QutElW2EuVpMT+Oa72reFWhb8ND779dl3kf24ZTAvz75pL8gHWC0/GetJp+z0nHWxPouLKviRp5HjxjKQBf4JcPIzwXDKve+6v9Bi4IiIc58YkdC4wxFmCsuiacpIFP7dFf3eyGaU8hHMikSMPLnbgN/HtdU5hTVrQsCN34QBviXUA46mgtVBXo4lt/yhZEJDMZY5KpGVqE2gpgO8q0a2sOPHLqcMc3Mn5/VK9YXuZZ1aXpZIkTKuHKPXlXu0qeTKWGGTcDKhLkr6ybCwEeln2QT3mWtzGoXp9YOM6/iAzGTrSxcYq63OrIdMdJcGOYgbYtSVP4eGIsfKT/+HswZRWZwYJtspb9oR2fIV57I1DQvu3i4P9EA+PCFI7wCejIqitO5ygSRJ4E945X6FOhRKOWgDB80aiV/OJ/D68aG2+/8/o66MN5xwAxPFAQU0C4v2WpDsjdVOizNcqlchDFh+6ivv/n1mpXUbfErzhTWTzhWYcJC4dGzhzaQ2sMh2/jmGrDD3k8+Jf5baQ5Fz4iv+HPjvAdYN24LcU7c0TDy3ChuG0Sitjg6Zy8SewDYZlgYhg93cnNv/UezBbYyBaXuEDzRdKLNZgRISHOs3ZPfGcA=; 4:yrk+jEEoqrw0wwIjC4oHOqlcAjRWlnJy4F7wLuMDVUDJ/9/iSstkKlWbIXTEsp33R8zslbqDh7cFI9S83jbnhVE6vDlc7VITsjojh5d0g1knPiuI+TEfu9F1AjWry3vOrvQMsEJdJP+zZ27b7q/nVcw8GhY5gwzAue1M6HZoGLLZGLEb1zHQ1vGR/yb9ZjQ57LsYiec0mtrVqOIbh0hggLigQ560SLcE0sD19Q8q2JRV3V0wBUv53L6ypeIFfNJ8ocrMaEmeeSHgIkeoyU37wDtoNXSDxKR96ADqam75VEyr/RdpVKAg7jbjBx0EIf1q X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(131327999870524); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(3002001)(3231232)(944501327)(52105095)(10201501046)(6041310)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123558120)(20161123560045)(20161123562045)(6072148)(201708071742011); SRVR:CY4PR07MB3464; BCL:0; PCL:0; RULEID:; SRVR:CY4PR07MB3464; X-Forefront-PRVS: 0647963F84 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(39860400002)(376002)(39380400002)(346002)(366004)(396003)(51444003)(478600001)(305945005)(72206003)(53936002)(966005)(6666003)(42882007)(33716001)(55016002)(316002)(6116002)(110136005)(16586007)(9686003)(6306002)(6246003)(66066001)(33656002)(23726003)(1076002)(47776003)(50466002)(8936002)(229853002)(93886005)(4326008)(2906002)(5660300001)(26005)(3846002)(81166006)(16526019)(6496006)(52116002)(7736002)(956004)(11346002)(25786009)(8676002)(186003)(446003)(476003)(76176011)(59450400001)(386003)(53546011)(33896004)(18370500001)(107986001)(42262002); DIR:OUT; SFP:1101; SCL:1; SRVR:CY4PR07MB3464; H:ltp-pvn; FPR:; SPF:None; LANG:en; MLV:sfv; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; CY4PR07MB3464; 23:yfIari3tN7cIIF+5S8kUf/X8X9UWYcnzYMQn9WywH?= =?us-ascii?Q?bBtq5ZW2RXbzZf7uLfArNgVbznjWT4PSaNm8mtm052EeuWC5GSzY/Ybf1tvG?= =?us-ascii?Q?/9SWveJtl1w2D+0XFDdB9+HFy/sOTP8k/fLOfmblgfsyxv8cH3XErZBTy5g2?= =?us-ascii?Q?SiI7XUHfjwegyQtC8oWf6u5pAr6mdQ3b8aWNh+oV1JezfZvFsgItYJWwouyl?= =?us-ascii?Q?eHMMFbfPkP7sXflf6oJq035wi97+zsSOyhqPUJByclPvufirBYEiXoq+6rSL?= =?us-ascii?Q?rFE944/r92vUEj8DrcnwdWWeqpWjmW6/R++1sVolZHw1WhakS95dh/7CraNo?= =?us-ascii?Q?g7Hg6k1jAl4yeiSG3hHyFAc4OyXJbKeYHbim8R3vxy3TlaBbCZpbfGRy5iWU?= =?us-ascii?Q?V3wPnTXxkyH74omqVIXvk0SET+UZJNLFEh8+hvpJkoLCYNvxuBGifOPROz6M?= =?us-ascii?Q?yZmqvL2qMf4GPij9mRYckUnSDXrJA2Puzc3FO7ZnX36tVqRTZRtm0764Msyc?= =?us-ascii?Q?KznMiLnU7di6olgQLPiCxCIWaabe4fTiihJW5PekEFGZMZ175AfxXj0hbAJh?= =?us-ascii?Q?D4/I/av7cbKT9lQBOhS4wKGjKYIkgf8g4r4XFtmYnjabapu4GIaTUmZR8bTY?= =?us-ascii?Q?6qxdbuYS+HCBCxuQeUvf93j6F9VOsQNFBFiaktSZ/j494aPHK7ROKFsjZ5mf?= =?us-ascii?Q?ObLE9jrRTlnmqkkbKmi7qodBRj/okiBY4uRlqrBgeZxWd1aKdSUM4/Qr+wgZ?= =?us-ascii?Q?RJmhnUMjEKD0hFMHvk8HkSHWvv9KvNhZI6YrSkQuz8fa8GR5WeMHkAWBnD3o?= =?us-ascii?Q?sZSO8BLSfvUWQJMTDsXunEVjfrBhnPxpgdcMxXQIozKe/xOXwDNiMhLh77mP?= =?us-ascii?Q?QP8cTF6WHtpeabbwPDcImyxq++3sAPHOOTV7OBzBvMeJyGTQLfnCz1ZaHQeU?= =?us-ascii?Q?3E3GkI7o5inTELnoROXZRSNC9a2kQheF7TcawH01D1cCMWoXWjYpOuVgC8Qd?= =?us-ascii?Q?tNEhxQ9ONTHkbxiTNb5byGwEN/7Mff1DCC6LMyaZ/S4zM6us8c+k0bQobnp3?= =?us-ascii?Q?8NmUbDyuD4nlyOBiFjwcMJH/u24dGaWzri2oqeamiyTUwhj6Nkt/CCm6jMcS?= =?us-ascii?Q?WqJDDnBzUgaoDPIPYohtilI4EQKZzsS+IFGbMA5HEK/eRy7VYKki8DeWRYLW?= =?us-ascii?Q?gOIsZ3atXLoAlNChpBk7q2I2krTeT+rHWlKpyZjZr4UdbkxA1J/vuuUEDLZF?= =?us-ascii?Q?Dy5VP+siPWZUuljAhz72kdk3oF24xka6Rvjy5GIhKlpN4pbsuwn8RmlfZoeU?= =?us-ascii?B?Zz09?= X-Microsoft-Antispam-Message-Info: aO91fgSbZwo9HPnlRPr/hrLol4e+jy7tt2zpwLWP6uHDwIlL0+vAwQd4dVrDoABgXqTa8fQpiff1rVhFB5BYVYKkwaTHerHmyEG6yxc59i71NeEzJeaAV+kV4DqPTby9JdbaLekNmmrr2FQlzooE+Lqt4dOz2LCaMPEo4AenX0wxK8iQljyswFznmd+ShsLm X-Microsoft-Exchange-Diagnostics: 1; CY4PR07MB3464; 6:aPlSZtysQdDJBJpLh+8B45D7oSsvjBk2nDh3mpaWdzRc6SH/lC484K8789cIyG6/X6Dslg6yEjRyajVnLRXZOrJqS6v1mblzo14zfX1hX41yblr/zk7ToeKN6MRI8X/TasjHxFV8BC/htSsrEn796n/dJXWYiPuEwUS6lAEyalWk1GYlFzw/u675q1Rj7mQvyBh4PAuG6grS98GtizUiKOTKhuGP3oPgsCO/T4wt0CMKG0VVgwVSqCzESMwquvAilvjx22IFStSv4OX9Np1UaHfBnDEHQmSAMU71rQ8Y+Mh/1SEbQ9EPfiGyMT7RjejtlgZXYuZeTaejIv+QUsBgOc+v7q9M/QSpDfsiIvBR2Eq0D7DP5fShDjAk3LdaLATDfHixVol/wPsCx56kCLv5AwzdmS6t//39M7BqUganso2224w7LMQhkOv9l7MmKzVR3/KfEtJqgf7RiTNXjAXozA==; 5:TpdP+NKZE9ie6kHm0bOIi8pezBOqkQMJfoPcON2caAwL+4L/5eoMkv03MTxbn2a5ee+9NiFLx/oSlb5fiLgmBe0CL/TTGxseKtbqbCj4ngBZ7ia1ya68mYsRgQM3lzbfjpSLY3AgpAWhpOz5h91LMwhjB2epJhW3JaujK+CPIEg=; 24:C8kdZQtVkQ2D7JEI6kv8Vok4xpdZG9KrdWo9+f7sQ5kODBl1RT39/tS/cCrfd4B0CE6RmbmJp4IIDkYfY3l/BkIseXpTeF3Ztk5VJBIUjvI= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; CY4PR07MB3464; 7:pa2exphfNSiIUJgjSjn4gPtX4lFDcKTJMD+3bQxfvuBIX0xozFDKVyQwEyBDYonH3QiKzYP4TOLA3+nKqADhVNUAPGOONZ1x8G41PF6MY0p8SM1UtUMW4+wH/AV39JvmcZIfqqXADVM9z/9mdrp6BSuI7DaHCbOUtDwN+A6Ew371PR/NOrHDYOPBMJcwFiSbFJCz/jr8sNubH+W7/GFamaMtC6xFnR9VafKRInPfdfHzBjOguWc3ovM2bfj+RaaL X-MS-Office365-Filtering-Correlation-Id: f909ea76-f884-4960-8e57-08d5a60df9d1 X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Apr 2018 15:55:27.7901 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: f909ea76-f884-4960-8e57-08d5a60df9d1 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR07MB3464 Subject: Re: [dpdk-dev] [PATCH 1/2] eal: add macro to mark variable mostly read only X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 19 Apr 2018 15:55:32 -0000 On Thu, Apr 19, 2018 at 04:37:23PM +0100, Bruce Richardson wrote: > On Thu, Apr 19, 2018 at 08:48:33PM +0530, Pavan Nikhilesh wrote: > > On Thu, Apr 19, 2018 at 01:09:58PM +0100, Bruce Richardson wrote: > > > On Thu, Apr 19, 2018 at 02:50:52PM +0530, Pavan Nikhilesh wrote: > > > > On Wed, Apr 18, 2018 at 07:03:06PM +0100, Ferruh Yigit wrote: > > > > > On 4/18/2018 6:55 PM, Pavan Nikhilesh wrote: > > > > > > On Wed, Apr 18, 2018 at 06:43:11PM +0100, Ferruh Yigit wrote: > > > > > >> On 4/18/2018 4:30 PM, Pavan Nikhilesh wrote: > > > > > >>> Add macro to mark a variable to be mostly read only and place it in a > > > > > >>> separate section. > > > > > >>> > > > > > >>> Signed-off-by: Pavan Nikhilesh > > > > > >>> --- > > > > > >>> > > > > > >>> Group together mostly read only data to avoid cacheline bouncing, also > > > > > >>> useful for auditing purposes. > > > > > >>> > > > > > >>> lib/librte_eal/common/include/rte_common.h | 5 +++++ > > > > > >>> 1 file changed, 5 insertions(+) > > > > > >>> > > > > > >>> diff --git a/lib/librte_eal/common/include/rte_common.h b/lib/librte_eal/common/include/rte_common.h > > > > > >>> index 6c5bc5a76..f2ff2e9e6 100644 > > > > > >>> --- a/lib/librte_eal/common/include/rte_common.h > > > > > >>> +++ b/lib/librte_eal/common/include/rte_common.h > > > > > >>> @@ -114,6 +114,11 @@ static void __attribute__((constructor(prio), used)) func(void) > > > > > >>> */ > > > > > >>> #define __rte_noinline __attribute__((noinline)) > > > > > >>> > > > > > >>> +/** > > > > > >>> + * Mark a variable to be mostly read only and place it in a separate section. > > > > > >>> + */ > > > > > >>> +#define __rte_read_mostly __attribute__((__section__(".read_mostly"))) > > > > > >> > > > > > > > > > > > > Hi Ferruh, > > > > > > > > > > > >> Hi Pavan, > > > > > >> > > > > > >> Is the section ".read_mostly" treated specially [1] or is this just for grouping > > > > > >> symbols together (to reduce cacheline bouncing)? > > > > > > > > > > > > The section .read_mostly is not treated specially it's just for grouping > > > > > > symbols. > > > > > > > > > > I have encounter with a blog post claiming this is not working: > > > > > > > > > > " > > > > > The problem with the above approach is that once all the __read_mostly variables > > > > > are grouped into one section, the remaining "non-read-mostly" variables end-up > > > > > together too. This increases the chances that two frequently used elements (in > > > > > the "non-read-mostly" region) will end-up competing for the same position (or > > > > > cache-line, the basic fixed-sized block for memory<-->cache transfers) in the > > > > > cache. Thus frequent accesses will cause excessive cache thrashing on that > > > > > particular cache-line thereby degrading the overall system performance. > > > > > " > > > > > > > > > > https://thecodeartist.blogspot.com/2011/12/why-readmostly-does-not-work-as-it.html > > > > > > > > > > > > > The author is concerned about processors with less cache set-associativity, > > > > almost all modern processors have >= 16 way set associativity. And the above > > > > issue can happen even now when two frequently written global variables are > > > > placed next to each other. > > > > > > > > Currently, we don't have much control over how the global variables are > > > > arranged and a single addition/deletion to the global variables causes change > > > > in alignment and in some cases minor performance regression. > > > > Tagging them as __read_mostly we can easily identify the alignment changes > > > > across builds by comparing map files global variable section. > > > > > > > > I have verified the patch-set on arm64 (16-way set-associative) and didn't > > > > notice any performance regression. > > > > Did you have a chance to verify if there is any performance regression? > > > > > > > Is there a performance improvement? It's seems a relatively strange change > > > to me, so I'd like to know that it really improves performance in test > > > cases. > > > > We had a performance regression of ~200k between 17.11 and 18.02 due enabling > > dpaa/dpaa2 in default config this was due to new global variables being added > > and changing the alignment. > > Moving read mostly global variables (logtypes/device arrays) to a separate > > section helps when tracking performance regression between builds. > > > So it's of use when debugging, rather than providing a performance boost in > and of itself, right? > > If performance regressions are appearing, should we then see about marking > globals with __rte_cache_align to force them all onto difference > cachelines? I think that would be a bit of overkill considering the number of logtype variables. Currently there are 29 global variables not found in any map file List of globals ("['fw_file', 'mode_8023ad_ports', 'ecore_mz_count', 'igb_filter_rss_list', " "'crc16_ccitt_pmull', 'igb_filter_flex_list', 'rte_crypto_devices', " "'igb_flow_list', 'qman_clk', 'bman_ccsr_map', 'dpaa_portal_key', " "'bman_ip_rev', 'cons_filter', 'rte_rawdevices', 'internal_config', " "'bman_pool_max', 'qman_version', 'qman_ccsr_map', 'rte_event_devices', " "'qman_ip_rev', 'netcfg_interface', 'igb_filter_ntuple_list', " "'igb_filter_ethertype_list', 'skeldev_init_once', 'igb_filter_syn_list', " "'crc32_eth_pmull', 'global_portals_used', 'virtio_hw_internal', " "'vhost_devices']") These are stats without including the logtype variables across all drivers. With logtype variables included there are 70+.