From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 168CEA0544; Mon, 10 Oct 2022 11:58:06 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id ED8A440146; Mon, 10 Oct 2022 11:58:05 +0200 (CEST) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by mails.dpdk.org (Postfix) with ESMTP id 1970040041 for ; Mon, 10 Oct 2022 11:58:03 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1665395884; x=1696931884; h=date:from:to:cc:subject:message-id:references: content-transfer-encoding:in-reply-to:mime-version; bh=KheS1KOMl2SE+stmSGZEL2xrrYHFfn9KJYKRcK8uofo=; b=dKpeLbIDVH3Z/6a0iwdwePuM1K1B1UmYk5WAw4BcJDL60lS1E+odRJV5 w165j8S3Kn0OVjlgLzJw7zc5jVH9a04RKlF6m9piIfg1d+h8CeQiyKSG+ tRUOHqFT1eifpNf2wzMsAbKgDEQhGL4Ke1m2SI0UIO8g1UOAHer5Ggrgs cEHEvO8Fkcd+fq9hwu2wNbg453ehgD6MQQhjfq/7IjE0jRnAz5Ti0xFDf mauwiB0Vai5ytbtuZL4nIT2ijRTqxtxMYUYpY13nfUrOUl0gnmrJhmcgl KR159QggzbaLVdzkRkmLxRJ4o7QfKqSjFMIbeHmkR51xRu/SZ3XUudc5f Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10495"; a="390494098" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="390494098" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Oct 2022 02:58:03 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10495"; a="625916429" X-IronPort-AV: E=Sophos;i="5.95,173,1661842800"; d="scan'208";a="625916429" Received: from fmsmsx603.amr.corp.intel.com ([10.18.126.83]) by orsmga002.jf.intel.com with ESMTP; 10 Oct 2022 02:58:02 -0700 Received: from fmsmsx609.amr.corp.intel.com (10.18.126.89) by fmsmsx603.amr.corp.intel.com (10.18.126.83) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 10 Oct 2022 02:58:02 -0700 Received: from fmsmsx602.amr.corp.intel.com (10.18.126.82) by fmsmsx609.amr.corp.intel.com (10.18.126.89) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31; Mon, 10 Oct 2022 02:58:01 -0700 Received: from fmsedg601.ED.cps.intel.com (10.1.192.135) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.31 via Frontend Transport; Mon, 10 Oct 2022 02:58:01 -0700 Received: from NAM10-MW2-obe.outbound.protection.outlook.com (104.47.55.108) by edgegateway.intel.com (192.55.55.70) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2375.31; Mon, 10 Oct 2022 02:58:01 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aFx5sG2Sa6+GciIP+uFmvBzIOowxpvq4V50S2gq3R70E3igvczyFNHeqVjN99lYEis4CMH1CbtKR0+t9XVO0dVhhAqSMslkV00ykAqGgRDB2RsXmIpvfco+tjR2/PUK2g9iTSyO01q6VpyzItx/30TDof5ZwnT05+iBJuQDZLQZcp+ZzZzS/vyhkfbbVnZnc3gMbF2f+5g9ZuK/Wl6XUfPmtigu++TRf4qgG93fKjrLK5cBE18woKD1G68ed/4VVP9MS3aljIRkniUn1JF73EVgv0USR8oKsxc+uYFEfTWfUaH1LZl5hcdpIop+X6uSYy3Jm68RIs5CyM+NcvFPzSg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=fkufyAS+XZ24q2XhtNnsui8yICN6+6ZgW2r92Xcrn/A=; b=lMTBGwOulS7fArfsCPyXAexGjPkDzhQWvtNokiwGtok342ptlrAp9X2Y5a+Pdl8lTn4bLqTyH+Wz3dxX1jAhkFkDPwJQKzW92dyZ8K6Ic38JGYkhxKVLo1x/WXxekZNkQKFAYCP1+5roV40lY6+0OLoUrX+dhNQj8sy+IDfSd7+935gpJj7E/uHkkwlYEMElv4HcVeZwNvNI760/HLIoADSK9nHuOIgci4j6ezKphphcGJAm7rFI6gNIFaCaIvW6msOAf8GCt0nzugUceHVtmE4rPZRzLj6aY01WvgpusTqQ06ahXxUqTuRCzBzHsmV2n3qP5+j6ox1ftb0olVTl8Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Authentication-Results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; Received: from MWHPR11MB1629.namprd11.prod.outlook.com (2603:10b6:301:d::21) by CO1PR11MB5169.namprd11.prod.outlook.com (2603:10b6:303:95::19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5676.20; Mon, 10 Oct 2022 09:57:59 +0000 Received: from MWHPR11MB1629.namprd11.prod.outlook.com ([fe80::5582:9796:3aaa:aa1]) by MWHPR11MB1629.namprd11.prod.outlook.com ([fe80::5582:9796:3aaa:aa1%12]) with mapi id 15.20.5709.015; Mon, 10 Oct 2022 09:57:59 +0000 Date: Mon, 10 Oct 2022 10:57:52 +0100 From: Bruce Richardson To: Mattias =?iso-8859-1?Q?R=F6nnblom?= CC: Morten =?iso-8859-1?Q?Br=F8rup?= , , , , , , , Subject: Re: [PATCH] eal: non-temporal memcpy Message-ID: References: <98CBD80474FA8B44BF855DF32C47DC35D8728A@smartserver.smartshare.dk> <20221006203426.78743-1-mb@smartsharesystems.com> <98CBD80474FA8B44BF855DF32C47DC35D873BC@smartserver.smartshare.dk> <730193b1-9574-ff59-28be-c1449cba0ffc@lysator.liu.se> Content-Type: text/plain; charset="utf-8" Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <730193b1-9574-ff59-28be-c1449cba0ffc@lysator.liu.se> X-ClientProxiedBy: LO2P265CA0174.GBRP265.PROD.OUTLOOK.COM (2603:10a6:600:a::18) To MWHPR11MB1629.namprd11.prod.outlook.com (2603:10b6:301:d::21) MIME-Version: 1.0 X-MS-PublicTrafficType: Email X-MS-TrafficTypeDiagnostic: MWHPR11MB1629:EE_|CO1PR11MB5169:EE_ X-MS-Office365-Filtering-Correlation-Id: 25f1cbe5-0001-4705-84d9-08daaaa5e9b9 X-LD-Processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: Vnxr4hpor+dAAFjLN1wuUkuqPHTmRRFAaErGnyfZDhb2L3F+TVP7HftkACgs3WwR+DZFxLzlKp+lRRvTOU8uzaFx15jU+lGMurtL7ylUb9gYCHKd0nurV4mQE4q7uGaQqr+LGDddLprgrNTLz4nWr5N/PQ9gzKgVAt+IUI/npCkVhGVJZkMBlUS7SxUm36OJ7r9G5H0i8hJm7+kOJYcud4E/vJNsV9VA3nMuLRXDbF3q6zitt29sd+W35pwRqiuxm61JBpwJ0FNgbCcAp8kDdg/XHrxP+VXHqoP5CupN2leyY2WwMILV/2gleJgr7dVba8uvQBo7zJaO/a/7jTi8c3EO/pcfFUuZqBbc0zzizKh4tj8rUIsEKQmIHXOgalGURbau9ebxsC84lgwCsk6R+6r9XhZpzMGTJKVFtdDP8HD+nc5pNFh9g0WLkbbyGbZbmBoP2/zSUnXuFpNEUZytv4UiQs3OfCc8JJGy7Fj9kc57wuCor9oPGOOOrc41/EYJ2sTKYgHEeUp6WhFmWbv+v38rdWI9EO28uUYu+ocZioAbRC7lX/+gQErieibitPIYvKGxMULbcfq9A2sdqS/tXGPNU7/poW+LoeGX+Y4MwZlGqH3tg7K2wtes9veU9aU5vZtpNVnUfiEx26lqjO4T5D6jKiuxUoxAMD/ENimB92CbG95FQCevSykkODERREi/7NlyAqcIlzb2XTUzuAwuyzyg7qp1suiscRkpHq4VOZsqVSU+KpKxb22dLR4k2/lQ X-Forefront-Antispam-Report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MWHPR11MB1629.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230022)(376002)(396003)(366004)(136003)(39860400002)(346002)(451199015)(316002)(478600001)(6486002)(82960400001)(6666004)(966005)(6506007)(53546011)(38100700002)(186003)(26005)(6512007)(83380400001)(66574015)(5660300002)(296002)(8936002)(4001150100001)(41300700001)(44832011)(2906002)(86362001)(8676002)(66946007)(66476007)(4326008)(66556008)(6916009); DIR:OUT; SFP:1102; X-MS-Exchange-AntiSpam-MessageData-ChunkCount: 1 X-MS-Exchange-AntiSpam-MessageData-0: =?utf-8?B?WkFaOE5BQWZETjQwTU8zWlRiN2pBdENRckRtM3g1aEN6WTY1bEJrY09PL2U2?= =?utf-8?B?N2dOYzh3blpJaGhsZU11U2dBZlNwNktNdS9kRmQ3bXB2Q01jR2NtMkNnS1hK?= =?utf-8?B?a3grNjhqY0x6MGprYW1reW51eFF4eDJIeUphMExvYmM1d3ZtQWhnM0FmMWF5?= =?utf-8?B?VXVZZkpGWGd5S2t5Qzh4NjREV1BxTUR2NDJCcjV6a1dHdUI3eGQ2bnI4KzFP?= =?utf-8?B?blA1Tm44dktxQ0gvV3k5czhyaVlJWEI0bFVlT3lXNjNRWjhrU0hHR2g1cEN2?= =?utf-8?B?L1NDTVI3Rm9mYVozYkJrVEx3ZkRQRXhQcUJRYVkxR1RrY0FmT0JQWHl3VFFu?= =?utf-8?B?RUtLWTRKb2M1eHpXYWhMd0NkUnVJQTIxYjRNVmpwTVVVUFUvQ1dTR3dBN3o4?= =?utf-8?B?dC9DR0F0TVNUd2VMOHhYVTMyQzdmWUgxYlVycHdRNE1BYVltUDRkUEJieW5y?= =?utf-8?B?OHhYYVZSa3ZYVGsra0xMTFh6eGhJQk43R3NudlNiWmJPWUZXNFg3ZVdIUm11?= =?utf-8?B?S0VmZ002dkdUdk1MNHJYeENDQ2pYOGp2SmlFaVVLSUs4K1p4TUU2THhiY1FB?= =?utf-8?B?MmY3OE1laUVzL3NiTkhQc2p6MDFGVC9mRFpMMUtnSkJDaERNamJuNVZBdXEv?= =?utf-8?B?Vy9GdE1jK0xweUVsOVcvUzg5TWlnNXIzM2pSU1lVRGhCaWUybnVuV01FNW1O?= =?utf-8?B?d1FxUFI4QTVMUGtRUmw4RVo0Q2J4UXhiaDZoRWJrRWJsZTN5WDlXc1VYVVd1?= =?utf-8?B?ZURGUE4zN0hldmZTdDZLNzFMU1JMZFd1Wk1HREU5MUZZdCt6TWMyVlBDMEw1?= =?utf-8?B?bUdCQm96Uk9FcHRLRFozQXgrQm1NSkIxOWN2Zkc3dVo1YlZJWno1eGZjTmFy?= =?utf-8?B?NjhOZGQ3L09iN3NZRDhPb3liNW1sL29yTWtwWGZsd1VNT1pmYmh5Y09ORFR4?= =?utf-8?B?ajloY3BjbXV0T1RFU0VKajlpOUQ2WXR3UjlQbWNvVmxDV29tQVdHdjFtSkIy?= =?utf-8?B?OW5nMmM4TkxwSW1ZN21EZFJSczNQWVJlQ0lONTBGV2VZa0pXdzJSR2doZzI5?= =?utf-8?B?Z1BJdUFDSWRxRmEzMHBoQlVtdGs5NW8wdmJ4dkR5TEpBKzBXaC9NUHM5ZW90?= =?utf-8?B?WU81bFkrNzBWbUpxbWorVDFoOGxBV3c3STR1YWNCbTNmN2xld3NyVXFOd0Rj?= =?utf-8?B?SGNKcWxvalUrclZBZVEwMzJ1V3c0VG1PTUNJMHE4TkxmUmlWY2Uva2JNaVVO?= =?utf-8?B?MVNvZ3dzKzZ5N29taUpYWVUxdGxMTlRCa3hyS1hQcml0YXM0SzhpNk0xMnY4?= =?utf-8?B?cTdjNmZDeUtLV25WQkxmZ2lWdU9hK0QyZFdqdVU4bHJhZVlxaG1tdVNhT1lo?= =?utf-8?B?VWpRdDFnUEEyM3JFbmp2d2dsUFFQMEJzRVRxODFqQTdFRCtON2lWSGxFajhi?= =?utf-8?B?YjY4N0FDVmhqODZ5UzlyS3hpVTRlTkwwUjhrTzBEcGU2Rk1QYkYvdG9LYWNT?= =?utf-8?B?ZE5FOTlROHJuSEUwYkJmT1Z6eC9JMVpENkJRQVhQZ1hjejF6bWJtRWo2KzNn?= =?utf-8?B?YlpXblVRSUpISmNIUk9lUk4rMUhlTGF4TmdTQWxuN29GOTI2a21vUTdjWDNE?= =?utf-8?B?elpoR0srWUpPRHlsUm5kWndHOHFvaFN4Nm1IemVvTk1RVngrSlRJMjdENmpt?= =?utf-8?B?dFcwTmxVcHphS3VFSER3TWJWR1Z3V3BFcS9rWldMNXl5RHdRNDN0NVdMMEFL?= =?utf-8?B?MVRBN3dZYlJ1OUFKTmFwODE5NDJVQzU1aTlYYUtzWWhTaDdQNEJ6Vk5TNG9p?= =?utf-8?B?a1FwN0ZpaUVacnBPNzJFblJIMGVkOGZxeG5LOWg5WGVlaC9HWE1RNHhVcXlW?= =?utf-8?B?QXI2VHVlQmhqaDVwMlF0MnRPNm56SEU1OEdXUUJicFZjbVVzUkYzNFYzQTZH?= =?utf-8?B?Y0ZLL1NQWWgxZmdCek1Qa0pYT3NZTWhaV0Q1a3BrZEhlK1NEU3FhMHE3dmh5?= =?utf-8?B?OURoaTdjZ2t1SkJlUmx5bDdtOGRpdW5XN09VNzZEclMzUFdQQm9LS3UzWEFa?= =?utf-8?B?eWFCVWVVVUp5VDBCMGhEQUNUaFl1U3VhZCt2VWVCYVpBUHprTm9KTStwVGRM?= =?utf-8?B?Z1BIcTVQSlNnWkpSZmtCYWhBN1RyV2NmNFpiR1JnWXlNQ2xDUTFPaENZSXR5?= =?utf-8?B?NUE9PQ==?= X-MS-Exchange-CrossTenant-Network-Message-Id: 25f1cbe5-0001-4705-84d9-08daaaa5e9b9 X-MS-Exchange-CrossTenant-AuthSource: MWHPR11MB1629.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-OriginalArrivalTime: 10 Oct 2022 09:57:59.5640 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-MailboxType: HOSTED X-MS-Exchange-CrossTenant-UserPrincipalName: OfYj5orryP/dmBt/OvBfDAP+LtbZJG/8SLbemy1zWeWffDWZWdrS4qXA5oSlVTyIHICU6RXiK0wL+UflmZehz5X4uyvO46gGSN/Rnh1dg0U= X-MS-Exchange-Transport-CrossTenantHeadersStamped: CO1PR11MB5169 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Mon, Oct 10, 2022 at 10:58:57AM +0200, Mattias Rönnblom wrote: > On 2022-10-10 09:35, Morten Brørup wrote: > > Mattias, Konstantin, Honnappa, Stephen, > > > > In my patch for non-temporal memcpy, I have been aiming for using as much non-temporal store as possible. E.g. copying 16 byte to a 16 byte aligned address will be done using non-temporal store instructions. > > > > Now, I am seriously considering this alternative: > > > > Only using non-temporal stores for complete cache lines, and using normal stores for partial cache lines. > > > > This is how I've done it in the past, in DPDK applications. That was both to > simplify (and potentially optimize) the code somewhat, and because I had my > doubt there was any actual benefits from using non-temporal stores for the > beginning or the end of the memory block. > > That latter reason however, was pure conjecture. I think it would be great > if Intel, ARM, AMD, IBM etc. DPDK developers could dig in the manuals or go > find the appropriate CPU expert, to find out if that is true. > > More specifically, my question is: > > A) Consider a scenario where a core does a regular store against some cache > line, and then pretty much immediately does a non-temporal store against a > different address in the same cache line. How will this cache line be > treated? > > B) Consider the same scenario, but where no regular stores preceded (or > followed) the non-temporal store, and the non-temporal stores performed did > not cover the entirety of the cache line. > The best reference I am aware of for this for Intel CPUs is section 10.4.6.2 in Vol 1 of the Software Developers Manual[1]. The bit relevant to your scenarios above is: "If a program specifies a non-temporal store with one of these instruc- tions and the memory type of the destination region is write back (WB), write through (WT), or write combining (WC), the processor will do the following: • If the memory location being written to is present in the cache hierarchy, the data in the caches is evicted. • The non-temporal data is written to memory with WC semantics" Hope this helps a little. Regards, /Bruce [1] https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-1-manual.pdf#G11.44032