From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 287C8A0513; Fri, 8 Apr 2022 08:29:19 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BB53C4067E; Fri, 8 Apr 2022 08:29:18 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id 759714003F for ; Fri, 8 Apr 2022 08:29:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649399357; x=1680935357; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=BqVKNYRGrLtzGVHnjwVJtQwflOqsFBaD68fcsyxQYXI=; b=T8p3xxPxuZE9J4GUcgyf1lMuBZbjhVMeqMPaxPv2gdETx5wNAmvDa78J daflaTWAV5pEdOtZFwl3fMapshndBTp3z2WGnkqOB9ZMfV4zoFqNX+cAN c0v/JuHUycJwvTaRSXpbO/h5nnIVlEVsrd1ir78WT3fpVf1rxFutdAmyd plxZZSjdyqfJRKGKpoKJ4g8jr6wxTQe3pt6bjBcsnCbf75zxgExzfygsC zEHmwEbmZRiHByaD3XMjl0mInaLO1J2Jf3b36ESWdtNPNxp7Pd89H64BZ W+uS1EG/V+ka9PW2x4vye5X74FM7FA0tFTDGq6HkytcOD5Smgnv75QsnN w==; X-IronPort-AV: E=McAfee;i="6400,9594,10310"; a="261698841" X-IronPort-AV: E=Sophos;i="5.90,244,1643702400"; d="scan'208";a="261698841" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Apr 2022 23:29:14 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,244,1643702400"; d="scan'208";a="642783649" Received: from orsmsx606.amr.corp.intel.com ([10.22.229.19]) by FMSMGA003.fm.intel.com with ESMTP; 07 Apr 2022 23:29:14 -0700 Received: from orsmsx606.amr.corp.intel.com (10.22.229.19) by ORSMSX606.amr.corp.intel.com (10.22.229.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.27; Thu, 7 Apr 2022 23:29:13 -0700 Received: from orsedg603.ED.cps.intel.com (10.7.248.4) by orsmsx606.amr.corp.intel.com (10.22.229.19) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.27 via Frontend Transport; Thu, 7 Apr 2022 23:29:13 -0700 Received: from NAM11-BN8-obe.outbound.protection.outlook.com (104.47.58.172) by edgegateway.intel.com (134.134.137.100) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2308.27; Thu, 7 Apr 2022 23:29:13 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=T3Av9k8DH2qsmAEdcxKMAc1acoQPkM2eJwRJApZnjOrtsJ/scsk/KU1Yfw9IK6NX040okymV2mVGdPezBHwE2Hov/cZSJhK3TkOHxoZIfnaocgObw40Szzd+e6gp5ZufuFKChivYevnPD/E/4VPcgpCcT+hEWSINQR+Q+b+PKGMOGdLL9lUVY+CPmtpP2qy7poK5wCsCzds3ASIXbWgJJUojI9yi/v7Poas6mP9Uuq/Gx09j14WoKP/63/wGpM5H9/vKjPcP0ia6h+RQlMS4Xqyi5q58BCh7RP8L8V9SP0mz1dqUsOhzRTXpQw9642vM+9thbzI1/2xdSt1UCml5Vw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=UyRrOTTxQEzpidyjEb/EIsZfcN2VSzEPMlruMGXLVj4=; b=FZW4hCv3jDC7e9m9IwD6xLN0aGsJqQUl+xgkCNJaGs73UxdPV0AOmWlzu70Idsx/bLjmybEgraKISUB+P7GWPqviRhtfE+JdR4GPHboIQ+j5ZBe+9IUcyngstHz+RhpczeHgMFFWveWGSI9msTHzE6jrurqip1MrKlWI78A+pBr/T/TXu7LNrRjFMY12TaAfOsqbr6WEEWEEQOp23w0Khsc0vr6Hedyph4jUlIQscPhL51mYLh9Ph4GjyN2tNF3XWRHXgGyDBYG/nbI07y6vyt8Xj/DSREuMHaML0xtqC8bRUZsocixIDU2PohdhoNjsPHxkF3160D6GqOFV+ts70Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com; dkim=pass header.d=intel.com; arc=none Received: from CO6PR11MB5603.namprd11.prod.outlook.com (2603:10b6:5:35c::12) by CH0PR11MB5236.namprd11.prod.outlook.com (2603:10b6:610:e3::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5144.22; Fri, 8 Apr 2022 06:29:11 +0000 Received: from CO6PR11MB5603.namprd11.prod.outlook.com ([fe80::c9c7:aace:b04e:24bd]) by CO6PR11MB5603.namprd11.prod.outlook.com ([fe80::c9c7:aace:b04e:24bd%4]) with mapi id 15.20.5144.026; Fri, 8 Apr 2022 06:29:11 +0000 From: "Pai G, Sunil" To: "Richardson, Bruce" , Ilya Maximets , Chengwen Feng , "Radha Mohan Chintakuntla" , Veerasenareddy Burru , Gagandeep Singh , Nipun Gupta CC: "Stokes, Ian" , "Hu, Jiayu" , "Ferriter, Cian" , "Van Haaren, Harry" , "Maxime Coquelin (maxime.coquelin@redhat.com)" , "ovs-dev@openvswitch.org" , "dev@dpdk.org" , "Mcnamara, John" , "O'Driscoll, Tim" , "Finn, Emma" Subject: RE: OVS DPDK DMA-Dev library/Design Discussion Thread-Topic: OVS DPDK DMA-Dev library/Design Discussion Thread-Index: Adg/jDNGcC8G4wWtSxeVfUOuAS3Y6wACLuFQAM6falAAVNKuAAAAYByAAAC4LoAAAQC/AAAFKqEAASgoXYAAAVdSgACKqu6Q Date: Fri, 8 Apr 2022 06:29:11 +0000 Message-ID: References: <22e3ff73-f3d9-abae-1866-90d133af5528@ovn.org> <0633e31c-68fc-618c-e4f8-78a74662078c@ovn.org> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-Mentions: fengchengwen@huawei.com, radhac@marvell.com, vburru@marvell.com, g.singh@nxp.com, nipun.gupta@nxp.com X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-reaction: no-action dlp-version: 11.6.401.20 authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=intel.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: cf11a0bb-e71e-44a4-130c-08da19291800 x-ms-traffictypediagnostic: CH0PR11MB5236:EE_ x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: YUbJWSUe6bMO81jcrQ0qJOzKNNTS+7DLgfyUQJbqPVxzCL71KPqNDnoLL/geO/2zitEnUaiEKp3os7MPidk/83GXZwcYd25LAYlIMvMsi25W4bhCz2tCZ1XSJhL2jY6MuH4m8s7v37Mi+q4SL9HhBBJWvTVOypKarBfs5/0X/BWMf1T82NIcFhOwQiZMGz3zCJdLjsqrLGu8lcXu2B7eOC08c4fRh0/l/2VntA6w1oxQ45V12PYHo6BnLkFXCK1LknjDZfGC2TcziYhSfeO3qDQ8seJralmvWOhRCredyBrClecbbVuDOOTdn1YaokCIRdy5MKz7Pc9hMFB/pxQTZAw90Z39miFsb9S/MY9EGJtZiUU8mFlqRvAExMJ8n5oO4Z+OoyXC8Gt0ZLSrLvxFgbcj5TZKaSzLg4YwJSIWd4yRwOr3utowRdAxdKB3YYtW5JBSjz53503fFuff5rRwEeVMjs+a6kzHJHaIsHXTqnmUeg3oIJQRBRNw494MPRV6zyRIG1vsYBZmQbAFuYm57nx38dsjPL7nPoq9hn5ewdyDcwKnIl/2samjN7XN6+kXuFnfyH67jf3lBYT8OMEhA2q6pc3Aes7q8XAJAKY5aGe2Ht1lugNd1uUMv2WZnkDeNQtaAqYtYfs1sY92GH28gqzt7ZwJGVHqDL/TW72DSgqNE4bjOCG0hVviWZCcMUw9m6VbT/ZtV7/879I/BP38NQ== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO6PR11MB5603.namprd11.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(366004)(4326008)(82960400001)(8676002)(38100700002)(76116006)(122000001)(110136005)(52536014)(66946007)(66556008)(66476007)(55016003)(64756008)(66446008)(316002)(54906003)(86362001)(508600001)(83380400001)(9686003)(2906002)(53546011)(38070700005)(26005)(186003)(7696005)(6506007)(8936002)(5660300002)(71200400001)(107886003)(33656002); DIR:OUT; SFP:1102; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?umpb2CIszg5I7XOL+ire7KNrfhLAne7lEaS76hEWMYUDttI9ToAhLaCIGDPE?= =?us-ascii?Q?cfAI/cvD69qfmHO6/ldaDiWJCkaLMtGY1hz0bnM+/cntFvglr87roKhZ+mQp?= =?us-ascii?Q?FTQDOemziWfsqi5T+DTMkUYfL11u5BSox+1DZ1Z8Jp7zQKPVGhsiQP94+hzG?= =?us-ascii?Q?4OIfk2afRHQF7tM579VaIKF9r15+tEfyaEFTtQSGeB7nkFCapG1lfEqpDK1K?= =?us-ascii?Q?yCF3fznKtt17sAwdi8pde7iSOik/CyV2IAluPXFZ/LNo0ok5O2CA9NNKs9Nu?= =?us-ascii?Q?kWCLRsoU+1Dnhnq/N0S23S2E++13b4QqY2WJsHH0+dZuFqRAarruY0txB8jl?= =?us-ascii?Q?4QDa/3isW69Qv+B7s+TlUVNUnOcJHSI9JxtfsDBzuFI9abL+0FOgtAIoKOPB?= =?us-ascii?Q?tLFJmvjx3iJKtT3VgEpAv7veMUI2DfzFnpIsdpsvSVsAaBtBWa2YcUAImmN2?= =?us-ascii?Q?jimSRlXrKHvdhvDnIyRbnKUQMiiQzdT9Kx72q/rH/KlWg0tCcpEF2U/1Dx9t?= =?us-ascii?Q?PYrPnI7IPC36RknrR241irFrBGo1xsnTtORduq94wMs6fwG9AEbs+MvQ8QDo?= =?us-ascii?Q?n3LF9H5j0EW7THuqFclBUbydMdgynwHC0HCtGJ2JMsjL/dMnp0FluxMcHSz6?= =?us-ascii?Q?8h4cBF3kmef76cwPHBuey2rTRLx6j9qikmXVsulmICOVTLC0KjLOtwM57BlT?= =?us-ascii?Q?6Z0jyZt5gIg05+owc5wd6BMbi8SI10ytIdH5qJOFv7zVNqIS7qxjnEQ7VosQ?= =?us-ascii?Q?20NaUJZF5p9DB9hBpvLA0DVPoD1npBTBU6u2/E0XOWyDptapSdlDIDWcFaTv?= =?us-ascii?Q?p4Dby+IDwydRSKTSg13GPiDa5nL1ELyxDYqoIX2l3mEqUKObkAEL+7cFk3tB?= =?us-ascii?Q?zO4+L+9XNzbiNBfr5TsQRdKch+oZ/otY0iW7+PJ+7XQi5Urds7Rw5GhrhXtT?= =?us-ascii?Q?EG5CedWDETwD4UaQsbaNN4pFAkvPAz2k6f7egzWnWaLpAzuht4VP92FtSV2Z?= =?us-ascii?Q?MUHpbh/NlvfH20UNm5EZPoI506/x4q5r2I08WVdh8EP9UZo62ZPE0nVwafk4?= =?us-ascii?Q?qgy46ktdCPX5ocKtM5ACmqYi9gu6vXRi0zGVaG3d+XichAPMOL49etgEdjTs?= =?us-ascii?Q?UiTXqI1yYuRKO/oTpaKQE1be17t070YpAnCw//iWksNtfyQIaJ/XBnlp/ooT?= =?us-ascii?Q?tDNU+ULKgBcsSRkoiG1ksog8IGW1lyYeIp0HQlo7uD8n1hR33Uk78lPR9p9t?= =?us-ascii?Q?t9xGBTcjlL3st/P72htAA6oNfriU83F1L4n4lpXXlVjU3oIIeBxBw4tO5Xhu?= =?us-ascii?Q?F2VOg2NVYqhKOHuyDITwgj6PCiGCs6uDgJlrdfTv5eQEZ852hFq477YJdD02?= =?us-ascii?Q?YxotvncHW75uGqQF0BPXKwzkJ/TQL3Ymr28mcmEfSiDN4LTdM+dLmfuYC4zc?= =?us-ascii?Q?4DB0XwZK9dSSSnxz7Ylcs3sFS27SxxneD2nNZ+4BBi8mGtl0/UJLZVlevY87?= =?us-ascii?Q?4Y1pT4MVX0bEZhOp0FdpkJDxfOA4tt8Qsldjp8vm9BbDxySyyR9JQ+dOXjpS?= =?us-ascii?Q?jjhNrFicZPFgyCFV68xtTpU3hDjeLCx9j3XaDGm7WUKj8y3gCzemBynLaOl4?= =?us-ascii?Q?n24bmA/ijA6BL0OFfhTl3MuhGsfqCY4VuEbaKfa89Dpj2fahMe/AFpzRLwn2?= =?us-ascii?Q?cLWc2+tXVkpFmd+/Vwxf0/W/oqwAJ0IZXonPAmyjj74/k8KDjg1E7M1QnzaX?= =?us-ascii?Q?pKtkLonPyw=3D=3D?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CO6PR11MB5603.namprd11.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: cf11a0bb-e71e-44a4-130c-08da19291800 X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Apr 2022 06:29:11.2576 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: HHNH9hFumRbu+IlKzGgvGYQ8moCDkkA4XWCptTtoM4rR5+MIhy3ClhUHNkkGdotiyztavN+Xx09iiiAoquh/Wg== X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH0PR11MB5236 X-OriginatorOrg: intel.com X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > -----Original Message----- > From: Richardson, Bruce > Sent: Tuesday, April 5, 2022 5:38 PM > To: Ilya Maximets ; Chengwen Feng > ; Radha Mohan Chintakuntla ; > Veerasenareddy Burru ; Gagandeep Singh > ; Nipun Gupta > Cc: Pai G, Sunil ; Stokes, Ian > ; Hu, Jiayu ; Ferriter, Cian > ; Van Haaren, Harry = ; > Maxime Coquelin (maxime.coquelin@redhat.com) = ; > ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara, John > ; O'Driscoll, Tim ; > Finn, Emma > Subject: Re: OVS DPDK DMA-Dev library/Design Discussion >=20 > On Tue, Apr 05, 2022 at 01:29:25PM +0200, Ilya Maximets wrote: > > On 3/30/22 16:09, Bruce Richardson wrote: > > > On Wed, Mar 30, 2022 at 01:41:34PM +0200, Ilya Maximets wrote: > > >> On 3/30/22 13:12, Bruce Richardson wrote: > > >>> On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets wrote: > > >>>> On 3/30/22 12:41, Ilya Maximets wrote: > > >>>>> Forking the thread to discuss a memory consistency/ordering model= . > > >>>>> > > >>>>> AFAICT, dmadev can be anything from part of a CPU to a > > >>>>> completely separate PCI device. However, I don't see any memory > > >>>>> ordering being enforced or even described in the dmadev API or > documentation. > > >>>>> Please, point me to the correct documentation, if I somehow misse= d > it. > > >>>>> > > >>>>> We have a DMA device (A) and a CPU core (B) writing respectively > > >>>>> the data and the descriptor info. CPU core (C) is reading the > > >>>>> descriptor and the data it points too. > > >>>>> > > >>>>> A few things about that process: > > >>>>> > > >>>>> 1. There is no memory barrier between writes A and B (Did I miss > > >>>>> them?). Meaning that those operations can be seen by C in a > > >>>>> different order regardless of barriers issued by C and > regardless > > >>>>> of the nature of devices A and B. > > >>>>> > > >>>>> 2. Even if there is a write barrier between A and B, there is > > >>>>> no guarantee that C will see these writes in the same order > > >>>>> as C doesn't use real memory barriers because vhost > > >>>>> advertises > > >>>> > > >>>> s/advertises/does not advertise/ > > >>>> > > >>>>> VIRTIO_F_ORDER_PLATFORM. > > >>>>> > > >>>>> So, I'm getting to conclusion that there is a missing write > > >>>>> barrier on the vhost side and vhost itself must not advertise > > >>>>> the > > >>>> > > >>>> s/must not/must/ > > >>>> > > >>>> Sorry, I wrote things backwards. :) > > >>>> > > >>>>> VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use actual > > >>>>> memory barriers. > > >>>>> > > >>>>> Would like to hear some thoughts on that topic. Is it a real > issue? > > >>>>> Is it an issue considering all possible CPU architectures and > > >>>>> DMA HW variants? > > >>>>> > > >>> > > >>> In terms of ordering of operations using dmadev: > > >>> > > >>> * Some DMA HW will perform all operations strictly in order e.g. > Intel > > >>> IOAT, while other hardware may not guarantee order of > operations/do > > >>> things in parallel e.g. Intel DSA. Therefore the dmadev API > provides the > > >>> fence operation which allows the order to be enforced. The fence > can be > > >>> thought of as a full memory barrier, meaning no jobs after the > barrier can > > >>> be started until all those before it have completed. Obviously, > for HW > > >>> where order is always enforced, this will be a no-op, but for > hardware that > > >>> parallelizes, we want to reduce the fences to get best > performance. > > >>> > > >>> * For synchronization between DMA devices and CPUs, where a CPU can > only > > >>> write after a DMA copy has been done, the CPU must wait for the > dma > > >>> completion to guarantee ordering. Once the completion has been > returned > > >>> the completed operation is globally visible to all cores. > > >> > > >> Thanks for explanation! Some questions though: > > >> > > >> In our case one CPU waits for completion and another CPU is > > >> actually using the data. IOW, "CPU must wait" is a bit ambiguous. > Which CPU must wait? > > >> > > >> Or should it be "Once the completion is visible on any core, the > > >> completed operation is globally visible to all cores." ? > > >> > > > > > > The latter. > > > Once the change to memory/cache is visible to any core, it is > > > visible to all ones. This applies to regular CPU memory writes too - > > > at least on IA, and I expect on many other architectures - once the > > > write is visible outside the current core it is visible to every > > > other core. Once the data hits the l1 or l2 cache of any core, any > > > subsequent requests for that data from any other core will "snoop" > > > the latest data from the cores cache, even if it has not made its > > > way down to a shared cache, e.g. l3 on most IA systems. > > > > It sounds like you're referring to the "multicopy atomicity" of the > > architecture. However, that is not universally supported thing. > > AFAICT, POWER and older ARM systems doesn't support it, so writes > > performed by one core are not necessarily available to all other cores > > at the same time. That means that if the CPU0 writes the data and the > > completion flag, CPU1 reads the completion flag and writes the ring, > > CPU2 may see the ring write, but may still not see the write of the > > data, even though there was a control dependency on CPU1. > > There should be a full memory barrier on CPU1 in order to fulfill the > > memory ordering requirements for CPU2, IIUC. > > > > In our scenario the CPU0 is a DMA device, which may or may not be part > > of a CPU and may have different memory consistency/ordering > > requirements. So, the question is: does DPDK DMA API guarantee > > multicopy atomicity between DMA device and all CPU cores regardless of > > CPU architecture and a nature of the DMA device? > > >=20 > Right now, it doesn't because this never came up in discussion. In order > to be useful, it sounds like it explicitly should do so. At least for the > Intel ioat and idxd driver cases, this will be supported, so we just need > to ensure all other drivers currently upstreamed can offer this too. If > they cannot, we cannot offer it as a global guarantee, and we should see > about adding a capability flag for this to indicate when the guarantee is > there or not. >=20 > Maintainers of dma/cnxk, dma/dpaa and dma/hisilicon - are we ok to > document for dmadev that once a DMA operation is completed, the op is > guaranteed visible to all cores/threads? If not, any thoughts on what > guarantees we can provide in this regard, or what capabilities should be > exposed? Hi @Chengwen Feng, @Radha Mohan Chintakuntla, @Veerasenareddy Burru, @Gagan= deep Singh, @Nipun Gupta, Requesting your valuable opinions for the queries on this thread. >=20 > /Bruce