From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 8590CA00BE; Tue, 17 May 2022 00:31:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 30BC24068B; Tue, 17 May 2022 00:31:31 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 9220540042 for ; Tue, 17 May 2022 00:31:29 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.17.1.5/8.17.1.5) with ESMTP id 24GJwSdd007593; Mon, 16 May 2022 15:31:28 -0700 Received: from nam11-bn8-obe.outbound.protection.outlook.com (mail-bn8nam11lp2171.outbound.protection.outlook.com [104.47.58.171]) by mx0b-0016f401.pphosted.com (PPS) with ESMTPS id 3g2bxsrysh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Mon, 16 May 2022 15:31:27 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TyEwoxcZCbqBn/1PUlKWayUnfWbg/tVdpGJ5vCLneg5LZ3RBeMWAKyRbtzRUQJAtszvQm4feqAE8PRE7Pt+xtoyDUGxVYqwx2i+tHib06GavMZSR1ihjoVR0n0RwetanBryEPKJjKz/wXwNJlQMp9sgnWjyxLtltnT3VONA7fPReh4cht6QCOrvYaHek/MbLX1gBv7gDnzR2XAuCZtRoZO9m144xLIWL6uKU93iocRjgvrC4iyeOS7j8yF8T+OtT2lNzHtztiIAdEkj6aWxiR7MgTKQT7pfzt6sAJkdpcCbKiR/94R5wQFpDWSl9cCyIqz0rvs5I27cmLZd4lvD+sA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WmSaFGPOTmdTNVqE+x5oYIKP+zEDd1fumJqmcjCEaTM=; b=nqS1AX0l5u78vIpGJi/ojh7A6omKQN4KSDTYdYugy4a4O+D5P3xqEGxcbe2Xr1JX+WksC80fx8s0ul+zFsNrGucardmpMqmKoOdBehufV9RcrIPS70x3tTZHBFHYtcgCIHm+1q1j67FLgLmllOS2TJLewedmQixtCAGotgGYIZDACiFo42doScZmCrzEbXQlGh23cNBCd+Uj5yeSN+mvPgP88jiqNb/lJvyfV63lirD39g4FCHBsD0ptznPY7F4eTrRvAjmaeLb3CkMKlgxhM3DPgndZXFsebwoV+ebTkhFqO+NOeVH1aSSRfsAOEa5xQIGhJZTXrbf7LFO34aZ6Lg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=marvell.com; dmarc=pass action=none header.from=marvell.com; dkim=pass header.d=marvell.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector1-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WmSaFGPOTmdTNVqE+x5oYIKP+zEDd1fumJqmcjCEaTM=; b=h7madzWIPezxDcu5/TFnAGXjNkP6Zlptx3eh/7BV6sXNtEal6+SuuEnCLsq5Hw9HMEFgGfVEHG1QB4fJY/gqjoEGfNYHC5ON40TTo+s0WbN57Wjuf7x2HD2XiRGvUl2h/8NaShNlBGwiJjSTRiQTc33p+RQjFs4xxi6bpe+fWXg= Received: from CO1PR18MB4732.namprd18.prod.outlook.com (2603:10b6:303:eb::13) by SJ0PR18MB3851.namprd18.prod.outlook.com (2603:10b6:a03:2eb::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5250.18; Mon, 16 May 2022 22:31:25 +0000 Received: from CO1PR18MB4732.namprd18.prod.outlook.com ([fe80::f125:87d4:8920:73f0]) by CO1PR18MB4732.namprd18.prod.outlook.com ([fe80::f125:87d4:8920:73f0%9]) with mapi id 15.20.5250.018; Mon, 16 May 2022 22:31:25 +0000 From: Radha Chintakuntla To: Bruce Richardson , fengchengwen CC: "Pai G, Sunil" , Ilya Maximets , Veerasenareddy Burru , Gagandeep Singh , Nipun Gupta , "Stokes, Ian" , "Hu, Jiayu" , "Ferriter, Cian" , "Van Haaren, Harry" , "Maxime Coquelin (maxime.coquelin@redhat.com)" , "ovs-dev@openvswitch.org" , "dev@dpdk.org" , "Mcnamara, John" , "O'Driscoll, Tim" , "Finn, Emma" Subject: RE: [EXT] Re: OVS DPDK DMA-Dev library/Design Discussion Thread-Topic: [EXT] Re: OVS DPDK DMA-Dev library/Design Discussion Thread-Index: AQHYSxH69HScnsypWUKxsEL2UFJbva0ct1IAgAAE/YCAAArGgIAADMEAgAV+EqA= Date: Mon, 16 May 2022 22:31:24 +0000 Message-ID: References: <0633e31c-68fc-618c-e4f8-78a74662078c@ovn.org> <67043e2a-c420-7e7e-0c55-7303c6e506bc@huawei.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: e13468ae-4f1a-4ea3-b8b5-08da378bcfa9 x-ms-traffictypediagnostic: SJ0PR18MB3851:EE_ x-microsoft-antispam-prvs: x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: Z7h188qyVxnW4GDEGvRtQFkL6n0S+hqzEjmMHrJ23uGpMMMnIF+2H9zr4e09ajl3EVhpdrnUIvEsQck5g+WFnVpLNhwPxrSNMvkwtiQqG7K6FwQc1bKf5ys4ZnMwhP1OaKaEtHmoxo6jYOeKqvNN9PKKqsOCydmFtg47oKKKTGkmBFIwm7f2eOmFbn2dcS3hv5se1AEvPIKIB0k5wcr8t1ChzMt5gGI3/iBkN95AOqvElKOsriQIWUtfOLYLVv/UrTTjmHvDPwa99uiFs1UthUooalnGGriKg7+Nyj7U//FmJH7tOV3umU50twn60+Bbo2C9P4sLcD2fW394ydEyHOwrJtBRCYuMqdsOgZifdoODcuNTFnhSa9xySYiYLeQ+z1qTrHbDxOoLCHOW5xcybOdvFjMtPPGwigYqlHCCyXTl8FSYb07N2YQniHBIw7SH5cvA1KWZcoDm2UhVrHUDzRWkv+MvyAxHgP/Fm/xcSpTAbzW60Wn5l1GnZKZD6EcBqOxRHKoEsoXauDJUGAp1h0FHhofxY/jQ9csV1jadnMdHzbYGRbhxDo+0jXu/IwMrgKuB9wifBEvz42F86eQP0diWgxqVBmY664XseWynQ1N7zUFrnuf5dJACSREuw3R/NXGCwNn7nrU8Qu6pFnaeIGujWKgyNMbry+2B2WOfDyHN7GCqOONVmT4vVecmrlSN4D6jmBnfaA9roneRmrHQdg== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CO1PR18MB4732.namprd18.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230001)(4636009)(366004)(66946007)(76116006)(54906003)(66556008)(55016003)(4326008)(86362001)(83380400001)(38100700002)(6506007)(5660300002)(53546011)(110136005)(64756008)(66446008)(30864003)(2906002)(26005)(7416002)(33656002)(9686003)(186003)(7696005)(38070700005)(508600001)(122000001)(316002)(8936002)(71200400001)(66476007)(8676002)(52536014); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?WNIpi92s41UBFWw8k5PIUrbghO93pq8YB0psF3lcbgWPRFi3dnpA18U61Vgv?= =?us-ascii?Q?Ji4g9e5QByaxGcuWpVP3MkndWayzaMsblJ0WhNh3hojmaH1XdQ+7nynTdfmR?= =?us-ascii?Q?Pe3DN93nlAGmUQbgPXDe8DDT+FAYk9014rxJ7GoCVpHp31Aqyy1ARQ6mg8fz?= =?us-ascii?Q?xtpYhJ4bapcgAgKRAhFrSQhzvIctvRtgNOiKAx78OMY2089C01CbEKqupRGI?= =?us-ascii?Q?4T/WE9I0FhlNN0e0eklnwhB6VQIcYDdtPqY+UYi5VwvUno7NsDwvodI6vvvW?= =?us-ascii?Q?0cZSNS82QJR1gM1gwCqPMhj/ROmEUINiWJeAm3HNzaniuIM7gdEt4ZB/KUwA?= =?us-ascii?Q?GWGyr2wxUpndpB65UyIFVmKgTwBO91XA+pnsuNqp74+6aqy7ei0Ag1fXmMeM?= =?us-ascii?Q?p0x64LSYyNKsDP/+jAYmZH4PquVe1IhlG/THzA0Ai7AtP01bTXDIFBem0Wv6?= =?us-ascii?Q?vdCtsf85XDWLcXwO9jav4p7eIqB1cCU9JhM1Yu3/U7eURkOQbfiRoGKZQNrN?= =?us-ascii?Q?XR2YbFLdc6Xk3My3zEo1UzX/j7QcXT0CAYzp0k1CCMyzEoFiVS9ceNVEsKtU?= =?us-ascii?Q?aHrkIyjDjlUF/d6wzjjmAi7bKqpjrToQa5Whu2zb3lvoclDIDwpp9SboYvGs?= =?us-ascii?Q?O9BzbB/LW5UoXkqYjBEBkWtrh3pL7VyJOMK3IThQQ0EOlSj2Zg/2qwMO108W?= =?us-ascii?Q?kWYUvtwxlK2uh9bxiOBBnEZ6tw4xuuDgoKRF4T1As5g1cD3HZAvttKGKzm0F?= =?us-ascii?Q?s3HnxDb4JFJ81QzHCULZkgWtCGmwCb+c5GhNBBSttFE95HG5hHLq5oexe7c9?= =?us-ascii?Q?8tKP2ycTvyEohO3GzHlAnMq4tdm3oV8icpiqj0hqOSP84BagO5TwRlSDWhn3?= =?us-ascii?Q?QkhE6dJtFag9gfJwNvvPZAIVfskcQWHGP011beNZ2+s/zuZD2z/oh2/qXk29?= =?us-ascii?Q?yF8LFiUqpR8BRDLaqpr+rFsLTVktZCt2e2nlcebz8RSKa9tUQ4qRhSyGPhQK?= =?us-ascii?Q?oS5VilqbS8Ri+zlkmCOmdxw1I1Fk/vuTW55oN3ecxCB7l9t7OHx0TLgl+3cX?= =?us-ascii?Q?BRIrflg7SKI6a9qTBswopQXvIA+PxtQ+ZdG1hKX+9wzqmCnD31LgLJ2mfFy6?= =?us-ascii?Q?M0wLRIUWwMrIeBppyqOeq91kTohfNS1RYpW5edcP5j6Gnpu3rBCsNDFEzb77?= =?us-ascii?Q?hrkYsJqAWCjerrF4uQAxk69tlXe1h7c0+RtwWb2CMVP0JmfHarQmOiCZhwUC?= =?us-ascii?Q?NI8earoABP7pG+tj2VA8q33m3iolmLTIxaW5Qx5qyVMs6Q7h/kXL92GL3ZdQ?= =?us-ascii?Q?MjHG/Xws7Wg4HQ+Gf7u1XV1+j7zOrSyDN8gBew6PH4Td/ZEAeUsJr+FD20wA?= =?us-ascii?Q?xmGGYObtYhsBbQSHu6lLiX88rzSMoYIjjV0E62JZXHZ7J+IAzsMhjWVP//c9?= =?us-ascii?Q?kjXafWRF720EQ+LII1T73azKPUJ8nuxD8q4HAZgoaz5Y8GA5CsgdBQ25vxns?= =?us-ascii?Q?ZFUFIg6tTjY8B3gTFbdhEpG6ItYAa+zx1vTnO2BwPcyHkzb5ReYpWn5ogeuY?= =?us-ascii?Q?sAvfQZ+gbGLb93aRhvorPl06mciNrm9U0O4tDsFkAtgLEPcqfLih6cnsT5sW?= =?us-ascii?Q?5h+tiYcGTnrv01fPAXmCs00fVzKaAEx6jA6L3EyNc7dpER3vRM7ONSHwKL4o?= =?us-ascii?Q?uev4lbUPZ8Ii32qwfL8UQKqTIYHuQAYML1uRfqUXOjviOxrA?= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: marvell.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CO1PR18MB4732.namprd18.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: e13468ae-4f1a-4ea3-b8b5-08da378bcfa9 X-MS-Exchange-CrossTenant-originalarrivaltime: 16 May 2022 22:31:24.9710 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: kaGb8CcErlfKdTPUeEEFkPoFShsTZG0/WF9DqLLYcUXDWP4XZb2v+1JoiRCY/KMOSa4vY/nlc0GNzDmmLz1Dug== X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR18MB3851 X-Proofpoint-ORIG-GUID: 4YQWRi2D6FhO2qTF6utT5ynn12ZErzX_ X-Proofpoint-GUID: 4YQWRi2D6FhO2qTF6utT5ynn12ZErzX_ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-05-16_16,2022-05-16_02,2022-02-23_01 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > -----Original Message----- > From: Bruce Richardson > Sent: Friday, May 13, 2022 3:34 AM > To: fengchengwen > Cc: Pai G, Sunil ; Ilya Maximets > ; Radha Chintakuntla ; > Veerasenareddy Burru ; Gagandeep Singh > ; Nipun Gupta ; Stokes, Ian > ; Hu, Jiayu ; Ferriter, Cian > ; Van Haaren, Harry > ; Maxime Coquelin > (maxime.coquelin@redhat.com) ; ovs- > dev@openvswitch.org; dev@dpdk.org; Mcnamara, John > ; O'Driscoll, Tim ; > Finn, Emma > Subject: [EXT] Re: OVS DPDK DMA-Dev library/Design Discussion >=20 > External Email >=20 > ---------------------------------------------------------------------- > On Fri, May 13, 2022 at 05:48:35PM +0800, fengchengwen wrote: > > On 2022/5/13 17:10, Bruce Richardson wrote: > > > On Fri, May 13, 2022 at 04:52:10PM +0800, fengchengwen wrote: > > >> On 2022/4/8 14:29, Pai G, Sunil wrote: > > >>>> -----Original Message----- > > >>>> From: Richardson, Bruce > > >>>> Sent: Tuesday, April 5, 2022 5:38 PM > > >>>> To: Ilya Maximets ; Chengwen Feng > > >>>> ; Radha Mohan Chintakuntla > > >>>> ; Veerasenareddy Burru > ; > > >>>> Gagandeep Singh ; Nipun Gupta > > >>>> > > >>>> Cc: Pai G, Sunil ; Stokes, Ian > > >>>> ; Hu, Jiayu ; Ferriter, > > >>>> Cian ; Van Haaren, Harry > > >>>> ; Maxime Coquelin > > >>>> (maxime.coquelin@redhat.com) ; > > >>>> ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara, John > > >>>> ; O'Driscoll, Tim > > >>>> ; Finn, Emma > > >>>> Subject: Re: OVS DPDK DMA-Dev library/Design Discussion > > >>>> > > >>>> On Tue, Apr 05, 2022 at 01:29:25PM +0200, Ilya Maximets wrote: > > >>>>> On 3/30/22 16:09, Bruce Richardson wrote: > > >>>>>> On Wed, Mar 30, 2022 at 01:41:34PM +0200, Ilya Maximets wrote: > > >>>>>>> On 3/30/22 13:12, Bruce Richardson wrote: > > >>>>>>>> On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets > wrote: > > >>>>>>>>> On 3/30/22 12:41, Ilya Maximets wrote: > > >>>>>>>>>> Forking the thread to discuss a memory consistency/ordering > model. > > >>>>>>>>>> > > >>>>>>>>>> AFAICT, dmadev can be anything from part of a CPU to a > > >>>>>>>>>> completely separate PCI device. However, I don't see any > > >>>>>>>>>> memory ordering being enforced or even described in the > > >>>>>>>>>> dmadev API or > > >>>> documentation. > > >>>>>>>>>> Please, point me to the correct documentation, if I somehow > > >>>>>>>>>> missed > > >>>> it. > > >>>>>>>>>> > > >>>>>>>>>> We have a DMA device (A) and a CPU core (B) writing > > >>>>>>>>>> respectively the data and the descriptor info. CPU core > > >>>>>>>>>> (C) is reading the descriptor and the data it points too. > > >>>>>>>>>> > > >>>>>>>>>> A few things about that process: > > >>>>>>>>>> > > >>>>>>>>>> 1. There is no memory barrier between writes A and B (Did I > miss > > >>>>>>>>>> them?). Meaning that those operations can be seen by C i= n > a > > >>>>>>>>>> different order regardless of barriers issued by C and > > >>>> regardless > > >>>>>>>>>> of the nature of devices A and B. > > >>>>>>>>>> > > >>>>>>>>>> 2. Even if there is a write barrier between A and B, there i= s > > >>>>>>>>>> no guarantee that C will see these writes in the same ord= er > > >>>>>>>>>> as C doesn't use real memory barriers because vhost > > >>>>>>>>>> advertises > > >>>>>>>>> > > >>>>>>>>> s/advertises/does not advertise/ > > >>>>>>>>> > > >>>>>>>>>> VIRTIO_F_ORDER_PLATFORM. > > >>>>>>>>>> > > >>>>>>>>>> So, I'm getting to conclusion that there is a missing write > > >>>>>>>>>> barrier on the vhost side and vhost itself must not > > >>>>>>>>>> advertise the > > >>>>>>>>> > > >>>>>>>>> s/must not/must/ > > >>>>>>>>> > > >>>>>>>>> Sorry, I wrote things backwards. :) > > >>>>>>>>> > > >>>>>>>>>> VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use > > >>>>>>>>>> actual memory barriers. > > >>>>>>>>>> > > >>>>>>>>>> Would like to hear some thoughts on that topic. Is it a > > >>>>>>>>>> real > > >>>> issue? > > >>>>>>>>>> Is it an issue considering all possible CPU architectures > > >>>>>>>>>> and DMA HW variants? > > >>>>>>>>>> > > >>>>>>>> > > >>>>>>>> In terms of ordering of operations using dmadev: > > >>>>>>>> > > >>>>>>>> * Some DMA HW will perform all operations strictly in order e.= g. > > >>>> Intel > > >>>>>>>> IOAT, while other hardware may not guarantee order of > > >>>> operations/do > > >>>>>>>> things in parallel e.g. Intel DSA. Therefore the dmadev API > > >>>> provides the > > >>>>>>>> fence operation which allows the order to be enforced. The > > >>>>>>>> fence > > >>>> can be > > >>>>>>>> thought of as a full memory barrier, meaning no jobs after > > >>>>>>>> the > > >>>> barrier can > > >>>>>>>> be started until all those before it have completed. > > >>>>>>>> Obviously, > > >>>> for HW > > >>>>>>>> where order is always enforced, this will be a no-op, but > > >>>>>>>> for > > >>>> hardware that > > >>>>>>>> parallelizes, we want to reduce the fences to get best > > >>>> performance. > > >>>>>>>> > > >>>>>>>> * For synchronization between DMA devices and CPUs, where a > > >>>>>>>> CPU can > > >>>> only > > >>>>>>>> write after a DMA copy has been done, the CPU must wait for > > >>>>>>>> the > > >>>> dma > > >>>>>>>> completion to guarantee ordering. Once the completion has > > >>>>>>>> been > > >>>> returned > > >>>>>>>> the completed operation is globally visible to all cores. > > >>>>>>> > > >>>>>>> Thanks for explanation! Some questions though: > > >>>>>>> > > >>>>>>> In our case one CPU waits for completion and another CPU is > > >>>>>>> actually using the data. IOW, "CPU must wait" is a bit ambiguo= us. > > >>>> Which CPU must wait? > > >>>>>>> > > >>>>>>> Or should it be "Once the completion is visible on any core, > > >>>>>>> the completed operation is globally visible to all cores." ? > > >>>>>>> > > >>>>>> > > >>>>>> The latter. > > >>>>>> Once the change to memory/cache is visible to any core, it is > > >>>>>> visible to all ones. This applies to regular CPU memory writes > > >>>>>> too - at least on IA, and I expect on many other architectures > > >>>>>> - once the write is visible outside the current core it is > > >>>>>> visible to every other core. Once the data hits the l1 or l2 > > >>>>>> cache of any core, any subsequent requests for that data from an= y > other core will "snoop" > > >>>>>> the latest data from the cores cache, even if it has not made > > >>>>>> its way down to a shared cache, e.g. l3 on most IA systems. > > >>>>> > > >>>>> It sounds like you're referring to the "multicopy atomicity" of > > >>>>> the architecture. However, that is not universally supported thi= ng. > > >>>>> AFAICT, POWER and older ARM systems doesn't support it, so > > >>>>> writes performed by one core are not necessarily available to > > >>>>> all other cores at the same time. That means that if the CPU0 > > >>>>> writes the data and the completion flag, CPU1 reads the > > >>>>> completion flag and writes the ring, > > >>>>> CPU2 may see the ring write, but may still not see the write of > > >>>>> the data, even though there was a control dependency on CPU1. > > >>>>> There should be a full memory barrier on CPU1 in order to > > >>>>> fulfill the memory ordering requirements for CPU2, IIUC. > > >>>>> > > >>>>> In our scenario the CPU0 is a DMA device, which may or may not > > >>>>> be part of a CPU and may have different memory > > >>>>> consistency/ordering requirements. So, the question is: does > > >>>>> DPDK DMA API guarantee multicopy atomicity between DMA device > > >>>>> and all CPU cores regardless of CPU architecture and a nature of = the > DMA device? > > >>>>> > > >>>> > > >>>> Right now, it doesn't because this never came up in discussion. > > >>>> In order to be useful, it sounds like it explicitly should do so. > > >>>> At least for the Intel ioat and idxd driver cases, this will be > > >>>> supported, so we just need to ensure all other drivers currently > > >>>> upstreamed can offer this too. If they cannot, we cannot offer it > > >>>> as a global guarantee, and we should see about adding a > > >>>> capability flag for this to indicate when the guarantee is there o= r not. > > >>>> > > >>>> Maintainers of dma/cnxk, dma/dpaa and dma/hisilicon - are we ok > > >>>> to document for dmadev that once a DMA operation is completed, > > >>>> the op is guaranteed visible to all cores/threads? If not, any > > >>>> thoughts on what guarantees we can provide in this regard, or > > >>>> what capabilities should be exposed? > > >>> > > >>> > > >>> > > >>> Hi @Chengwen Feng, @Radha Mohan Chintakuntla, > @Veerasenareddy > > >>> Burru, @Gagandeep Singh, @Nipun Gupta, Requesting your valuable > opinions for the queries on this thread. > > >> > > >> Sorry late for reply due I didn't follow this thread. > > >> > > >> I don't think the DMA API should provide such guarantee because: > > >> 1. DMA is an acceleration device, which is the same as > encryption/decryption device or network device. > > >> 2. For Hisilicon Kunpeng platform: > > >> The DMA device support: > > >> a) IO coherency: which mean it could read read the latest data = which > may stay the cache, and will > > >> invalidate cache's data and write data to DDR when write. > > >> b) Order in one request: which mean it only write completion > descriptor after the copy is done. > > >> Note: orders between multiple requests can be implemented > through the fence mechanism. > > >> The DMA driver only should: > > >> a) Add one write memory barrier(use lightweight mb) when doorbe= ll. > > >> So once the DMA is completed the operation is guaranteed visible = to > all cores, > > >> And the 3rd core will observed the right order: core-B prepare da= ta > and issue request to DMA, DMA > > >> start work, core-B get completion status. > > >> 3. I did a TI multi-core SoC many years ago, the SoC don't support c= ache > coherence and consistency between > > >> cores. The SoC also have DMA device which have many channel. Here > we do a hypothetical design the DMA > > >> driver with the DPDK DMA framework: > > >> The DMA driver should: > > >> a) write back DMA's src buffer, so that there are none cache da= ta > when DMA running. > > >> b) invalidate DMA's dst buffer > > >> c) do a full mb > > >> d) update DMA's registers. > > >> Then DMA will execute the copy task, it copy from DDR and write t= o > DDR, and after copy it will modify > > >> it's status register to completed. > > >> In this case, the 3rd core will also observed the right order. > > >> A particular point of this is: If one buffer will shared on multi= ple core, > application should explicit > > >> maintain the cache. > > >> > > >> Based on above, I don't think the DMA API should explicit add the > > >> descriptor, it's driver's and even application(e.g. above TI's SoC)'= s duty > to make sure it. > > >> > > > Hi, > > > > > > thanks for that. So if I understand correctly, your current HW does > > > provide this guarantee, but you don't think it should be always the > > > case for dmadev, correct? > > > > Yes, our HW will provide the guarantee. > > If some HW could not provide, it's driver's and maybe application's dut= y to > provide it. > > > > > > > > Based on that, what do you think should be the guarantee on > completion? > > > Once a job is completed, the completion is visible to the submitting > > > core, or the core reading the completion? Do you think it's > > > acceptable to add a > > > > Both core will visible to it. > > > > > capability flag for drivers to indicate that they do support a > > > "globally visible" guarantee? > > > > I think the driver (and with HW) should support "globally visible" guar= antee. > > And for some HW, even application (or middleware) should care about it. > > >=20 > From a dmadev API viewpoint, whether the driver handles it or the HW itse= lf, > does not matter. However, if the application needs to take special action= s to > guarantee visibility, then that needs to be flagged as part of the dmadev= API. >=20 > I see three possibilities: > 1 Wait until we have a driver that does not have global visibility on > return from rte_dma_completed, and at that point add a flag indicating > the lack of that support. Until then, document that results of ops will > be globally visible. > 2 Add a flag now to allow drivers to indicate *lack* of global visibility= , > and document that results are visible unless flag is set. > 3 Add a flag now to allow drivers call out that all results are g.v., and > update drivers to use this flag. >=20 > I would be very much in favour of #1, because: > * YAGNI principle - (subject to confirmation by other maintainers) if we > don't have a driver right now that needs non-g.v. behaviour we may neve= r > need one. > * In the absence of a concrete case where g.v. is not guaranteed, we may > struggle to document correctly what the actual guarantees are, especial= ly if > submitter core and completer core are different. >=20 > @Radha Mohan Chintakuntla, @Veerasenareddy Burru, @Gagandeep Singh, > @Nipun Gupta, As driver maintainers, can you please confirm if on receipt= of > a completion from HW/driver, the operation results are visible on all > application cores, i.e. the app does not need additional barriers to prop= agate > visibility to other cores. Your opinions on this discussion would also be= useful. [Radha Chintakuntla] Yes as of today on our HW the completion is visible on= all cores. >=20 > Regards, > /Bruce