From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B944DA050A; Wed, 30 Mar 2022 13:41:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 5CFF740685; Wed, 30 Mar 2022 13:41:40 +0200 (CEST) Received: from relay7-d.mail.gandi.net (relay7-d.mail.gandi.net [217.70.183.200]) by mails.dpdk.org (Postfix) with ESMTP id A514D4013F for ; Wed, 30 Mar 2022 13:41:38 +0200 (CEST) Received: (Authenticated sender: i.maximets@ovn.org) by mail.gandi.net (Postfix) with ESMTPSA id 78DB220013; Wed, 30 Mar 2022 11:41:34 +0000 (UTC) Message-ID: Date: Wed, 30 Mar 2022 13:41:34 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Cc: i.maximets@ovn.org, "Pai G, Sunil" , "Stokes, Ian" , "Hu, Jiayu" , "Ferriter, Cian" , "Van Haaren, Harry" , "Maxime Coquelin (maxime.coquelin@redhat.com)" , "ovs-dev@openvswitch.org" , "dev@dpdk.org" , "Mcnamara, John" , "O'Driscoll, Tim" , "Finn, Emma" Content-Language: en-US To: Bruce Richardson References: <22e3ff73-f3d9-abae-1866-90d133af5528@ovn.org> From: Ilya Maximets Subject: Re: OVS DPDK DMA-Dev library/Design Discussion In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 3/30/22 13:12, Bruce Richardson wrote: > On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets wrote: >> On 3/30/22 12:41, Ilya Maximets wrote: >>> Forking the thread to discuss a memory consistency/ordering model. >>> >>> AFAICT, dmadev can be anything from part of a CPU to a completely >>> separate PCI device. However, I don't see any memory ordering being >>> enforced or even described in the dmadev API or documentation. >>> Please, point me to the correct documentation, if I somehow missed it. >>> >>> We have a DMA device (A) and a CPU core (B) writing respectively >>> the data and the descriptor info. CPU core (C) is reading the >>> descriptor and the data it points too. >>> >>> A few things about that process: >>> >>> 1. There is no memory barrier between writes A and B (Did I miss >>> them?). Meaning that those operations can be seen by C in a >>> different order regardless of barriers issued by C and regardless >>> of the nature of devices A and B. >>> >>> 2. Even if there is a write barrier between A and B, there is >>> no guarantee that C will see these writes in the same order >>> as C doesn't use real memory barriers because vhost advertises >> >> s/advertises/does not advertise/ >> >>> VIRTIO_F_ORDER_PLATFORM. >>> >>> So, I'm getting to conclusion that there is a missing write barrier >>> on the vhost side and vhost itself must not advertise the >> >> s/must not/must/ >> >> Sorry, I wrote things backwards. :) >> >>> VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use actual memory >>> barriers. >>> >>> Would like to hear some thoughts on that topic. Is it a real issue? >>> Is it an issue considering all possible CPU architectures and DMA >>> HW variants? >>> > > In terms of ordering of operations using dmadev: > > * Some DMA HW will perform all operations strictly in order e.g. Intel > IOAT, while other hardware may not guarantee order of operations/do > things in parallel e.g. Intel DSA. Therefore the dmadev API provides the > fence operation which allows the order to be enforced. The fence can be > thought of as a full memory barrier, meaning no jobs after the barrier can > be started until all those before it have completed. Obviously, for HW > where order is always enforced, this will be a no-op, but for hardware that > parallelizes, we want to reduce the fences to get best performance. > > * For synchronization between DMA devices and CPUs, where a CPU can only > write after a DMA copy has been done, the CPU must wait for the dma > completion to guarantee ordering. Once the completion has been returned > the completed operation is globally visible to all cores. Thanks for explanation! Some questions though: In our case one CPU waits for completion and another CPU is actually using the data. IOW, "CPU must wait" is a bit ambiguous. Which CPU must wait? Or should it be "Once the completion is visible on any core, the completed operation is globally visible to all cores." ? And the main question: Are these synchronization claims documented somewhere? Best regards, Ilya Maximets.