From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 44F6EA0508; Tue, 29 Mar 2022 19:04:17 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E6D2E40691; Tue, 29 Mar 2022 19:04:16 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 8485B40141 for ; Tue, 29 Mar 2022 19:04:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1648573455; x=1680109455; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=aWYBd/xxXAozOROWEhx9G+6QSSfzXfssUtmLMoHnEYc=; b=KzlszYHF+ZgIqir3laSWmijVfYYsPGA8nAHTM0PiXmHwhuM5uq2Fj2B7 phJDSpts30bxAjy4Q9b37QcblhVuntUQtAFo36ojeGBbM7qWmSHGk0lip Wx5TmA0FHx0MlMRC1j+DqvbZI//jfl+Y03pGeSL1OOhXtD1p2OA2hV7Yy WpImUwqRVjHvuqOLTQH7uQkekEWKmKPGz4uMv1f2YdxrsfGTPn0Png16g nOxchKVMURajl9h10vmchCZxHhK9JIqe9qaMGAjXG1i2AzIeiuWgx43rV IaFYAqYyzJOc4DVHsuQZIrHgQgoc1bt3m20gXFXDWnxUxpNbEIivyhNFZ w==; X-IronPort-AV: E=McAfee;i="6200,9189,10301"; a="322494525" X-IronPort-AV: E=Sophos;i="5.90,220,1643702400"; d="scan'208";a="322494525" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Mar 2022 10:03:23 -0700 X-IronPort-AV: E=Sophos;i="5.90,220,1643702400"; d="scan'208";a="653116357" Received: from bricha3-mobl.ger.corp.intel.com ([10.252.12.222]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-SHA; 29 Mar 2022 10:03:20 -0700 Date: Tue, 29 Mar 2022 18:03:17 +0100 From: Bruce Richardson To: Morten =?iso-8859-1?Q?Br=F8rup?= Cc: Maxime Coquelin , "Van Haaren, Harry" , "Pai G, Sunil" , "Stokes, Ian" , "Hu, Jiayu" , "Ferriter, Cian" , Ilya Maximets , ovs-dev@openvswitch.org, dev@dpdk.org, "Mcnamara, John" , "O'Driscoll, Tim" , "Finn, Emma" Subject: Re: OVS DPDK DMA-Dev library/Design Discussion Message-ID: References: <98CBD80474FA8B44BF855DF32C47DC35D86F7C@smartserver.smartshare.dk> <98CBD80474FA8B44BF855DF32C47DC35D86F7D@smartserver.smartshare.dk> <7968dd0b-8647-8d7b-786f-dc876bcbf3f0@redhat.com> <98CBD80474FA8B44BF855DF32C47DC35D86F7E@smartserver.smartshare.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D86F7E@smartserver.smartshare.dk> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Tue, Mar 29, 2022 at 06:45:19PM +0200, Morten Brørup wrote: > > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com] > > Sent: Tuesday, 29 March 2022 18.24 > > > > Hi Morten, > > > > On 3/29/22 16:44, Morten Brørup wrote: > > >> From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com] > > >> Sent: Tuesday, 29 March 2022 15.02 > > >> > > >>> From: Morten Brørup > > >>> Sent: Tuesday, March 29, 2022 1:51 PM > > >>> > > >>> Having thought more about it, I think that a completely different > > architectural approach is required: > > >>> > > >>> Many of the DPDK Ethernet PMDs implement a variety of RX and TX > > packet burst functions, each optimized for different CPU vector > > instruction sets. The availability of a DMA engine should be treated > > the same way. So I suggest that PMDs copying packet contents, e.g. > > memif, pcap, vmxnet3, should implement DMA optimized RX and TX packet > > burst functions. > > >>> > > >>> Similarly for the DPDK vhost library. > > >>> > > >>> In such an architecture, it would be the application's job to > > allocate DMA channels and assign them to the specific PMDs that should > > use them. But the actual use of the DMA channels would move down below > > the application and into the DPDK PMDs and libraries. > > >>> > > >>> > > >>> Med venlig hilsen / Kind regards, > > >>> -Morten Brørup > > >> > > >> Hi Morten, > > >> > > >> That's *exactly* how this architecture is designed & implemented. > > >> 1. The DMA configuration and initialization is up to the application > > (OVS). > > >> 2. The VHost library is passed the DMA-dev ID, and its new async > > rx/tx APIs, and uses the DMA device to accelerate the copy. > > >> > > >> Looking forward to talking on the call that just started. Regards, - > > Harry > > >> > > > > > > OK, thanks - as I said on the call, I haven't looked at the patches. > > > > > > Then, I suppose that the TX completions can be handled in the TX > > function, and the RX completions can be handled in the RX function, > > just like the Ethdev PMDs handle packet descriptors: > > > > > > TX_Burst(tx_packet_array): > > > 1. Clean up descriptors processed by the NIC chip. --> Process TX > > DMA channel completions. (Effectively, the 2nd pipeline stage.) > > > 2. Pass on the tx_packet_array to the NIC chip descriptors. --> Pass > > on the tx_packet_array to the TX DMA channel. (Effectively, the 1st > > pipeline stage.) > > > > The problem is Tx function might not be called again, so enqueued > > packets in 2. may never be completed from a Virtio point of view. IOW, > > the packets will be copied to the Virtio descriptors buffers, but the > > descriptors will not be made available to the Virtio driver. > > In that case, the application needs to call TX_Burst() periodically with an empty array, for completion purposes. > > Or some sort of TX_Keepalive() function can be added to the DPDK library, to handle DMA completion. It might even handle multiple DMA channels, if convenient - and if possible without locking or other weird complexity. > > Here is another idea, inspired by a presentation at one of the DPDK Userspace conferences. It may be wishful thinking, though: > > Add an additional transaction to each DMA burst; a special transaction containing the memory write operation that makes the descriptors available to the Virtio driver. > That is something that can work, so long as the receiver is operating in polling mode. For cases where virtio interrupts are enabled, you still need to do a write to the eventfd in the kernel in vhost to signal the virtio side. That's not something that can be offloaded to a DMA engine, sadly, so we still need some form of completion call. /Bruce