DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>,
	"Van Haaren, Harry" <harry.van.haaren@intel.com>,
	"Pai G, Sunil" <sunil.pai.g@intel.com>,
	"Stokes, Ian" <ian.stokes@intel.com>,
	"Hu, Jiayu" <jiayu.hu@intel.com>,
	"Ferriter, Cian" <cian.ferriter@intel.com>,
	Ilya Maximets <i.maximets@ovn.org>,
	ovs-dev@openvswitch.org, dev@dpdk.org, "Mcnamara,
	John" <john.mcnamara@intel.com>,
	"O'Driscoll, Tim" <tim.odriscoll@intel.com>,
	"Finn, Emma" <emma.finn@intel.com>
Subject: Re: OVS DPDK DMA-Dev library/Design Discussion
Date: Tue, 29 Mar 2022 18:03:17 +0100	[thread overview]
Message-ID: <YkM71aqX00pY6hVf@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D86F7E@smartserver.smartshare.dk>

On Tue, Mar 29, 2022 at 06:45:19PM +0200, Morten Brørup wrote:
> > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > Sent: Tuesday, 29 March 2022 18.24
> > 
> > Hi Morten,
> > 
> > On 3/29/22 16:44, Morten Brørup wrote:
> > >> From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com]
> > >> Sent: Tuesday, 29 March 2022 15.02
> > >>
> > >>> From: Morten Brørup <mb@smartsharesystems.com>
> > >>> Sent: Tuesday, March 29, 2022 1:51 PM
> > >>>
> > >>> Having thought more about it, I think that a completely different
> > architectural approach is required:
> > >>>
> > >>> Many of the DPDK Ethernet PMDs implement a variety of RX and TX
> > packet burst functions, each optimized for different CPU vector
> > instruction sets. The availability of a DMA engine should be treated
> > the same way. So I suggest that PMDs copying packet contents, e.g.
> > memif, pcap, vmxnet3, should implement DMA optimized RX and TX packet
> > burst functions.
> > >>>
> > >>> Similarly for the DPDK vhost library.
> > >>>
> > >>> In such an architecture, it would be the application's job to
> > allocate DMA channels and assign them to the specific PMDs that should
> > use them. But the actual use of the DMA channels would move down below
> > the application and into the DPDK PMDs and libraries.
> > >>>
> > >>>
> > >>> Med venlig hilsen / Kind regards,
> > >>> -Morten Brørup
> > >>
> > >> Hi Morten,
> > >>
> > >> That's *exactly* how this architecture is designed & implemented.
> > >> 1.	The DMA configuration and initialization is up to the application
> > (OVS).
> > >> 2.	The VHost library is passed the DMA-dev ID, and its new async
> > rx/tx APIs, and uses the DMA device to accelerate the copy.
> > >>
> > >> Looking forward to talking on the call that just started. Regards, -
> > Harry
> > >>
> > >
> > > OK, thanks - as I said on the call, I haven't looked at the patches.
> > >
> > > Then, I suppose that the TX completions can be handled in the TX
> > function, and the RX completions can be handled in the RX function,
> > just like the Ethdev PMDs handle packet descriptors:
> > >
> > > TX_Burst(tx_packet_array):
> > > 1.	Clean up descriptors processed by the NIC chip. --> Process TX
> > DMA channel completions. (Effectively, the 2nd pipeline stage.)
> > > 2.	Pass on the tx_packet_array to the NIC chip descriptors. --> Pass
> > on the tx_packet_array to the TX DMA channel. (Effectively, the 1st
> > pipeline stage.)
> > 
> > The problem is Tx function might not be called again, so enqueued
> > packets in 2. may never be completed from a Virtio point of view. IOW,
> > the packets will be copied to the Virtio descriptors buffers, but the
> > descriptors will not be made available to the Virtio driver.
> 
> In that case, the application needs to call TX_Burst() periodically with an empty array, for completion purposes.
> 
> Or some sort of TX_Keepalive() function can be added to the DPDK library, to handle DMA completion. It might even handle multiple DMA channels, if convenient - and if possible without locking or other weird complexity.
> 
> Here is another idea, inspired by a presentation at one of the DPDK Userspace conferences. It may be wishful thinking, though:
> 
> Add an additional transaction to each DMA burst; a special transaction containing the memory write operation that makes the descriptors available to the Virtio driver.
> 

That is something that can work, so long as the receiver is operating in
polling mode. For cases where virtio interrupts are enabled, you still need
to do a write to the eventfd in the kernel in vhost to signal the virtio
side. That's not something that can be offloaded to a DMA engine, sadly, so
we still need some form of completion call.

/Bruce

  reply	other threads:[~2022-03-29 17:04 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-24 15:36 Stokes, Ian
2022-03-28 18:19 ` Pai G, Sunil
2022-03-29 12:51   ` Morten Brørup
2022-03-29 13:01     ` Van Haaren, Harry
2022-03-29 14:44       ` Morten Brørup
2022-03-29 16:24         ` Maxime Coquelin
2022-03-29 16:45           ` Morten Brørup
2022-03-29 17:03             ` Bruce Richardson [this message]
2022-03-29 17:13               ` Morten Brørup
2022-03-29 17:45                 ` Ilya Maximets
2022-03-29 18:46                   ` Morten Brørup
2022-03-30  2:02                   ` Hu, Jiayu
2022-03-30  9:25                     ` Maxime Coquelin
2022-03-30 10:20                       ` Bruce Richardson
2022-03-30 14:27                       ` Hu, Jiayu
2022-03-29 17:46                 ` Van Haaren, Harry
2022-03-29 19:59                   ` Morten Brørup
2022-03-30  9:01                     ` Van Haaren, Harry
2022-04-07 14:04                       ` Van Haaren, Harry
2022-04-07 14:25                         ` Maxime Coquelin
2022-04-07 14:39                           ` Ilya Maximets
2022-04-07 14:42                             ` Van Haaren, Harry
2022-04-07 15:01                               ` Ilya Maximets
2022-04-07 15:46                                 ` Maxime Coquelin
2022-04-07 16:04                                   ` Bruce Richardson
2022-04-08  7:13                             ` Hu, Jiayu
2022-04-08  8:21                               ` Morten Brørup
2022-04-08  9:57                               ` Ilya Maximets
2022-04-20 15:39                                 ` Mcnamara, John
2022-04-20 16:41                                 ` Mcnamara, John
2022-04-25 21:46                                   ` Ilya Maximets
2022-04-27 14:55                                     ` Mcnamara, John
2022-04-27 20:34                                     ` Bruce Richardson
2022-04-28 12:59                                       ` Ilya Maximets
2022-04-28 13:55                                         ` Bruce Richardson
2022-05-03 19:38                                         ` Van Haaren, Harry
2022-05-10 14:39                                           ` Van Haaren, Harry
2022-05-24 12:12                                           ` Ilya Maximets
2022-03-30 10:41   ` Ilya Maximets
2022-03-30 10:52     ` Ilya Maximets
2022-03-30 11:12       ` Bruce Richardson
2022-03-30 11:41         ` Ilya Maximets
2022-03-30 14:09           ` Bruce Richardson
2022-04-05 11:29             ` Ilya Maximets
2022-04-05 12:07               ` Bruce Richardson
2022-04-08  6:29                 ` Pai G, Sunil
2022-05-13  8:52                   ` fengchengwen
2022-05-13  9:10                     ` Bruce Richardson
2022-05-13  9:48                       ` fengchengwen
2022-05-13 10:34                         ` Bruce Richardson
2022-05-16  9:04                           ` Morten Brørup
2022-05-16 22:31                           ` [EXT] " Radha Chintakuntla
  -- strict thread matches above, loose matches on Subject: below --
2022-04-25 15:19 Mcnamara, John
2022-04-21 14:57 Mcnamara, John
     [not found] <DM6PR11MB3227AC0014F321EB901BE385FC199@DM6PR11MB3227.namprd11.prod.outlook.com>
2022-04-21 11:51 ` Mcnamara, John
     [not found] <DM8PR11MB5605B4A5DBD79FFDB4B1C3B2BD0A9@DM8PR11MB5605.namprd11.prod.outlook.com>
2022-03-21 18:23 ` Pai G, Sunil
2022-03-15 15:48 Stokes, Ian
2022-03-15 13:17 Stokes, Ian
2022-03-15 11:15 Stokes, Ian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YkM71aqX00pY6hVf@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=cian.ferriter@intel.com \
    --cc=dev@dpdk.org \
    --cc=emma.finn@intel.com \
    --cc=harry.van.haaren@intel.com \
    --cc=i.maximets@ovn.org \
    --cc=ian.stokes@intel.com \
    --cc=jiayu.hu@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=ovs-dev@openvswitch.org \
    --cc=sunil.pai.g@intel.com \
    --cc=tim.odriscoll@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).