DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Pai G, Sunil" <sunil.pai.g@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>,
	Ilya Maximets <i.maximets@ovn.org>,
	Chengwen Feng <fengchengwen@huawei.com>,
	"Radha Mohan Chintakuntla" <radhac@marvell.com>,
	Veerasenareddy Burru <vburru@marvell.com>,
	Gagandeep Singh <g.singh@nxp.com>,
	Nipun Gupta <nipun.gupta@nxp.com>
Cc: "Stokes, Ian" <ian.stokes@intel.com>,
	"Hu, Jiayu" <jiayu.hu@intel.com>,
	"Ferriter, Cian" <cian.ferriter@intel.com>,
	"Van Haaren, Harry" <harry.van.haaren@intel.com>,
	"Maxime Coquelin (maxime.coquelin@redhat.com)"
	<maxime.coquelin@redhat.com>,
	"ovs-dev@openvswitch.org" <ovs-dev@openvswitch.org>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Mcnamara, John" <john.mcnamara@intel.com>,
	"O'Driscoll, Tim" <tim.odriscoll@intel.com>,
	"Finn, Emma" <emma.finn@intel.com>
Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
Date: Fri, 8 Apr 2022 06:29:11 +0000	[thread overview]
Message-ID: <CO6PR11MB5603A9314AE76184512877BDBDE99@CO6PR11MB5603.namprd11.prod.outlook.com> (raw)
In-Reply-To: <YkwxFZUqOfnL9cJC@bricha3-MOBL.ger.corp.intel.com>

> -----Original Message-----
> From: Richardson, Bruce <bruce.richardson@intel.com>
> Sent: Tuesday, April 5, 2022 5:38 PM
> To: Ilya Maximets <i.maximets@ovn.org>; Chengwen Feng
> <fengchengwen@huawei.com>; Radha Mohan Chintakuntla <radhac@marvell.com>;
> Veerasenareddy Burru <vburru@marvell.com>; Gagandeep Singh
> <g.singh@nxp.com>; Nipun Gupta <nipun.gupta@nxp.com>
> Cc: Pai G, Sunil <sunil.pai.g@intel.com>; Stokes, Ian
> <ian.stokes@intel.com>; Hu, Jiayu <jiayu.hu@intel.com>; Ferriter, Cian
> <cian.ferriter@intel.com>; Van Haaren, Harry <harry.van.haaren@intel.com>;
> Maxime Coquelin (maxime.coquelin@redhat.com) <maxime.coquelin@redhat.com>;
> ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara, John
> <john.mcnamara@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>;
> Finn, Emma <emma.finn@intel.com>
> Subject: Re: OVS DPDK DMA-Dev library/Design Discussion
> 
> On Tue, Apr 05, 2022 at 01:29:25PM +0200, Ilya Maximets wrote:
> > On 3/30/22 16:09, Bruce Richardson wrote:
> > > On Wed, Mar 30, 2022 at 01:41:34PM +0200, Ilya Maximets wrote:
> > >> On 3/30/22 13:12, Bruce Richardson wrote:
> > >>> On Wed, Mar 30, 2022 at 12:52:15PM +0200, Ilya Maximets wrote:
> > >>>> On 3/30/22 12:41, Ilya Maximets wrote:
> > >>>>> Forking the thread to discuss a memory consistency/ordering model.
> > >>>>>
> > >>>>> AFAICT, dmadev can be anything from part of a CPU to a
> > >>>>> completely separate PCI device.  However, I don't see any memory
> > >>>>> ordering being enforced or even described in the dmadev API or
> documentation.
> > >>>>> Please, point me to the correct documentation, if I somehow missed
> it.
> > >>>>>
> > >>>>> We have a DMA device (A) and a CPU core (B) writing respectively
> > >>>>> the data and the descriptor info.  CPU core (C) is reading the
> > >>>>> descriptor and the data it points too.
> > >>>>>
> > >>>>> A few things about that process:
> > >>>>>
> > >>>>> 1. There is no memory barrier between writes A and B (Did I miss
> > >>>>>    them?).  Meaning that those operations can be seen by C in a
> > >>>>>    different order regardless of barriers issued by C and
> regardless
> > >>>>>    of the nature of devices A and B.
> > >>>>>
> > >>>>> 2. Even if there is a write barrier between A and B, there is
> > >>>>>    no guarantee that C will see these writes in the same order
> > >>>>>    as C doesn't use real memory barriers because vhost
> > >>>>> advertises
> > >>>>
> > >>>> s/advertises/does not advertise/
> > >>>>
> > >>>>>    VIRTIO_F_ORDER_PLATFORM.
> > >>>>>
> > >>>>> So, I'm getting to conclusion that there is a missing write
> > >>>>> barrier on the vhost side and vhost itself must not advertise
> > >>>>> the
> > >>>>
> > >>>> s/must not/must/
> > >>>>
> > >>>> Sorry, I wrote things backwards. :)
> > >>>>
> > >>>>> VIRTIO_F_ORDER_PLATFORM, so the virtio driver can use actual
> > >>>>> memory barriers.
> > >>>>>
> > >>>>> Would like to hear some thoughts on that topic.  Is it a real
> issue?
> > >>>>> Is it an issue considering all possible CPU architectures and
> > >>>>> DMA HW variants?
> > >>>>>
> > >>>
> > >>> In terms of ordering of operations using dmadev:
> > >>>
> > >>> * Some DMA HW will perform all operations strictly in order e.g.
> Intel
> > >>>   IOAT, while other hardware may not guarantee order of
> operations/do
> > >>>   things in parallel e.g. Intel DSA. Therefore the dmadev API
> provides the
> > >>>   fence operation which allows the order to be enforced. The fence
> can be
> > >>>   thought of as a full memory barrier, meaning no jobs after the
> barrier can
> > >>>   be started until all those before it have completed. Obviously,
> for HW
> > >>>   where order is always enforced, this will be a no-op, but for
> hardware that
> > >>>   parallelizes, we want to reduce the fences to get best
> performance.
> > >>>
> > >>> * For synchronization between DMA devices and CPUs, where a CPU can
> only
> > >>>   write after a DMA copy has been done, the CPU must wait for the
> dma
> > >>>   completion to guarantee ordering. Once the completion has been
> returned
> > >>>   the completed operation is globally visible to all cores.
> > >>
> > >> Thanks for explanation!  Some questions though:
> > >>
> > >> In our case one CPU waits for completion and another CPU is
> > >> actually using the data.  IOW, "CPU must wait" is a bit ambiguous.
> Which CPU must wait?
> > >>
> > >> Or should it be "Once the completion is visible on any core, the
> > >> completed operation is globally visible to all cores." ?
> > >>
> > >
> > > The latter.
> > > Once the change to memory/cache is visible to any core, it is
> > > visible to all ones. This applies to regular CPU memory writes too -
> > > at least on IA, and I expect on many other architectures - once the
> > > write is visible outside the current core it is visible to every
> > > other core. Once the data hits the l1 or l2 cache of any core, any
> > > subsequent requests for that data from any other core will "snoop"
> > > the latest data from the cores cache, even if it has not made its
> > > way down to a shared cache, e.g. l3 on most IA systems.
> >
> > It sounds like you're referring to the "multicopy atomicity" of the
> > architecture.  However, that is not universally supported thing.
> > AFAICT, POWER and older ARM systems doesn't support it, so writes
> > performed by one core are not necessarily available to all other cores
> > at the same time.  That means that if the CPU0 writes the data and the
> > completion flag, CPU1 reads the completion flag and writes the ring,
> > CPU2 may see the ring write, but may still not see the write of the
> > data, even though there was a control dependency on CPU1.
> > There should be a full memory barrier on CPU1 in order to fulfill the
> > memory ordering requirements for CPU2, IIUC.
> >
> > In our scenario the CPU0 is a DMA device, which may or may not be part
> > of a CPU and may have different memory consistency/ordering
> > requirements.  So, the question is: does DPDK DMA API guarantee
> > multicopy atomicity between DMA device and all CPU cores regardless of
> > CPU architecture and a nature of the DMA device?
> >
> 
> Right now, it doesn't because this never came up in discussion. In order
> to be useful, it sounds like it explicitly should do so. At least for the
> Intel ioat and idxd driver cases, this will be supported, so we just need
> to ensure all other drivers currently upstreamed can offer this too. If
> they cannot, we cannot offer it as a global guarantee, and we should see
> about adding a capability flag for this to indicate when the guarantee is
> there or not.
> 
> Maintainers of dma/cnxk, dma/dpaa and dma/hisilicon - are we ok to
> document for dmadev that once a DMA operation is completed, the op is
> guaranteed visible to all cores/threads? If not, any thoughts on what
> guarantees we can provide in this regard, or what capabilities should be
> exposed?



Hi @Chengwen Feng, @Radha Mohan Chintakuntla, @Veerasenareddy Burru, @Gagandeep Singh, @Nipun Gupta,
Requesting your valuable opinions for the queries on this thread.


> 
> /Bruce

  reply	other threads:[~2022-04-08  6:29 UTC|newest]

Thread overview: 59+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-24 15:36 Stokes, Ian
2022-03-28 18:19 ` Pai G, Sunil
2022-03-29 12:51   ` Morten Brørup
2022-03-29 13:01     ` Van Haaren, Harry
2022-03-29 14:44       ` Morten Brørup
2022-03-29 16:24         ` Maxime Coquelin
2022-03-29 16:45           ` Morten Brørup
2022-03-29 17:03             ` Bruce Richardson
2022-03-29 17:13               ` Morten Brørup
2022-03-29 17:45                 ` Ilya Maximets
2022-03-29 18:46                   ` Morten Brørup
2022-03-30  2:02                   ` Hu, Jiayu
2022-03-30  9:25                     ` Maxime Coquelin
2022-03-30 10:20                       ` Bruce Richardson
2022-03-30 14:27                       ` Hu, Jiayu
2022-03-29 17:46                 ` Van Haaren, Harry
2022-03-29 19:59                   ` Morten Brørup
2022-03-30  9:01                     ` Van Haaren, Harry
2022-04-07 14:04                       ` Van Haaren, Harry
2022-04-07 14:25                         ` Maxime Coquelin
2022-04-07 14:39                           ` Ilya Maximets
2022-04-07 14:42                             ` Van Haaren, Harry
2022-04-07 15:01                               ` Ilya Maximets
2022-04-07 15:46                                 ` Maxime Coquelin
2022-04-07 16:04                                   ` Bruce Richardson
2022-04-08  7:13                             ` Hu, Jiayu
2022-04-08  8:21                               ` Morten Brørup
2022-04-08  9:57                               ` Ilya Maximets
2022-04-20 15:39                                 ` Mcnamara, John
2022-04-20 16:41                                 ` Mcnamara, John
2022-04-25 21:46                                   ` Ilya Maximets
2022-04-27 14:55                                     ` Mcnamara, John
2022-04-27 20:34                                     ` Bruce Richardson
2022-04-28 12:59                                       ` Ilya Maximets
2022-04-28 13:55                                         ` Bruce Richardson
2022-05-03 19:38                                         ` Van Haaren, Harry
2022-05-10 14:39                                           ` Van Haaren, Harry
2022-05-24 12:12                                           ` Ilya Maximets
2022-03-30 10:41   ` Ilya Maximets
2022-03-30 10:52     ` Ilya Maximets
2022-03-30 11:12       ` Bruce Richardson
2022-03-30 11:41         ` Ilya Maximets
2022-03-30 14:09           ` Bruce Richardson
2022-04-05 11:29             ` Ilya Maximets
2022-04-05 12:07               ` Bruce Richardson
2022-04-08  6:29                 ` Pai G, Sunil [this message]
2022-05-13  8:52                   ` fengchengwen
2022-05-13  9:10                     ` Bruce Richardson
2022-05-13  9:48                       ` fengchengwen
2022-05-13 10:34                         ` Bruce Richardson
2022-05-16  9:04                           ` Morten Brørup
2022-05-16 22:31                           ` [EXT] " Radha Chintakuntla
  -- strict thread matches above, loose matches on Subject: below --
2022-04-25 15:19 Mcnamara, John
2022-04-21 14:57 Mcnamara, John
     [not found] <DM6PR11MB3227AC0014F321EB901BE385FC199@DM6PR11MB3227.namprd11.prod.outlook.com>
2022-04-21 11:51 ` Mcnamara, John
     [not found] <DM8PR11MB5605B4A5DBD79FFDB4B1C3B2BD0A9@DM8PR11MB5605.namprd11.prod.outlook.com>
2022-03-21 18:23 ` Pai G, Sunil
2022-03-15 15:48 Stokes, Ian
2022-03-15 13:17 Stokes, Ian
2022-03-15 11:15 Stokes, Ian

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CO6PR11MB5603A9314AE76184512877BDBDE99@CO6PR11MB5603.namprd11.prod.outlook.com \
    --to=sunil.pai.g@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=cian.ferriter@intel.com \
    --cc=dev@dpdk.org \
    --cc=emma.finn@intel.com \
    --cc=fengchengwen@huawei.com \
    --cc=g.singh@nxp.com \
    --cc=harry.van.haaren@intel.com \
    --cc=i.maximets@ovn.org \
    --cc=ian.stokes@intel.com \
    --cc=jiayu.hu@intel.com \
    --cc=john.mcnamara@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=nipun.gupta@nxp.com \
    --cc=ovs-dev@openvswitch.org \
    --cc=radhac@marvell.com \
    --cc=tim.odriscoll@intel.com \
    --cc=vburru@marvell.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).