DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ilya Maximets <i.maximets@samsung.com>
To: Shahaf Shuler <shahafs@mellanox.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>,
	Xiao Wang <xiao.w.wang@intel.com>,
	"jfreimann@redhat.com" <jfreimann@redhat.com>
Cc: Tiwei Bie <tiwei.bie@intel.com>,
	Zhihong Wang <zhihong.wang@intel.com>,
	Jason Wang <jasowang@redhat.com>,
	"xiaolong.ye@intel.com" <xiaolong.ye@intel.com>,
	"alejandro.lucero@netronome.com" <alejandro.lucero@netronome.com>,
	Daniel Marcovitch <danielm@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v2] net/virtio: add platform memory ordering feature support
Date: Wed, 9 Jan 2019 17:34:38 +0300	[thread overview]
Message-ID: <2832c88f-9b43-997c-5937-ef5ae6482fd5@samsung.com> (raw)
In-Reply-To: <DB7PR05MB4426DD67FB44510971CECAEDC3B60@DB7PR05MB4426.eurprd05.prod.outlook.com>

On 27.12.2018 13:07, Shahaf Shuler wrote:
> Hi Ilya, 
> 
> Wednesday, December 26, 2018 6:37 PM, Ilya Maximets:
>> Subject: [dpdk-dev] [PATCH v2] net/virtio: add platform memory ordering
>> feature support
>>
>> VIRTIO_F_ORDER_PLATFORM is required to use proper memory barriers in
>> case of HW vhost implementations like vDPA.
>>
>> DMA barriers (rte_cio_*) are sufficent for that purpose.
>>
>> Previously known as VIRTIO_F_IO_BARRIER.
>>
>> Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
>> ---
>>
>> Version 2:
>>   * rebased on current master (packed rings).
>>
>> RFC --> Version 1:
>>   * Dropped vendor-specific hack to determine if we need real barriers.
>>   * Added VIRTIO_F_ORDER_PLATFORM feature definition and checking.
>>
>> Note: Patch to change the name of the feature from VIRTIO_F_IO_BARRIER
>>       to VIRTIO_F_ORDER_PLATFORM is not merged yet:
>>
>> https://emea01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.m
>> ail-archive.com%2Fvirtio-dev%40lists.oasis-
>> open.org%2Fmsg04114.html&amp;data=02%7C01%7Cshahafs%40mellanox.co
>> m%7C147684bb9b754e30781408d66b506965%7Ca652971c7d2e4d9ba6a4d149
>> 256f461b%7C0%7C0%7C636814390449088148&amp;sdata=f4W1YLftBtZ4zAQ3
>> %2Bv7LV5w%2FDhM5aWpROV2c2gHY8iU%3D&amp;reserved=0
>>
>>  drivers/net/virtio/virtio_ethdev.c |  2 ++  drivers/net/virtio/virtio_ethdev.h |  3
>> ++-
>>  drivers/net/virtio/virtio_pci.h    |  7 ++++++
>>  drivers/net/virtio/virtio_rxtx.c   | 16 ++++++------
>>  drivers/net/virtio/virtqueue.h     | 39 ++++++++++++++++++++++++------
>>  5 files changed, 51 insertions(+), 16 deletions(-)
> 
> [...]
> 
>>
>>  /*
>> - * Per virtio_config.h in Linux.
>> + * Per virtio_ring.h in Linux.
>>   *     For virtio_pci on SMP, we don't need to order with respect to MMIO
>>   *     accesses through relaxed memory I/O windows, so smp_mb() et al are
>>   *     sufficient.
>>   *
>> + *     For using virtio to talk to real devices (eg. vDPA) we do need real
>> + *     barriers.
>>   */
>> -#define virtio_mb()	rte_smp_mb()
>> -#define virtio_rmb()	rte_smp_rmb()
>> -#define virtio_wmb()	rte_smp_wmb()
>> +static inline void
>> +virtio_mb(uint8_t weak_barriers)
>> +{
>> +	if (weak_barriers)
>> +		rte_smp_mb();
>> +	else
>> +		rte_mb();
>> +}
>> +
>> +static inline void
>> +virtio_rmb(uint8_t weak_barriers)
>> +{
>> +	if (weak_barriers)
>> +		rte_smp_rmb();
>> +	else
>> +		rte_cio_rmb();
>> +}
>> +
>> +static inline void
>> +virtio_wmb(uint8_t weak_barriers)
>> +{
>> +	if (weak_barriers)
>> +		rte_smp_wmb();
>> +	else
>> +		rte_cio_wmb();
> 
> Just wondering if the cio barrier is enough here.
> This virtio_wmb will be called, for example in the following sequence for transmit (not of packed queue):
> if (likely(nb_tx)) {                                                 
>         vq_update_avail_idx(vq);                <--- virito_wmb()                     
>                                                                      
>         if (unlikely(virtqueue_kick_prepare(vq))) {                 <--- no barrier 
>                 virtqueue_notify(vq);                                
>                 PMD_TX_LOG(DEBUG, "Notified backend after xmit");    
>         }                                                            
> }                                                                    
> 
> Assuming the backhand has vDPA device. The notify will be most likely mapped to the device PCI bar as write combining.
> This means we need to keep ordering here between two memory domains - the PCI and the host local memory. We must make sure that when the notify reaches the device, the store to the host local memory is already written to memory (and not in the write buffer).
> Is complier barrier is enough for such purpose? Or we need something stronger like sfence? 
> 
> Note #1 - This issue (if exists) is not caused directly by your code, however you are making things right here w/ respect to memory ordering so worth taking care also on this one
> Note #2 - if indeed there is an issue here, for packed queue we are somewhat OK, since virtio_mb() will be called before the kick. I am not sure why we need such hard barrier and sfence is not enough. Do you know? 

Hmm. Thanks for spotting this.
The issue exists, but it's a bit different.
Looks like we have missed virtio_mb() for the split case inside the
virtqueue_kick_prepare(). It's required because we must ensure the
ordering between writing the data (updating avail->idx) and reading the
used->flags. Otherwise we could catch the case where data written, but
virtqueue wasn't notified and vhost waits for notification.
This is not the vDPA related issue. This could happen with kernel vhost.
Adding the virtio_mb() to virtqueue_kick_prepare() will solve the vDPA
case automatically. I'll prepare the patches.


Beside that, Jens, between v10 and v11 of the packed queues support
patch-set you added the virtio_mb() to kick_prepare function.
I guess, this could be the root cause of the performance drop you
spotted in compare with split queues. Maybe you could check and prove
that point?
I didn't found why you done that change (there are no such review
comments on the list), but it was right decision.

virtio_mb() is really heavy. I'd like to avoid it somehow, but I don't
know how to do this yet.

> 
> 
>> +}
>>
>>  #ifdef RTE_PMD_PACKET_PREFETCH
>>  #define rte_packet_prefetch(p)  rte_prefetch1(p) @@ -325,7 +350,7 @@
>> virtqueue_enable_intr_packed(struct virtqueue *vq)
>>
>>
>>  	if (vq->event_flags_shadow == RING_EVENT_FLAGS_DISABLE) {
>> -		virtio_wmb();
>> +		virtio_wmb(vq->hw->weak_barriers);
>>  		vq->event_flags_shadow = RING_EVENT_FLAGS_ENABLE;
>>  		*event_flags = vq->event_flags_shadow;
>>  	}
>> @@ -391,7 +416,7 @@ void vq_ring_free_inorder(struct virtqueue *vq,
>> uint16_t desc_idx,  static inline void  vq_update_avail_idx(struct virtqueue *vq)
>> {
>> -	virtio_wmb();
>> +	virtio_wmb(vq->hw->weak_barriers);
>>  	vq->vq_ring.avail->idx = vq->vq_avail_idx;  }
>>
>> @@ -423,7 +448,7 @@ virtqueue_kick_prepare_packed(struct virtqueue *vq)  {
>>  	uint16_t flags;
>>
>> -	virtio_mb();
>> +	virtio_mb(vq->hw->weak_barriers);
>>  	flags = vq->ring_packed.device_event->desc_event_flags;
>>
>>  	return flags != RING_EVENT_FLAGS_DISABLE;
>> --
>> 2.17.1
> 

  reply	other threads:[~2019-01-09 14:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <CGME20181214153817eucas1p19a41cdd791879252e1f3a5d77c427845@eucas1p1.samsung.com>
2018-12-14 15:38 ` [dpdk-dev] [PATCH] " Ilya Maximets
2018-12-14 17:00   ` Michael S. Tsirkin
2018-12-14 17:23     ` Ilya Maximets
     [not found]   ` <CGME20181226163717eucas1p15276eb45e35abe2c9cf3e7c1e0050823@eucas1p1.samsung.com>
2018-12-26 16:37     ` [dpdk-dev] [PATCH v2] " Ilya Maximets
2018-12-27 10:07       ` Shahaf Shuler
2019-01-09 14:34         ` Ilya Maximets [this message]
2019-01-09 15:50           ` Michael S. Tsirkin
2019-01-10 20:36             ` Shahaf Shuler
2019-01-15  6:33               ` Shahaf Shuler
2019-01-15  8:29                 ` Ilya Maximets
2019-01-15  8:55                   ` Shahaf Shuler
2019-01-15 10:23                     ` Ilya Maximets
2019-02-12 17:50                     ` Michael S. Tsirkin
     [not found]       ` <CGME20190109145021eucas1p1bfe194ffafaaaa5df62243c92b2ed6cd@eucas1p1.samsung.com>
2019-01-09 14:50         ` [dpdk-dev] [PATCH v3 0/3] Missing barriers and VIRTIO_F_ORDER_PLATFORM Ilya Maximets
     [not found]           ` <CGME20190109145027eucas1p2437215de0df4c691eb84d4e84bfc71e5@eucas1p2.samsung.com>
2019-01-09 14:50             ` [dpdk-dev] [PATCH v3 1/3] net/virtio: add missing barrier before reading the flags Ilya Maximets
2019-01-10 14:31               ` Maxime Coquelin
     [not found]           ` <CGME20190109145034eucas1p2183e275e316b87917b96fa184fc7d7cb@eucas1p2.samsung.com>
2019-01-09 14:50             ` [dpdk-dev] [PATCH v3 2/3] net/virtio: update memory ordering comment for vq notify Ilya Maximets
2019-01-10  8:19               ` Gavin Hu (Arm Technology China)
2019-01-10  9:18                 ` Maxime Coquelin
2019-01-10  9:55                   ` Ilya Maximets
2019-01-10 14:56                     ` Michael S. Tsirkin
2019-01-10 14:31               ` Maxime Coquelin
     [not found]           ` <CGME20190109145040eucas1p2d9afc678ef94986544bde07b77373e6f@eucas1p2.samsung.com>
2019-01-09 14:50             ` [dpdk-dev] [PATCH v3 3/3] net/virtio: add platform memory ordering feature support Ilya Maximets
2019-01-10 14:31               ` Maxime Coquelin
2019-01-09 14:55           ` [dpdk-dev] [PATCH v3 0/3] Missing barriers and VIRTIO_F_ORDER_PLATFORM Michael S. Tsirkin
2019-01-09 15:24             ` Ilya Maximets
2019-01-09 16:53               ` Ferruh Yigit
2019-01-10 15:19             ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2832c88f-9b43-997c-5937-ef5ae6482fd5@samsung.com \
    --to=i.maximets@samsung.com \
    --cc=alejandro.lucero@netronome.com \
    --cc=danielm@mellanox.com \
    --cc=dev@dpdk.org \
    --cc=jasowang@redhat.com \
    --cc=jfreimann@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=shahafs@mellanox.com \
    --cc=tiwei.bie@intel.com \
    --cc=xiao.w.wang@intel.com \
    --cc=xiaolong.ye@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).