DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jason Wang <jasowang@redhat.com>
To: Ilya Maximets <i.maximets@samsung.com>,
	Maxime Coquelin <maxime.coquelin@redhat.com>,
	dev@dpdk.org, jfreimann@redhat.com, tiwei.bie@intel.com,
	zhihong.wang@intel.com
Cc: stable@dpdk.org, "Michael S. Tsirkin" <mst@redhat.com>
Subject: Re: [dpdk-dev] [1/5] vhost: enforce avail index and desc read ordering
Date: Thu, 6 Dec 2018 21:25:56 +0800	[thread overview]
Message-ID: <c21d8b80-d504-17a1-67bc-363a99da6bb1@redhat.com> (raw)
In-Reply-To: <db9c6933-0817-8947-38de-e0df079275de@samsung.com>


On 2018/12/6 下午8:48, Ilya Maximets wrote:
> On 06.12.2018 7:17, Jason Wang wrote:
>> On 2018/12/5 下午7:30, Ilya Maximets wrote:
>>> On 05.12.2018 12:49, Maxime Coquelin wrote:
>>>> A read barrier is required to ensure the ordering between
>>>> available index and the descriptor reads is enforced.
>>>>
>>>> Fixes: 4796ad63ba1f ("examples/vhost: import userspace vhost application")
>>>> Cc: stable@dpdk.org
>>>>
>>>> Reported-by: Jason Wang <jasowang@redhat.com>
>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> ---
>>>>    lib/librte_vhost/virtio_net.c | 12 ++++++++++++
>>>>    1 file changed, 12 insertions(+)
>>>>
>>>> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c
>>>> index 5e1a1a727..f11ebb54f 100644
>>>> --- a/lib/librte_vhost/virtio_net.c
>>>> +++ b/lib/librte_vhost/virtio_net.c
>>>> @@ -791,6 +791,12 @@ virtio_dev_rx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>>>        rte_prefetch0(&vq->avail->ring[vq->last_avail_idx & (vq->size - 1)]);
>>>>        avail_head = *((volatile uint16_t *)&vq->avail->idx);
>>>>    +    /*
>>>> +     * The ordering between avail index and
>>>> +     * desc reads needs to be enforced.
>>>> +     */
>>>> +    rte_smp_rmb();
>>>> +
>>> Hmm. This looks weird to me.
>>> Could you please describe the bad scenario here? (It'll be good to have it
>>> in commit message too)
>>>
>>> As I understand, you're enforcing the read of avail->idx to happen before
>>> reading the avail->ring[avail_idx]. Is it correct?
>>>
>>> But we have following code sequence:
>>>
>>> 1. read avail->idx (avail_head).
>>> 2. check that last_avail_idx != avail_head.
>>> 3. read from the ring using last_avail_idx.
>>>
>>> So, there is a strict dependency between all 3 steps and the memory
>>> transaction will be finished at the step #2 in any case. There is no
>>> way to read the ring before reading the avail->idx.
>>>
>>> Am I missing something?
>>
>> Nope, I kind of get what you meaning now. And even if we will
>>
>> 4. read descriptor from descriptor ring using the id read from 3
>>
>> 5. read descriptor content according to the address from 4
>>
>> They still have dependent memory access. So there's no need for rmb.
>>
> On a second glance I changed my mind.
> The code looks like this:
>
> 1. read avail_head = avail->idx
> 2. read cur_idx    = last_avail_idx
> if (cur_idx != avail_head) {
>      3. read idx = avail->ring[cur_idx]
>      4. read desc[idx]
> }
>
> There is an address (data) dependency: 2 -> 3 -> 4.
> These reads could not be reordered.
>
> But it's only control dependency between 1 and (3, 4), because 'avail_head'
> is not used to calculate 'cur_idx'. In case of aggressive speculative
> execution, 1 could be reordered with 3 resulting with reading of not yet
> updated 'idx'.
>
> Not sure if speculative execution could go so far while 'avail_head' is not
> read yet, but it's should be possible in theory.
>
> Thoughts ?


I think I change my mind as well, this is similar to the discussion of 
desc_is_avail(). So I think it's possible.


>
>>>>        for (pkt_idx = 0; pkt_idx < count; pkt_idx++) {
>>>>            uint32_t pkt_len = pkts[pkt_idx]->pkt_len + dev->vhost_hlen;
>>>>            uint16_t nr_vec = 0;
>>>> @@ -1373,6 +1379,12 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
>>>>        if (free_entries == 0)
>>>>            return 0;
>>>>    +    /*
>>>> +     * The ordering between avail index and
>>>> +     * desc reads needs to be enforced.
>>>> +     */
>>>> +    rte_smp_rmb();
>>>> +
>>> This one is strange too.
>>>
>>>      free_entries = *((volatile uint16_t *)&vq->avail->idx) -
>>>              vq->last_avail_idx;
>>>      if (free_entries == 0)
>>>          return 0;
>>>
>>> The code reads the value of avail->idx and uses the value on the next
>>> line even with any compiler optimizations. There is no way for CPU to
>>> postpone the actual read.
>>
>> Yes.
>>
> It's kind of similar situation here, but 'avail_head' is involved somehow
> in 'cur_idx' calculation because of
> 	fill_vec_buf_split(..., vq->last_avail_idx + i, ...)
> And 'i' depends on 'free_entries'.


I agree it depends on compiler,  it can choose to remove such data 
dependency.


> But we need to look at the exact asm
> code to be sure.


I think it's probably hard to get a conclusion by checking asm code 
generated by one specific version or kind of a compiler


>   I think, we may add barrier here to avoid possible issues.


Yes.


Thanks.


>
>> Thanks
>>
>>
>>>>        VHOST_LOG_DEBUG(VHOST_DATA, "(%d) %s\n", dev->vid, __func__);
>>>>          count = RTE_MIN(count, MAX_PKT_BURST);
>>>>
>>

  reply	other threads:[~2018-12-06 13:26 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-12-05  9:49 [dpdk-dev] [PATCH 0/5] vhost: add missing barriers, remove useless volatiles Maxime Coquelin
2018-12-05  9:49 ` [dpdk-dev] [PATCH 1/5] vhost: enforce avail index and desc read ordering Maxime Coquelin
     [not found]   ` <CGME20181205113041eucas1p1943b9c13af2fb5b736ba4906b59a9cd5@eucas1p1.samsung.com>
2018-12-05 11:30     ` [dpdk-dev] [1/5] " Ilya Maximets
2018-12-06  4:17       ` Jason Wang
2018-12-06 12:48         ` Ilya Maximets
2018-12-06 13:25           ` Jason Wang [this message]
2018-12-06 13:48         ` Michael S. Tsirkin
2018-12-07 14:58           ` Ilya Maximets
2018-12-07 15:44             ` Michael S. Tsirkin
     [not found]   ` <CGME20181211103848eucas1p10c270ca8997fea8a2f55c2d94d02baea@eucas1p1.samsung.com>
2018-12-11 10:38     ` Ilya Maximets
2018-12-11 14:46       ` Maxime Coquelin
2018-12-05  9:49 ` [dpdk-dev] [PATCH 2/5] vhost: enforce desc flags and content " Maxime Coquelin
     [not found]   ` <CGME20181205133332eucas1p195b3864ed146403e314d7004d27be285@eucas1p1.samsung.com>
2018-12-05 13:33     ` [dpdk-dev] [2/5] " Ilya Maximets
2018-12-06  4:24       ` Jason Wang
2018-12-06 11:34         ` Ilya Maximets
2018-12-05  9:49 ` [dpdk-dev] [PATCH 3/5] vhost: prefetch descriptor after the read barrier Maxime Coquelin
2018-12-05  9:49 ` [dpdk-dev] [PATCH 4/5] vhost: remove useless prefetch for packed ring descriptor Maxime Coquelin
2018-12-05  9:49 ` [dpdk-dev] [PATCH 5/5] vhost: remove useless casts to volatile Maxime Coquelin
     [not found]   ` <CGME20181205135231eucas1p1c89281f6525a0fedab4a2fc0d2e21393@eucas1p1.samsung.com>
2018-12-05 13:52     ` [dpdk-dev] [5/5] " Ilya Maximets
2018-12-06 16:59       ` Maxime Coquelin
2018-12-07 11:16         ` Ilya Maximets

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c21d8b80-d504-17a1-67bc-363a99da6bb1@redhat.com \
    --to=jasowang@redhat.com \
    --cc=dev@dpdk.org \
    --cc=i.maximets@samsung.com \
    --cc=jfreimann@redhat.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mst@redhat.com \
    --cc=stable@dpdk.org \
    --cc=tiwei.bie@intel.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).