From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: "Ding, Xuan" <xuan.ding@intel.com>,
Maxime Coquelin <maxime.coquelin@redhat.com>,
"dev@dpdk.org" <dev@dpdk.org>,
"david.marchand@redhat.com" <david.marchand@redhat.com>,
"Xia, Chenbo" <chenbo.xia@intel.com>
Cc: "Hu, Jiayu" <jiayu.hu@intel.com>
Subject: Re: [dpdk-dev] [PATCH] vhost: fix async DMA map
Date: Tue, 26 Oct 2021 11:58:57 +0100 [thread overview]
Message-ID: <6714a5cf-b0b9-e1a8-60c4-83904ef4b9b9@intel.com> (raw)
In-Reply-To: <BN9PR11MB5513DC83249885B387AABBD7E7849@BN9PR11MB5513.namprd11.prod.outlook.com>
On 26-Oct-21 11:27 AM, Ding, Xuan wrote:
> Hi,
>
>> -----Original Message-----
>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>> Sent: Tuesday, October 26, 2021 5:49 PM
>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>> Subject: Re: [PATCH] vhost: fix async DMA map
>>
>>
>>
>> On 10/26/21 10:49, Ding, Xuan wrote:
>>> Hi Maxime,
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: Tuesday, October 26, 2021 2:53 PM
>>>> To: Ding, Xuan <xuan.ding@intel.com>; dev@dpdk.org;
>>>> david.marchand@redhat.com; Xia, Chenbo <chenbo.xia@intel.com>
>>>> Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>
>>>>
>>>>
>>>> On 10/26/21 04:07, Ding, Xuan wrote:
>>>>> Hi Maxime,
>>>>>
>>>>>> -----Original Message-----
>>>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>> Sent: Tuesday, October 26, 2021 4:47 AM
>>>>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo
>>>>>> <chenbo.xia@intel.com>; Ding, Xuan <xuan.ding@intel.com>
>>>>>> Subject: Re: [PATCH] vhost: fix async DMA map
>>>>>>
>>>>>> Hi Xuan,
>>>>>>
>>>>>> On 10/25/21 22:33, Maxime Coquelin wrote:
>>>>>>> This patch fixes possible NULL-pointer dereferencing
>>>>>>> reported by Coverity and also fixes NUMA reallocation
>>>>>>> of the async DMA map.
>>>>>>>
>>>>>>> Fixes: 7c61fa08b716 ("vhost: enable IOMMU for async vhost")
>>>>>>>
>>>>>>> Coverity issue: 373655
>>>>>>>
>>>>>>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>>>>> ---
>>>>>>> lib/vhost/vhost_user.c | 45 +++++++++++++++++++-----------------------
>>>>>>> 1 file changed, 20 insertions(+), 25 deletions(-)
>>>>>>>
>>>>>>
>>>>>> I posted this patch to fix the issue reported by Coverity and also other
>>>>>> issue on NUMA realloc that I found at the same time. But I wonder
>>>>>> whether all this async_map_status is needed.
>>>>>
>>>>> Thanks for your fix! I can help to review and test the patch later.
>>>>>
>>>>> I add the async_map_status in v2 for compatibility. Some DMA device,
>>>>> like DSA, may use kernel idxd driver only. If there is no device bound to
>>>>> DPDK vfio and kernel vfio module is modprobed to ensure
>>>> rte_vfio_is_enabled() is true,
>>>>> we will unavoidably do DMA map/unmap and it will fail.
>>>>>
>>>>> Therefore, the dma_map_status here is used to filter this situation by
>>>> preventing
>>>>> unnecessary DMA unmap.
>>>>
>>>> Ok, then I think we can just remove the async DMA map.
>>>>
>>>>>>
>>>>>> Indeed, if the only place where we DMA map is in
>>>>>> vhost_user_mmap_region(). If it fails, the error is propagated, the mem
>>>>>> table are freed and NACK is replied to the master. IOW, the device will
>>>>>> be in an unusable state.
>>>>>
>>>>> I agree with you, this is the place I consider right to do DMA map
>>>>> because we also do SW mapping here, any suggestions?
>>>>
>>>> No suggestion, I was just explaining that at the only place where
>>>> DMA map were done, mapping errors were properly handled and
>> propagated.
>>>
>>> What about just setting async_copy to false, and allow switching to sync
>> path.
>>>
>>>>
>>>>>>
>>>>>> Removing the async DMA map will simplify a lot the code, do you agree
>> to
>>>>>> remove it or there is something I missed?
>>>>>
>>>>> See above. Indeed, it adds a lot of code. But we can't know the driver for
>>>>> each device in vhost lib, or we can only restrict the user to bind some
>>>> devices
>>>>> to DPDK vfio if async logic needed.
>>>>
>>>> I would think we don't care if DMA unmap fails, we can just do the same
>>>> as what you do for DMA map, i.e. just ignore the error.
>>>
>>> Get your idea, we can do the same as DMA map, and in this way
>> dma_map_status flag can be removed.
>>>
>>>>
>>>> Thanks to this discussion, I have now more concerns on how it works. I
>>>> think we have a problem here in case of DMA device hotplug, that device
>>>> could miss the necessary map entries from Vhost if no VFIO devices were
>>>> attached at VHST_USER_SET_MEM_TABLE time. How would you handle
>> that
>>>> case?
>>>
>>> DMA device is uncore, so I don't see the hotplug issue here.
>>
>> I'm not sure what 'uncore' is, I suppose you mean your device cannot be
>> physically added/removed to the system.
>
> Yes, we are at the same understanding.
>
>>
>> I was not clear enough in my question. I meant that for example, the
>> application is started and the Vhost port is created. Then, the DMA
>> device is bound to VFIO, and probed by the DPDK application. Finally,
>> the application register the DMA device as an async channel in Vhost.
>>
>> I think it will not work as the SET_MEM_TABLE will already have
>> happened, so the guest memory won't be mapped in the VFIO container.
>
> Assuming the DMA device supports hotplug, this situation was not actually took into consideration, and maybe the handling should be added in app, with an event callback.
>
>>
>> Do you have the same understanding?
>>
>>> I will have another patch containing compatibility with sync path, and
>> async_map_status flag will be removed.
>>> Hope to get your insights.
>>
>> What do you mean by compatibility with sync path?
>
> I mean whatever DMA map succeeds or fails, we return 0 so as not to prevent
> SET_EM_TABLE.
>
>>
>> Thanks for taking care of the async map status removal.
>
> Found a tricky point for removing dma_map_status flag, currently DMA unmap API will directly access the vfio_config,
> And if without dma_map_status, we cannot get rte_errno at this moment.
This is a bug in VFIO API. We always assume VFIO type is valid, but it
is only valid whenever 1) VFIO support is detected at startup, and 2) we
assign at least one device to VFIO. If VFIO was detected but no devices
were assigned, the VFIO type will not be assigned, and the pointer will
be NULL. In this case, we should not attempt any unmaps, because there
are no active devices.
I'll submit a patch to fix this, thanks for catching this issue!
>
> BTW, I will consider this Coverity fix in new patch. :)
>
> Thanks,
> Xuan
>
>>
>> Maxime
>>> Thanks,
>>> Xuan
>>>
>>>>
>>>> Regards,
>>>> Maxime
>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Maxime
>>>>>
>>>
>
--
Thanks,
Anatoly
next prev parent reply other threads:[~2021-10-26 10:59 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-10-25 20:33 Maxime Coquelin
2021-10-25 20:47 ` Maxime Coquelin
2021-10-26 2:07 ` Ding, Xuan
2021-10-26 6:52 ` Maxime Coquelin
2021-10-26 8:49 ` Ding, Xuan
2021-10-26 9:49 ` Maxime Coquelin
2021-10-26 10:27 ` Ding, Xuan
2021-10-26 10:58 ` Burakov, Anatoly [this message]
2021-10-26 15:24 ` Burakov, Anatoly
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6714a5cf-b0b9-e1a8-60c4-83904ef4b9b9@intel.com \
--to=anatoly.burakov@intel.com \
--cc=chenbo.xia@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=jiayu.hu@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=xuan.ding@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).