DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: Tiwei Bie <tiwei.bie@intel.com>
Cc: yliu@fridaylinux.org, jianfeng.tan@intel.com,
	jfreimann@redhat.com, ferruh.yigit@intel.com, dev@dpdk.org,
	stable@dpdk.org
Subject: Re: [dpdk-dev] [PATCH] vhost: fix iotlb pool out-of-memory handling
Date: Fri, 26 Jan 2018 16:23:03 +0100	[thread overview]
Message-ID: <44259095-a0c6-bdad-7e5b-88f0bda369bd@redhat.com> (raw)
In-Reply-To: <20180126080316.xm7gqvozuaasqvfl@debian-xvivbkq.sh.intel.com>

Hi Tiwei,

On 01/26/2018 09:03 AM, Tiwei Bie wrote:
> On Thu, Jan 25, 2018 at 04:06:53PM +0100, Maxime Coquelin wrote:
>> In the unlikely case the IOTLB memory pool runs out of memory,
>> an issue may happen if all entries are used by the IOTLB cache,
>> and an IOTLB miss happen. If the iotlb pending list is empty,
>> then no memory is freed and allocation fails a second time.
>>
>> This patch fixes this by doing an IOTLB cache random evict if
>> the IOTLB pending list is empty, ensuring the second allocation
>> try will succeed.
>>
>> In the same spirit, the opposite is done when inserting an
>> IOTLB entry in the IOTLB cache fails due to out of memory. In
>> this case, the IOTLB pending is flushed if the IOTLB cache is
>> empty to ensure the new entry can be inserted.
>>
>> Fixes: d012d1f293f4 ("vhost: add IOTLB helper functions")
>> Fixes: f72c2ad63aeb ("vhost: add pending IOTLB miss request list and helpers")
>>
> [...]
>> @@ -95,9 +98,11 @@ vhost_user_iotlb_pending_insert(struct vhost_virtqueue *vq,
>>   
>>   	ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>>   	if (ret) {
>> -		RTE_LOG(INFO, VHOST_CONFIG,
>> -				"IOTLB pool empty, clear pending misses\n");
>> -		vhost_user_iotlb_pending_remove_all(vq);
>> +		RTE_LOG(INFO, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");
>> +		if (!TAILQ_EMPTY(&vq->iotlb_pending_list))
>> +			vhost_user_iotlb_pending_remove_all(vq);
>> +		else
>> +			vhost_user_iotlb_cache_random_evict(vq);
>>   		ret = rte_mempool_get(vq->iotlb_pool, (void **)&node);
>>   		if (ret) {
>>   			RTE_LOG(ERR, VHOST_CONFIG, "IOTLB pool still empty, failure\n");
>> @@ -186,8 +191,11 @@ vhost_user_iotlb_cache_insert(struct vhost_virtqueue *vq, uint64_t iova,
>>   
>>   	ret = rte_mempool_get(vq->iotlb_pool, (void **)&new_node);
>>   	if (ret) {
>> -		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, evict one entry\n");
>> -		vhost_user_iotlb_cache_random_evict(vq);
>> +		RTE_LOG(DEBUG, VHOST_CONFIG, "IOTLB pool empty, clear entries\n");
> 
> Maybe we should use the same log level for both cases?
> Currently, the first one is INFO, and this one is DEBUG.

Right, I will set DEBUG log level for both.

> 
>> +		if (!TAILQ_EMPTY(&vq->iotlb_list))
>> +			vhost_user_iotlb_cache_random_evict(vq);
>> +		else
>> +			vhost_user_iotlb_pending_remove_all(vq);
> 
> The IOTLB entry will be invalidated explicitly if it's
> unmapped in the frontend. So all the IOTLB entries in
> IOTLB cache are supposed to be valid. So I think maybe
> it's better to always prefer to free memory from IOTLB
> pending list. Something like:
> 
> 	if (TAILQ_EMPTY(&vq->iotlb_list))
> 		vhost_user_iotlb_pending_remove_all(vq);
> 	else if (TAILQ_EMPTY(&vq->iotlb_pending_list))
> 		vhost_user_iotlb_cache_random_evict(vq);
> 	else
> 		vhost_user_iotlb_pending_random_evict(vq);

I agree that all the IOTLB entries in the cache are supposed to
be valid. But removing in pending list first might not be a good idea,
as if the pending entry is valid, it means an IOTLB update is on its way
to the backend. So it could cause another MISS request to be sent for
the same address if the PMD thread retry the translation afterwards.

> Besides, in __vhost_iova_to_vva():
> 
> uint64_t
> __vhost_iova_to_vva(struct virtio_net *dev, struct vhost_virtqueue *vq,
> 		    uint64_t iova, uint64_t size, uint8_t perm)
> {
> 	uint64_t vva, tmp_size;
> 
> 	if (unlikely(!size))
> 		return 0;
> 
> 	tmp_size = size;
> 
> 	vva = vhost_user_iotlb_cache_find(vq, iova, &tmp_size, perm);
> 	if (tmp_size == size)
> 		return vva;
> 
> 	if (!vhost_user_iotlb_pending_miss(vq, iova + tmp_size, perm)) {
> 		/*
> 		 * iotlb_lock is read-locked for a full burst,
> 		 * but it only protects the iotlb cache.
> 		 * In case of IOTLB miss, we might block on the socket,
> 		 * which could cause a deadlock with QEMU if an IOTLB update
> 		 * is being handled. We can safely unlock here to avoid it.
> 		 */
> 		vhost_user_iotlb_rd_unlock(vq);
> 
> 		vhost_user_iotlb_pending_insert(vq, iova + tmp_size, perm);
> 		vhost_user_iotlb_miss(dev, iova + tmp_size, perm);
> 
> The vhost_user_iotlb_miss() may fail. If it fails,
> the inserted pending entry should be removed from
> the pending list.

Right, thanks for spotting this.
I will fix this in a separate patch.

Regards,
Maxime
> Best regards,
> Tiwei Bie
> 

  reply	other threads:[~2018-01-26 15:23 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-25 15:06 Maxime Coquelin
2018-01-26  8:03 ` Tiwei Bie
2018-01-26 15:23   ` Maxime Coquelin [this message]
2018-01-26 12:40 ` Jens Freimann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=44259095-a0c6-bdad-7e5b-88f0bda369bd@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jfreimann@redhat.com \
    --cc=jianfeng.tan@intel.com \
    --cc=stable@dpdk.org \
    --cc=tiwei.bie@intel.com \
    --cc=yliu@fridaylinux.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).