From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id ECE80A034F;
	Mon,  7 Jun 2021 15:46:13 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 5ECBC4067E;
	Mon,  7 Jun 2021 15:46:13 +0200 (CEST)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by mails.dpdk.org (Postfix) with ESMTP id F2D8940147
 for <dev@dpdk.org>; Mon,  7 Jun 2021 15:46:11 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1623073571;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=O9/YbjqhV5yJJ06tln032d6CrIDGVS36T/gbaX0AmKQ=;
 b=G/fuKvz653IrQHfNhRmXndBHW21YDECLnDK0zJnbAgVpuPCl+eEjFYUdM4lRelBJyDBHak
 Lje26DF7hsJqpcrJVru7RLxvYFELbIOOptAREF0OzkygKkVbeThe4GSPGm5gucxpF0YWIP
 qtcdJcorxC6odttd0BqEndNoifKtP64=
Received: from mail-wr1-f72.google.com (mail-wr1-f72.google.com
 [209.85.221.72]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-301-0S1o_doxNHaR4uQesXu4ZA-1; Mon, 07 Jun 2021 09:46:09 -0400
X-MC-Unique: 0S1o_doxNHaR4uQesXu4ZA-1
Received: by mail-wr1-f72.google.com with SMTP id
 d5-20020a0560001865b0290119bba6e1c7so3903099wri.20
 for <dev@dpdk.org>; Mon, 07 Jun 2021 06:46:09 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:subject:to:cc:references:from:message-id:date
 :user-agent:mime-version:in-reply-to:content-language
 :content-transfer-encoding;
 bh=O9/YbjqhV5yJJ06tln032d6CrIDGVS36T/gbaX0AmKQ=;
 b=Itt64/oJpMbG1R0od1NWH0k7FbhwdCRrcYt4o277YA2s9uszsR2nDT1sn4Tmj+DD92
 m6szsEmQumH8/XLKGKsPRHoS1IjVs/kawaybhDAeSxFYX4+JXQGlRsTw3crG78NIqJsT
 u2VnDGr0kr5NRPeHL4A64gktU2J3VaixspXQqrPYfi016Zw7K60PPmsxFtyChIa6SAlK
 8BjvEKNkPO3Km7JivZfr4qIaDXlDwrJ48hJr8cuE9jF8eLGRZ5Vl3aKZGAqbkYDwy6YI
 sl8pZloI6nUqAXmO5iV8CK+ME6ofAyY1NHslO94isf/ddexN42J5iHRQG/1XWhqAVGn2
 SJPw==
X-Gm-Message-State: AOAM532Qs+JUFhClbUzBjISIdgFrk0uTRLLzda+UHgSGpDU9xGLsF+5T
 JumUmtQfIofXXbcSJ8h33uQ/rRcPTNPcZCmkLOpq5iPsRS2z+P3g6yrZwXnuv9agk4romPyKks+
 GR/0=
X-Received: by 2002:adf:d23a:: with SMTP id k26mr17581602wrh.68.1623073568828; 
 Mon, 07 Jun 2021 06:46:08 -0700 (PDT)
X-Google-Smtp-Source: ABdhPJx91HCF46IjILKVhTX5BqGMEnZsfcv97otzwiqZxRHiqgBrMN+WLuJZw7ZMXYZQ8VdlKDrPYw==
X-Received: by 2002:adf:d23a:: with SMTP id k26mr17581587wrh.68.1623073568634; 
 Mon, 07 Jun 2021 06:46:08 -0700 (PDT)
Received: from [192.168.1.205] (219-230-83-45.ftth.cust.kwaoo.net.
 [45.83.230.219])
 by smtp.gmail.com with ESMTPSA id f12sm463705wru.81.2021.06.07.06.46.07
 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128);
 Mon, 07 Jun 2021 06:46:07 -0700 (PDT)
To: Cheng Jiang <cheng1.jiang@intel.com>, maxime.coquelin@redhat.com,
 Chenbo.Xia@intel.com
Cc: dev@dpdk.org, jiayu.hu@intel.com, yvonnex.yang@intel.com
References: <20210602042802.31943-1-cheng1.jiang@intel.com>
 <20210602042802.31943-2-cheng1.jiang@intel.com>
From: Maxime Coquelin <mcoqueli@redhat.com>
Message-ID: <65444651-0a8e-3495-8d4e-91453d6b2069@redhat.com>
Date: Mon, 7 Jun 2021 15:46:07 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101
 Thunderbird/78.10.1
MIME-Version: 1.0
In-Reply-To: <20210602042802.31943-2-cheng1.jiang@intel.com>
Authentication-Results: relay.mimecast.com;
 auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=mcoqueli@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH 1/2] vhost: add unsafe API to drain pkts in
 async vhost
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>


On 6/2/21 6:28 AM, Cheng Jiang wrote:
> Applications need to stop DMA transfers and finish all the in-flight
> pkts when in VM memory hot-plug case and async vhost is used. This
> patch is to provide an unsafe API to drain in-flight pkts which are
> submitted to DMA engine in vhost async data path. And enable it in
> vhost example.
> 
> Signed-off-by: Cheng Jiang <cheng1.jiang@intel.com>
> ---
>  examples/vhost/main.c       | 48 +++++++++++++++++++-
>  examples/vhost/main.h       |  1 +
>  lib/vhost/rte_vhost_async.h | 22 +++++++++
>  lib/vhost/version.map       |  3 ++
>  lib/vhost/virtio_net.c      | 90 +++++++++++++++++++++++++++----------
>  5 files changed, 139 insertions(+), 25 deletions(-)

Please split example and lib changes in dedicated patches.

> 
> diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> index d2179eadb9..70bb67c7f8 100644
> --- a/examples/vhost/main.c
> +++ b/examples/vhost/main.c
> @@ -851,8 +851,11 @@ complete_async_pkts(struct vhost_dev *vdev)
>  
>  	complete_count = rte_vhost_poll_enqueue_completed(vdev->vid,
>  					VIRTIO_RXQ, p_cpl, MAX_PKT_BURST);
> -	if (complete_count)
> +	if (complete_count) {
>  		free_pkts(p_cpl, complete_count);
> +		__atomic_sub_fetch(&vdev->pkts_inflight, complete_count, __ATOMIC_SEQ_CST);
> +	}
> +
>  }
>  
>  static __rte_always_inline void
> @@ -895,6 +898,7 @@ drain_vhost(struct vhost_dev *vdev)
>  		complete_async_pkts(vdev);
>  		ret = rte_vhost_submit_enqueue_burst(vdev->vid, VIRTIO_RXQ,
>  					m, nr_xmit, m_cpu_cpl, &cpu_cpl_nr);
> +		__atomic_add_fetch(&vdev->pkts_inflight, ret - cpu_cpl_nr, __ATOMIC_SEQ_CST);
>  
>  		if (cpu_cpl_nr)
>  			free_pkts(m_cpu_cpl, cpu_cpl_nr);
> @@ -1226,6 +1230,9 @@ drain_eth_rx(struct vhost_dev *vdev)
>  		enqueue_count = rte_vhost_submit_enqueue_burst(vdev->vid,
>  					VIRTIO_RXQ, pkts, rx_count,
>  					m_cpu_cpl, &cpu_cpl_nr);
> +		__atomic_add_fetch(&vdev->pkts_inflight, enqueue_count - cpu_cpl_nr,
> +					__ATOMIC_SEQ_CST);
> +
>  		if (cpu_cpl_nr)
>  			free_pkts(m_cpu_cpl, cpu_cpl_nr);
>  
> @@ -1397,8 +1404,15 @@ destroy_device(int vid)
>  		"(%d) device has been removed from data core\n",
>  		vdev->vid);
>  
> -	if (async_vhost_driver)
> +	if (async_vhost_driver) {
> +		uint16_t n_pkt = 0;
> +		struct rte_mbuf *m_cpl[vdev->pkts_inflight];
> +		n_pkt = rte_vhost_drain_queue_thread_unsafe(vid, VIRTIO_RXQ, m_cpl,
> +							vdev->pkts_inflight);
> +
> +		free_pkts(m_cpl, n_pkt);
>  		rte_vhost_async_channel_unregister(vid, VIRTIO_RXQ);
> +	}
>  
>  	rte_free(vdev);
>  }
> @@ -1487,6 +1501,35 @@ new_device(int vid)
>  	return 0;
>  }
>  
> +static int
> +vring_state_changed(int vid, uint16_t queue_id, int enable)
> +{
> +	struct vhost_dev *vdev = NULL;
> +
> +	TAILQ_FOREACH(vdev, &vhost_dev_list, global_vdev_entry) {
> +		if (vdev->vid == vid)
> +			break;
> +	}
> +	if (!vdev)
> +		return -1;
> +
> +	if (queue_id != VIRTIO_RXQ)
> +		return 0;
> +
> +	if (async_vhost_driver) {
> +		if (!enable) {
> +			uint16_t n_pkt;
> +			struct rte_mbuf *m_cpl[vdev->pkts_inflight];
> +
> +			n_pkt = rte_vhost_drain_queue_thread_unsafe(vid, queue_id,
> +							m_cpl, vdev->pkts_inflight);
> +			free_pkts(m_cpl, n_pkt);
> +		}
> +	}
> +
> +	return 0;
> +}
> +
>  /*
>   * These callback allow devices to be added to the data core when configuration
>   * has been fully complete.
> @@ -1495,6 +1538,7 @@ static const struct vhost_device_ops virtio_net_device_ops =
>  {
>  	.new_device =  new_device,
>  	.destroy_device = destroy_device,
> +	.vring_state_changed = vring_state_changed,
>  };
>  
>  /*
> diff --git a/examples/vhost/main.h b/examples/vhost/main.h
> index 0ccdce4b4a..e7b1ac60a6 100644
> --- a/examples/vhost/main.h
> +++ b/examples/vhost/main.h
> @@ -51,6 +51,7 @@ struct vhost_dev {
>  	uint64_t features;
>  	size_t hdr_len;
>  	uint16_t nr_vrings;
> +	uint16_t pkts_inflight;
>  	struct rte_vhost_memory *mem;
>  	struct device_statistics stats;
>  	TAILQ_ENTRY(vhost_dev) global_vdev_entry;
> diff --git a/lib/vhost/rte_vhost_async.h b/lib/vhost/rte_vhost_async.h
> index 6faa31f5ad..041f40cf04 100644
> --- a/lib/vhost/rte_vhost_async.h
> +++ b/lib/vhost/rte_vhost_async.h
> @@ -193,4 +193,26 @@ __rte_experimental
>  uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
>  		struct rte_mbuf **pkts, uint16_t count);
>  
> +/**
> + * This function checks async completion status and empty all pakcets
> + * for a specific vhost device queue. Packets which are inflight will
> + * be returned in an array.
> + *
> + * @note This function does not perform any locking
> + *
> + * @param vid
> + *  id of vhost device to enqueue data
> + * @param queue_id
> + *  queue id to enqueue data
> + * @param pkts
> + *  blank array to get return packet pointer
> + * @param count
> + *  size of the packet array
> + * @return
> + *  num of packets returned
> + */
> +__rte_experimental
> +uint16_t rte_vhost_drain_queue_thread_unsafe(int vid, uint16_t queue_id,
> +		struct rte_mbuf **pkts, uint16_t count);
> +
>  #endif /* _RTE_VHOST_ASYNC_H_ */
> diff --git a/lib/vhost/version.map b/lib/vhost/version.map
> index 9103a23cd4..f480f188af 100644
> --- a/lib/vhost/version.map
> +++ b/lib/vhost/version.map
> @@ -79,4 +79,7 @@ EXPERIMENTAL {
>  
>  	# added in 21.05
>  	rte_vhost_get_negotiated_protocol_features;
> +
> +	# added in 21.08
> +	rte_vhost_drain_queue_thread_unsafe;
>  };
> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> index 8da8a86a10..793510974a 100644
> --- a/lib/vhost/virtio_net.c
> +++ b/lib/vhost/virtio_net.c
> @@ -2082,36 +2082,18 @@ write_back_completed_descs_packed(struct vhost_virtqueue *vq,
>  	} while (nr_left > 0);
>  }
>  
> -uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
> +static __rte_always_inline uint16_t
> +vhost_poll_enqueue_completed(struct virtio_net *dev, uint16_t queue_id,
>  		struct rte_mbuf **pkts, uint16_t count)
>  {
> -	struct virtio_net *dev = get_device(vid);
>  	struct vhost_virtqueue *vq;
>  	uint16_t n_pkts_cpl = 0, n_pkts_put = 0, n_descs = 0, n_buffers = 0;
>  	uint16_t start_idx, pkts_idx, vq_size;
>  	struct async_inflight_info *pkts_info;
>  	uint16_t from, i;
>  
> -	if (!dev)
> -		return 0;
> -
> -	VHOST_LOG_DATA(DEBUG, "(%d) %s\n", dev->vid, __func__);
> -	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->nr_vring))) {
> -		VHOST_LOG_DATA(ERR, "(%d) %s: invalid virtqueue idx %d.\n",
> -			dev->vid, __func__, queue_id);
> -		return 0;
> -	}
> -
>  	vq = dev->virtqueue[queue_id];
>  
> -	if (unlikely(!vq->async_registered)) {
> -		VHOST_LOG_DATA(ERR, "(%d) %s: async not registered for queue id %d.\n",
> -			dev->vid, __func__, queue_id);
> -		return 0;
> -	}
> -
> -	rte_spinlock_lock(&vq->access_lock);
> -
>  	pkts_idx = vq->async_pkts_idx % vq->size;
>  	pkts_info = vq->async_pkts_info;
>  	vq_size = vq->size;
> @@ -2119,14 +2101,14 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
>  		vq_size, vq->async_pkts_inflight_n);
>  
>  	if (count > vq->async_last_pkts_n)
> -		n_pkts_cpl = vq->async_ops.check_completed_copies(vid,
> +		n_pkts_cpl = vq->async_ops.check_completed_copies(dev->vid,
>  			queue_id, 0, count - vq->async_last_pkts_n);
>  	n_pkts_cpl += vq->async_last_pkts_n;
>  
>  	n_pkts_put = RTE_MIN(count, n_pkts_cpl);
>  	if (unlikely(n_pkts_put == 0)) {
>  		vq->async_last_pkts_n = n_pkts_cpl;
> -		goto done;
> +		return 0;
>  	}
>  
>  	if (vq_is_packed(dev)) {
> @@ -2165,12 +2147,74 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
>  			vq->last_async_desc_idx_split += n_descs;
>  	}
>  
> -done:
> +	return n_pkts_put;
> +}
> +
> +uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,
> +		struct rte_mbuf **pkts, uint16_t count)
> +{
> +	struct virtio_net *dev = get_device(vid);
> +	struct vhost_virtqueue *vq;
> +	uint16_t n_pkts_put = 0;
> +
> +	if (!dev)
> +		return 0;
> +
> +	VHOST_LOG_DATA(DEBUG, "(%d) %s\n", dev->vid, __func__);
> +	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->nr_vring))) {
> +		VHOST_LOG_DATA(ERR, "(%d) %s: invalid virtqueue idx %d.\n",
> +			dev->vid, __func__, queue_id);
> +		return 0;
> +	}
> +
> +	vq = dev->virtqueue[queue_id];
> +
> +	if (unlikely(!vq->async_registered)) {
> +		VHOST_LOG_DATA(ERR, "(%d) %s: async not registered for queue id %d.\n",
> +			dev->vid, __func__, queue_id);
> +		return 0;
> +	}
> +
> +	rte_spinlock_lock(&vq->access_lock);
> +
> +	n_pkts_put = vhost_poll_enqueue_completed(dev, queue_id, pkts, count);
> +
>  	rte_spinlock_unlock(&vq->access_lock);
>  
>  	return n_pkts_put;
>  }
>  
> +uint16_t rte_vhost_drain_queue_thread_unsafe(int vid, uint16_t queue_id,
> +		struct rte_mbuf **pkts, uint16_t count)
> +{
> +	struct virtio_net *dev = get_device(vid);
> +	struct vhost_virtqueue *vq;
> +	uint16_t n_pkts = count;
> +
> +	if (!dev)
> +		return 0;
> +
> +	VHOST_LOG_DATA(DEBUG, "(%d) %s\n", dev->vid, __func__);
> +	if (unlikely(!is_valid_virt_queue_idx(queue_id, 0, dev->nr_vring))) {
> +		VHOST_LOG_DATA(ERR, "(%d) %s: invalid virtqueue idx %d.\n",
> +			dev->vid, __func__, queue_id);
> +		return 0;
> +	}
> +
> +	vq = dev->virtqueue[queue_id];
> +
> +	if (unlikely(!vq->async_registered)) {
> +		VHOST_LOG_DATA(ERR, "(%d) %s: async not registered for queue id %d.\n",
> +			dev->vid, __func__, queue_id);
> +		return 0;
> +	}
> +
> +	while (count)
> +		count -= vhost_poll_enqueue_completed(dev, queue_id, pkts, count);
> +
> +	return n_pkts;
> +}
> +
>  static __rte_always_inline uint32_t
>  virtio_dev_rx_async_submit(struct virtio_net *dev, uint16_t queue_id,
>  	struct rte_mbuf **pkts, uint32_t count,
>