From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 11B18A0526;
	Wed, 25 Nov 2020 03:18:18 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 75971C940;
	Wed, 25 Nov 2020 03:18:15 +0100 (CET)
Received: from us-smtp-delivery-124.mimecast.com
 (us-smtp-delivery-124.mimecast.com [216.205.24.124])
 by dpdk.org (Postfix) with ESMTP id 4D83EC93E
 for <dev@dpdk.org>; Wed, 25 Nov 2020 03:18:13 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com;
 s=mimecast20190719; t=1606270691;
 h=from:from:reply-to:subject:subject:date:date:message-id:message-id:
 to:to:cc:cc:mime-version:mime-version:content-type:content-type:
 content-transfer-encoding:content-transfer-encoding:
 in-reply-to:in-reply-to:references:references;
 bh=QNm0jcfK/OyrQ+E62pwHTwemFJ32wq4JjSzfVxX6TUo=;
 b=e1uSNAG3UtZ56NbwCWibfgYZyxH7F/sqQjLdn83KfFMiqfQgTynYeLAedIsmJkYBeEqy/t
 1OEoXxyTEXbpaaYqyIevNX97vAWz//DvTCbuzor3/m78qCJxIh1sTousli82OcWMhf1BAP
 AsqipztctZcfB5zHmfdnXbzYigALthQ=
Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com
 [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id
 us-mta-453-JtFVthZCPrqJY_YwUUqfpA-1; Tue, 24 Nov 2020 21:18:09 -0500
X-MC-Unique: JtFVthZCPrqJY_YwUUqfpA-1
Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com
 [10.5.11.16])
 (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DF9AC1005E46;
 Wed, 25 Nov 2020 02:18:07 +0000 (UTC)
Received: from [10.72.13.165] (ovpn-13-165.pek2.redhat.com [10.72.13.165])
 by smtp.corp.redhat.com (Postfix) with ESMTP id 3843D5C1BD;
 Wed, 25 Nov 2020 02:18:02 +0000 (UTC)
To: Maxime Coquelin <maxime.coquelin@redhat.com>, dev@dpdk.org,
 chenbo.xia@intel.com, amorenoz@redhat.com
Cc: stable@dpdk.org
References: <20201124124557.140048-1-maxime.coquelin@redhat.com>
From: Jason Wang <jasowang@redhat.com>
Message-ID: <5e17e6d2-efe0-2781-00a5-6c5b9f185f19@redhat.com>
Date: Wed, 25 Nov 2020 10:18:01 +0800
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <20201124124557.140048-1-maxime.coquelin@redhat.com>
X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16
Authentication-Results: relay.mimecast.com;
 auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=jasowang@redhat.com
X-Mimecast-Spam-Score: 0
X-Mimecast-Originator: redhat.com
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Subject: Re: [dpdk-dev] [PATCH 21.02 v1] net/virtio: fix memory init with
	vDPA backend
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>


On 2020/11/24 下午8:45, Maxime Coquelin wrote:
> This patch fixes an overhead met with mlx5-vdpa Kernel
> driver, where for every page in the mapped area, all the
> memory tables gets updated. For example, with 2MB hugepages,
> a single IOTLB_UPDATE for a 1GB region causes 512 memory
> updates on mlx5-vdpa side.
>
> Using batching mode, the mlx5 driver will only trigger a
> single memory update for all the IOTLB updates that happen
> between the batch begin and batch end commands.
>
> Fixes: 6b901437056e ("net/virtio: introduce vhost-vDPA backend")
> Cc: stable@dpdk.org
>
> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> ---
>   drivers/net/virtio/virtio_user/vhost_vdpa.c | 90 +++++++++++++++++++--
>   1 file changed, 85 insertions(+), 5 deletions(-)


Hi Maxime:

To be safe, it's better to check whether or not the kernel support 
batching flags.

This could be done by using VHOST_GET_BACKEND_FEATURES to test whether 
VHOST_BACKEND_F_IOTLB_BATCH is there.

Thanks


>
> diff --git a/drivers/net/virtio/virtio_user/vhost_vdpa.c b/drivers/net/virtio/virtio_user/vhost_vdpa.c
> index c7b9349fc8..6d0200516d 100644
> --- a/drivers/net/virtio/virtio_user/vhost_vdpa.c
> +++ b/drivers/net/virtio/virtio_user/vhost_vdpa.c
> @@ -66,6 +66,8 @@ struct vhost_iotlb_msg {
>   #define VHOST_IOTLB_UPDATE         2
>   #define VHOST_IOTLB_INVALIDATE     3
>   #define VHOST_IOTLB_ACCESS_FAIL    4
> +#define VHOST_IOTLB_BATCH_BEGIN    5
> +#define VHOST_IOTLB_BATCH_END      6
>   	uint8_t type;
>   };
>   
> @@ -80,6 +82,40 @@ struct vhost_msg {
>   	};
>   };
>   
> +static int
> +vhost_vdpa_iotlb_batch_begin(struct virtio_user_dev *dev)
> +{
> +	struct vhost_msg msg = {};
> +
> +	msg.type = VHOST_IOTLB_MSG_V2;
> +	msg.iotlb.type = VHOST_IOTLB_BATCH_BEGIN;
> +
> +	if (write(dev->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
> +		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch begin (%s)",
> +				strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
> +static int
> +vhost_vdpa_iotlb_batch_end(struct virtio_user_dev *dev)
> +{
> +	struct vhost_msg msg = {};
> +
> +	msg.type = VHOST_IOTLB_MSG_V2;
> +	msg.iotlb.type = VHOST_IOTLB_BATCH_END;
> +
> +	if (write(dev->vhostfd, &msg, sizeof(msg)) != sizeof(msg)) {
> +		PMD_DRV_LOG(ERR, "Failed to send IOTLB batch end (%s)",
> +				strerror(errno));
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>   static int
>   vhost_vdpa_dma_map(struct virtio_user_dev *dev, void *addr,
>   				  uint64_t iova, size_t len)
> @@ -122,6 +158,39 @@ vhost_vdpa_dma_unmap(struct virtio_user_dev *dev, __rte_unused void *addr,
>   	return 0;
>   }
>   
> +static int
> +vhost_vdpa_dma_map_batch(struct virtio_user_dev *dev, void *addr,
> +				  uint64_t iova, size_t len)
> +{
> +	int ret;
> +
> +	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
> +		return -1;
> +
> +	ret = vhost_vdpa_dma_map(dev, addr, iova, len);
> +
> +	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
> +		return -1;
> +
> +	return ret;
> +}
> +
> +static int
> +vhost_vdpa_dma_unmap_batch(struct virtio_user_dev *dev, void *addr,
> +				  uint64_t iova, size_t len)
> +{
> +	int ret;
> +
> +	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
> +		return -1;
> +
> +	ret = vhost_vdpa_dma_unmap(dev, addr, iova, len);
> +
> +	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
> +		return -1;
> +
> +	return ret;
> +}
>   
>   static int
>   vhost_vdpa_map_contig(const struct rte_memseg_list *msl,
> @@ -159,21 +228,32 @@ vhost_vdpa_map(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
>   static int
>   vhost_vdpa_dma_map_all(struct virtio_user_dev *dev)
>   {
> +	int ret;
> +
> +	if (vhost_vdpa_iotlb_batch_begin(dev) < 0)
> +		return -1;
> +
>   	vhost_vdpa_dma_unmap(dev, NULL, 0, SIZE_MAX);
>   
>   	if (rte_eal_iova_mode() == RTE_IOVA_VA) {
>   		/* with IOVA as VA mode, we can get away with mapping contiguous
>   		 * chunks rather than going page-by-page.
>   		 */
> -		int ret = rte_memseg_contig_walk_thread_unsafe(
> +		ret = rte_memseg_contig_walk_thread_unsafe(
>   				vhost_vdpa_map_contig, dev);
>   		if (ret)
> -			return ret;
> +			goto batch_end;
>   		/* we have to continue the walk because we've skipped the
>   		 * external segments during the config walk.
>   		 */
>   	}
> -	return rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
> +	ret = rte_memseg_walk_thread_unsafe(vhost_vdpa_map, dev);
> +
> +batch_end:
> +	if (vhost_vdpa_iotlb_batch_end(dev) < 0)
> +		return -1;
> +
> +	return ret;
>   }
>   
>   /* with below features, vhost vdpa does not need to do the checksum and TSO,
> @@ -293,6 +373,6 @@ struct virtio_user_backend_ops virtio_ops_vdpa = {
>   	.setup = vhost_vdpa_setup,
>   	.send_request = vhost_vdpa_ioctl,
>   	.enable_qp = vhost_vdpa_enable_queue_pair,
> -	.dma_map = vhost_vdpa_dma_map,
> -	.dma_unmap = vhost_vdpa_dma_unmap,
> +	.dma_map = vhost_vdpa_dma_map_batch,
> +	.dma_unmap = vhost_vdpa_dma_unmap_batch,
>   };