From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5F1BFA0C46; Fri, 18 Jun 2021 10:48:31 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 16306410E5; Fri, 18 Jun 2021 10:48:31 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 7ABE840142 for ; Fri, 18 Jun 2021 10:48:29 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1624006108; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=FuvRrcfELyaoWy7BlfJua/ViqqlL5K7pr1yQQFn9/eA=; b=X7EjRaJ6IFNoHZQowjO6pFvNa74R4MXtC2xc2pqnfsd1MSK2IKlVL/5NCPBM6M5nrA4Vmb 607zPNIPZltJNIH5wSl8us+yNPM0fIzu4ANYkOQCU1PIsqWrQQgh32LNPoWsFyd/zPISxU HxhSsS641Cpj1dYsqOa+eb2UlEMkjQk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-537-oqZOEx_hPaSdohWqFYJIvA-1; Fri, 18 Jun 2021 04:48:25 -0400 X-MC-Unique: oqZOEx_hPaSdohWqFYJIvA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 196EFDC3A; Fri, 18 Jun 2021 08:48:24 +0000 (UTC) Received: from [10.36.110.21] (unknown [10.36.110.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CA2AE60864; Fri, 18 Jun 2021 08:48:19 +0000 (UTC) To: "Xia, Chenbo" , "dev@dpdk.org" , "david.marchand@redhat.com" Cc: "stable@dpdk.org" References: <20210617153739.178011-1-maxime.coquelin@redhat.com> <20210617153739.178011-5-maxime.coquelin@redhat.com> <49426b63-82c8-230e-727f-91743657a471@redhat.com> From: Maxime Coquelin Message-ID: Date: Fri, 18 Jun 2021 10:48:18 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 6/18/21 10:21 AM, Xia, Chenbo wrote: > Hi Maxime, > >> -----Original Message----- >> From: Maxime Coquelin >> Sent: Friday, June 18, 2021 4:01 PM >> To: Xia, Chenbo ; dev@dpdk.org; >> david.marchand@redhat.com >> Cc: stable@dpdk.org >> Subject: Re: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue >> >> >> >> On 6/18/21 6:34 AM, Xia, Chenbo wrote: >>> Hi Maxime, >>> >>>> -----Original Message----- >>>> From: Maxime Coquelin >>>> Sent: Thursday, June 17, 2021 11:38 PM >>>> To: dev@dpdk.org; david.marchand@redhat.com; Xia, Chenbo >> >>>> Cc: Maxime Coquelin ; stable@dpdk.org >>>> Subject: [PATCH v4 4/7] vhost: fix NUMA reallocation with multiqueue >>>> >>>> Since the Vhost-user device initialization has been reworked, >>>> enabling the application to start using the device as soon as >>>> the first queue pair is ready, NUMA reallocation no more >>>> happened on queue pairs other than the first one since >>>> numa_realloc() was returning early if the device was running. >>>> >>>> This patch fixes this issue by only preventing the device >>>> metadata to be allocated if the device is running. For the >>>> virtqueues, a vring state change notification is sent to >>>> notify the application of its disablement. Since the callback >>>> is supposed to be blocking, it is safe to reallocate it >>>> afterwards. >>> >>> Is there a corner case? Numa_realloc may happen during vhost-user msg >>> set_vring_addr/kick, set_mem_table and iotlb msg. And iotlb msg will >>> not take vq access lock. It could happen when numa_realloc happens on >>> iotlb msg and app accesses vq in the meantime? >> >> I think we are safe wrt to numa_realloc(), because the app's >> .vring_state_changed() callback is only returning when it is no more >> processing the rings. > > Yes, I think it should be. But in this iotlb msg case (take vhost pmd for example), > can't vhost pmd still access vq since vq access lock is not took? Do I miss something? Vhost PMD sends RTE_ETH_EVENT_QUEUE_STATE, and my assumption was that the application would stop processing the rings when handling this event and only return from the callback when it's one, but this seems that's not done at least in testpmd. So we may not rely on that after all :/. We cannot rely on the VQ's access lock since the goal of numa_realloc is to reallocate the vhost_virtqueue itself which contains the acces_lock. Relying on it would cause a use after free. Maybe the safest thing to do is to just skip the reallocation if vq->ready == true. Having vq->ready == true means we already received all the vrings info from QEMU, which means the driver has already initialized the device. It should not change runtime behavior compared to this patch since it would not reallocate anyway. What do you think? > Thanks, > Chenbo > >> >> >>> Thanks, >>> Chenbo >>> >>>> >>>> Fixes: d0fcc38f5fa4 ("vhost: improve device readiness notifications") >>>> Cc: stable@dpdk.org >>>> >>>> Signed-off-by: Maxime Coquelin >>>> --- >>>> lib/vhost/vhost_user.c | 11 ++++++++--- >>>> 1 file changed, 8 insertions(+), 3 deletions(-) >>>> >>>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c >>>> index 0e9e26ebe0..6e7b327ef8 100644 >>>> --- a/lib/vhost/vhost_user.c >>>> +++ b/lib/vhost/vhost_user.c >>>> @@ -488,9 +488,6 @@ numa_realloc(struct virtio_net *dev, int index) >>>> struct batch_copy_elem *new_batch_copy_elems; >>>> int ret; >>>> >>>> - if (dev->flags & VIRTIO_DEV_RUNNING) >>>> - return dev; >>>> - >>>> old_dev = dev; >>>> vq = old_vq = dev->virtqueue[index]; >>>> >>>> @@ -506,6 +503,11 @@ numa_realloc(struct virtio_net *dev, int index) >>>> return dev; >>>> } >>>> if (oldnode != newnode) { >>>> + if (vq->ready) { >>>> + vq->ready = false; >>>> + vhost_user_notify_queue_state(dev, index, 0); >>>> + } >>>> + >>>> VHOST_LOG_CONFIG(INFO, >>>> "reallocate vq from %d to %d node\n", oldnode, newnode); >>>> vq = rte_malloc_socket(NULL, sizeof(*vq), 0, newnode); >>>> @@ -558,6 +560,9 @@ numa_realloc(struct virtio_net *dev, int index) >>>> rte_free(old_vq); >>>> } >>>> >>>> + if (dev->flags & VIRTIO_DEV_RUNNING) >>>> + goto out; >>>> + >>>> /* check if we need to reallocate dev */ >>>> ret = get_mempolicy(&oldnode, NULL, 0, old_dev, >>>> MPOL_F_NODE | MPOL_F_ADDR); >>>> -- >>>> 2.31.1 >>> >