From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0F82CA0540; Wed, 6 Jul 2022 14:32:19 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A3E8940A7F; Wed, 6 Jul 2022 14:32:18 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id EC39D40691 for ; Wed, 6 Jul 2022 14:32:16 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1657110736; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZI2YpgsHqJER59f2DZ3+x3x6sniQ+F/emjZ8BCpF4kI=; b=GZPt+IbXx2pHdM+51Y/fjeIT4tS0pJ0xfHceRjVpaWcY4XuQyOiGXUcXo8yiSqiACSSZ5v g6aHJk+1/PkGO/B5s8qDnsyh9QpztmVGyY1vWFxbX+5GXLz3gtlrTIWbU8I3DDSN6uYAiW KvXCV5NdgO9+0kpBx1IblAIvY2AwvyY= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-391-AdeN0I_YOQ6VSMJt5uiiTQ-1; Wed, 06 Jul 2022 08:32:13 -0400 X-MC-Unique: AdeN0I_YOQ6VSMJt5uiiTQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id CAF2F811E87; Wed, 6 Jul 2022 12:32:12 +0000 (UTC) Received: from [10.39.208.34] (unknown [10.39.208.34]) by smtp.corp.redhat.com (Postfix) with ESMTPS id AD68BC3598B; Wed, 6 Jul 2022 12:32:11 +0000 (UTC) Message-ID: <0a1f7f0a-9522-8ebb-cb2a-9652251c70a6@redhat.com> Date: Wed, 6 Jul 2022 14:32:10 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH] vdpa/sfc: resolve race between libvhost and dev_conf To: abhimanyu.saini@xilinx.com, dev@dpdk.org Cc: chenbo.xia@intel.com, andrew.rybchenko@oktetlabs.ru, Abhimanyu Saini References: <20220706092401.36815-1-asaini@xilinx.com> From: Maxime Coquelin In-Reply-To: <20220706092401.36815-1-asaini@xilinx.com> X-Scanned-By: MIMEDefang 2.85 on 10.11.54.8 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=maxime.coquelin@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Abhimanyu, On 7/6/22 11:24, abhimanyu.saini@xilinx.com wrote: > From: Abhimanyu Saini > > libvhost calls dev_conf() before prosessing the > VHOST_USER_SET_VRING_CALL message for the last VQ. So > this message is processed after dev_conf() returns. > > However, the dev_conf() function spawns a thread to set > rte_vhost_host_notifier_ctrl() before returning control to > libvhost. This parallel thread in turn invokes get_notify_area(). > To get the notify_area, the vdpa driver needs to query the HW and > for this query it needs an enabled VQ. > > But at the same time libvhost is processing the last > VHOST_USER_SET_VRING_CALL, and to do that it disables the last VQ. > > Hence there is a race b/w the libvhost and the vdpa driver. > > To resolve this race condition, query the HW and cache notify_area > inside dev_conf() instead of doing it the parallel thread. > > Signed-off-by: Abhimanyu Saini > --- > drivers/vdpa/sfc/sfc_vdpa_ops.c | 36 ++++++++++++++++++------------------ > drivers/vdpa/sfc/sfc_vdpa_ops.h | 1 + > 2 files changed, 19 insertions(+), 18 deletions(-) We are really late in the v22.07 release cycle. Does this issue reproduces easily, i.e. is it a blocker if not applied in v22.07? How confident are you about this fix? If we take it in -rc4 and it introduces a regression, we might not be able to fix it on time for final v22.07. Also, it misses the Fixes tag, and stable is not CC'ed (The driver was introduced in v22.11). Regards, Maxime > diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.c b/drivers/vdpa/sfc/sfc_vdpa_ops.c > index 63aa52d..b84699d 100644 > --- a/drivers/vdpa/sfc/sfc_vdpa_ops.c > +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.c > @@ -222,6 +222,7 @@ > sfc_vdpa_virtq_start(struct sfc_vdpa_ops_data *ops_data, int vq_num) > { > int rc; > + uint32_t doorbell; > efx_virtio_vq_t *vq; > struct sfc_vdpa_vring_info vring; > efx_virtio_vq_cfg_t vq_cfg; > @@ -270,22 +271,35 @@ > /* Start virtqueue */ > rc = efx_virtio_qstart(vq, &vq_cfg, &vq_dyncfg); > if (rc != 0) { > - /* destroy virtqueue */ > sfc_vdpa_err(ops_data->dev_handle, > "virtqueue start failed: %s", > rte_strerror(rc)); > - efx_virtio_qdestroy(vq); > goto fail_virtio_qstart; > } > > sfc_vdpa_info(ops_data->dev_handle, > "virtqueue started successfully for vq_num %d", vq_num); > > + rc = efx_virtio_get_doorbell_offset(vq, &doorbell); > + if (rc != 0) { > + sfc_vdpa_err(ops_data->dev_handle, > + "failed to get doorbell offset: %s", > + rte_strerror(rc)); > + goto fail_doorbell; > + } > + > + /* > + * Cache the bar_offset here for each VQ here, it will come > + * in handy when sfc_vdpa_get_notify_area() is invoked. > + */ > + ops_data->vq_cxt[vq_num].doorbell = (void *)(uintptr_t)doorbell; > ops_data->vq_cxt[vq_num].enable = B_TRUE; > > return rc; > > +fail_doorbell: > fail_virtio_qstart: > + efx_virtio_qdestroy(vq); > fail_vring_info: > return rc; > } > @@ -792,8 +806,6 @@ > int ret; > efx_nic_t *nic; > int vfio_dev_fd; > - efx_rc_t rc; > - unsigned int bar_offset; > volatile void *doorbell; > struct rte_pci_device *pci_dev; > struct rte_vdpa_device *vdpa_dev; > @@ -824,19 +836,6 @@ > return -1; > } > > - if (ops_data->vq_cxt[qid].enable != B_TRUE) { > - sfc_vdpa_err(dev, "vq is not enabled"); > - return -1; > - } > - > - rc = efx_virtio_get_doorbell_offset(ops_data->vq_cxt[qid].vq, > - &bar_offset); > - if (rc != 0) { > - sfc_vdpa_err(dev, "failed to get doorbell offset: %s", > - rte_strerror(rc)); > - return rc; > - } > - > reg.index = sfc_vdpa_adapter_by_dev_handle(dev)->mem_bar.esb_rid; > ret = ioctl(vfio_dev_fd, VFIO_DEVICE_GET_REGION_INFO, ®); > if (ret != 0) { > @@ -845,7 +844,8 @@ > return ret; > } > > - *offset = reg.offset + bar_offset; > + /* Use bar_offset that was cached during sfc_vdpa_virtq_start() */ > + *offset = reg.offset + (uint64_t)ops_data->vq_cxt[qid].doorbell; > > len = (1U << encp->enc_vi_window_shift) / 2; > if (len >= sysconf(_SC_PAGESIZE)) { > diff --git a/drivers/vdpa/sfc/sfc_vdpa_ops.h b/drivers/vdpa/sfc/sfc_vdpa_ops.h > index 6d790fd..9dbd5b8 100644 > --- a/drivers/vdpa/sfc/sfc_vdpa_ops.h > +++ b/drivers/vdpa/sfc/sfc_vdpa_ops.h > @@ -35,6 +35,7 @@ struct sfc_vdpa_vring_info { > }; > > typedef struct sfc_vdpa_vq_context_s { > + volatile void *doorbell; > uint8_t enable; > uint32_t pidx; > uint32_t cidx;