From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7FB74A0A02; Thu, 13 May 2021 17:03:56 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E5FCF4067E; Thu, 13 May 2021 17:03:55 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by mails.dpdk.org (Postfix) with ESMTP id 747D74003F for ; Thu, 13 May 2021 17:03:52 +0200 (CEST) IronPort-SDR: 0Nh4X2lIIg6SSy8GNe9uTk12mFPXbI9MWEUz3yPSzoXVutx4T8it3gDwjmLt/HprXxoxZX8kG9 dstHsO+SRUrg== X-IronPort-AV: E=McAfee;i="6200,9189,9982"; a="261208823" X-IronPort-AV: E=Sophos;i="5.82,296,1613462400"; d="scan'208";a="261208823" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 May 2021 08:03:23 -0700 IronPort-SDR: jIg1pFQHqTuNZHJsU5MYbYxGcp/6hB2EXAkQIaTANIWKAMORPfzcxs4wfcATRajpH6EsQNCaef sTI9BwgL5pxg== X-IronPort-AV: E=Sophos;i="5.82,296,1613462400"; d="scan'208";a="431280316" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.213.242.51]) ([10.213.242.51]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 May 2021 08:03:20 -0700 To: Kevin Traynor , David Marchand , Chenbo Xia , Maxime Coquelin Cc: dev , Pei Zhang , Thomas Monjalon References: <20210513122826.49910-1-chenbo.xia@intel.com> From: Ferruh Yigit X-User: ferruhy Message-ID: <5e9026cf-6d29-8dd4-b41b-9334f48256fa@intel.com> Date: Thu, 13 May 2021 16:03:16 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] vhost: fix wrong IOTLB initialization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 5/13/2021 3:38 PM, Kevin Traynor wrote: > On 13/05/2021 15:11, David Marchand wrote: >> On Thu, May 13, 2021 at 2:38 PM Chenbo Xia wrote: >>> >>> This patch fixes an issue of application crash because of vhost iotlb >>> not initialized when virtio has multiqueue enabled. >>> >>> iotlb messages will be sent when some queues are not enabled. If we >>> initialize iotlb in vhost_user_set_vring_num, it could happen that >>> iotlb update comes when iotlb pool of disabled queues are not >>> initialized. >> >> This makes the problem I reproduced disappear at init, but I noticed >> the segfault after restarting testpmd once. >> And a little bit after this, my vm crashed. >> >> This is not systematic, so I guess there is some condition with how >> the virtio device is initialised in the vm. >> > > Ok, no point in Red Hat QA testing RC3 yet, if it is still faulty. > > fyi - if you want to fix with a new patch it will likely delay Red Hat > QA testing RC3 (maybe others?) and probably they will only have cycles > for one RC3 test run. > > If you choose to revert, we can ask Red Hat QA to test RC3 without > further delay. Please let us know when you consider the options. > If the patch is not good to go as it is I suggest reverting it, as far as I know Chenbo will be off for Friday & Monday, so it doesn't leave much time to update/test a new version. >> >> One question below. >> >> >> Bugzilla ID: 703 >> >>> Fixes: 968bbc7e2e50 ("vhost: avoid IOTLB mempool allocation while IOMMU disabled") >>> >> >> Reported-by: Pei Zhang >> >>> Signed-off-by: Chenbo Xia >>> --- >>> lib/vhost/vhost_user.c | 13 +++++++++---- >>> 1 file changed, 9 insertions(+), 4 deletions(-) >>> >>> diff --git a/lib/vhost/vhost_user.c b/lib/vhost/vhost_user.c >>> index 611ff209e3..ae4df8eb69 100644 >>> --- a/lib/vhost/vhost_user.c >>> +++ b/lib/vhost/vhost_user.c >>> @@ -311,6 +311,7 @@ vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg, >>> uint64_t features = msg->payload.u64; >>> uint64_t vhost_features = 0; >>> struct rte_vdpa_device *vdpa_dev; >>> + uint32_t i; >>> >>> if (validate_msg_fds(msg, 0) != 0) >>> return RTE_VHOST_MSG_RESULT_ERR; >>> @@ -389,6 +390,14 @@ vhost_user_set_features(struct virtio_net **pdev, struct VhostUserMsg *msg, >>> vdpa_dev->ops->set_features(dev->vid); >>> >>> dev->flags &= ~VIRTIO_DEV_FEATURES_FAILED; >>> + >>> + if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) { >>> + for (i = 0; i < dev->nr_vring; i++) { >> >> I don't know the vhost-user protocol. >> At this point of the device init/life, are we sure nr_vring is set to >> the max number of vring? >> The logs I have tend to say it is the case, but is there a guarantee >> in the protocol? >> >> >> Another way to fix would be to allocate on the first >> VHOST_USER_IOTLB_MSG message received for a vring. >> >> >>> + if (vhost_user_iotlb_init(dev, i)) >>> + return RTE_VHOST_MSG_RESULT_ERR; >>> + } >>> + } >>> + >>> return RTE_VHOST_MSG_RESULT_OK; >>> } >>> >> >> >