From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 02F4143A1E; Wed, 31 Jan 2024 14:32:38 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 68BE642686; Wed, 31 Jan 2024 14:32:37 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id 8EB63402D0 for ; Wed, 31 Jan 2024 14:32:35 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706707955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:autocrypt:autocrypt; bh=kPrzpj2HXczP4g6hm3JAcqKQ83qE6dynyCZeBq8nkN4=; b=L+PvemuIBYYmac2VXA/Aacxdtm/QebJadhpf0wmmnCEhxSYyusLPyTChNfwjzaTSBfGxjf XpLttjqd/3I0DVZuswNTVwXRFPvCnhcsSwYD0c2at2iS+ai3I9jI9Wgde4M4FxXGbJt4dz pkOSPhCKR/HfHXaVqHwY6KIP4nownJ4= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-92-AhO24qJgOdOSFPiRzKp7Dg-1; Wed, 31 Jan 2024 08:32:31 -0500 X-MC-Unique: AhO24qJgOdOSFPiRzKp7Dg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 86665108BCA0; Wed, 31 Jan 2024 13:32:31 +0000 (UTC) Received: from [10.39.208.18] (unknown [10.39.208.18]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 331181C060AF; Wed, 31 Jan 2024 13:32:30 +0000 (UTC) Message-ID: <9efc5d7e-1e44-4338-8264-6d544a895d2d@redhat.com> Date: Wed, 31 Jan 2024 14:32:28 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 1/2] vhost: fix memory leak in Virtio Tx split path To: David Marchand Cc: dev@dpdk.org, chenbox@nvidia.com, bnemeth@redhat.com, echaudro@redhat.com, stable@dpdk.org References: <20240131093113.2208894-1-maxime.coquelin@redhat.com> From: Maxime Coquelin Autocrypt: addr=maxime.coquelin@redhat.com; keydata= xsFNBFOEQQIBEADjNLYZZqghYuWv1nlLisptPJp+TSxE/KuP7x47e1Gr5/oMDJ1OKNG8rlNg kLgBQUki3voWhUbMb69ybqdMUHOl21DGCj0BTU3lXwapYXOAnsh8q6RRM+deUpasyT+Jvf3a gU35dgZcomRh5HPmKMU4KfeA38cVUebsFec1HuJAWzOb/UdtQkYyZR4rbzw8SbsOemtMtwOx YdXodneQD7KuRU9IhJKiEfipwqk2pufm2VSGl570l5ANyWMA/XADNhcEXhpkZ1Iwj3TWO7XR uH4xfvPl8nBsLo/EbEI7fbuUULcAnHfowQslPUm6/yaGv6cT5160SPXT1t8U9QDO6aTSo59N jH519JS8oeKZB1n1eLDslCfBpIpWkW8ZElGkOGWAN0vmpLfdyiqBNNyS3eGAfMkJ6b1A24un /TKc6j2QxM0QK4yZGfAxDxtvDv9LFXec8ENJYsbiR6WHRHq7wXl/n8guyh5AuBNQ3LIK44x0 KjGXP1FJkUhUuruGyZsMrDLBRHYi+hhDAgRjqHgoXi5XGETA1PAiNBNnQwMf5aubt+mE2Q5r qLNTgwSo2dpTU3+mJ3y3KlsIfoaxYI7XNsPRXGnZi4hbxmeb2NSXgdCXhX3nELUNYm4ArKBP LugOIT/zRwk0H0+RVwL2zHdMO1Tht1UOFGfOZpvuBF60jhMzbQARAQABzSxNYXhpbWUgQ29x dWVsaW4gPG1heGltZS5jb3F1ZWxpbkByZWRoYXQuY29tPsLBeAQTAQIAIgUCV3u/5QIbAwYL CQgHAwIGFQgCCQoLBBYCAwECHgECF4AACgkQyjiNKEaHD4ma2g/+P+Hg9WkONPaY1J4AR7Uf kBneosS4NO3CRy0x4WYmUSLYMLx1I3VH6SVjqZ6uBoYy6Fs6TbF6SHNc7QbB6Qjo3neqnQR1 71Ua1MFvIob8vUEl3jAR/+oaE1UJKrxjWztpppQTukIk4oJOmXbL0nj3d8dA2QgHdTyttZ1H xzZJWWz6vqxCrUqHU7RSH9iWg9R2iuTzii4/vk1oi4Qz7y/q8ONOq6ffOy/t5xSZOMtZCspu Mll2Szzpc/trFO0pLH4LZZfz/nXh2uuUbk8qRIJBIjZH3ZQfACffgfNefLe2PxMqJZ8mFJXc RQO0ONZvwoOoHL6CcnFZp2i0P5ddduzwPdGsPq1bnIXnZqJSl3dUfh3xG5ArkliZ/++zGF1O wvpGvpIuOgLqjyCNNRoR7cP7y8F24gWE/HqJBXs1qzdj/5Hr68NVPV1Tu/l2D1KMOcL5sOrz 2jLXauqDWn1Okk9hkXAP7+0Cmi6QwAPuBT3i6t2e8UdtMtCE4sLesWS/XohnSFFscZR6Vaf3 gKdWiJ/fW64L6b9gjkWtHd4jAJBAIAx1JM6xcA1xMbAFsD8gA2oDBWogHGYcScY/4riDNKXi lw92d6IEHnSf6y7KJCKq8F+Jrj2BwRJiFKTJ6ChbOpyyR6nGTckzsLgday2KxBIyuh4w+hMq TGDSp2rmWGJjASrOwU0EVPSbkwEQAMkaNc084Qvql+XW+wcUIY+Dn9A2D1gMr2BVwdSfVDN7 0ZYxo9PvSkzh6eQmnZNQtl8WSHl3VG3IEDQzsMQ2ftZn2sxjcCadexrQQv3Lu60Tgj7YVYRM H+fLYt9W5YuWduJ+FPLbjIKynBf6JCRMWr75QAOhhhaI0tsie3eDsKQBA0w7WCuPiZiheJaL 4MDe9hcH4rM3ybnRW7K2dLszWNhHVoYSFlZGYh+MGpuODeQKDS035+4H2rEWgg+iaOwqD7bg CQXwTZ1kSrm8NxIRVD3MBtzp9SZdUHLfmBl/tLVwDSZvHZhhvJHC6Lj6VL4jPXF5K2+Nn/Su CQmEBisOmwnXZhhu8ulAZ7S2tcl94DCo60ReheDoPBU8PR2TLg8rS5f9w6mLYarvQWL7cDtT d2eX3Z6TggfNINr/RTFrrAd7NHl5h3OnlXj7PQ1f0kfufduOeCQddJN4gsQfxo/qvWVB7PaE 1WTIggPmWS+Xxijk7xG6x9McTdmGhYaPZBpAxewK8ypl5+yubVsE9yOOhKMVo9DoVCjh5To5 aph7CQWfQsV7cd9PfSJjI2lXI0dhEXhQ7lRCFpf3V3mD6CyrhpcJpV6XVGjxJvGUale7+IOp sQIbPKUHpB2F+ZUPWds9yyVxGwDxD8WLqKKy0WLIjkkSsOb9UBNzgRyzrEC9lgQ/ABEBAAHC wV8EGAECAAkFAlT0m5MCGwwACgkQyjiNKEaHD4nU8hAAtt0xFJAy0sOWqSmyxTc7FUcX+pbD KVyPlpl6urKKMk1XtVMUPuae/+UwvIt0urk1mXi6DnrAN50TmQqvdjcPTQ6uoZ8zjgGeASZg jj0/bJGhgUr9U7oG7Hh2F8vzpOqZrdd65MRkxmc7bWj1k81tOU2woR/Gy8xLzi0k0KUa8ueB iYOcZcIGTcs9CssVwQjYaXRoeT65LJnTxYZif2pfNxfINFzCGw42s3EtZFteczClKcVSJ1+L +QUY/J24x0/ocQX/M1PwtZbB4c/2Pg/t5FS+s6UB1Ce08xsJDcwyOPIH6O3tccZuriHgvqKP yKz/Ble76+NFlTK1mpUlfM7PVhD5XzrDUEHWRTeTJSvJ8TIPL4uyfzhjHhlkCU0mw7Pscyxn DE8G0UYMEaNgaZap8dcGMYH/96EfE5s/nTX0M6MXV0yots7U2BDb4soLCxLOJz4tAFDtNFtA wLBhXRSvWhdBJZiig/9CG3dXmKfi2H+wdUCSvEFHRpgo7GK8/Kh3vGhgKmnnxhl8ACBaGy9n fxjSxjSO6rj4/MeenmlJw1yebzkX8ZmaSi8BHe+n6jTGEFNrbiOdWpJgc5yHIZZnwXaW54QT UhhSjDL1rV2B4F28w30jYmlRmm2RdN7iCZfbyP3dvFQTzQ4ySquuPkIGcOOHrvZzxbRjzMx1 Mwqu3GQ= In-Reply-To: X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.7 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi David, On 1/31/24 14:19, David Marchand wrote: > On Wed, Jan 31, 2024 at 10:31 AM Maxime Coquelin > wrote: >> >> When vIOMMU is enabled and Virtio device is bound to kernel >> driver in guest, rte_vhost_dequeue_burst() will often return >> early because of IOTLB misses. >> >> This patch fixes a mbuf leak occurring in this case. >> >> Fixes: 242695f6122a ("vhost: allocate and free packets in bulk in Tx split") >> Cc: stable@dpdk.org >> >> Signed-off-by: Maxime Coquelin >> --- >> lib/vhost/virtio_net.c | 20 ++++++++------------ >> 1 file changed, 8 insertions(+), 12 deletions(-) >> >> diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c >> index 280d4845f8..db9985c9b9 100644 >> --- a/lib/vhost/virtio_net.c >> +++ b/lib/vhost/virtio_net.c >> @@ -3120,11 +3120,8 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, >> VHOST_ACCESS_RO) < 0)) >> break; >> >> - c(vq, head_idx, 0); >> - >> if (unlikely(buf_len <= dev->vhost_hlen)) { >> - dropped += 1; >> - i++; >> + dropped = 1; >> break; >> } > > ``i`` was used for both filling the returned mbuf array, but also to > update the shadow_used_idx / array. > > So this change here also affects how the currently considered > descriptor is returned through the used ring. > > See below... > >> >> @@ -3143,8 +3140,7 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, >> buf_len, mbuf_pool->name); >> allocerr_warned = true; >> } >> - dropped += 1; >> - i++; >> + dropped = 1; >> break; >> } >> >> @@ -3155,17 +3151,17 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, >> VHOST_DATA_LOG(dev->ifname, ERR, "failed to copy desc to mbuf."); >> allocerr_warned = true; >> } >> - dropped += 1; >> - i++; >> + dropped = 1; >> break; >> } >> >> + update_shadow_used_ring_split(vq, head_idx, 0); >> } >> >> - if (dropped) >> - rte_pktmbuf_free_bulk(&pkts[i - 1], count - i + 1); >> + if (unlikely(count != i)) >> + rte_pktmbuf_free_bulk(&pkts[i], count - i); >> >> - vq->last_avail_idx += i; >> + vq->last_avail_idx += i + dropped; >> >> do_data_copy_dequeue(vq); >> if (unlikely(i < count)) > > ... I am copying the rest of the context: > if (unlikely(i < count)) > vq->shadow_used_idx = i; > > Before the patch, when breaking and doing the i++ stuff, > vq->shadow_used_idx was probably already equal to i because > update_shadow_used_ring_split had been called earlier. > Because of this, the "dropped" descriptor was part of the shadow_used > array for returning it through the used ring. > > With the patch, since we break without touching i, it means that the > "dropped" descriptor is not returned anymore. > > Fixing this issue could take the form of restoring the call to > update_shadow_used_ring_split in the loop and adjust > vq->shadow_used_idx to i + dropped. > > But I think we can go one step further, by noting that > vq->last_avail_idx is being incremented by the same "i + dropped" > value. > Meaning that we could simply rely on calls to > update_shadow_used_ring_split and reuse vq->shadow_used_idx to adjust > vq->last_avail_idx. > By doing this, there is no need for a "dropped" variable, and no need > for touching of vq->shadow_used_idx manually, which is probably better > for robustness / easing readability. I fully agree with your suggestion as discussed off-list. On top of that, it also makes the split code closer to the packed one. > > Note: we could also move the call do_data_copy_dequeue() under the > check on vq->shadow_used_idx != 0, though it won't probably change > anything. I will move the call to do_data_copy_dequeue() under vq->shadow_used_idx != 0 check. Thanks, Maxime > > This gives the following (untested) diff, on top of your fix: > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c > index 211d24b36a..4e1d61bd54 100644 > --- a/lib/vhost/virtio_net.c > +++ b/lib/vhost/virtio_net.c > @@ -3104,7 +3104,6 @@ virtio_dev_tx_split(struct virtio_net *dev, > struct vhost_virtqueue *vq, > { > uint16_t i; > uint16_t avail_entries; > - uint16_t dropped = 0; > static bool allocerr_warned; > > /* > @@ -3141,10 +3140,10 @@ virtio_dev_tx_split(struct virtio_net *dev, > struct vhost_virtqueue *vq, > VHOST_ACCESS_RO) < 0)) > break; > > - if (unlikely(buf_len <= dev->vhost_hlen)) { > - dropped = 1; > + update_shadow_used_ring_split(vq, head_idx, 0); > + > + if (unlikely(buf_len <= dev->vhost_hlen)) > break; > - } > > buf_len -= dev->vhost_hlen; > > @@ -3161,7 +3160,6 @@ virtio_dev_tx_split(struct virtio_net *dev, > struct vhost_virtqueue *vq, > buf_len, mbuf_pool->name); > allocerr_warned = true; > } > - dropped = 1; > break; > } > > @@ -3172,22 +3170,16 @@ virtio_dev_tx_split(struct virtio_net *dev, > struct vhost_virtqueue *vq, > VHOST_DATA_LOG(dev->ifname, ERR, > "failed to copy desc to mbuf."); > allocerr_warned = true; > } > - dropped = 1; > break; > } > - > - update_shadow_used_ring_split(vq, head_idx, 0); > } > > if (unlikely(count != i)) > rte_pktmbuf_free_bulk(&pkts[i], count - i); > > - vq->last_avail_idx += i + dropped; > - > do_data_copy_dequeue(vq); > - if (unlikely(i < count)) > - vq->shadow_used_idx = i; > if (likely(vq->shadow_used_idx)) { > + vq->last_avail_idx += vq->shadow_used_idx; > flush_shadow_used_ring_split(dev, vq); > vhost_vring_call_split(dev, vq); > } > >