From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 2067D43A1E; Wed, 31 Jan 2024 14:20:13 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A002E402A1; Wed, 31 Jan 2024 14:20:12 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 42A8C40270 for ; Wed, 31 Jan 2024 14:20:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706707210; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fgRreWgW6q2iR/2DloVaZyrVnZetmVVFGeNZNROj9gc=; b=g/DPAIZPa8kG1zyRJhwePJ0/keSPKLedEzHIvS5H5HkN4qv2hqNNjVobPe/LRKjQvw2jF6 PV0BKMTyp27OryvMsdZf1KYn25Mu8C/Fh5eNM8PCnJlvhlWNPBpnmLcnppuEFRQ++VIM8I 3EI/kXpbYaLbwPpu+H5LRlXlH66Ry7Q= Received: from mail-lj1-f198.google.com (mail-lj1-f198.google.com [209.85.208.198]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-587-TbLSFrnRP6yx_h_edj8K3Q-1; Wed, 31 Jan 2024 08:20:09 -0500 X-MC-Unique: TbLSFrnRP6yx_h_edj8K3Q-1 Received: by mail-lj1-f198.google.com with SMTP id 38308e7fff4ca-2cf4166eb82so37609471fa.0 for ; Wed, 31 Jan 2024 05:20:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706707208; x=1707312008; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fgRreWgW6q2iR/2DloVaZyrVnZetmVVFGeNZNROj9gc=; b=j+IDJzTO+dZLAGBPXK5QVzyys4zZSuPcQlgefQ6ZbkpRIsRB4oiP/rIAsmHklYqqOb 8zRGqkHT/AeNJ5wuZcDJYiVCcfkjnwUiohy7jFwXgq8eC6aKWpHz/gmvDlomtAhMuRtD CQNZQ+k1AzTLIJDfprpcbrA+hViUP+olb3/6athVYToWZ7XREYgBSWHsTDKZGMxoDoLw o+2JCMhJ1k7ZWuu2o2Fg+pc3yHyrwNR/1gLG8SdX6WQTcejudi3V8xxn06nhPyEw2S85 MesNzNWcHTuFFVKAltZYOwAXOsAIwKjtRkCTgSFaRJweA6JRqWFc3UiEbpswMJ/MM4WY XYTA== X-Gm-Message-State: AOJu0YxcwRkKIvDH3V0SDwLGEr/McV+QeL/GqalzB/z6lnkdR4KF1ees uB4L79kQFUZ/X1/nsRMXSTtU8wZ/mPq2IH5KJorr5i41iXDJiSiRihZAtv43rOchmXlnmlH+Anx dJ+D4elDR5gyMoXaPqQRKCUh09cqGbx6CrE1ZJjxv194VxEta/q/nAYF/nV750g1Kw7fmu7IMi+ MLyrkK5N5fVE5gKZk= X-Received: by 2002:a2e:988e:0:b0:2d0:4c1c:4c10 with SMTP id b14-20020a2e988e000000b002d04c1c4c10mr1000089ljj.53.1706707208003; Wed, 31 Jan 2024 05:20:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IFhX/RJ9TVaRNhL+2WYDt1HWMByhIG9m1D4vs/T1iBhCO/2MnH6BK9Y4SbdZYjnfdeH1osL229UmEYMUhZcLWk= X-Received: by 2002:a2e:988e:0:b0:2d0:4c1c:4c10 with SMTP id b14-20020a2e988e000000b002d04c1c4c10mr1000076ljj.53.1706707207535; Wed, 31 Jan 2024 05:20:07 -0800 (PST) MIME-Version: 1.0 References: <20240131093113.2208894-1-maxime.coquelin@redhat.com> In-Reply-To: <20240131093113.2208894-1-maxime.coquelin@redhat.com> From: David Marchand Date: Wed, 31 Jan 2024 14:19:55 +0100 Message-ID: Subject: Re: [PATCH 1/2] vhost: fix memory leak in Virtio Tx split path To: Maxime Coquelin Cc: dev@dpdk.org, chenbox@nvidia.com, bnemeth@redhat.com, echaudro@redhat.com, stable@dpdk.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Wed, Jan 31, 2024 at 10:31=E2=80=AFAM Maxime Coquelin wrote: > > When vIOMMU is enabled and Virtio device is bound to kernel > driver in guest, rte_vhost_dequeue_burst() will often return > early because of IOTLB misses. > > This patch fixes a mbuf leak occurring in this case. > > Fixes: 242695f6122a ("vhost: allocate and free packets in bulk in Tx spli= t") > Cc: stable@dpdk.org > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/virtio_net.c | 20 ++++++++------------ > 1 file changed, 8 insertions(+), 12 deletions(-) > > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c > index 280d4845f8..db9985c9b9 100644 > --- a/lib/vhost/virtio_net.c > +++ b/lib/vhost/virtio_net.c > @@ -3120,11 +3120,8 @@ virtio_dev_tx_split(struct virtio_net *dev, struct= vhost_virtqueue *vq, > VHOST_ACCESS_RO) < 0)) > break; > > - c(vq, head_idx, 0); > - > if (unlikely(buf_len <=3D dev->vhost_hlen)) { > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } ``i`` was used for both filling the returned mbuf array, but also to update the shadow_used_idx / array. So this change here also affects how the currently considered descriptor is returned through the used ring. See below... > > @@ -3143,8 +3140,7 @@ virtio_dev_tx_split(struct virtio_net *dev, struct = vhost_virtqueue *vq, > buf_len, mbuf_pool->name); > allocerr_warned =3D true; > } > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } > > @@ -3155,17 +3151,17 @@ virtio_dev_tx_split(struct virtio_net *dev, struc= t vhost_virtqueue *vq, > VHOST_DATA_LOG(dev->ifname, ERR, "failed = to copy desc to mbuf."); > allocerr_warned =3D true; > } > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } > > + update_shadow_used_ring_split(vq, head_idx, 0); > } > > - if (dropped) > - rte_pktmbuf_free_bulk(&pkts[i - 1], count - i + 1); > + if (unlikely(count !=3D i)) > + rte_pktmbuf_free_bulk(&pkts[i], count - i); > > - vq->last_avail_idx +=3D i; > + vq->last_avail_idx +=3D i + dropped; > > do_data_copy_dequeue(vq); > if (unlikely(i < count)) ... I am copying the rest of the context: if (unlikely(i < count)) vq->shadow_used_idx =3D i; Before the patch, when breaking and doing the i++ stuff, vq->shadow_used_idx was probably already equal to i because update_shadow_used_ring_split had been called earlier. Because of this, the "dropped" descriptor was part of the shadow_used array for returning it through the used ring. With the patch, since we break without touching i, it means that the "dropped" descriptor is not returned anymore. Fixing this issue could take the form of restoring the call to update_shadow_used_ring_split in the loop and adjust vq->shadow_used_idx to i + dropped. But I think we can go one step further, by noting that vq->last_avail_idx is being incremented by the same "i + dropped" value. Meaning that we could simply rely on calls to update_shadow_used_ring_split and reuse vq->shadow_used_idx to adjust vq->last_avail_idx. By doing this, there is no need for a "dropped" variable, and no need for touching of vq->shadow_used_idx manually, which is probably better for robustness / easing readability. Note: we could also move the call do_data_copy_dequeue() under the check on vq->shadow_used_idx !=3D 0, though it won't probably change anything. This gives the following (untested) diff, on top of your fix: diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 211d24b36a..4e1d61bd54 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -3104,7 +3104,6 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, { uint16_t i; uint16_t avail_entries; - uint16_t dropped =3D 0; static bool allocerr_warned; /* @@ -3141,10 +3140,10 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, VHOST_ACCESS_RO) < 0)) break; - if (unlikely(buf_len <=3D dev->vhost_hlen)) { - dropped =3D 1; + update_shadow_used_ring_split(vq, head_idx, 0); + + if (unlikely(buf_len <=3D dev->vhost_hlen)) break; - } buf_len -=3D dev->vhost_hlen; @@ -3161,7 +3160,6 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, buf_len, mbuf_pool->name); allocerr_warned =3D true; } - dropped =3D 1; break; } @@ -3172,22 +3170,16 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, VHOST_DATA_LOG(dev->ifname, ERR, "failed to copy desc to mbuf."); allocerr_warned =3D true; } - dropped =3D 1; break; } - - update_shadow_used_ring_split(vq, head_idx, 0); } if (unlikely(count !=3D i)) rte_pktmbuf_free_bulk(&pkts[i], count - i); - vq->last_avail_idx +=3D i + dropped; - do_data_copy_dequeue(vq); - if (unlikely(i < count)) - vq->shadow_used_idx =3D i; if (likely(vq->shadow_used_idx)) { + vq->last_avail_idx +=3D vq->shadow_used_idx; flush_shadow_used_ring_split(dev, vq); vhost_vring_call_split(dev, vq); } --=20 David Marchand