From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D521543A1F for ; Wed, 31 Jan 2024 14:20:13 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C69B3402D6; Wed, 31 Jan 2024 14:20:13 +0100 (CET) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id 60C22402A1 for ; Wed, 31 Jan 2024 14:20:11 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1706707210; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fgRreWgW6q2iR/2DloVaZyrVnZetmVVFGeNZNROj9gc=; b=g/DPAIZPa8kG1zyRJhwePJ0/keSPKLedEzHIvS5H5HkN4qv2hqNNjVobPe/LRKjQvw2jF6 PV0BKMTyp27OryvMsdZf1KYn25Mu8C/Fh5eNM8PCnJlvhlWNPBpnmLcnppuEFRQ++VIM8I 3EI/kXpbYaLbwPpu+H5LRlXlH66Ry7Q= Received: from mail-lj1-f199.google.com (mail-lj1-f199.google.com [209.85.208.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-99-WnmFN9xBP_-87h1YNNi-TA-1; Wed, 31 Jan 2024 08:20:09 -0500 X-MC-Unique: WnmFN9xBP_-87h1YNNi-TA-1 Received: by mail-lj1-f199.google.com with SMTP id 38308e7fff4ca-2cf3568559fso35332361fa.2 for ; Wed, 31 Jan 2024 05:20:09 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706707208; x=1707312008; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fgRreWgW6q2iR/2DloVaZyrVnZetmVVFGeNZNROj9gc=; b=sDMTvWgVGXWPv5Ol6aWU6QX5T/dlXvmTLy+x+JcEGr7QPfjXWOS8btnPD/UAAqSowE cwN95qOJE7rePbJkX4bGnCTiO+wfbMdUuyKBKz+Ln7FtS2Wx600KMpsh6/YZUqj2DYPM 6g2zYhWCB0T2cZuWeuTbDodFTkKdunxl7boUFGFLuOy7qi3xRAXNIsutufAbU8njZKbJ l9QfScOQoyA8Rbk7ARFRHeOxzNM/iR8TrRy5DpgYXAonacJ/ppUEUgTAUoIxMDz9p72Y dzmak4nGtixbldfUOmLc2sBYc/gPsKbo3QECEkTMkD4FgwZsXvs1RFvVFC4RhqqCnz3Z YLpw== X-Gm-Message-State: AOJu0YxzkVKR8Cy+EcU2UNLo7I9z+8rIEDbs9OuI6CWR1VRsNIT5qsAg Etv4noAOBjbi1t/tgtPXQbPc5jEpsTdTZ/zsvpfmHTGjPH+RMVTqqMfAqU/6WStCQ4tYHvxfjcH b7avwO4WHyS5MWP+ZRtObhVRBsL3C/f4lyMWmSm62JTKgGpcaUUjJ5fR+EYzGmn1D0NQbouKje0 FOmJVwGl22ppRORTlxTHU= X-Received: by 2002:a2e:988e:0:b0:2d0:4c1c:4c10 with SMTP id b14-20020a2e988e000000b002d04c1c4c10mr1000093ljj.53.1706707208007; Wed, 31 Jan 2024 05:20:08 -0800 (PST) X-Google-Smtp-Source: AGHT+IFhX/RJ9TVaRNhL+2WYDt1HWMByhIG9m1D4vs/T1iBhCO/2MnH6BK9Y4SbdZYjnfdeH1osL229UmEYMUhZcLWk= X-Received: by 2002:a2e:988e:0:b0:2d0:4c1c:4c10 with SMTP id b14-20020a2e988e000000b002d04c1c4c10mr1000076ljj.53.1706707207535; Wed, 31 Jan 2024 05:20:07 -0800 (PST) MIME-Version: 1.0 References: <20240131093113.2208894-1-maxime.coquelin@redhat.com> In-Reply-To: <20240131093113.2208894-1-maxime.coquelin@redhat.com> From: David Marchand Date: Wed, 31 Jan 2024 14:19:55 +0100 Message-ID: Subject: Re: [PATCH 1/2] vhost: fix memory leak in Virtio Tx split path To: Maxime Coquelin Cc: dev@dpdk.org, chenbox@nvidia.com, bnemeth@redhat.com, echaudro@redhat.com, stable@dpdk.org X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org On Wed, Jan 31, 2024 at 10:31=E2=80=AFAM Maxime Coquelin wrote: > > When vIOMMU is enabled and Virtio device is bound to kernel > driver in guest, rte_vhost_dequeue_burst() will often return > early because of IOTLB misses. > > This patch fixes a mbuf leak occurring in this case. > > Fixes: 242695f6122a ("vhost: allocate and free packets in bulk in Tx spli= t") > Cc: stable@dpdk.org > > Signed-off-by: Maxime Coquelin > --- > lib/vhost/virtio_net.c | 20 ++++++++------------ > 1 file changed, 8 insertions(+), 12 deletions(-) > > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c > index 280d4845f8..db9985c9b9 100644 > --- a/lib/vhost/virtio_net.c > +++ b/lib/vhost/virtio_net.c > @@ -3120,11 +3120,8 @@ virtio_dev_tx_split(struct virtio_net *dev, struct= vhost_virtqueue *vq, > VHOST_ACCESS_RO) < 0)) > break; > > - c(vq, head_idx, 0); > - > if (unlikely(buf_len <=3D dev->vhost_hlen)) { > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } ``i`` was used for both filling the returned mbuf array, but also to update the shadow_used_idx / array. So this change here also affects how the currently considered descriptor is returned through the used ring. See below... > > @@ -3143,8 +3140,7 @@ virtio_dev_tx_split(struct virtio_net *dev, struct = vhost_virtqueue *vq, > buf_len, mbuf_pool->name); > allocerr_warned =3D true; > } > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } > > @@ -3155,17 +3151,17 @@ virtio_dev_tx_split(struct virtio_net *dev, struc= t vhost_virtqueue *vq, > VHOST_DATA_LOG(dev->ifname, ERR, "failed = to copy desc to mbuf."); > allocerr_warned =3D true; > } > - dropped +=3D 1; > - i++; > + dropped =3D 1; > break; > } > > + update_shadow_used_ring_split(vq, head_idx, 0); > } > > - if (dropped) > - rte_pktmbuf_free_bulk(&pkts[i - 1], count - i + 1); > + if (unlikely(count !=3D i)) > + rte_pktmbuf_free_bulk(&pkts[i], count - i); > > - vq->last_avail_idx +=3D i; > + vq->last_avail_idx +=3D i + dropped; > > do_data_copy_dequeue(vq); > if (unlikely(i < count)) ... I am copying the rest of the context: if (unlikely(i < count)) vq->shadow_used_idx =3D i; Before the patch, when breaking and doing the i++ stuff, vq->shadow_used_idx was probably already equal to i because update_shadow_used_ring_split had been called earlier. Because of this, the "dropped" descriptor was part of the shadow_used array for returning it through the used ring. With the patch, since we break without touching i, it means that the "dropped" descriptor is not returned anymore. Fixing this issue could take the form of restoring the call to update_shadow_used_ring_split in the loop and adjust vq->shadow_used_idx to i + dropped. But I think we can go one step further, by noting that vq->last_avail_idx is being incremented by the same "i + dropped" value. Meaning that we could simply rely on calls to update_shadow_used_ring_split and reuse vq->shadow_used_idx to adjust vq->last_avail_idx. By doing this, there is no need for a "dropped" variable, and no need for touching of vq->shadow_used_idx manually, which is probably better for robustness / easing readability. Note: we could also move the call do_data_copy_dequeue() under the check on vq->shadow_used_idx !=3D 0, though it won't probably change anything. This gives the following (untested) diff, on top of your fix: diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 211d24b36a..4e1d61bd54 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -3104,7 +3104,6 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, { uint16_t i; uint16_t avail_entries; - uint16_t dropped =3D 0; static bool allocerr_warned; /* @@ -3141,10 +3140,10 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, VHOST_ACCESS_RO) < 0)) break; - if (unlikely(buf_len <=3D dev->vhost_hlen)) { - dropped =3D 1; + update_shadow_used_ring_split(vq, head_idx, 0); + + if (unlikely(buf_len <=3D dev->vhost_hlen)) break; - } buf_len -=3D dev->vhost_hlen; @@ -3161,7 +3160,6 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, buf_len, mbuf_pool->name); allocerr_warned =3D true; } - dropped =3D 1; break; } @@ -3172,22 +3170,16 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, VHOST_DATA_LOG(dev->ifname, ERR, "failed to copy desc to mbuf."); allocerr_warned =3D true; } - dropped =3D 1; break; } - - update_shadow_used_ring_split(vq, head_idx, 0); } if (unlikely(count !=3D i)) rte_pktmbuf_free_bulk(&pkts[i], count - i); - vq->last_avail_idx +=3D i + dropped; - do_data_copy_dequeue(vq); - if (unlikely(i < count)) - vq->shadow_used_idx =3D i; if (likely(vq->shadow_used_idx)) { + vq->last_avail_idx +=3D vq->shadow_used_idx; flush_shadow_used_ring_split(dev, vq); vhost_vring_call_split(dev, vq); } --=20 David Marchand