From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id C2CB3A3168 for ; Wed, 16 Oct 2019 16:14:58 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 7E0A21E8AB; Wed, 16 Oct 2019 16:14:57 +0200 (CEST) Received: from sysclose.org (smtp.sysclose.org [69.164.214.230]) by dpdk.org (Postfix) with ESMTP id 581371C11B for ; Wed, 16 Oct 2019 16:14:56 +0200 (CEST) Received: by sysclose.org (Postfix, from userid 5001) id 87C0C677B; Wed, 16 Oct 2019 14:15:22 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 sysclose.org 87C0C677B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sysclose.org; s=201903; t=1571235322; bh=myHSILWIPGFfnehtUbMmaCF46xSBR9cHBn/H0xEJyFM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=EOIEOusw/qdm88CtqJCVD87cDvhXhc+JxjpwVZLI+s5+iSUJOQmgr7ctkyFyXiANv 2YYComnuUAbYHYjLFquEQ+KXGfY83/NC6733Fu+j8RFPGBHdybwMDQrVb75hoclbaS mltfc9BxtccFznYmPpevOc6MVvGR+q4bnsOXfgSG/57csS7m1ATwhSFFVGC4B4Vc43 yk6lHw4+SRyF6qFd7nvi4ZFpUssNsF2jluQ/YBu+70srPFxOTx2gSiLKQEp8AZCbAw yDrXMQ/bxEOUty1EP7G+jdr8roHNQyh6YwoWBqalq5MAE7mOknKyo8e0e9dMv3oDCL nipoxCJ/u140Q== X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mail.sysclose.org X-Spam-Level: X-Spam-Status: No, score=-1.1 required=5.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from p50.lan (unknown [177.183.215.210]) by sysclose.org (Postfix) with ESMTPSA id 541C96509; Wed, 16 Oct 2019 14:15:19 +0000 (UTC) DKIM-Filter: OpenDKIM Filter v2.11.0 sysclose.org 541C96509 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sysclose.org; s=201903; t=1571235320; bh=myHSILWIPGFfnehtUbMmaCF46xSBR9cHBn/H0xEJyFM=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=EByrIjplcxpITgyVdpz9p8OlNf/pytKxW0aYA1SzoimkajJkpJmjfYYq2553WJR0k 1DTJ2sH50vuM2ax6zt59vznuspXtlWOIDAOG59rDFdK5keMlTye+BpksPTBP2fJFhX l9Xbi7/0Cj5zH+EDmHHIRuA5Y4K/Js6Qwc+nBJ5uNXEj4qv21UBwA1Cqks7bfb/llA 8J14HiUl88R5/9guiQq9tdyGI09sT8Onv6TEsewIIAhHRyPB7Ti67Y0qf+Aode6kSd Kq3YC43DtfGIL5AciUjVh2J7joPUCmxHherJ5Ec48fTCIBvEDVNuYvjt7HhFwcvnhY ljMb7UXAvobxQ== Date: Wed, 16 Oct 2019 11:14:46 -0300 From: Flavio Leitner To: Ilya Maximets Cc: Maxime Coquelin , dev@dpdk.org, Shahaf Shuler , David Marchand , Tiwei Bie , Obrembski MichalX , Stokes Ian Message-ID: <20191016111446.31d5163d@p50.lan> In-Reply-To: References: <20191015161727.32570-1-fbl@sysclose.org> <20191015185951.6295-1-fbl@sysclose.org> <5900f514-f1d9-7f7d-0ea1-f394b9e5d01f@ovn.org> <62fb6a50-d4ab-584d-8cb2-4820c141f137@redhat.com> <20191016110243.14005e1c@p50.lan> X-Mailer: Claws Mail 3.17.4 (GTK+ 2.24.32; x86_64-redhat-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH v5] vhost: add support for large buffers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Wed, 16 Oct 2019 16:08:54 +0200 Ilya Maximets wrote: > On 16.10.2019 16:02, Flavio Leitner wrote: > > On Wed, 16 Oct 2019 15:46:15 +0200 > > Maxime Coquelin wrote: > > =20 > >> On 10/16/19 3:32 PM, Ilya Maximets wrote: =20 > >>> On 16.10.2019 13:13, Maxime Coquelin wrote: =20 > >>>> > >>>> > >>>> On 10/15/19 8:59 PM, Flavio Leitner wrote: =20 > >>>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data. > >>>>> If the data fits into a buffer, then all data is copied and a > >>>>> single linear buffer is returned. Otherwise it allocates > >>>>> additional mbufs and chains them together to return a multiple > >>>>> segments mbuf. > >>>>> > >>>>> While that covers most use cases, it forces applications that > >>>>> need to work with larger data sizes to support multiple segments > >>>>> mbufs. The non-linear characteristic brings complexity and > >>>>> performance implications to the application. > >>>>> > >>>>> To resolve the issue, add support to attach external buffer > >>>>> to a pktmbuf and let the host provide during registration if > >>>>> attaching an external buffer to pktmbuf is supported and if > >>>>> only linear buffer are supported. > >>>>> > >>>>> Signed-off-by: Flavio Leitner > >>>>> --- > >>>>> =C2=A0 doc/guides/prog_guide/vhost_lib.rst |=C2=A0 35 +++++++++ > >>>>> =C2=A0 lib/librte_vhost/rte_vhost.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0=C2=A0 |=C2=A0=C2=A0 4 + > >>>>> =C2=A0 lib/librte_vhost/socket.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 22 ++++++ > >>>>> =C2=A0 lib/librte_vhost/vhost.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0 22 ++++++ > >>>>> =C2=A0 lib/librte_vhost/vhost.h=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2= =A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 |=C2=A0=C2=A0 4 + > >>>>> =C2=A0 lib/librte_vhost/virtio_net.c=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0= =C2=A0 | 109 > >>>>> ++++++++++++++++++++++++---- 6 files changed, 182 insertions(+), > >>>>> 14 deletions(-) > >>>>> > >>>>> - Changelog: > >>>>> =C2=A0=C2=A0 v5: > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - fixed to destroy mutex if incompatible = flags > >>>>> =C2=A0=C2=A0 v4: > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - allow to use pktmbuf if there is exact = space > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - removed log message if the buffer is to= o big > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - fixed the length to include align paddi= ng > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - free allocated buf if shinfo fails > >>>>> =C2=A0=C2=A0 v3: > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - prevent the new features to be used wit= h zero copy > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - fixed sizeof() usage > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - fixed log msg indentation > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - removed/replaced asserts > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - used the correct virt2iova function > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - fixed the patch's title > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - OvS PoC code: > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 https://github.com/fleitner/o= vs/tree/rte_malloc-v3 > >>>>> =C2=A0=C2=A0 v2: > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - Used rte_malloc() instead of another me= mpool as > >>>>> suggested by Shahaf. > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - Added the documentation section. > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - Using driver registration to negotiate = the features. > >>>>> =C2=A0=C2=A0=C2=A0=C2=A0 - OvS PoC code: > >>>>> =20 > >>>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919= e9e2b670dda7 > >>>>> =20 > >>>> > >>>> Applied to dpdk-next-virtio/master. =20 > >>> > >>> Thanks Maxime, > >>> > >>> But can we return back the mbuf allocation failure message? =20 > >> > >> Good point, I missed it. > >> =20 > >>> I mean: > >>> > >>> diff --git a/lib/librte_vhost/virtio_net.c > >>> b/lib/librte_vhost/virtio_net.c index 66f0c7206..f8af4e0b3 100644 > >>> --- a/lib/librte_vhost/virtio_net.c > >>> +++ b/lib/librte_vhost/virtio_net.c > >>> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net > >>> *dev, struct rte_mempool *mp, > >>> =C2=A0{ > >>> =C2=A0=C2=A0=C2=A0=C2=A0 struct rte_mbuf *pkt =3D rte_pktmbuf_alloc(= mp); > >>> =20 > >>> -=C2=A0=C2=A0=C2=A0 if (unlikely(pkt =3D=3D NULL)) > >>> +=C2=A0=C2=A0=C2=A0 if (unlikely(pkt =3D=3D NULL)) { > >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 RTE_LOG(ERR, VHOST_DATA, > >>> +=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 "= Failed to allocate memory for mbuf.\n"); > >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return NULL; > >>> +=C2=A0=C2=A0=C2=A0 } > >>> =20 > >>> =C2=A0=C2=A0=C2=A0=C2=A0 if (rte_pktmbuf_tailroom(pkt) >=3D data_len) > >>> =C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0=C2=A0 return pkt; > >>> --- > >>> > >>> It's a hard failure that highlights some significant issues with > >>> memory pool size or a mbuf leak. =20 > >> > >> I agree, we need this error message. =20 > >=20 > > If it runs out of memory and there are many packets coming, then it > > will flood the logs. Maybe we could use something to report in a > > more moderated way, for example: > >=20 > > diff --git a/lib/librte_vhost/virtio_net.c > > b/lib/librte_vhost/virtio_net.c index da69ab1db..b1572b3a4 100644 > > --- a/lib/librte_vhost/virtio_net.c > > +++ b/lib/librte_vhost/virtio_net.c > > @@ -1354,8 +1354,14 @@ virtio_dev_pktmbuf_alloc(struct virtio_net > > *dev, struct rte_mempool *mp, { > > struct rte_mbuf *pkt =3D rte_pktmbuf_alloc(mp); > > =20 > > - if (unlikely(pkt =3D=3D NULL)) > > + if (unlikely(pkt =3D=3D NULL)) { > > + static int rate_lim =3D 0; > > + > > + if (rate_lim++ % 1000 =3D=3D 0) > > + RTE_LOG... > > + > > return NULL; > > + } > > =20 > > if (rte_pktmbuf_tailroom(pkt) >=3D data_len) > > return pkt; > >=20 > >=20 > > Or track this at mempool level and keep stats of failed requests and > > report that in OvS. That would cover other points too. =20 >=20 > OVS is already rete-limiting DPDK logs. Basically, I introduced this > functionality to limit exactly this message. > See: 9fd38f6867df ("netdev-dpdk: Limit rate of DPDK logs.") Thanks, I missed that commit. fbl