From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A28AFA3168 for ; Wed, 16 Oct 2019 16:05:23 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E90001E96A; Wed, 16 Oct 2019 16:05:22 +0200 (CEST) Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [217.70.183.193]) by dpdk.org (Postfix) with ESMTP id 0DD881E960 for ; Wed, 16 Oct 2019 16:05:21 +0200 (CEST) X-Originating-IP: 90.177.210.238 Received: from [192.168.1.110] (238.210.broadband10.iol.cz [90.177.210.238]) (Authenticated sender: i.maximets@ovn.org) by relay1-d.mail.gandi.net (Postfix) with ESMTPSA id 453A9240010; Wed, 16 Oct 2019 14:05:18 +0000 (UTC) To: Maxime Coquelin , Ilya Maximets , Flavio Leitner , dev@dpdk.org Cc: Shahaf Shuler , David Marchand , Tiwei Bie , Obrembski MichalX , Stokes Ian References: <20191015161727.32570-1-fbl@sysclose.org> <20191015185951.6295-1-fbl@sysclose.org> <5900f514-f1d9-7f7d-0ea1-f394b9e5d01f@ovn.org> <62fb6a50-d4ab-584d-8cb2-4820c141f137@redhat.com> From: Ilya Maximets Message-ID: <2b963349-66d9-179d-b480-6e15263733f3@ovn.org> Date: Wed, 16 Oct 2019 16:05:18 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <62fb6a50-d4ab-584d-8cb2-4820c141f137@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v5] vhost: add support for large buffers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 16.10.2019 15:46, Maxime Coquelin wrote: > > > On 10/16/19 3:32 PM, Ilya Maximets wrote: >> On 16.10.2019 13:13, Maxime Coquelin wrote: >>> >>> >>> On 10/15/19 8:59 PM, Flavio Leitner wrote: >>>> The rte_vhost_dequeue_burst supports two ways of dequeuing data. >>>> If the data fits into a buffer, then all data is copied and a >>>> single linear buffer is returned. Otherwise it allocates >>>> additional mbufs and chains them together to return a multiple >>>> segments mbuf. >>>> >>>> While that covers most use cases, it forces applications that >>>> need to work with larger data sizes to support multiple segments >>>> mbufs. The non-linear characteristic brings complexity and >>>> performance implications to the application. >>>> >>>> To resolve the issue, add support to attach external buffer >>>> to a pktmbuf and let the host provide during registration if >>>> attaching an external buffer to pktmbuf is supported and if >>>> only linear buffer are supported. >>>> >>>> Signed-off-by: Flavio Leitner >>>> --- >>>>   doc/guides/prog_guide/vhost_lib.rst |  35 +++++++++ >>>>   lib/librte_vhost/rte_vhost.h        |   4 + >>>>   lib/librte_vhost/socket.c           |  22 ++++++ >>>>   lib/librte_vhost/vhost.c            |  22 ++++++ >>>>   lib/librte_vhost/vhost.h            |   4 + >>>>   lib/librte_vhost/virtio_net.c       | 109 ++++++++++++++++++++++++---- >>>>   6 files changed, 182 insertions(+), 14 deletions(-) >>>> >>>> - Changelog: >>>>    v5: >>>>      - fixed to destroy mutex if incompatible flags >>>>    v4: >>>>      - allow to use pktmbuf if there is exact space >>>>      - removed log message if the buffer is too big >>>>      - fixed the length to include align padding >>>>      - free allocated buf if shinfo fails >>>>    v3: >>>>      - prevent the new features to be used with zero copy >>>>      - fixed sizeof() usage >>>>      - fixed log msg indentation >>>>      - removed/replaced asserts >>>>      - used the correct virt2iova function >>>>      - fixed the patch's title >>>>      - OvS PoC code: >>>>        https://github.com/fleitner/ovs/tree/rte_malloc-v3 >>>>    v2: >>>>      - Used rte_malloc() instead of another mempool as suggested by >>>> Shahaf. >>>>      - Added the documentation section. >>>>      - Using driver registration to negotiate the features. >>>>      - OvS PoC code: >>>> >>>> https://github.com/fleitner/ovs/commit/8fc197c40b1d4fda331686a7b919e9e2b670dda7 >>>> >>> >>> Applied to dpdk-next-virtio/master. >> >> Thanks Maxime, >> >> But can we return back the mbuf allocation failure message? > > Good point, I missed it. > >> I mean: >> >> diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c >> index 66f0c7206..f8af4e0b3 100644 >> --- a/lib/librte_vhost/virtio_net.c >> +++ b/lib/librte_vhost/virtio_net.c >> @@ -1354,8 +1354,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev, >> struct rte_mempool *mp, >>  { >>      struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp); >> >> -    if (unlikely(pkt == NULL)) >> +    if (unlikely(pkt == NULL)) { >> +        RTE_LOG(ERR, VHOST_DATA, >> +            "Failed to allocate memory for mbuf.\n"); >>          return NULL; >> +    } >> >>      if (rte_pktmbuf_tailroom(pkt) >= data_len) >>          return pkt; >> --- >> >> It's a hard failure that highlights some significant issues with >> memory pool size or a mbuf leak. > > I agree, we need this error message. > >> We still have the message for subsequent chained mbufs, but not >> for the first one.  Without this error message we could never >> notice that something is wrong with our memory pool.  Only the >> network traffic will stop flowing. >> The message was very useful previously while catching the root >> cause of the mbuf leak in OVS. >> >> I could send a separate patch for this if needed. > > We need a separate patch. > If you want to do it, please do! Otherwise I'll do it. OK. I'll send a patch soon. Best regards, Ilya Maximets.