From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id B480FA04B1; Thu, 5 Nov 2020 10:10:27 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 86A9B7CAE; Thu, 5 Nov 2020 10:10:26 +0100 (CET) Received: from mail-wr1-f68.google.com (mail-wr1-f68.google.com [209.85.221.68]) by dpdk.org (Postfix) with ESMTP id 3B9396947 for ; Thu, 5 Nov 2020 10:10:25 +0100 (CET) Received: by mail-wr1-f68.google.com with SMTP id c17so808032wrc.11 for ; Thu, 05 Nov 2020 01:10:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=OCFNNpW/Gq1yo3sSupn5ME3wfDsVUWDovh5X1IuPDdk=; b=AkyuWMOcDJdY0V9HlOVSLweTGzx4xbdbWeqnDSxZJhSuBhGRhi/HIrpp2s4kqOEmgP aE8kG7gwvrNgPn5SQKXH/6rpjNI5jo3Sl/8PIYlWzit2hmtrQl/lzrXeCzZYSSGfcJmu x43wQV2mxLjLY4Ige1366fPWpP52mpTfJ0lDayHt0M1BBdoggbwYMvN3RodpCquEQ3+P h6u9q04kkQtrdheV6tRiZHhos8qQbGTFSrqlYTEGfdMczlWobH3AxUn0LuAdOyFgDS07 riC2zC78aAsUiH73JYcN55SKL5/2Btxm1k8NtIXG60h0UVYT4Uy/1FRCy37bi0Nd56sH zX7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=OCFNNpW/Gq1yo3sSupn5ME3wfDsVUWDovh5X1IuPDdk=; b=WNB6B7jZlgUJk8Eh/T7Milw/CbmElzOUorFtPn+iy5YMwAVFN0fUjdUYTSaZt3QFXc 14RAA5FyWWo+fXGAJFF06g26TrT5QruopGZd9c3MRZSJj1jbhGR//zV+BfOxhhUMaM2m MA5Q7U4f4UiWNAv9QRPLARtpIPzLa9NxVqQRgyZgv5FhRp+JP6Vs7EMCH8sPRpCjjCMc nK1FiInBOMg9UYrWM4EgblQJyFeNT/fM2qpJy4MT7Ks2rGEaZJpoyYQy3mE6T/yqKJM1 U09JX55tnJ4GnVMMCDarou02sIWu8Fj92FJ7jrwItK+MdJitQtQAOuL34T0vsIOfW4k8 W8fA== X-Gm-Message-State: AOAM533E4WD16MJaW8XawxWT49IdpIf6qy1Cxefmu5d8xBAEnDAdJjNE PzmxJI/WMBAHYw9InWc0O2HFTA== X-Google-Smtp-Source: ABdhPJzlCioCQwQQ/VEl8lVbIHh+cs1RVGdi68tN83ovWHPpHWvY3jhOUP7MO0Y3GKmlpcpI4fvyZQ== X-Received: by 2002:a5d:6086:: with SMTP id w6mr1600887wrt.231.1604567423926; Thu, 05 Nov 2020 01:10:23 -0800 (PST) Received: from 6wind.com (2a01cb0c0005a600345636f7e65ed1a0.ipv6.abo.wanadoo.fr. [2a01:cb0c:5:a600:3456:36f7:e65e:d1a0]) by smtp.gmail.com with ESMTPSA id t2sm1489420wrw.95.2020.11.05.01.10.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 05 Nov 2020 01:10:23 -0800 (PST) Date: Thu, 5 Nov 2020 10:10:22 +0100 From: Olivier Matz To: Andrew Rybchenko Cc: dev@dpdk.org Message-ID: <20201105091022.GO1898@platinum> References: <20201104170007.8026-1-olivier.matz@6wind.com> <20201105074626.GL1898@platinum> <10b675b5-c376-746f-9ae2-210685b0b05d@oktetlabs.ru> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <10b675b5-c376-746f-9ae2-210685b0b05d@oktetlabs.ru> User-Agent: Mutt/1.10.1 (2018-07-13) Subject: Re: [dpdk-dev] [PATCH] mbuf: fix reset on mbuf free X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Nov 05, 2020 at 11:26:51AM +0300, Andrew Rybchenko wrote: > On 11/5/20 10:46 AM, Olivier Matz wrote: > > On Thu, Nov 05, 2020 at 12:15:49AM +0000, Ananyev, Konstantin wrote: > >> > >> Hi Olivier, > >> > >>> m->nb_seg must be reset on mbuf free whatever the value of m->next, > >>> because it can happen that m->nb_seg is != 1. For instance in this > >>> case: > >>> > >>> m1 = rte_pktmbuf_alloc(mp); > >>> rte_pktmbuf_append(m1, 500); > >>> m2 = rte_pktmbuf_alloc(mp); > >>> rte_pktmbuf_append(m2, 500); > >>> rte_pktmbuf_chain(m1, m2); > >>> m0 = rte_pktmbuf_alloc(mp); > >>> rte_pktmbuf_append(m0, 500); > >>> rte_pktmbuf_chain(m0, m1); > >>> > >>> As rte_pktmbuf_chain() does not reset nb_seg in the initial m1 > >>> segment (this is not required), after this code the mbuf chain > >>> have 3 segments: > >>> - m0: next=m1, nb_seg=3 > >>> - m1: next=m2, nb_seg=2 > >>> - m2: next=NULL, nb_seg=1 > >>> > >>> Freeing this mbuf chain will not restore nb_seg=1 in the second > >>> segment. > >> > >> Hmm, not sure why is that? > >> You are talking about freeing m1, right? > >> rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > >> { > >> ... > >> if (m->next != NULL) { > >> m->next = NULL; > >> m->nb_segs = 1; > >> } > >> > >> m1->next != NULL, so it will enter the if() block, > >> and will reset both next and nb_segs. > >> What I am missing here? > >> Thinking in more generic way, that change: > >> - if (m->next != NULL) { > >> - m->next = NULL; > >> - m->nb_segs = 1; > >> - } > >> + m->next = NULL; > >> + m->nb_segs = 1; > > > > Ah, sorry. I oversimplified the example and now it does not > > show the issue... > > > > The full example also adds a split() to break the mbuf chain > > between m1 and m2. The kind of thing that would be done for > > software TCP segmentation. > > > > If so, may be the right solution is to care about nb_segs > when next is set to NULL on split? Any place when next is set > to NULL. Just to keep the optimization in a more generic place. The problem with that approach is that there are already several existing split() or trim() implementations in different DPDK-based applications. For instance, we have some in 6WINDGate. If we force applications to set nb_seg to 1 when resetting next, it has to be documented because it is not straightforward. I think the approach from this patch is safer. By the way, for 21.11, if we are able to do some optimizations and have both pool (index?) and next in the first cache line, we may reconsider the fact that next and nb_segs are already set for new allocated mbufs, because it is not straightforward either. > > After this operation, we have 2 mbuf chain: > > - m0 with 2 segments, the last one has next=NULL but nb_seg=2 > > - new_m with 1 segment > > > > Freeing m0 will not restore nb_seg=1 in the second segment. > > > >> Assumes that it is ok to have an mbuf with > >> nb_seg > 1 and next == NULL. > >> Which seems wrong to me. > > > > I don't think it is wrong: nb_seg is just ignored when not in the first > > segment, and there is nothing saying it should be set to 1. Typically, > > rte_pktmbuf_chain() does not change it, and I guess it's the same for > > many similar functions in applications. > > > > Olivier > > > >> > >> > >>> This is expected that mbufs stored in pool have their > >>> nb_seg field set to 1. > >>> > >>> Fixes: 8f094a9ac5d7 ("mbuf: set mbuf fields while in pool") > >>> Cc: stable@dpdk.org > >>> > >>> Signed-off-by: Olivier Matz > >>> --- > >>> lib/librte_mbuf/rte_mbuf.c | 6 ++---- > >>> lib/librte_mbuf/rte_mbuf.h | 12 ++++-------- > >>> 2 files changed, 6 insertions(+), 12 deletions(-) > >>> > >>> diff --git a/lib/librte_mbuf/rte_mbuf.c b/lib/librte_mbuf/rte_mbuf.c > >>> index 8a456e5e64..e632071c23 100644 > >>> --- a/lib/librte_mbuf/rte_mbuf.c > >>> +++ b/lib/librte_mbuf/rte_mbuf.c > >>> @@ -129,10 +129,8 @@ rte_pktmbuf_free_pinned_extmem(void *addr, void *opaque) > >>> > >>> rte_mbuf_ext_refcnt_set(m->shinfo, 1); > >>> m->ol_flags = EXT_ATTACHED_MBUF; > >>> - if (m->next != NULL) { > >>> - m->next = NULL; > >>> - m->nb_segs = 1; > >>> - } > >>> + m->next = NULL; > >>> + m->nb_segs = 1; > >>> rte_mbuf_raw_free(m); > >>> } > >>> > >>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h > >>> index a1414ed7cd..ef5800c8ef 100644 > >>> --- a/lib/librte_mbuf/rte_mbuf.h > >>> +++ b/lib/librte_mbuf/rte_mbuf.h > >>> @@ -1329,10 +1329,8 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > >>> return NULL; > >>> } > >>> > >>> - if (m->next != NULL) { > >>> - m->next = NULL; > >>> - m->nb_segs = 1; > >>> - } > >>> + m->next = NULL; > >>> + m->nb_segs = 1; > >>> > >>> return m; > >>> > >>> @@ -1346,10 +1344,8 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) > >>> return NULL; > >>> } > >>> > >>> - if (m->next != NULL) { > >>> - m->next = NULL; > >>> - m->nb_segs = 1; > >>> - } > >>> + m->next = NULL; > >>> + m->nb_segs = 1; > >>> rte_mbuf_refcnt_set(m, 1); > >>> > >>> return m; > >>> -- > >>> 2.25.1 > >> >