From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67]) by dpdk.org (Postfix) with ESMTP id 4563CADE8 for ; Wed, 8 Jun 2016 16:11:42 +0200 (CEST) Received: from was59-1-82-226-113-214.fbx.proxad.net ([82.226.113.214] helo=[192.168.0.10]) by mail.droids-corp.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.84_2) (envelope-from ) id 1bAeF4-0000qD-6w; Wed, 08 Jun 2016 16:13:59 +0200 To: "Ananyev, Konstantin" , "dev@dpdk.org" , Adrien Mazarguil References: <1465374688-11729-1-git-send-email-adrien.mazarguil@6wind.com> <2601191342CEEE43887BDE71AB97725836B6CE18@irsmsx105.ger.corp.intel.com> <20160608122723.GK7621@6wind.com> <2601191342CEEE43887BDE71AB97725836B6CFDD@irsmsx105.ger.corp.intel.com> <20160608135751.GM7621@6wind.com> From: Olivier Matz Message-ID: <57582797.8050300@6wind.com> Date: Wed, 8 Jun 2016 16:11:35 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 In-Reply-To: <20160608135751.GM7621@6wind.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 08 Jun 2016 14:11:42 -0000 Hi Adrien, Konstantin, I'm jumping in this (interesting) discussion. I already talked a bit with Adrien in point to point, and I think its patch is valid. Please see some comments below. On 06/08/2016 03:57 PM, Adrien Mazarguil wrote: > On Wed, Jun 08, 2016 at 01:09:18PM +0000, Ananyev, Konstantin wrote: >>> >>> Hi Konstantin, >>> >>> On Wed, Jun 08, 2016 at 10:34:17AM +0000, Ananyev, Konstantin wrote: >>>> Hi Adrien, >>>> >>>>> >>>>> An assertion failure occurs in __rte_mbuf_raw_free() (called by a few PMDs) >>>>> when compiling DPDK with CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG and starting >>>>> applications with a log level high enough to trigger it. >>>>> >>>>> While rte_mbuf_raw_alloc() sets refcount to 1, __rte_mbuf_raw_free() >>>>> expects it to be 0. >>>>> Considering users are not expected to reset the >>>>> reference count to satisfy assert() and that raw functions are designed on >>>>> purpose without safety belts, remove these checks. >>>> >>>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg(). >>>> Wright now, it is a user responsibility to make sure refcnt==0 before pushing >>>> mbuf back to the pool. >>>> Not sure why do you consider that wrong? >>> >>> I do not consider this wrong and I'm all for using assert() to catch >>> programming errors, however in this specific case, I think they are >>> inconsistent and misleading. >> >> Honestly, I don't understand why. >> Right now the rule of thumb is - when mbuf is in the pool, it's refcnt should be equal zero. What is the purpose of this? Is there some code relying on this? >> Yes, as you pointed below - that rule probably can be changed to: >> when mbuf is in the pool, it's refcnt should equal one, and that would probably allow us >> to speedup things a bit, but I suppose that's the matter of another aptch/discussion. > > Agreed. I agree this is somehow another discussion, but let's dive into :) But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all PMDs were calling an internal function before). We could argue that rte_mbuf_raw_free() should also be made public for PMDs. As you said below, no-one promised that the free() reverts the malloc(), but given the function names, one can legitimately expect that the following code is valid: m = __rte_mbuf_raw_alloc(); /* do nothing here */ __rte_mbuf_raw_free(m); If no code relies on having the refcnt set to 0 when a mbuf is in the pool, I suggest to relax this constraint as Adrien proposed. Then, my opinion is that the refcount should be set to 1 in rte_pktmbuf_reset(). And rte_pktmbuf_free() should not be allowed on an uninitialized mbuf (yes, this would require some changes in PMDs). This would open the door for bulk allocation/free in the mbuf api. This could be done in several steps: 1/ remove these assert(), add introduce a public rte_mbuf_raw_free() 2/ announce that rte_pktmbuf_free() won't work on uninitialized mbufs in a future version, and ensure that all PMD are inline with this requirement 3/ later, move refcount initialization in rte_pktmbuf_reset() What do you think? Another option is to remove the rte_mbuf_raw_alloc()/rte_mbuf_raw_free() and use mempool calls. But having a mbuf wrapper does not seem a bad thing to me. By the way, __rte_pktmbuf_prefree_seg() is also an internal function. I think we should try to update the mbuf API so that the PMDs do not need to call these internal functions. [1] http://dpdk.org/browse/dpdk/commit/?id=fbfd99551ca370266f4bfff58ce441cf5cb1203a Regards, Olivier > >>>> If the user calls __rte_mbuf_raw_free() manualy it is his responsibility to make >>>> sure mbuf's refcn==0. >>> >>> Sure, however what harm does it cause (besides assert() to fail), since the >>> allocation function sets refcount to 1? >>> >>> Why having the allocation function set the refcount if we are sure it is >>> already 0 (assert() proves it). Removing rte_mbuf_refcnt_set(m, 1) can >>> surely improve performance. >> >> That's' just an assert() enabled when MBUF_DEBUG is on. >> Its sole purpose is to help troubleshoot the bugs and help to catch situations >> when someone silently updates mbufs supposed to be free. > > I perfectly understand and I cannot agree more with this explanation, > however the fact these functions are not symmetrical remains an issue that > needs to be addressed somehow in my opinion. > >>>> BTW, why are you doing it? >>>> The comment clearly states that the function is for internal use: >>>> /** >>>> * @internal Put mbuf back into its original mempool. >>>> * The use of that function is reserved for RTE internal needs. >>>> * Please use rte_pktmbuf_free(). >>>> * >>>> * @param m >>>> * The mbuf to be freed. >>>> */ >>>> static inline void __attribute__((always_inline)) >>>> __rte_mbuf_raw_free(struct rte_mbuf *m) >>> >>> Several PMDs are using it anyway (won't name names, but I know one of them >>> quite well). >> >> Then it probably is a bug in these PMDs that need to be fixed. >> >>> I chose to modify this code instead of its users for the >>> following reasons: >>> >>> - Considering their names, these functions should be opposites and able to >>> work together like malloc()/free(). >> >> These are internal functions. >> Comments in mbuf clearly state that library users shouldn't call them directly. >> They are written to fit internal librte_mbuf needs, and no-one ever promised >> malloc/free() compatibility here. > > So it cannot be provided for the sake of not providing it or is there a good > reason? > > What I meant is that since PMDs already made the mistake of using these > functions to benefit from the improved performance, DPDK being all about > performance and stuff, let them use it as intended. Perhaps we should drop > those "__" like for rte_mbuf_raw_alloc(). > >>> >>> - PMDs are relying on these functions for performance reasons, we can assume >>> they took the extra care necessary to make sure it would work properly. >> >> That just doesn't seem correct to me. >> The proper way to do free fo mbuf segment is: >> >> static inline void __attribute__((always_inline)) >> rte_pktmbuf_free_seg(struct rte_mbuf *m) >> { >> if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m)))) { >> m->next = NULL; >> __rte_mbuf_raw_free(m); >> } >> } >> >> If by some reason you choose not to use this function, then it is your >> responsibility to perform similar actions on your own before pushing mbuf into the pool. >> That's what some TX functions for some Intel NICs do to improve performance: >> they call _prefree_seg() manually and try to put mbufs into the pool in groups. > > Not anymore it seems, but in the current code base both ena and mpipe PMDs > (soon mlx5 as well) seem to get this wrong. > >>> - Preventing it would make these PMDs slower and is not acceptable either. >> >> I can hardly imagine that __rte_pktmbuf_prefree_seg() impact would be that severe... >> But ok, probably you do have some very specific case, but then why you PMD just doesn't call: >> rte_mempool_put(m->pool, m); >> directly? > > To survive the upcoming transition to mbufs backed by libc malloc() without > having to fix them? Joke aside, I guess the reason is to use functions with > "mbuf" in their names when dealing with mbufs. > >> Why instead you choose to change common functions and compromise >> librte_mbuf debug ability? > > No, I'm fine with keeping the debug ability, however I did not find a way to > both keep it and fix the consistency issue without breaking something > (performance or refcount assumptions I'm not familiar with elsewhere). > >>> What remains is the consistency issue, I think these statements were only >>> added to catch multiple frees, >> >> Yes these asserts() here to help catch bugs, >> and I think it is a good thing to have them here. >> >>> and those should be caught at a higher >>> level, where other consistency checks are also performed. >> >> Like where? > > Possibly rte_pktmbuf_free_seg(). > >>>>> Signed-off-by: Adrien Mazarguil >>>>> --- >>>>> lib/librte_mbuf/rte_mbuf.h | 2 -- >>>>> 1 file changed, 2 deletions(-) >>>>> >>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h >>>>> index 11fa06d..7070bb8 100644 >>>>> --- a/lib/librte_mbuf/rte_mbuf.h >>>>> +++ b/lib/librte_mbuf/rte_mbuf.h >>>>> @@ -1108,7 +1108,6 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp) >>>>> if (rte_mempool_get(mp, &mb) < 0) >>>>> return NULL; >>>>> m = (struct rte_mbuf *)mb; >>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0); >>>>> rte_mbuf_refcnt_set(m, 1); >>>>> __rte_mbuf_sanity_check(m, 0); >>>>> >>>>> @@ -1133,7 +1132,6 @@ __rte_mbuf_raw_alloc(struct rte_mempool *mp) >>>>> static inline void __attribute__((always_inline)) >>>>> __rte_mbuf_raw_free(struct rte_mbuf *m) >>>>> { >>>>> - RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0); >>>>> rte_mempool_put(m->pool, m); >>>>> } >>>>> >>>>> -- >>>>> 2.1.4 >>>> >>> >>> -- >>> Adrien Mazarguil >>> 6WIND >