DPDK patches and discussions
 help / color / mirror / Atom feed
From: Olivier Matz <olivier.matz@6wind.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Adrien Mazarguil <adrien.mazarguil@6wind.com>
Subject: Re: [dpdk-dev] [PATCH] mbuf: remove inconsistent assert statements
Date: Wed, 8 Jun 2016 16:11:35 +0200	[thread overview]
Message-ID: <57582797.8050300@6wind.com> (raw)
In-Reply-To: <20160608135751.GM7621@6wind.com>

Hi Adrien, Konstantin,

I'm jumping in this (interesting) discussion. I already talked a
bit with Adrien in point to point, and I think its patch is valid.
Please see some comments below.


On 06/08/2016 03:57 PM, Adrien Mazarguil wrote:
> On Wed, Jun 08, 2016 at 01:09:18PM +0000, Ananyev, Konstantin wrote:
>>>
>>> Hi Konstantin,
>>>
>>> On Wed, Jun 08, 2016 at 10:34:17AM +0000, Ananyev, Konstantin wrote:
>>>> Hi Adrien,
>>>>
>>>>>
>>>>> An assertion failure occurs in __rte_mbuf_raw_free() (called by a few PMDs)
>>>>> when compiling DPDK with CONFIG_RTE_LOG_LEVEL=RTE_LOG_DEBUG and starting
>>>>> applications with a log level high enough to trigger it.
>>>>>
>>>>> While rte_mbuf_raw_alloc() sets refcount to 1, __rte_mbuf_raw_free()
>>>>> expects it to be 0.
>>>>> Considering users are not expected to reset the
>>>>> reference count to satisfy assert() and that raw functions are designed on
>>>>> purpose without safety belts, remove these checks.
>>>>
>>>> Yes, it refcnt supposed to be set to 0 by __rte_pktmbuf_prefree_seg().
>>>> Wright now, it is a user responsibility to make sure refcnt==0 before pushing
>>>> mbuf back to the pool.
>>>> Not sure why do you consider that wrong?
>>>
>>> I do not consider this wrong and I'm all for using assert() to catch
>>> programming errors, however in this specific case, I think they are
>>> inconsistent and misleading.
>>
>> Honestly, I don't understand why.
>> Right now the rule of thumb is - when mbuf is in the pool, it's refcnt should be equal zero.

What is the purpose of this? Is there some code relying on this?

>> Yes, as you pointed below - that rule probably can be changed to: 
>> when mbuf is in the pool, it's refcnt should equal one, and that would probably allow us
>> to speedup things a bit, but I suppose that's the matter of another aptch/discussion.
> 
> Agreed.

I agree this is somehow another discussion, but let's dive into :)

But since [1], it is allowed to call rte_mbuf_raw_alloc() in PMDs (all
PMDs were calling an internal function before). We could argue that
rte_mbuf_raw_free() should also be made public for PMDs.

As you said below, no-one promised that the free() reverts the malloc(),
but given the function names, one can legitimately expect that the
following code is valid:

m = __rte_mbuf_raw_alloc();
/* do nothing here */
__rte_mbuf_raw_free(m);

If no code relies on having the refcnt set to 0 when a mbuf is in
the pool, I suggest to relax this constraint as Adrien proposed.

Then, my opinion is that the refcount should be set to 1 in
rte_pktmbuf_reset(). And rte_pktmbuf_free() should not be allowed on
an uninitialized mbuf (yes, this would require some changes in PMDs).
This would open the door for bulk allocation/free in the mbuf api.

This could be done in several steps:

1/ remove these assert(), add introduce a public rte_mbuf_raw_free()
2/ announce that rte_pktmbuf_free() won't work on uninitialized mbufs
in a future version, and ensure that all PMD are inline with this
requirement
3/ later, move refcount initialization in rte_pktmbuf_reset()

What do you think?

Another option is to remove the rte_mbuf_raw_alloc()/rte_mbuf_raw_free()
and use mempool calls. But having a mbuf wrapper does not seem a bad
thing to me.

By the way, __rte_pktmbuf_prefree_seg() is also an internal function.
I think we should try to update the mbuf API so that the PMDs do not
need to call these internal functions.


[1]
http://dpdk.org/browse/dpdk/commit/?id=fbfd99551ca370266f4bfff58ce441cf5cb1203a


Regards,
Olivier


> 
>>>> If the user calls __rte_mbuf_raw_free() manualy it is his responsibility to make
>>>> sure mbuf's refcn==0.
>>>
>>> Sure, however what harm does it cause (besides assert() to fail), since the
>>> allocation function sets refcount to 1?
>>>
>>> Why having the allocation function set the refcount if we are sure it is
>>> already 0 (assert() proves it). Removing rte_mbuf_refcnt_set(m, 1) can
>>> surely improve performance.
>>
>> That's' just an assert() enabled when MBUF_DEBUG  is on.
>> Its sole purpose is to help troubleshoot the bugs and help to catch situations
>> when someone silently updates mbufs supposed to be free.  
> 
> I perfectly understand and I cannot agree more with this explanation,
> however the fact these functions are not symmetrical remains an issue that
> needs to be addressed somehow in my opinion.
> 
>>>> BTW, why are you doing it?
>>>> The comment clearly states that the function is for internal use:
>>>> /**
>>>>  * @internal Put mbuf back into its original mempool.
>>>>  * The use of that function is reserved for RTE internal needs.
>>>>  * Please use rte_pktmbuf_free().
>>>>  *
>>>>  * @param m
>>>>  *   The mbuf to be freed.
>>>>  */
>>>> static inline void __attribute__((always_inline))
>>>> __rte_mbuf_raw_free(struct rte_mbuf *m)
>>>
>>> Several PMDs are using it anyway (won't name names, but I know one of them
>>> quite well).
>>
>> Then it probably is a bug in these PMDs that need to be fixed.
>>
>>> I chose to modify this code instead of its users for the
>>> following reasons:
>>>
>>> - Considering their names, these functions should be opposites and able to
>>>   work together like malloc()/free().
>>
>> These are internal functions.
>> Comments in mbuf clearly state that library users shouldn't call them directly.
>> They are written to fit internal librte_mbuf needs, and no-one ever promised
>> malloc/free() compatibility here. 
> 
> So it cannot be provided for the sake of not providing it or is there a good
> reason?
> 
> What I meant is that since PMDs already made the mistake of using these
> functions to benefit from the improved performance, DPDK being all about
> performance and stuff, let them use it as intended. Perhaps we should drop
> those "__" like for rte_mbuf_raw_alloc().
> 
>>>
>>> - PMDs are relying on these functions for performance reasons, we can assume
>>>   they took the extra care necessary to make sure it would work properly.
>>
>> That just doesn't seem correct to me.
>> The proper way to do free fo mbuf segment is:
>>
>> static inline void __attribute__((always_inline))
>> rte_pktmbuf_free_seg(struct rte_mbuf *m)
>> {
>>         if (likely(NULL != (m = __rte_pktmbuf_prefree_seg(m)))) {
>>                 m->next = NULL;
>>                 __rte_mbuf_raw_free(m);
>>         }
>> }
>>
>> If by some reason you choose not to use this function, then it is your
>> responsibility to perform similar actions on your own before pushing mbuf into the pool.
>> That's what some TX functions for some Intel NICs do to improve performance:
>> they call _prefree_seg() manually and try to put mbufs into the pool in groups.
> 
> Not anymore it seems, but in the current code base both ena and mpipe PMDs
> (soon mlx5 as well) seem to get this wrong.
> 
>>> - Preventing it would make these PMDs slower and is not acceptable either.
>>
>> I can hardly imagine that __rte_pktmbuf_prefree_seg() impact would be that severe...
>> But ok, probably  you do have some very specific case, but then why you PMD just doesn't call:
>> rte_mempool_put(m->pool, m); 
>> directly?
> 
> To survive the upcoming transition to mbufs backed by libc malloc() without
> having to fix them? Joke aside, I guess the reason is to use functions with
> "mbuf" in their names when dealing with mbufs.
> 
>> Why instead you choose to change common functions and compromise
>> librte_mbuf debug ability?
> 
> No, I'm fine with keeping the debug ability, however I did not find a way to
> both keep it and fix the consistency issue without breaking something
> (performance or refcount assumptions I'm not familiar with elsewhere).
> 
>>> What remains is the consistency issue, I think these statements were only
>>> added to catch multiple frees,
>>
>> Yes these asserts() here to help catch bugs,
>> and I think it is a good thing to have them here. 
>>
>>> and those should be caught at a higher
>>> level, where other consistency checks are also performed.
>>
>> Like where?
> 
> Possibly rte_pktmbuf_free_seg().
> 
>>>>> Signed-off-by: Adrien Mazarguil <adrien.mazarguil@6wind.com>
>>>>> ---
>>>>>  lib/librte_mbuf/rte_mbuf.h | 2 --
>>>>>  1 file changed, 2 deletions(-)
>>>>>
>>>>> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
>>>>> index 11fa06d..7070bb8 100644
>>>>> --- a/lib/librte_mbuf/rte_mbuf.h
>>>>> +++ b/lib/librte_mbuf/rte_mbuf.h
>>>>> @@ -1108,7 +1108,6 @@ static inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
>>>>>  	if (rte_mempool_get(mp, &mb) < 0)
>>>>>  		return NULL;
>>>>>  	m = (struct rte_mbuf *)mb;
>>>>> -	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
>>>>>  	rte_mbuf_refcnt_set(m, 1);
>>>>>  	__rte_mbuf_sanity_check(m, 0);
>>>>>
>>>>> @@ -1133,7 +1132,6 @@ __rte_mbuf_raw_alloc(struct rte_mempool *mp)
>>>>>  static inline void __attribute__((always_inline))
>>>>>  __rte_mbuf_raw_free(struct rte_mbuf *m)
>>>>>  {
>>>>> -	RTE_ASSERT(rte_mbuf_refcnt_read(m) == 0);
>>>>>  	rte_mempool_put(m->pool, m);
>>>>>  }
>>>>>
>>>>> --
>>>>> 2.1.4
>>>>
>>>
>>> --
>>> Adrien Mazarguil
>>> 6WIND
> 

  reply	other threads:[~2016-06-08 14:11 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-06-08  8:31 Adrien Mazarguil
2016-06-08 10:34 ` Ananyev, Konstantin
2016-06-08 12:27   ` Adrien Mazarguil
2016-06-08 13:09     ` Ananyev, Konstantin
2016-06-08 13:57       ` Adrien Mazarguil
2016-06-08 14:11         ` Olivier Matz [this message]
2016-06-08 16:07           ` Ananyev, Konstantin
2016-06-09  7:46             ` Olivier Matz
2016-06-09 13:21               ` Ananyev, Konstantin
2016-06-09 14:19                 ` Bruce Richardson
2016-06-09 15:27                 ` Thomas Monjalon
2016-06-09 15:45                   ` Ananyev, Konstantin
2016-06-09 18:42                     ` Don Provan
2016-06-20 13:49   ` Adrien Mazarguil

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=57582797.8050300@6wind.com \
    --to=olivier.matz@6wind.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).