From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <olivier.matz@6wind.com>
Received: from mail.droids-corp.org (zoll.droids-corp.org [94.23.50.67])
 by dpdk.org (Postfix) with ESMTP id 17AF55902
 for <dev@dpdk.org>; Mon, 16 May 2016 10:52:51 +0200 (CEST)
Received: from was59-1-82-226-113-214.fbx.proxad.net ([82.226.113.214]
 helo=[192.168.0.10])
 by mail.droids-corp.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128)
 (Exim 4.84_2) (envelope-from <olivier.matz@6wind.com>)
 id 1b2EIi-00050j-IM; Mon, 16 May 2016 10:54:56 +0200
To: Hiroyuki Mikita <h.mikita89@gmail.com>
References: <1463327436-6863-1-git-send-email-h.mikita89@gmail.com>
Cc: dev@dpdk.org, "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
From: Olivier Matz <olivier.matz@6wind.com>
Message-ID: <57398A5C.2050802@6wind.com>
Date: Mon, 16 May 2016 10:52:44 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101
 Icedove/38.6.0
MIME-Version: 1.0
In-Reply-To: <1463327436-6863-1-git-send-email-h.mikita89@gmail.com>
Content-Type: text/plain; charset=windows-1252
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH] mbuf: decrease refcnt when detaching
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 16 May 2016 08:52:51 -0000

Hi Hiroyuki,


On 05/15/2016 05:50 PM, Hiroyuki Mikita wrote:
> The rte_pktmbuf_detach() function should decrease refcnt on a direct
> buffer.
> 
> Signed-off-by: Hiroyuki Mikita <h.mikita89@gmail.com>
> ---
>  lib/librte_mbuf/rte_mbuf.h | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> index 529debb..3b117ca 100644
> --- a/lib/librte_mbuf/rte_mbuf.h
> +++ b/lib/librte_mbuf/rte_mbuf.h
> @@ -1468,9 +1468,11 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m)
>   */
>  static inline void rte_pktmbuf_detach(struct rte_mbuf *m)
>  {
> +	struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>  	struct rte_mempool *mp = m->pool;
>  	uint32_t mbuf_size, buf_len, priv_size;
>  
> +	rte_mbuf_refcnt_update(md, -1);
>  	priv_size = rte_pktmbuf_priv_size(mp);
>  	mbuf_size = sizeof(struct rte_mbuf) + priv_size;
>  	buf_len = rte_pktmbuf_data_room_size(mp);
> @@ -1498,7 +1500,7 @@ __rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
>  		if (RTE_MBUF_INDIRECT(m)) {
>  			struct rte_mbuf *md = rte_mbuf_from_indirect(m);
>  			rte_pktmbuf_detach(m);
> -			if (rte_mbuf_refcnt_update(md, -1) == 0)
> +			if (rte_mbuf_refcnt_read(md) == 0)
>  				__rte_mbuf_raw_free(md);
>  		}
>  		return m;
> 

Thanks for submitting this patch. I agree that rte_pktmbuf_attach()
and rte_pktmbuf_detach() are not symmetrical, but I think your patch
could trigger some race conditions.

Example:

- init: m, c1 and c2 are direct mbuf
- rte_pktmbuf_attach(c1, m);  # c1 becomes a clone of m
- rte_pktmbuf_attach(c2, m);  # c2 becomes another clone of m
- rte_pktmbuf_free(m);
- after that, we have:
  - m is a direct mbuf with refcnt = 2
  - c1 is an indirect mbuf pointing to data of m
  - c2 is an indirect mbuf pointing to data of m
- if we call rte_pktmbuf_free(c1) and rte_pktmbuf_free(c2) on 2
  different cores at the same time, m can be freed twice because
  (rte_mbuf_refcnt_read(md) == 0) can be true on both cores.

I think the proper way of doing would be to have rte_pktmbuf_detach()
returning the value of rte_mbuf_refcnt_update(md, -1), ensuring that
only one core will call _rte_mbuf_raw_free().

In the unit tests, in test_attach_from_different_pool(), the mbuf m
is never freed due to this behavior. That shows the current api is
a bit misleading. I think it should also be fixed in the patch.

Another issue is that it will break the API.
To avoid issues in applications relying on the current behavior of
rte_pktmbuf_detach(), I'd say we should keep the function as-is and
mark it as deprecated. Then, introduce a new function
rte_pktmbuf_detach2() (or any better name :) ) that changes the
behavior to what you suggest. An entry in the release note would
also be helpful.

The other option is to let things as-is and just document the behavior
of rte_pktmbuf_detach(), explicitly saying that it is not symmetrical
with the attach. But I'd prefer the first option.


Thoughts ?

Regards,
Olivier