From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 103C6A0514;
	Wed, 15 Jan 2020 13:52:18 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 2DC4E1BF80;
	Wed, 15 Jan 2020 13:52:17 +0100 (CET)
Received: from EUR04-DB3-obe.outbound.protection.outlook.com
 (mail-eopbgr60079.outbound.protection.outlook.com [40.107.6.79])
 by dpdk.org (Postfix) with ESMTP id 7FF101BF7E
 for <dev@dpdk.org>; Wed, 15 Jan 2020 13:52:15 +0100 (CET)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=GUAvqmKbgBiOqZ0BUj0k1tq6UymK8KlCU+N7bf2nuUQtPt4eDBU5wKok8a2HPd3a7wnhMJ6ruiaU2khrS3M/sIL50wAbjPdoVsxfcx5YtU6S/TO9/twRtytLVS5j8RtjFmC8F+7PxPShvk046q6WpbattQXqROEYoZLfdLoT462oRgSGSaSnX0v8Ek8X7jOjcIY7flFPyyF/XSBJZs3B9iUvAkfHacNa4q++QBfDAdp/HlDIVzrPCDXokz+agkcL3PcvMTKgWNxy2HpMregMH39oPGAAPDeB0Rlo0FmCyHglByfeulIANMplw9DWnvo5pwe33TOkp/V38imG33AGqQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=BOBaiUzkYqdwLbdSuRquvP5xR7b9Von+j1hM09LuPII=;
 b=UdCwRzs2v6cxIBnDUCUQWurVGsGcK67R/bXyO2Lu3TNqPvKOMLJM77HHhSZRJn7rmdoOUMyvRu2fgzoIkE6ORBKXRSWZsQQ1nlp/jKzRXZMleYmBVQiO6ghqvfP58hbXJ1wKZJD8/ZYUjzFGOVVu1tNXOD/Ec2LpKITFo3hQV08dKy2nLc1ZB6oSO8njahX8PUAI9xEtbCuyIMZGvphNs2cnKHnJw5qbaB4FFs7rE7rvxaBDFDWQHmLZtk/KkoCZfLGIGfzeNLseLm/hECQCu8jWClnCVTCiwGZf9HbgJakaPaEXiMWtae+IyYSGKliphPSrTKDtD+thPQw4qUNNGQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com;
 dkim=pass header.d=mellanox.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=BOBaiUzkYqdwLbdSuRquvP5xR7b9Von+j1hM09LuPII=;
 b=V9TZXTMsS5krCc0J4injUWhh7r8eQF4BpNUF6Kn4Et5TvaUHftUvdmKpehhCbQ3O46Vw6cnt5ZCowJ+JBiDs6l8+4UrQleCMpa9vlmbm+tTHiMmLvFtXphujQEIzOVkOaPGe/EH0FikZjH00MNNkOmXadT6yDkkRV5+ZYbzN6Nc=
Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.188.154) by
 AM4PR05MB3267.eurprd05.prod.outlook.com (10.171.188.28) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.2644.19; Wed, 15 Jan 2020 12:52:13 +0000
Received: from AM4PR05MB3265.eurprd05.prod.outlook.com
 ([fe80::68eb:ad79:71f4:110f]) by AM4PR05MB3265.eurprd05.prod.outlook.com
 ([fe80::68eb:ad79:71f4:110f%3]) with mapi id 15.20.2623.018; Wed, 15 Jan 2020
 12:52:13 +0000
From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: Olivier Matz <olivier.matz@6wind.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, Matan Azrad <matan@mellanox.com>, Raslan
 Darawsheh <rasland@mellanox.com>, Ori Kam <orika@mellanox.com>, Shahaf Shuler
 <shahafs@mellanox.com>, "stephen@networkplumber.org"
 <stephen@networkplumber.org>
Thread-Topic: [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
Thread-Index: AQHVyu8biJrXrE/1QU6oREG6CoHT6KfrqdGw
Date: Wed, 15 Jan 2020 12:52:13 +0000
Message-ID: <AM4PR05MB32654D9B4F6E3B01CAAFF3B0D2370@AM4PR05MB3265.eurprd05.prod.outlook.com>
References: <20191118094938.192850-1-shahafs@mellanox.com>
 <1578993305-15165-1-git-send-email-viacheslavo@mellanox.com>
 <1578993305-15165-2-git-send-email-viacheslavo@mellanox.com>
 <20200114152713.GS22738@platinum>
In-Reply-To: <20200114152713.GS22738@platinum>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=viacheslavo@mellanox.com; 
x-originating-ip: [95.164.10.10]
x-ms-publictraffictype: Email
x-ms-office365-filtering-ht: Tenant
x-ms-office365-filtering-correlation-id: c3416a1c-c3d4-4505-115b-08d799b9be57
x-ms-traffictypediagnostic: AM4PR05MB3267:|AM4PR05MB3267:
x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtFwd
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <AM4PR05MB32678DDD84193AFD7EE7FF2BD2370@AM4PR05MB3267.eurprd05.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-forefront-prvs: 02830F0362
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(4636009)(39860400002)(366004)(376002)(136003)(396003)(346002)(199004)(189003)(55016002)(6506007)(81156014)(9686003)(8936002)(53546011)(6916009)(66946007)(64756008)(66476007)(7696005)(26005)(66446008)(76116006)(5660300002)(81166006)(478600001)(71200400001)(316002)(2906002)(54906003)(52536014)(186003)(8676002)(4326008)(66556008)(33656002)(86362001);
 DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3267;
 H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en;
 PTR:InfoNoRecords; A:1; MX:1; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: yz1fhD306do7h5u1uLgvMPn9dtsiQcUXtQ5JVAw5tvPq2/Q4OuN2CqGSpIYosr6s/7VasxXw/h2z8+OJxVroQAjUaRS9Y4OBDrLFdRwmI0uMAvTxRw23H8VKiXILczjW3wxeAWWat2pnyVuYtW/9Aca4Ji0WEx2Gect8vs/p4Ctpy9TiCcNp8kotNr5869/qDJI6HjypUXvhvUYFRiqEtMAqPwUFfrCtjaflFM4fwobSkBBQ8wyuFhyQFJZACeIvVpdWfDWO09UOsLqGeVmjjT4h5ePo9m72htL3Ft0zpPNCkEWDcmbmVmHACBadoInP/YYTOeoDTOgmjX3IEouEa6bUyTuzmNc1jqjlHpI1g71a++qmxfpczsJswuy3AiZHmTOlZPu4Di7l6pcbZKqxakYuEYH2k3i2hPwL3BQhrPgfxupBMzZOeRdgcoUaWY51
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: c3416a1c-c3d4-4505-115b-08d799b9be57
X-MS-Exchange-CrossTenant-originalarrivaltime: 15 Jan 2020 12:52:13.6231 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: vOokD94NSZkhhJ819zM3Sl5Fi7Nm1q6CQsE8P3sSnjBVUiphpbCnrPI9dmjzcR1N42jCR9VEkN+5q49QcBOiXtWrbNL6nDoZSY10zxPjA04=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3267
Subject: Re: [dpdk-dev] [PATCH v3 1/4] mbuf: detach mbuf with pinned
	external buffer
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

> -----Original Message-----
> From: Olivier Matz <olivier.matz@6wind.com>
> Sent: Tuesday, January 14, 2020 17:27
> To: Slava Ovsiienko <viacheslavo@mellanox.com>
> Cc: dev@dpdk.org; Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Ori Kam <orika@mellanox.com>; Shahaf Shuler
> <shahafs@mellanox.com>; stephen@networkplumber.org
> Subject: Re: [PATCH v3 1/4] mbuf: detach mbuf with pinned external buffer
>=20
> Hi Viacheslav,
>=20
> Please see some comments below.
>=20
> On Tue, Jan 14, 2020 at 09:15:02AM +0000, Viacheslav Ovsiienko wrote:
> > Update detach routine to check the mbuf pool type.
> >
> > Signed-off-by: Shahaf Shuler <shahafs@mellanox.com>
> > Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 64
> > +++++++++++++++++++++++++++++++++++++++++++---
> >  1 file changed, 60 insertions(+), 4 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index 219b110..8f486af 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -306,6 +306,46 @@ struct rte_pktmbuf_pool_private {
> >  	uint32_t flags; /**< reserved for future use. */  };
> >
> > +/**
> > + * When set pktmbuf mempool will hold only mbufs with pinned external
> > + * buffer. The external buffer will be attached on the mbuf at the
> > + * memory pool creation and will never be detached by the mbuf free ca=
lls.
> > + * mbuf should not contain any room for data after the mbuf structure.
> > + */
> > +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0)
>=20
> Out of curiosity, why using a pool flag instead of flagging the mbufs?
> The reason I see is that adding a new m->flag would impact places where w=
e
> copy or reset the flags (it should be excluded). Is there another reason?
>=20
Can we introduce the new static flag for mbuf?
Yes, there is some problem - there are many places in DPDK where the flags
in new allocated mbufs are disregarded (ol_flags field is just set to zero)=
.
So, any flag set on allocation (even static, dynamic one is not possible to=
 handle at all)
would get lost. We could fix it in new application (this feature is address=
ed to the
new ones) and PMDs supporting this, the question is whether we are allowed =
to
define the new mbuf static (in meaning not dynamic) flag.

> > +/**
> > + * Returns TRUE if given mbuf has an pinned external buffer, or FALSE
> > + * otherwise. The pinned external buffer is allocated at pool
> > +creation
> > + * time and should not be freed.
> > + *
> > + * External buffer is a user-provided anonymous buffer.
> > + */
> > +#ifdef ALLOW_EXPERIMENTAL_API
> > +#define RTE_MBUF_HAS_PINNED_EXTBUF(mb)
> rte_mbuf_has_pinned_extbuf(mb)
> > +#else #define RTE_MBUF_HAS_PINNED_EXTBUF(mb) false #endif
>=20
> I suppose you added these lines because the compilation was broken after
> introducing the new __rte_experimental API, which is called from detach()=
.
>=20
> I find a bit strange that we require to do this. I don't see what would b=
e
> broken without the ifdef: an application compiled for 19.11 cannot use th=
e
> pinned-ext-buf feature (because it did not exist), so the modification lo=
oks
> safe to me.
Without ifdef compilation fails if there is no experimental API usage confi=
gured.

>=20
> > +
> > +__rte_experimental
> > +static inline uint32_t
>=20
> I don't think uint32_t is really better than uint64_t. I agree with Steph=
en that
> bool is the preferred choice, however if it breaks compilation in some ca=
ses, I
> think int is better.
Yes, bool causes compilation issues (just including stdbool.h causes build =
failure),
and, for example bool  is reserved gcc keyword if AltiVec support is enable=
d.
uint32_t is the type of priv->flags.=20

>=20
> > +rte_mbuf_has_pinned_extbuf(const struct rte_mbuf *m) {
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffer,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =3D
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(m->pool);
> > +
> > +		return priv->flags &
> RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF;
>=20
> What about introducing a rte_pktmbuf_priv_flags() function, on the same
> model than rte_pktmbuf_priv_size() or rte_pktmbuf_data_room_size()?

Nice idea, thanks. I think this routine can be not experimental and we woul=
d
get rid of ifdef and other stuff.

>=20
>=20
> > +	}
> > +	return 0;
> > +}
> > +
> >  #ifdef RTE_LIBRTE_MBUF_DEBUG
> >
> >  /**  check mbuf type in debug mode */ @@ -571,7 +611,8 @@ static
> > inline struct rte_mbuf *rte_mbuf_raw_alloc(struct rte_mempool *mp)
> > static __rte_always_inline void  rte_mbuf_raw_free(struct rte_mbuf *m)
> > {
> > -	RTE_ASSERT(RTE_MBUF_DIRECT(m));
> > +	RTE_ASSERT(!RTE_MBUF_CLONED(m) &&
> > +		  (!RTE_MBUF_HAS_EXTBUF(m) ||
> RTE_MBUF_HAS_PINNED_EXTBUF(m)));
> >  	RTE_ASSERT(rte_mbuf_refcnt_read(m) =3D=3D 1);
> >  	RTE_ASSERT(m->next =3D=3D NULL);
> >  	RTE_ASSERT(m->nb_segs =3D=3D 1);
> > @@ -1141,11 +1182,26 @@ static inline void rte_pktmbuf_detach(struct
> rte_mbuf *m)
> >  	uint32_t mbuf_size, buf_len;
> >  	uint16_t priv_size;
> >
> > -	if (RTE_MBUF_HAS_EXTBUF(m))
> > +	if (RTE_MBUF_HAS_EXTBUF(m)) {
> > +		/*
> > +		 * The mbuf has the external attached buffed,
> > +		 * we should check the type of the memory pool where
> > +		 * the mbuf was allocated from to detect the pinned
> > +		 * external buffer.
> > +		 */
> > +		struct rte_pktmbuf_pool_private *priv =3D
> > +			(struct rte_pktmbuf_pool_private *)
> > +				rte_mempool_get_priv(mp);
> > +
>=20
> It could be rte_pktmbuf_priv_flags() as said above.
>=20
> > +		if (priv->flags & RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF) {
> > +			RTE_ASSERT(m->shinfo =3D=3D NULL);
> > +			m->ol_flags =3D EXT_ATTACHED_MBUF;
> > +			return;
> > +		}
>=20
> I think it is not possible to have m->shinfo =3D=3D NULL (this comment is=
 also
> related to next patch, because shinfo init is done there). If you try to =
clone a
> mbuf that comes from an ext-pinned pool, it will crash. Here is the code =
from
> attach():
>=20
> 	static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct
> rte_mbuf *m)
> 	{
> 	        RTE_ASSERT(RTE_MBUF_DIRECT(mi) &&
> 	            rte_mbuf_refcnt_read(mi) =3D=3D 1);
>=20
> 	        if (RTE_MBUF_HAS_EXTBUF(m)) {
> 	                rte_mbuf_ext_refcnt_update(m->shinfo, 1); << HERE
> 	                mi->ol_flags =3D m->ol_flags;
> 	                mi->shinfo =3D m->shinfo;
> 		...
>=20
> The 2 alternatives I see are:
>=20
> - do not allow to clone these mbufs, but today there is no return value
>   to attach() functions, so there is no way to know if it failed or not
>=20
> - manage shinfo to support clones
>=20
> I think just ignoring the rte_mbuf_ext_refcnt_update() if shinfo =3D=3D N=
ULL is not
> an option, because if could result in recycling the extbuf while a clone =
still
> references it.
>=20
>=20
The clone for the mbufs with pinned buffers is not supposed at all, this wa=
s
chosen as development precondition. We can't touch the buf_adr/iova_addr fi=
elds in mbuf,
because there is no way to restore these pointers (nomore fixed offset betw=
een mbuf and data  buffer),
so rte_mbuf_detach() would be not operational at all.

Also, we can't deduce the mbuf base address from data buffer address.=20
We could add RTE_ASSERT to prevent attaching (to) mbufs with pinned data, w=
hat do
you think?

> >  		__rte_pktmbuf_free_extbuf(m);
> > -	else
> > +	} else {
> >  		__rte_pktmbuf_free_direct(m);
> > -
> > +	}
> >  	priv_size =3D rte_pktmbuf_priv_size(mp);
> >  	mbuf_size =3D (uint32_t)(sizeof(struct rte_mbuf) + priv_size);
> >  	buf_len =3D rte_pktmbuf_data_room_size(mp);
> > --
> > 1.8.3.1
> >