From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 66ABEA0351; Mon, 18 Nov 2019 10:50:10 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6F626B62; Mon, 18 Nov 2019 10:50:09 +0100 (CET) Received: from EUR03-DB5-obe.outbound.protection.outlook.com (mail-eopbgr40066.outbound.protection.outlook.com [40.107.4.66]) by dpdk.org (Postfix) with ESMTP id 8644C1F5 for ; Mon, 18 Nov 2019 10:50:08 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=aCcmbmtP1xSaCpfnbWZfW98uZU1J8BC0WtJ3yXE/ICH6LyKaWZ7gUSuqzhJtMWbqpksrXhlV1ynh17sdJEm/9DeYJO4NQzimZ3TFPV3oqvEADIMq6tH6lEFdxDRiGqzV3UF1OOOi4fGtQyWJ0raF0dASojzWkVHMl1gp/nfLUFS+4qISBJNPrgoDxZsyqkkyIk7O5e/YLcx/5+fGLlrlqEQtq6JEqxPTsJgRKD4pYZhf/F7iSmNloJYEg3QOmYW93KJPDnO4UJmsh0cf/V4rECYqGh3XAx6TsDR1K1AHDRbQKg0UQFh5w8iglG3Wr70s0MpIPJ72us07pD90Mqe/Tg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gAmC2pj3WD+O0a36E/84Du1UZWclPG7unVpQDAqRR7s=; b=IgbWEiheC8TPKCpETcRsds32zfe5VGeMHn8jMwcpYlR4itmPoq8J0yDEW0Xx0XdtgaZm/XAqhGqjZfm5RSZ14ViIyM7cBBp4D7XEC7rtSwTMJW5NiiN+D6zfezRwCuqfx8pBxFh8GlRUFwuGuVlQC5s5BvUj1TSYpqS7wKDcZ15S9WMHNXI1GFN2pYXmK9lh81mVwWe305SuR8RpD5RUyRB65P6PPIQOjwueX4wknDF8sZ2Wnfj2Cx6AzmyUkEiV53mtkxqeVm3gpCLbmMjIcYi9K881N67Mpw2bJ7EHkJUcUXZIGdVACAvowd8Hu+PLrh7T+G9K/D7coNjKBNvZbg== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=mellanox.com; dmarc=pass action=none header.from=mellanox.com; dkim=pass header.d=mellanox.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=gAmC2pj3WD+O0a36E/84Du1UZWclPG7unVpQDAqRR7s=; b=ePGC2CMqL+73QRGf2QSfMPOnWaBvXoP7hDn99bM/UO7z0QVLp8TtU35V8+OZEZcTDtdPoi/FhuQ/Lt5T+XTJB3G1dzNjHLWGMWeZ5W9NEr13f6kLdnYnLdDNTtmVsSAYkM0VJ77jbVG7MGn5RXG7TjIAzitY8MkvQKMBCnovG44= Received: from AM0PR0502MB3795.eurprd05.prod.outlook.com (52.133.45.150) by AM0PR0502MB3635.eurprd05.prod.outlook.com (52.133.46.140) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2451.30; Mon, 18 Nov 2019 09:50:07 +0000 Received: from AM0PR0502MB3795.eurprd05.prod.outlook.com ([fe80::c9c0:7e1c:6dae:7e4d]) by AM0PR0502MB3795.eurprd05.prod.outlook.com ([fe80::c9c0:7e1c:6dae:7e4d%4]) with mapi id 15.20.2451.029; Mon, 18 Nov 2019 09:50:07 +0000 From: Shahaf Shuler To: "olivier.matz@6wind.com" , Thomas Monjalon , "dev@dpdk.org" , "arybchenko@solarflare.com" CC: Asaf Penso , Olga Shern , Alex Rosenbaum , "eagostini@nvidia.com" Thread-Topic: [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers Thread-Index: AQHVnfWPYsHcGt5DkE+XLyfZbYYUDA== Date: Mon, 18 Nov 2019 09:50:07 +0000 Message-ID: <20191118094938.192850-1-shahafs@mellanox.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-mailer: git-send-email 2.12.0 x-clientproxiedby: AM0PR10CA0013.EURPRD10.PROD.OUTLOOK.COM (2603:10a6:208:17c::23) To AM0PR0502MB3795.eurprd05.prod.outlook.com (2603:10a6:208:1b::22) authentication-results: spf=none (sender IP is ) smtp.mailfrom=shahafs@mellanox.com; x-ms-exchange-messagesentrepresentingtype: 1 x-originating-ip: [94.188.199.18] x-ms-publictraffictype: Email x-ms-office365-filtering-ht: Tenant x-ms-office365-filtering-correlation-id: 546d1751-0d81-4019-00b6-08d76c0cb19f x-ms-traffictypediagnostic: AM0PR0502MB3635:|AM0PR0502MB3635: x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-ms-exchange-transport-forked: True x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:9508; x-forefront-prvs: 0225B0D5BC x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(366004)(136003)(376002)(39860400002)(396003)(346002)(189003)(199004)(71200400001)(2616005)(14444005)(305945005)(25786009)(99286004)(71190400001)(66556008)(64756008)(66946007)(66476007)(66446008)(86362001)(2201001)(7736002)(1076003)(5660300002)(476003)(6486002)(4326008)(186003)(6116002)(6436002)(52116002)(8936002)(26005)(6512007)(486006)(6506007)(386003)(2906002)(66066001)(102836004)(36756003)(110136005)(478600001)(50226002)(2501003)(14454004)(316002)(5024004)(8676002)(81156014)(81166006)(3846002)(54906003)(256004); DIR:OUT; SFP:1101; SCL:1; SRVR:AM0PR0502MB3635; H:AM0PR0502MB3795.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: gO0XY2qESqzzpiCQoELSHeaP7WoZ0Ahw1Qhzq6jhjsII+i6ap8KDHI/wN2DsVQkqW6r97nFjYrEbS8De3IFGgKB23JMxt8d34JcRbPl7S6VV0r4dO9eiOEoKFDo/ba2kFnY/xwwTC1b9zMmTubYpHNwg+eKzZxUMMvnvJjwrSUinsUqDqzi9TWozK0m/tfdXaoxea5/vX+ZCzwvXbANjx7gSvTTvpxDOwgUbzzPmIxYT8LvOD3GVlRLlfvkYYKDgE/BrsPQLI5EAvXTttLG3ez7jOGHgDjW6AMvSbSk8DqfIJAOZJMJIpjDr9bNDiaDhWaGt/uYaimDBkwIfCWw913kfTkt9jSAEPHOoZ4FbJE24KDsD72ikAa1autXd6cOS1IkFPuIejtAdFwGPJRGxTNRzF6AHjFPPiOmQU/SzjOC99dH9kDdM2EUqjZK1FZ3s Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 546d1751-0d81-4019-00b6-08d76c0cb19f X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Nov 2019 09:50:07.4088 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: QQglG9LmMJTjF3rJ1pmssc9MB1TrKAG/jGwCakN6rrSx92xUniNIl+sGZGM0U1euu0cLCYmhtZYbNasvKxuAXA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM0PR0502MB3635 Subject: [dpdk-dev] [RFC v20.20] mbuf: introduce pktmbuf pool with pinned external buffers X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Today's pktmbuf pool contains only mbufs with no external buffers. This means data buffer for the mbuf should be placed right after the mbuf structure (+ the private data when enabled). On some cases, the application would want to have the buffers allocated from a different device in the platform. This is in order to do zero copy for the packet directly to the device memory. Examples for such devices can be GPU or storage device. For such cases the native pktmbuf pool does not fit since each mbuf would need to point to external buffer. To support above, the pktmbuf pool will be populated with mbuf pointing to the device buffers using the mbuf external buffer feature. The PMD will populate its receive queues with those buffer, so that every packet received will be scattered directly to the device memory. on the other direction, embedding the buffer pointer to the transmit queues of the NIC, will make the DMA to fetch device memory using peer to peer communication. Such mbuf with external buffer should be handled with care when mbuf is freed. Mainly The external buffer should not be detached, so that it can be reused for the next packet receive. This patch introduce a new flag on the rte_pktmbuf_pool_private structure to specify this mempool is for mbuf with pinned external buffer. Upon detach this flag is validated and buffer is not detached. A new mempool create wrapper is also introduced to help application to create and populate such mempool. Signed-off-by: Shahaf Shuler --- lib/librte_mbuf/rte_mbuf.h | 75 ++++++++++++++++++++++++++++++++++++++++++= ---- 1 file changed, 69 insertions(+), 6 deletions(-) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 92d81972ab..e631dfff30 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -295,6 +295,13 @@ rte_mbuf_to_priv(struct rte_mbuf *m) } =20 /** + * When set pktmbuf mempool will hold only mbufs with pinned external buff= er. + * The external buffer will be attached on the mbuf creation and will not = be + * detached by the mbuf free calls. + * mbuf should not contain any room for data after the mbuf structure. + */ +#define RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF (1 << 0) +/** * Private data in case of pktmbuf pool. * * A structure that contains some pktmbuf_pool-specific data that are @@ -303,6 +310,7 @@ rte_mbuf_to_priv(struct rte_mbuf *m) struct rte_pktmbuf_pool_private { uint16_t mbuf_data_room_size; /**< Size of data space in each mbuf. */ uint16_t mbuf_priv_size; /**< Size of private area in each mbuf. */ + uint32_t flags; /**< Use RTE_PKTMMBUF_POOL_F_*. */ }; =20 #ifdef RTE_LIBRTE_MBUF_DEBUG @@ -660,6 +668,50 @@ rte_pktmbuf_pool_create(const char *name, unsigned n, int socket_id); =20 /** + * Create a mbuf pool with pinned external buffers. + * + * This function creates and initializes a packet mbuf pool that contains + * only mbufs with external buffer. It is a wrapper to rte_mempool functio= ns. + * + * @param name + * The name of the mbuf pool. + * @param n + * The number of elements in the mbuf pool. The optimum size (in terms + * of memory usage) for a mempool is when n is a power of two minus one: + * n =3D (2^q - 1). + * @param cache_size + * Size of the per-core object cache. See rte_mempool_create() for + * details. + * @param priv_size + * Size of application private are between the rte_mbuf structure + * and the data buffer. This value must be aligned to RTE_MBUF_PRIV_ALIG= N. + * @param socket_id + * The socket identifier where the mempool memory should be allocated. T= he + * value can be *SOCKET_ID_ANY* if there is no NUMA constraint for the + * reserved zone. + * @param buffers + * Array of buffers to be attached to the mbufs in the pool. + * Array size should be n. + * @param buffers_len + * Array of buffer length. buffers_len[i] describes the length of a buff= er + * pointed by buffer[i]. + * @return + * The pointer to the new allocated mempool, on success. NULL on error + * with rte_errno set appropriately. Possible rte_errno values include: + * - E_RTE_NO_CONFIG - function could not get pointer to rte_config str= ucture + * - E_RTE_SECONDARY - function was called from a secondary process ins= tance + * - EINVAL - cache size provided is too large, or priv_size is not ali= gned. + * - ENOSPC - the maximum number of memzones has already been allocated + * - EEXIST - a memzone with the same name already exists + * - ENOMEM - no appropriate memory area found in which to create memzo= ne + */ +struct rte_mempool * +rte_pktmbuf_ext_buffer_pool_create(const char *name, unsigned n, + unsigned cache_size, uint16_t priv_size, + int socket_id, void **buffers, + uint16_t *buffer_len); + +/** * Create a mbuf pool with a given mempool ops name * * This function creates and initializes a packet mbuf pool. It is @@ -1137,25 +1189,36 @@ __rte_pktmbuf_free_direct(struct rte_mbuf *m) static inline void rte_pktmbuf_detach(struct rte_mbuf *m) { struct rte_mempool *mp =3D m->pool; + struct rte_pktmbuf_pool_private *priv =3D + (struct rte_pktmbuf_pool_private *)rte_mempool_get_priv(mp); + uint8_t pinned_ext_mbuf =3D priv->flags & + RTE_PKTMBUF_POOL_F_PINNED_EXT_BUF; uint32_t mbuf_size, buf_len; uint16_t priv_size; =20 - if (RTE_MBUF_HAS_EXTBUF(m)) - __rte_pktmbuf_free_extbuf(m); - else + if (RTE_MBUF_HAS_EXTBUF(m)) { + if (pinned_ext_mbuf) { + m->ol_flags =3D EXT_ATTACHED_MBUF; + goto reset_data; + } else { + __rte_pktmbuf_free_extbuf(m); + } + } else { __rte_pktmbuf_free_direct(m); + } =20 - priv_size =3D rte_pktmbuf_priv_size(mp); + priv_size =3D priv->mbuf_priv_size; mbuf_size =3D (uint32_t)(sizeof(struct rte_mbuf) + priv_size); - buf_len =3D rte_pktmbuf_data_room_size(mp); + buf_len =3D priv->mbuf_data_room_size; =20 m->priv_size =3D priv_size; m->buf_addr =3D (char *)m + mbuf_size; m->buf_iova =3D rte_mempool_virt2iova(m) + mbuf_size; m->buf_len =3D (uint16_t)buf_len; + m->ol_flags =3D 0; +reset_data: rte_pktmbuf_reset_headroom(m); m->data_len =3D 0; - m->ol_flags =3D 0; } =20 /** --=20 2.12.0