From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00053.outbound.protection.outlook.com [40.107.0.53]) by dpdk.org (Postfix) with ESMTP id 7CD15AAE8 for ; Fri, 27 Apr 2018 19:23:15 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=yGPN5eXjVWHlrffu42dkti0teBgIhovTU+nEITJzOxk=; b=hLL3S6dEBIoKLrhYj2sYNKjUChZHydvcdRiqdpnFSTcsqK6xV4+uWCEIB+/+nBb8QEpusIDmf+3/wTohDQ11vbKmCD9KmpyeOgkpC+4RULq8yfaECOQUGAXZeaB0p0zeKwCUXN+3E8i3vP3KYHxtDnrhpwifllgDdqO/gpXoEwc= Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=yskoh@mellanox.com; Received: from mellanox.com (209.116.155.178) by HE1PR0501MB2043.eurprd05.prod.outlook.com (2603:10a6:3:35::21) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.696.15; Fri, 27 Apr 2018 17:23:09 +0000 From: Yongseok Koh To: wenzhuo.lu@intel.com, jingjing.wu@intel.com, olivier.matz@6wind.com Cc: dev@dpdk.org, konstantin.ananyev@intel.com, arybchenko@solarflare.com, stephen@networkplumber.org, thomas@monjalon.net, adrien.mazarguil@6wind.com, nelio.laranjeiro@6wind.com, Yongseok Koh Date: Fri, 27 Apr 2018 10:22:51 -0700 Message-Id: <20180427172252.8153-1-yskoh@mellanox.com> X-Mailer: git-send-email 2.11.0 In-Reply-To: <20180310012532.15809-1-yskoh@mellanox.com> References: <20180310012532.15809-1-yskoh@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [209.116.155.178] X-ClientProxiedBy: CY4PR13CA0011.namprd13.prod.outlook.com (2603:10b6:903:32::21) To HE1PR0501MB2043.eurprd05.prod.outlook.com (2603:10a6:3:35::21) X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:HE1PR0501MB2043; X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 3:4jWbVMJ9Xrpl5y9cKMujz+eI3lRiA/Oio9T68KuUidkdf9ZHID8y7Z8tkHvfJFCCojzHuAzJGEAi3Pwbyndb6anNlCA+z27AVgZaqSr30AlIYgyFvFkFhG/T9XzOmdGZ40sdWkqyICb8EBA9C3BYOf0eHp+runtbjj+I1bMbbHWAzTMQvJXlpjvHSJ0qcIzF3FZ+OsPTZs2Vxs9jzFLj/idp9kPJMiEbWJZibzvtSS5X/01UkKVuirbq7UcHMuWC; 25:k0XgJS6n3fqh7r+WCdmCj0uAiVlKtyEl6nnWGOkjdPPebmqSFAmL+ug11f6FeADhzSJcbdU58Vb5IATSoanIy60b4snq8Dycsyp3BaxmszO1v11qc6AJ2kO9wkeXGUNKjTObZ0s3sSOm0R1asChOgNjkcAAXFI4qIsVJvJr89rSf33WfqB0W/h4/uyAyspJPGso4wpy5iy3RjFhG2adpy+MzP9+TxrZ9jpy+wxSwa3E2+CBS5fv1shO3VLhWQEbZhuaErm1JBsdTmuknK74jJPZ1wOfsrPifqzbwKXPCyMxTYEF0NNWhfozorns0cQKh00QkQQbQiuksVP1IefQv7g==; 31:JsSnfUKDwniff0Y3Rheiz5/YhR++ew/uV70+yX+Otcr89l5k0lTClCL7vuPk0541+UqqjavSdUvAMDMIP/MdcPM6WHkxactMrlaoVGfPHj7+WYZZkYhdV60dI3ijlqp64dhcUcMZXEzFwO/NidCLWmQgXg9n+XgEpB0k+fgNUIVyJC6ZjL9ZdR4DvV0Hm7hpY9DT/yJcnUfbis36osbwlH7avhMyI8x3Zd9eZeEAWTA= X-MS-TrafficTypeDiagnostic: HE1PR0501MB2043: X-LD-Processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 20:V6/Zfxb0yMyGfcFff+u+csHk9BnDbQ7x5jIMi+0tfpPAJG8MPAD61racybqoMObfDcU9QAsX0YnTabbkU7mtu/ImrtlRVQSq/GxB/rLFu6kk/ss7vzKABKB6kOOWRdzEAAOOWbem98fckm4uV2hKOCMlBgc3+2tEctyiIlP09CAf6o48t2+c/mYPyz7dPuAsTPTFJP3FT/IK0399Xh9U0ug0CyuP9ra0gimsMKyLQwivebVBKqBjIAN7Um2f0ZsiwKWhyK43llCFlCuVVgsFNOhfJA5dKMGajPvkA5+Y8ww3e6EJBXQ/4gYhEXGk/m/laXqp+ewC/17mjgzVvXaUgJcuSul2Zubl9+HOLGSPJSMZww8KCsSlm3PshKpWi2qr6qw/cTgK7MCXxMQWqr+mXnsZg1gkNKxj+7wd9tswbS2GiJNHl01U5vuT55nTgJRdq9tlpafJmQaML1RqtxFYESdLljIY2o1ZFYNp+GEONqBTCB44FKpRQjSWmCf5KYDT; 4:KIR0at7FqZbHSmFYYZZyW1/2xJBgC9SSzdpUg6Ot5RFXtj/ZN/dSq6PshngRW9Z4BfBy5L2E4IC4VngkGqaL5QERJBG1/fyJTwF0vDCFf0n+Irk7hzVfHTe/4Nzte1GolxoZGib36PNPI2J8Lny1ctxMWlkfCgIV6eovouOTtnoVudGK8/zULDQ4+hQZm0H3FMt+dsY6Pa5VrZbrLLueYDlLXHuEGnxmcuLGqpyFl7A/tDi+OzB4D8w4ODln3NGg0QkTPi63zlGYOj2Ani17VSamM+mVisv3jly3hfjEiLuZL8pmmkSFb210LAex+Xyc X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(228905959029699); X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3002001)(93006095)(93001095)(10201501046)(3231232)(944501410)(52105095)(6055026)(6041310)(20161123562045)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123558120)(6072148)(201708071742011); SRVR:HE1PR0501MB2043; BCL:0; PCL:0; RULEID:; SRVR:HE1PR0501MB2043; X-Forefront-PRVS: 0655F9F006 X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10009020)(39860400002)(396003)(39380400002)(366004)(376002)(346002)(189003)(199004)(4326008)(26005)(6666003)(53936002)(50466002)(21086003)(69596002)(97736004)(186003)(446003)(107886003)(956004)(55016002)(486006)(2906002)(86362001)(575784001)(50226002)(16526019)(36756003)(51416003)(76176011)(68736007)(59450400001)(48376002)(81156014)(81166006)(305945005)(386003)(1076002)(8676002)(2616005)(8936002)(476003)(7696005)(3846002)(6116002)(106356001)(105586002)(52116002)(5890100001)(7736002)(5660300001)(7416002)(478600001)(47776003)(316002)(11346002)(66066001)(16586007)(25786009); DIR:OUT; SFP:1101; SCL:1; SRVR:HE1PR0501MB2043; H:mellanox.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; MX:1; A:1; Received-SPF: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1; HE1PR0501MB2043; 23:ezgQ5n9y/nP/J4R4uGQfzMZPaQ+CVa0XRmjlhSf?= =?us-ascii?Q?sg/SO0TpjpDRBUz9SWZ6W8sRg1O8IfEdu+G9RNlK0UBOq69PSxch6jdxB0bP?= =?us-ascii?Q?WzKMazZ1EOsp7QyVQ5lHtxqCJ/TWDmKC2ZGCSv1ZSakqD5+Ww10ELeiMgrlW?= =?us-ascii?Q?tYK1h1FqOzIUXVin6AQXuXzW0lYHiBDoZruP8/gdTrayauOLer6UB7mFHr3r?= =?us-ascii?Q?2bN7D45qPVGsMkUlWYb51OyaHmkN1qNkbKVoaqALgAHKUpWe7WTdqopHdEkH?= =?us-ascii?Q?Bn7nzsCyTA5U4KCpo2GunDkiA4IPwAg4d6d6ip+tEkfFHSkb9anUiRFgHJG+?= =?us-ascii?Q?DDaeQB4UxfNdRapLism9gOu0eXy1z/8dg1IkFFMGUCx0aWxf3YDIW209+hjl?= =?us-ascii?Q?daAX4NVDqjVPZAPu792uVInMROKv6KfzODG1EwDQ0cuIy0dEpndKf74NdH+o?= =?us-ascii?Q?pmPuEIKrvo8slGXpxISH06PmlMQcc0xOIo1CkTww+FVRgy3rSaM2EddpNzqD?= =?us-ascii?Q?h3jWuywKDaifMzf5xi9ytTtJRi4eLD/WRJ06b+xdCi98XKyksoEp4w93STfc?= =?us-ascii?Q?zPdezjjk0Xymv9U+janAqYkU1dmuzIL72uusFpA1VdTZaTiUp1416mTFJmq3?= =?us-ascii?Q?krHE8SGzFEq4EfgReMj485CekEGqCCGmaslJ0KvEI2hCwkh2gJ8J5dGFyfWp?= =?us-ascii?Q?g8m2gKR4nXQMopn3M6nkqtSYgqn5I4mxNln6ntd6tvZvFVUyEd0V9X4Mc7yj?= =?us-ascii?Q?YVNq9AXJHL1eJrfSi/dSFkWhIW/tzT0RAyoyIP+wy8+4ls9hygYx8mJgx+oh?= =?us-ascii?Q?/8sYxNx8EtE9HuiXz07el9T7yjzhmnqUR9An8ldxdzty96eXviwJ3axI/goA?= =?us-ascii?Q?Gb8c5VVRRx7N9NFFMH1sAkUyWXOBrOotKogcyv2HjgHjwS2LMAgebQSZUlSK?= =?us-ascii?Q?lz1NzYY/giQZRn55eDYL+PgYcSjPHEPPdxhkCa4h3wQ2XyD5+EGYhxCTUjk9?= =?us-ascii?Q?Qk2m0SrP5C9GlHG4nOcX+HccP8i8Oair0dcD7raOGnOzUh/YIygl3SjHfTVv?= =?us-ascii?Q?KpDzQxTn7ya19DG91SDLjQAnLVsTz8jDddr+tdp6rO/ZhOW19mrapf0O2P2x?= =?us-ascii?Q?vUNBJ72Juw+VHW8PnZS5NcU6DlmBfmAQmpxPZJgzdKMXQ56YNt7wFjaSssYX?= =?us-ascii?Q?Gl6rVGfIoy4imUPvlDuDrS2OMBZU8uCuFFu0qvrBP+iInYBzzssR+wTa24oR?= =?us-ascii?Q?Pz5OUH4TLcw8ZW7YwNiuQlGZ8Q1deLccW8wPUZYGaBOvvtU9wqs+IBYjkOYF?= =?us-ascii?Q?JCNOHQeN/7t7/5GlP7RV4SdM=3D?= X-Microsoft-Antispam-Message-Info: +CRUiK3690gSQaYJSeHi3wej6XHk47M0n+uRmtzJJJZjE2b266eoWfHgO0I1YSCKwPdG0nAYAYo5USNynrYoSxYbm2Y8ASsM5mEy13VgvLZECyoA3PUXFnPoYOjUb2ohVbeEeGmrr12u5E6i/hW1ddhbPKtHAtKNIxmBMQ5esxsXCgpv5xG8fhLl89e52BcK X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 6:Cfrc1pjfWMq7FjYBMjfvCv/+Y0bNq05i1U55Nu8AnNyS2ttNdiw7hPq3Lpg1XjkeQ5n+GySZve+u+uVAFylODCYKHLJeUkTL7udZpRV8X0/Kqt2qHdb8keF+GrotbRbWN7E7p0zSp2YvqeQWNymjZxZOfetMmeLE6ZdN+3Cdw0JUOwG13iJzk0pKu1t5Xm4/tXMK9AIh2iRkupMUfbIS5zA4Ts6MlHOmiLOwquBJuQ4ew8B+erygxjfVyxvRUnNzLpuM9IHSzBMUA04cFS6MVXTXjEczj06m4gqeDX5NI0nVSRmlTC+zldGzYIUmsyEXgy3LU/E2bQSk79U3RpSiZt1nG4+CZXZRxvev5wN/SH7mAI3oMjLsPhPyH2pKDvid1HnIRCplDHo5pme0BrtZTDiVFw0e1zSX3ZrF3fw3boZMLY6EBmcmf4+YPPPzpyqCAYrt8Um2tohY1GUwzFdFng==; 5:fUqt6ELurwrBnlheF6tcwQM5Sz9yZs27DrH6o6KeAcaS7D0qvn8bx06jscnbcqzsGml4YljwgYMxCNPsZMc7XaLjbOldGLDE0UHGz+WfHFCvdoKda25yAqeKcb7A4KYxNAGwEhFhO80sNmrnRLiCSACUileFvKb2rXnhm6TePbM=; 24:bVH9Z/rSPiCKFTrqR4jsmlb4VnZGU7JqvD58ANpg5kUsc2n4r0c7BE7zCUlo0Fs5HIdMlPn09jcxx4eQNc34MTeckoLgAT06a0cfC/+P4Vg= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1; HE1PR0501MB2043; 7:O5aR0A7eRKbad0S2vD3J0qgt3iS+WQSC7YvwYG8XjfSJ/6cPHuUGTTKmqas5TehJHgmPudM2iU3DYBBOUHrzy84Tm3wizdwv+gYorNl27dSP97ifRH+rMxEVU35gZAVP1pMEGRKEnJ8hlRnrqIynsEz0P8jN7ONH0yLG3QsI7tpH7A0VTd46rQhpCFhw2Z4WL4E9e/YcckEbECy0fnNBe6gI8DxWzd46BeNiSsNkqn/y1ywRoJIMW9HQplLhDTgZ X-MS-Office365-Filtering-Correlation-Id: 3a24a83d-d13b-4684-035a-08d5ac638d74 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 27 Apr 2018 17:23:09.3347 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 3a24a83d-d13b-4684-035a-08d5ac638d74 X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-CrossTenant-Id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: HE1PR0501MB2043 Subject: [dpdk-dev] [PATCH v8 1/2] mbuf: support attaching external buffer to mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 27 Apr 2018 17:23:15 -0000 This patch introduces a new way of attaching an external buffer to a mbuf. Attaching an external buffer is quite similar to mbuf indirection in replacing buffer addresses and length of a mbuf, but a few differences: - When an indirect mbuf is attached, refcnt of the direct mbuf would be 2 as long as the direct mbuf itself isn't freed after the attachment. In such cases, the buffer area of a direct mbuf must be read-only. But external buffer has its own refcnt and it starts from 1. Unless multiple mbufs are attached to a mbuf having an external buffer, the external buffer is writable. - There's no need to allocate buffer from a mempool. Any buffer can be attached with appropriate free callback. - Smaller metadata is required to maintain shared data such as refcnt. Signed-off-by: Yongseok Koh Acked-by: Konstantin Ananyev Acked-by: Olivier Matz Acked-by: Andrew Rybchenko --- Deprecation of RTE_MBUF_INDIRECT() will follow after integration of this patch. ** This patch can pass the mbuf_autotest. ** Submitting only non-mlx5 patches to meet deadline for RC1. mlx5 patches will be submitted separately rebased on a differnet patchset which accommodates new memory hotplug design to mlx PMDs. v8: * NO CHANGE. v7: * make buf_len param [in,out] in rte_pktmbuf_ext_shinfo_init_helper(). * a minor change from review. v6: * rte_pktmbuf_attach_extbuf() doesn't take NULL shinfo. Instead, rte_pktmbuf_ext_shinfo_init_helper() is added. * bug fix in rte_pktmbuf_attach() - shinfo wasn't saved to mi. * minor changes from review. v5: * rte_pktmbuf_attach_extbuf() sets headroom to 0. * if shinfo is provided when attaching, user should initialize it. * minor changes from review. v4: * rte_pktmbuf_attach_extbuf() takes new arguments - buf_iova and shinfo. user can pass memory for shared data via shinfo argument. * minor changes from review. v3: * implement external buffer attachment instead of introducing buf_off for mbuf indirection. lib/librte_mbuf/rte_mbuf.h | 337 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 308 insertions(+), 29 deletions(-) diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h index 0cd6a1c6b..4fd9a0d9e 100644 --- a/lib/librte_mbuf/rte_mbuf.h +++ b/lib/librte_mbuf/rte_mbuf.h @@ -345,7 +345,10 @@ extern "C" { PKT_TX_MACSEC | \ PKT_TX_SEC_OFFLOAD) -#define __RESERVED (1ULL << 61) /**< reserved for future mbuf use */ +/** + * Mbuf having an external buffer attached. shinfo in mbuf must be filled. + */ +#define EXT_ATTACHED_MBUF (1ULL << 61) #define IND_ATTACHED_MBUF (1ULL << 62) /**< Indirect attached mbuf */ @@ -585,8 +588,27 @@ struct rte_mbuf { /** Sequence number. See also rte_reorder_insert(). */ uint32_t seqn; + /** Shared data for external buffer attached to mbuf. See + * rte_pktmbuf_attach_extbuf(). + */ + struct rte_mbuf_ext_shared_info *shinfo; + } __rte_cache_aligned; +/** + * Function typedef of callback to free externally attached buffer. + */ +typedef void (*rte_mbuf_extbuf_free_callback_t)(void *addr, void *opaque); + +/** + * Shared data at the end of an external buffer. + */ +struct rte_mbuf_ext_shared_info { + rte_mbuf_extbuf_free_callback_t free_cb; /**< Free callback function */ + void *fcb_opaque; /**< Free callback argument */ + rte_atomic16_t refcnt_atomic; /**< Atomically accessed refcnt */ +}; + /**< Maximum number of nb_segs allowed. */ #define RTE_MBUF_MAX_NB_SEGS UINT16_MAX @@ -707,14 +729,34 @@ rte_mbuf_to_baddr(struct rte_mbuf *md) } /** + * Returns TRUE if given mbuf is cloned by mbuf indirection, or FALSE + * otherwise. + * + * If a mbuf has its data in another mbuf and references it by mbuf + * indirection, this mbuf can be defined as a cloned mbuf. + */ +#define RTE_MBUF_CLONED(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF) + +/** * Returns TRUE if given mbuf is indirect, or FALSE otherwise. */ -#define RTE_MBUF_INDIRECT(mb) ((mb)->ol_flags & IND_ATTACHED_MBUF) +#define RTE_MBUF_INDIRECT(mb) RTE_MBUF_CLONED(mb) + +/** + * Returns TRUE if given mbuf has an external buffer, or FALSE otherwise. + * + * External buffer is a user-provided anonymous buffer. + */ +#define RTE_MBUF_HAS_EXTBUF(mb) ((mb)->ol_flags & EXT_ATTACHED_MBUF) /** * Returns TRUE if given mbuf is direct, or FALSE otherwise. + * + * If a mbuf embeds its own data after the rte_mbuf structure, this mbuf + * can be defined as a direct mbuf. */ -#define RTE_MBUF_DIRECT(mb) (!RTE_MBUF_INDIRECT(mb)) +#define RTE_MBUF_DIRECT(mb) \ + (!((mb)->ol_flags & (IND_ATTACHED_MBUF | EXT_ATTACHED_MBUF))) /** * Private data in case of pktmbuf pool. @@ -840,6 +882,58 @@ rte_mbuf_refcnt_set(struct rte_mbuf *m, uint16_t new_value) #endif /* RTE_MBUF_REFCNT_ATOMIC */ +/** + * Reads the refcnt of an external buffer. + * + * @param shinfo + * Shared data of the external buffer. + * @return + * Reference count number. + */ +static inline uint16_t +rte_mbuf_ext_refcnt_read(const struct rte_mbuf_ext_shared_info *shinfo) +{ + return (uint16_t)(rte_atomic16_read(&shinfo->refcnt_atomic)); +} + +/** + * Set refcnt of an external buffer. + * + * @param shinfo + * Shared data of the external buffer. + * @param new_value + * Value set + */ +static inline void +rte_mbuf_ext_refcnt_set(struct rte_mbuf_ext_shared_info *shinfo, + uint16_t new_value) +{ + rte_atomic16_set(&shinfo->refcnt_atomic, new_value); +} + +/** + * Add given value to refcnt of an external buffer and return its new + * value. + * + * @param shinfo + * Shared data of the external buffer. + * @param value + * Value to add/subtract + * @return + * Updated value + */ +static inline uint16_t +rte_mbuf_ext_refcnt_update(struct rte_mbuf_ext_shared_info *shinfo, + int16_t value) +{ + if (likely(rte_mbuf_ext_refcnt_read(shinfo) == 1)) { + rte_mbuf_ext_refcnt_set(shinfo, 1 + value); + return 1 + value; + } + + return (uint16_t)rte_atomic16_add_return(&shinfo->refcnt_atomic, value); +} + /** Mbuf prefetch */ #define RTE_MBUF_PREFETCH_TO_FREE(m) do { \ if ((m) != NULL) \ @@ -1214,11 +1308,159 @@ static inline int rte_pktmbuf_alloc_bulk(struct rte_mempool *pool, } /** + * Initialize shared data at the end of an external buffer before attaching + * to a mbuf by ``rte_pktmbuf_attach_extbuf()``. This is not a mandatory + * initialization but a helper function to simply spare a few bytes at the + * end of the buffer for shared data. If shared data is allocated + * separately, this should not be called but application has to properly + * initialize the shared data according to its need. + * + * Free callback and its argument is saved and the refcnt is set to 1. + * + * @warning + * The value of buf_len will be reduced to RTE_PTR_DIFF(shinfo, buf_addr) + * after this initialization. This shall be used for + * ``rte_pktmbuf_attach_extbuf()`` + * + * @param buf_addr + * The pointer to the external buffer. + * @param [in,out] buf_len + * The pointer to length of the external buffer. Input value must be + * larger than the size of ``struct rte_mbuf_ext_shared_info`` and + * padding for alignment. If not enough, this function will return NULL. + * Adjusted buffer length will be returned through this pointer. + * @param free_cb + * Free callback function to call when the external buffer needs to be + * freed. + * @param fcb_opaque + * Argument for the free callback function. + * + * @return + * A pointer to the initialized shared data on success, return NULL + * otherwise. + */ +static inline struct rte_mbuf_ext_shared_info * +rte_pktmbuf_ext_shinfo_init_helper(void *buf_addr, uint16_t *buf_len, + rte_mbuf_extbuf_free_callback_t free_cb, void *fcb_opaque) +{ + struct rte_mbuf_ext_shared_info *shinfo; + void *buf_end = RTE_PTR_ADD(buf_addr, *buf_len); + + shinfo = RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end, + sizeof(*shinfo)), sizeof(uintptr_t)); + if ((void *)shinfo <= buf_addr) + return NULL; + + shinfo->free_cb = free_cb; + shinfo->fcb_opaque = fcb_opaque; + rte_mbuf_ext_refcnt_set(shinfo, 1); + + *buf_len = RTE_PTR_DIFF(shinfo, buf_addr); + return shinfo; +} + +/** + * Attach an external buffer to a mbuf. + * + * User-managed anonymous buffer can be attached to an mbuf. When attaching + * it, corresponding free callback function and its argument should be + * provided via shinfo. This callback function will be called once all the + * mbufs are detached from the buffer (refcnt becomes zero). + * + * The headroom for the attaching mbuf will be set to zero and this can be + * properly adjusted after attachment. For example, ``rte_pktmbuf_adj()`` + * or ``rte_pktmbuf_reset_headroom()`` might be used. + * + * More mbufs can be attached to the same external buffer by + * ``rte_pktmbuf_attach()`` once the external buffer has been attached by + * this API. + * + * Detachment can be done by either ``rte_pktmbuf_detach_extbuf()`` or + * ``rte_pktmbuf_detach()``. + * + * Memory for shared data must be provided and user must initialize all of + * the content properly, escpecially free callback and refcnt. The pointer + * of shared data will be stored in m->shinfo. + * ``rte_pktmbuf_ext_shinfo_init_helper`` can help to simply spare a few + * bytes at the end of buffer for the shared data, store free callback and + * its argument and set the refcnt to 1. The following is an example: + * + * struct rte_mbuf_ext_shared_info *shinfo = + * rte_pktmbuf_ext_shinfo_init_helper(buf_addr, &buf_len, + * free_cb, fcb_arg); + * rte_pktmbuf_attach_extbuf(m, buf_addr, buf_iova, buf_len, shinfo); + * rte_pktmbuf_reset_headroom(m); + * rte_pktmbuf_adj(m, data_len); + * + * Attaching an external buffer is quite similar to mbuf indirection in + * replacing buffer addresses and length of a mbuf, but a few differences: + * - When an indirect mbuf is attached, refcnt of the direct mbuf would be + * 2 as long as the direct mbuf itself isn't freed after the attachment. + * In such cases, the buffer area of a direct mbuf must be read-only. But + * external buffer has its own refcnt and it starts from 1. Unless + * multiple mbufs are attached to a mbuf having an external buffer, the + * external buffer is writable. + * - There's no need to allocate buffer from a mempool. Any buffer can be + * attached with appropriate free callback and its IO address. + * - Smaller metadata is required to maintain shared data such as refcnt. + * + * @warning + * @b EXPERIMENTAL: This API may change without prior notice. + * Once external buffer is enabled by allowing experimental API, + * ``RTE_MBUF_DIRECT()`` and ``RTE_MBUF_INDIRECT()`` are no longer + * exclusive. A mbuf can be considered direct if it is neither indirect nor + * having external buffer. + * + * @param m + * The pointer to the mbuf. + * @param buf_addr + * The pointer to the external buffer. + * @param buf_iova + * IO address of the external buffer. + * @param buf_len + * The size of the external buffer. + * @param shinfo + * User-provided memory for shared data of the external buffer. + */ +static inline void __rte_experimental +rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr, + rte_iova_t buf_iova, uint16_t buf_len, + struct rte_mbuf_ext_shared_info *shinfo) +{ + /* mbuf should not be read-only */ + RTE_ASSERT(RTE_MBUF_DIRECT(m) && rte_mbuf_refcnt_read(m) == 1); + RTE_ASSERT(shinfo->free_cb != NULL); + + m->buf_addr = buf_addr; + m->buf_iova = buf_iova; + m->buf_len = buf_len; + + m->data_len = 0; + m->data_off = 0; + + m->ol_flags |= EXT_ATTACHED_MBUF; + m->shinfo = shinfo; +} + +/** + * Detach the external buffer attached to a mbuf, same as + * ``rte_pktmbuf_detach()`` + * + * @param m + * The mbuf having external buffer. + */ +#define rte_pktmbuf_detach_extbuf(m) rte_pktmbuf_detach(m) + +/** * Attach packet mbuf to another packet mbuf. * - * After attachment we refer the mbuf we attached as 'indirect', - * while mbuf we attached to as 'direct'. - * The direct mbuf's reference counter is incremented. + * If the mbuf we are attaching to isn't a direct buffer and is attached to + * an external buffer, the mbuf being attached will be attached to the + * external buffer instead of mbuf indirection. + * + * Otherwise, the mbuf will be indirectly attached. After attachment we + * refer the mbuf we attached as 'indirect', while mbuf we attached to as + * 'direct'. The direct mbuf's reference counter is incremented. * * Right now, not supported: * - attachment for already indirect mbuf (e.g. - mi has to be direct). @@ -1232,19 +1474,20 @@ static inline int rte_pktmbuf_alloc_bulk(struct rte_mempool *pool, */ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) { - struct rte_mbuf *md; - RTE_ASSERT(RTE_MBUF_DIRECT(mi) && rte_mbuf_refcnt_read(mi) == 1); - /* if m is not direct, get the mbuf that embeds the data */ - if (RTE_MBUF_DIRECT(m)) - md = m; - else - md = rte_mbuf_from_indirect(m); + if (RTE_MBUF_HAS_EXTBUF(m)) { + rte_mbuf_ext_refcnt_update(m->shinfo, 1); + mi->ol_flags = m->ol_flags; + mi->shinfo = m->shinfo; + } else { + /* if m is not direct, get the mbuf that embeds the data */ + rte_mbuf_refcnt_update(rte_mbuf_from_indirect(m), 1); + mi->priv_size = m->priv_size; + mi->ol_flags = m->ol_flags | IND_ATTACHED_MBUF; + } - rte_mbuf_refcnt_update(md, 1); - mi->priv_size = m->priv_size; mi->buf_iova = m->buf_iova; mi->buf_addr = m->buf_addr; mi->buf_len = m->buf_len; @@ -1260,7 +1503,6 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) mi->next = NULL; mi->pkt_len = mi->data_len; mi->nb_segs = 1; - mi->ol_flags = m->ol_flags | IND_ATTACHED_MBUF; mi->packet_type = m->packet_type; mi->timestamp = m->timestamp; @@ -1269,12 +1511,52 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) } /** - * Detach an indirect packet mbuf. + * @internal used by rte_pktmbuf_detach(). * + * Decrement the reference counter of the external buffer. When the + * reference counter becomes 0, the buffer is freed by pre-registered + * callback. + */ +static inline void +__rte_pktmbuf_free_extbuf(struct rte_mbuf *m) +{ + RTE_ASSERT(RTE_MBUF_HAS_EXTBUF(m)); + RTE_ASSERT(m->shinfo != NULL); + + if (rte_mbuf_ext_refcnt_update(m->shinfo, -1) == 0) + m->shinfo->free_cb(m->buf_addr, m->shinfo->fcb_opaque); +} + +/** + * @internal used by rte_pktmbuf_detach(). + * + * Decrement the direct mbuf's reference counter. When the reference + * counter becomes 0, the direct mbuf is freed. + */ +static inline void +__rte_pktmbuf_free_direct(struct rte_mbuf *m) +{ + struct rte_mbuf *md; + + RTE_ASSERT(RTE_MBUF_INDIRECT(m)); + + md = rte_mbuf_from_indirect(m); + + if (rte_mbuf_refcnt_update(md, -1) == 0) { + md->next = NULL; + md->nb_segs = 1; + rte_mbuf_refcnt_set(md, 1); + rte_mbuf_raw_free(md); + } +} + +/** + * Detach a packet mbuf from external buffer or direct buffer. + * + * - decrement refcnt and free the external/direct buffer if refcnt + * becomes zero. * - restore original mbuf address and length values. * - reset pktmbuf data and data_len to their default values. - * - decrement the direct mbuf's reference counter. When the - * reference counter becomes 0, the direct mbuf is freed. * * All other fields of the given packet mbuf will be left intact. * @@ -1283,10 +1565,14 @@ static inline void rte_pktmbuf_attach(struct rte_mbuf *mi, struct rte_mbuf *m) */ static inline void rte_pktmbuf_detach(struct rte_mbuf *m) { - struct rte_mbuf *md = rte_mbuf_from_indirect(m); struct rte_mempool *mp = m->pool; uint32_t mbuf_size, buf_len, priv_size; + if (RTE_MBUF_HAS_EXTBUF(m)) + __rte_pktmbuf_free_extbuf(m); + else + __rte_pktmbuf_free_direct(m); + priv_size = rte_pktmbuf_priv_size(mp); mbuf_size = sizeof(struct rte_mbuf) + priv_size; buf_len = rte_pktmbuf_data_room_size(mp); @@ -1298,13 +1584,6 @@ static inline void rte_pktmbuf_detach(struct rte_mbuf *m) rte_pktmbuf_reset_headroom(m); m->data_len = 0; m->ol_flags = 0; - - if (rte_mbuf_refcnt_update(md, -1) == 0) { - md->next = NULL; - md->nb_segs = 1; - rte_mbuf_refcnt_set(md, 1); - rte_mbuf_raw_free(md); - } } /** @@ -1328,7 +1607,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) if (likely(rte_mbuf_refcnt_read(m) == 1)) { - if (RTE_MBUF_INDIRECT(m)) + if (!RTE_MBUF_DIRECT(m)) rte_pktmbuf_detach(m); if (m->next != NULL) { @@ -1340,7 +1619,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m) } else if (__rte_mbuf_refcnt_update(m, -1) == 0) { - if (RTE_MBUF_INDIRECT(m)) + if (!RTE_MBUF_DIRECT(m)) rte_pktmbuf_detach(m); if (m->next != NULL) { -- 2.11.0