From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 5F0EE8E69 for ; Wed, 25 Apr 2018 19:23:25 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Apr 2018 10:23:24 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,327,1520924400"; d="scan'208";a="36396743" Received: from irsmsx108.ger.corp.intel.com ([163.33.3.3]) by orsmga008.jf.intel.com with ESMTP; 25 Apr 2018 10:23:21 -0700 Received: from irsmsx102.ger.corp.intel.com ([169.254.2.83]) by IRSMSX108.ger.corp.intel.com ([169.254.11.155]) with mapi id 14.03.0319.002; Wed, 25 Apr 2018 18:23:20 +0100 From: "Ananyev, Konstantin" To: Yongseok Koh CC: "Lu, Wenzhuo" , "Wu, Jingjing" , "olivier.matz@6wind.com" , "dev@dpdk.org" , "arybchenko@solarflare.com" , "stephen@networkplumber.org" , "thomas@monjalon.net" , "adrien.mazarguil@6wind.com" , "nelio.laranjeiro@6wind.com" Thread-Topic: [PATCH v5 1/2] mbuf: support attaching external buffer to mbuf Thread-Index: AQHT3EC49E722wrR80ytdCwwn4fIHqQRd4/QgAAuuICAABKi0A== Date: Wed, 25 Apr 2018 17:23:20 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258AEBCFD59@IRSMSX102.ger.corp.intel.com> References: <20180310012532.15809-1-yskoh@mellanox.com> <20180425025341.10590-1-yskoh@mellanox.com> <2601191342CEEE43887BDE71AB977258AEBCF98C@IRSMSX102.ger.corp.intel.com> <20180425170638.GB3268@yongseok-MBP.local> In-Reply-To: <20180425170638.GB3268@yongseok-MBP.local> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWZkZGU2NmEtN2M4OC00NGJmLTgxNjQtZDNlN2RkNjQ4ZDU1IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IklLZzZNeDNrd2tXaGtrV3QrdTRNWG5kT0dCK3RicTFXV2JLOVBpNUU5XC8wPSJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.200.100 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v5 1/2] mbuf: support attaching external buffer to mbuf X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Apr 2018 17:23:26 -0000 > -----Original Message----- > From: Yongseok Koh [mailto:yskoh@mellanox.com] > Sent: Wednesday, April 25, 2018 6:07 PM > To: Ananyev, Konstantin > Cc: Lu, Wenzhuo ; Wu, Jingjing ; olivier.matz@6wind.com; dev@dpdk.org; > arybchenko@solarflare.com; stephen@networkplumber.org; thomas@monjalon.ne= t; adrien.mazarguil@6wind.com; > nelio.laranjeiro@6wind.com > Subject: Re: [PATCH v5 1/2] mbuf: support attaching external buffer to mb= uf >=20 > On Wed, Apr 25, 2018 at 01:31:42PM +0000, Ananyev, Konstantin wrote: > [...] > > > /** Mbuf prefetch */ > > > #define RTE_MBUF_PREFETCH_TO_FREE(m) do { \ > > > if ((m) !=3D NULL) \ > > > @@ -1213,11 +1306,127 @@ static inline int rte_pktmbuf_alloc_bulk(str= uct rte_mempool *pool, > > > } > > > > > > /** > > > + * Attach an external buffer to a mbuf. > > > + * > > > + * User-managed anonymous buffer can be attached to an mbuf. When at= taching > > > + * it, corresponding free callback function and its argument should = be > > > + * provided. This callback function will be called once all the mbuf= s are > > > + * detached from the buffer. > > > + * > > > + * The headroom for the attaching mbuf will be set to zero and this = can be > > > + * properly adjusted after attachment. For example, ``rte_pktmbuf_ad= j()`` > > > + * or ``rte_pktmbuf_reset_headroom()`` can be used. > > > + * > > > + * More mbufs can be attached to the same external buffer by > > > + * ``rte_pktmbuf_attach()`` once the external buffer has been attach= ed by > > > + * this API. > > > + * > > > + * Detachment can be done by either ``rte_pktmbuf_detach_extbuf()`` = or > > > + * ``rte_pktmbuf_detach()``. > > > + * > > > + * Attaching an external buffer is quite similar to mbuf indirection= in > > > + * replacing buffer addresses and length of a mbuf, but a few differ= ences: > > > + * - When an indirect mbuf is attached, refcnt of the direct mbuf wo= uld be > > > + * 2 as long as the direct mbuf itself isn't freed after the attac= hment. > > > + * In such cases, the buffer area of a direct mbuf must be read-on= ly. But > > > + * external buffer has its own refcnt and it starts from 1. Unless > > > + * multiple mbufs are attached to a mbuf having an external buffer= , the > > > + * external buffer is writable. > > > + * - There's no need to allocate buffer from a mempool. Any buffer c= an be > > > + * attached with appropriate free callback and its IO address. > > > + * - Smaller metadata is required to maintain shared data such as re= fcnt. > > > + * > > > + * @warning > > > + * @b EXPERIMENTAL: This API may change without prior notice. > > > + * Once external buffer is enabled by allowing experimental API, > > > + * ``RTE_MBUF_DIRECT()`` and ``RTE_MBUF_INDIRECT()`` are no longer > > > + * exclusive. A mbuf can be considered direct if it is neither indir= ect nor > > > + * having external buffer. > > > + * > > > + * @param m > > > + * The pointer to the mbuf. > > > + * @param buf_addr > > > + * The pointer to the external buffer we're attaching to. > > > + * @param buf_iova > > > + * IO address of the external buffer we're attaching to. > > > + * @param buf_len > > > + * The size of the external buffer we're attaching to. If memory f= or > > > + * shared data is not provided, buf_len must be larger than the si= ze of > > > + * ``struct rte_mbuf_ext_shared_info`` and padding for alignment. = If not > > > + * enough, this function will return NULL. > > > + * @param shinfo > > > + * User-provided memory for shared data. If NULL, a few bytes in t= he > > > + * trailer of the provided buffer will be dedicated for shared dat= a and > > > + * the shared data will be properly initialized. Otherwise, user m= ust > > > + * initialize the content except for free callback and its argumen= t. The > > > + * pointer of shared data will be stored in m->shinfo. > > > + * @param free_cb > > > + * Free callback function to call when the external buffer needs t= o be > > > + * freed. > > > + * @param fcb_opaque > > > + * Argument for the free callback function. > > > + * > > > + * @return > > > + * A pointer to the new start of the data on success, return NULL > > > + * otherwise. > > > + */ > > > +static inline char * __rte_experimental > > > +rte_pktmbuf_attach_extbuf(struct rte_mbuf *m, void *buf_addr, > > > + rte_iova_t buf_iova, uint16_t buf_len, > > > + struct rte_mbuf_ext_shared_info *shinfo, > > > + rte_mbuf_extbuf_free_callback_t free_cb, void *fcb_opaque) > > > +{ > > > + /* Additional attachment should be done by rte_pktmbuf_attach() */ > > > + RTE_ASSERT(!RTE_MBUF_HAS_EXTBUF(m)); > > > > Shouldn't we have here something like: > > RTE_ASSERT(RTE_MBUF_DIRECT(m) && rte_mbuf_refcnt_read(m) =3D=3D 1); > > ? >=20 > Right. That's better. Attaching mbuf should be direct and writable. >=20 > > > + > > > + m->buf_addr =3D buf_addr; > > > + m->buf_iova =3D buf_iova; > > > + > > > + if (shinfo =3D=3D NULL) { > > > > Instead of allocating shinfo ourselves - wound's it be better to rely > > on caller always allocating afeeling it for us (he can do that at the e= nd/start of buffer, > > or whenever he likes to. >=20 > It is just for convenience. For some users, external attachment could be > occasional and casual, e.g. punt control traffic from kernel/hv. For such > non-serious cases, it is good to provide this small utility. For such users that small utility could be a separate function then: shinfo_inside_buf() or so. >=20 > > Again in that case - caller can provide one shinfo to several mbufs (wi= th different buf_addrs) > > and would know for sure that free_cb wouldn't be overwritten by mistake= . > > I.E. mbuf code will only update refcnt inside shinfo. >=20 > I think you missed the discussion with other people yesterday. This chang= e is > exactly for that purpose. Like I documented above, if this API is called = with > shinfo being provided, it will use the user-provided shinfo instead of sp= aring a > few byte in the trailer and won't touch the shinfo. As I can see your current code always update free_cb and fcb_opaque. Which is kind of strange these fields shold be the same for all instances o= f the shinfo. > This code block happens only > if user doesn't provide memory for shared data (shinfo is NULL). >=20 > > > + void *buf_end =3D RTE_PTR_ADD(buf_addr, buf_len); > > > + > > > + shinfo =3D RTE_PTR_ALIGN_FLOOR(RTE_PTR_SUB(buf_end, > > > + sizeof(*shinfo)), sizeof(uintptr_t)); > > > + if ((void *)shinfo <=3D buf_addr) > > > + return NULL; > > > + > > > + m->buf_len =3D RTE_PTR_DIFF(shinfo, buf_addr); > > > + rte_mbuf_ext_refcnt_set(shinfo, 1); > > > + } else { > > > + m->buf_len =3D buf_len; > > > > I think you need to update shinfo>refcnt here too. >=20 > Like explained above, if shinfo is provided, it doesn't alter anything ex= cept > for callbacks and its arg. Hm, but I have 2mbufs attached to the same external buffer via same shinfo= , shouldn't shinfo.refcnt =3D=3D 2? >=20 >=20 > Thanks, > Yongseok