From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1283AA0032; Sat, 23 Jul 2022 20:25:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B73F44067C; Sat, 23 Jul 2022 20:25:27 +0200 (CEST) Received: from forward500j.mail.yandex.net (forward500j.mail.yandex.net [5.45.198.250]) by mails.dpdk.org (Postfix) with ESMTP id 158FD4021E for ; Sat, 23 Jul 2022 20:25:26 +0200 (CEST) Received: from sas2-1cbd504aaa99.qloud-c.yandex.net (sas2-1cbd504aaa99.qloud-c.yandex.net [IPv6:2a02:6b8:c14:7101:0:640:1cbd:504a]) by forward500j.mail.yandex.net (Yandex) with ESMTP id 65E576CB64CA; Sat, 23 Jul 2022 21:25:25 +0300 (MSK) Received: by sas2-1cbd504aaa99.qloud-c.yandex.net (smtp/Yandex) with ESMTPSA id U2XdL6AAzM-PLjGdlWs; Sat, 23 Jul 2022 21:25:24 +0300 (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits)) (Client certificate not present) X-Yandex-Fwd: 1 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex.ru; s=mail; t=1658600724; bh=ZmS8lzO3uJ+CBy/riTfFI6rlCBt5EEjpxcdwMNaP1q8=; h=From:In-Reply-To:Cc:Date:References:To:Subject:Message-ID; b=kRl/VTnjqV3tEoeHbgHHLKhYZeSIvApS05Ui84MS/lVPJRBn4L1KGRuH/66hHHPvf Y95DGH0I5L+w/LFYqYTmr1ywA+U5W+gvWWCB3s92arJygMRwOhX6krJ9I9Mqih92MW i5nqUQ/Q/HbiyrP1K/Y5AqlJAuJB03MqszjR/Xas= Authentication-Results: sas2-1cbd504aaa99.qloud-c.yandex.net; dkim=pass header.i=@yandex.ru Message-ID: Date: Sat, 23 Jul 2022 19:25:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.11.0 Subject: Re: [PATCH v3] ip_frag: add IPv4 fragment copy packet API Content-Language: en-US To: =?UTF-8?Q?Morten_Br=c3=b8rup?= , Huichao Cai Cc: dev@dpdk.org, Stephen Hemminger , Olivier Matz , Yuying Zhang , Beilei Xing , Matan Azrad , Viacheslav Ovsiienko References: <1654784398-11315-1-git-send-email-chcchc88@163.com> <1658494910-7869-1-git-send-email-chcchc88@163.com> <20220722074925.2e06fbd5@hermes.local> <98CBD80474FA8B44BF855DF32C47DC35D871DE@smartserver.smartshare.dk> <43369d95.400d.18226a1e579.Coremail.chcchc88@163.com> <98CBD80474FA8B44BF855DF32C47DC35D871DF@smartserver.smartshare.dk> <22ca4fab-eba8-a698-9468-118f2ecc99f8@yandex.ru> <98CBD80474FA8B44BF855DF32C47DC35D871E0@smartserver.smartshare.dk> From: Konstantin Ananyev In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D871E0@smartserver.smartshare.dk> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org 23/07/2022 09:24, Morten Brørup пишет: > +CC: i40e maintainers > +CC: mlx5 maintainers > >> From: Konstantin Ananyev [mailto:konstantin.v.ananyev@yandex.ru] >> Sent: Saturday, 23 July 2022 00.35 >> >> 22/07/2022 17:14, Morten Brørup пишет: >>> From: Huichao Cai [mailto:chcchc88@163.com] >>> Sent: Friday, 22 July 2022 17.59 >>> >>>> At 2022-07-22 23:52:28, "Morten Brørup" >> wrote: >>>>>> From: Stephen Hemminger [mailto:stephen@networkplumber.org] >>>>>> Sent: Friday, 22 July 2022 16.49 >>>>>> >>>>>> On Fri, 22 Jul 2022 21:01:50 +0800 >>>>>> Huichao Cai wrote: >>>>>> >>>>>>> Some NIC drivers support MBUF_FAST_FREE(Device supports >> optimization >>>>>>> for fast release of mbufs. When set application must guarantee >> that >>>>>>> per-queue all mbufs comes from the same mempool and has refcnt = >> 1) >>>>>>> offload. In order to adapt to this offload function, add this >> API. >>>>>>> Add some test data for this API. >>>>>>> >>>>>>> Signed-off-by: Huichao Cai >>>>>> >>>>>> The code should just be checking that refcnt == 1 directly. >>>>>> >>>>>> There are cases where sender passes a cloned mbuf. This is >> independent >>>>>> of the fast free optimization. >>>>>> >>>>>> Similar to what Linux kernel does with skb_cow(). >>>>> >>>>> Olivier just confirmed that MBUF_FAST_FREE requires that the mbufs >> are direct and non-segmented, although these requirements are not yet >> documented. >>>>> >>>>> This means that you should not generate segmented mbufs with this >> patch. I don't know what to do instead; probably fail with an >> appropriate errno. >>>> >>>> When the bnxt driver sends mbuf, it will take the mbuf segments >> apart and hang it to the tx_buf_ring, so there is no mbuf segments when >> it is released. Does this mean that there can be mbuf segments? >>> >>> Only if the bnxt driver also resets the segmentation fields (nb_segs >> and next) in those mbufs, which I suppose it does, if it supports >> MBUF_FAST_FREE with segmented packets. >>> >>> However, other Ethernet drivers don't do that, so a generic library >> function cannot rely on it. These missing requirements for >> MBUF_FAST_FREE is a bug, either in the MBUF_FAST_FREE documentation, or >> in the drivers where MBUF_FAST_FREE only works correctly with direct >> and non-segmented mbufs. >>> >> >> I believe multi-segment packets work ok with MBUF_FAST_FREE >> (as long as other requirements are met). > > Looking at the i40e and mlx5 drivers, they both seem to call rte_mempool_put_bulk() without first calling rte_pktmbuf_prefree_seg(). So segmented packets freed with MBUF_FAST_FREE, will be stored in the mbuf pool without m->nb_segs and m->next being reset first. > > I don't have deep knowledge of these drivers, so maybe I have overlooked something. > > The point of MBUF_FAST_FREE is to bypass a lot of code under certain conditions. So I believe that these two undocumented requirements should remain, so the drivers can bypass this code. Otherwise, don't use MBUF_FAST_FREE. > Actually, after another look, I think you and Olivier are right - multi-seg packets should not be used together with MBUF_FAST_FREE. I forgot that mbuf_prefree() is responsible to reset both 'next' and 'nb_segs' fields of the mbuf. It might keep working for some simple forwarding app (like l3fwd), as most PMDs reset these fields at RX path anyway, but that's just a coincidence we shouldn't rely on. We probably need to update l3fwd (and other examples) to dis-allow MBUF_FAST_FREE when TX_OFFLOAD_MULTI_SEGS is selected. Konstantin