From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 6EB2E43813; Thu, 4 Jan 2024 09:47:06 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id EABAE40262; Thu, 4 Jan 2024 09:47:05 +0100 (CET) Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id F39AD4013F for ; Thu, 4 Jan 2024 09:47:04 +0100 (CET) Received: from mail.maildlp.com (unknown [172.18.186.216]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4T5KsT4T5Yz6J9vF; Thu, 4 Jan 2024 16:45:01 +0800 (CST) Received: from frapeml500006.china.huawei.com (unknown [7.182.85.219]) by mail.maildlp.com (Postfix) with ESMTPS id 382E714013B; Thu, 4 Jan 2024 16:47:03 +0800 (CST) Received: from frapeml500007.china.huawei.com (7.182.85.172) by frapeml500006.china.huawei.com (7.182.85.219) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.35; Thu, 4 Jan 2024 09:47:02 +0100 Received: from frapeml500007.china.huawei.com ([7.182.85.172]) by frapeml500007.china.huawei.com ([7.182.85.172]) with mapi id 15.01.2507.035; Thu, 4 Jan 2024 09:47:02 +0100 From: Konstantin Ananyev To: Dariusz Sosnowski , Stephen Hemminger CC: "NBU-Contact-Thomas Monjalon (EXTERNAL)" , "Ferruh Yigit" , Andrew Rybchenko , Ori Kam , "dev@dpdk.org" Subject: RE: [RFC] ethdev: fast path async flow API Thread-Topic: [RFC] ethdev: fast path async flow API Thread-Index: AQHaOLOcwU/woXpUyUWxk7ftJVppvLC+4SeAgAmO64CAAO+SEA== Date: Thu, 4 Jan 2024 08:47:02 +0000 Message-ID: <4efb00a7f6f3406ab819424ac7a25542@huawei.com> References: <20231227105709.1951231-1-dsosnowski@nvidia.com> <20231228091657.14769682@hermes.local> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.206.138.42] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > > This is a blocker, showstopper for me. > +1 >=20 > > Have you considered having something like > > rte_flow_create_bulk() > > > > or better yet a Linux iouring style API? > > > > A ring style API would allow for better mixed operations across the boa= rd and > > get rid of the I-cache overhead which is the root cause of the needing = inline. > Existing async flow API is somewhat close to the io_uring interface. > The difference being that queue is not directly exposed to the applicatio= n. > Application interacts with the queue using rte_flow_async_* APIs (e.g., p= laces operations in the queue, pushes them to the HW). > Such design has some benefits over a flow API which exposes the queue to = the user: > - Easier to use - Applications do not manage the queue directly, they do = it through exposed APIs. > - Consistent with other DPDK APIs - In other libraries, queues are manipu= lated through API, not directly by an application. > - Lower memory usage - only HW primitives are needed (e.g., HW queue on P= MD side), no need to allocate separate application > queues. >=20 > Bulking of flow operations is a tricky subject. > Compared to packet processing, where it is desired to keep the manipulati= on of raw packet data to the minimum (e.g., only packet > headers are accessed), > during flow rule creation all items and actions must be processed by PMD = to create a flow rule. > The amount of memory consumed by items and actions themselves during this= process might be nonnegligible. > If flow rule operations were bulked, the size of working set of memory wo= uld increase, which could have negative consequences on > the cache behavior. > So, it might be the case that by utilizing bulking the I-cache overhead i= s removed, but the D-cache overhead is added. Is rte_flow struct really that big? We do bulk processing for mbufs, crypto_ops, etc., and usually bulk process= ing improves performance not degrades it. Of course bulk size has to be somewhat reasonable. > On the other hand, creating flow rule operations (or enqueuing flow rule = operations) one by one enables applications to reuse the > same memory for different flow rules. >=20 > In summary, in my opinion extending the async flow API with bulking capab= ilities or exposing the queue directly to the application is > not desirable. > This proposal aims to reduce the I-cache overhead in async flow API by re= using the existing design pattern in DPDK - fast path > functions are inlined to the application code and they call cached PMD ca= llbacks. >=20 > Best regards, > Dariusz Sosnowski