From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 6EB2E43813;
	Thu,  4 Jan 2024 09:47:06 +0100 (CET)
Received: from mails.dpdk.org (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id EABAE40262;
	Thu,  4 Jan 2024 09:47:05 +0100 (CET)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com
 [185.176.79.56]) by mails.dpdk.org (Postfix) with ESMTP id F39AD4013F
 for <dev@dpdk.org>; Thu,  4 Jan 2024 09:47:04 +0100 (CET)
Received: from mail.maildlp.com (unknown [172.18.186.216])
 by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4T5KsT4T5Yz6J9vF;
 Thu,  4 Jan 2024 16:45:01 +0800 (CST)
Received: from frapeml500006.china.huawei.com (unknown [7.182.85.219])
 by mail.maildlp.com (Postfix) with ESMTPS id 382E714013B;
 Thu,  4 Jan 2024 16:47:03 +0800 (CST)
Received: from frapeml500007.china.huawei.com (7.182.85.172) by
 frapeml500006.china.huawei.com (7.182.85.219) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2507.35; Thu, 4 Jan 2024 09:47:02 +0100
Received: from frapeml500007.china.huawei.com ([7.182.85.172]) by
 frapeml500007.china.huawei.com ([7.182.85.172]) with mapi id 15.01.2507.035;
 Thu, 4 Jan 2024 09:47:02 +0100
From: Konstantin Ananyev <konstantin.ananyev@huawei.com>
To: Dariusz Sosnowski <dsosnowski@nvidia.com>, Stephen Hemminger
 <stephen@networkplumber.org>
CC: "NBU-Contact-Thomas Monjalon (EXTERNAL)" <thomas@monjalon.net>, "Ferruh
 Yigit" <ferruh.yigit@amd.com>, Andrew Rybchenko
 <andrew.rybchenko@oktetlabs.ru>, Ori Kam <orika@nvidia.com>, "dev@dpdk.org"
 <dev@dpdk.org>
Subject: RE: [RFC] ethdev: fast path async flow API
Thread-Topic: [RFC] ethdev: fast path async flow API
Thread-Index: AQHaOLOcwU/woXpUyUWxk7ftJVppvLC+4SeAgAmO64CAAO+SEA==
Date: Thu, 4 Jan 2024 08:47:02 +0000
Message-ID: <4efb00a7f6f3406ab819424ac7a25542@huawei.com>
References: <20231227105709.1951231-1-dsosnowski@nvidia.com>
 <20231228091657.14769682@hermes.local>
 <IA1PR12MB8311C2B7D70FA8E1157A94E1A460A@IA1PR12MB8311.namprd12.prod.outlook.com>
In-Reply-To: <IA1PR12MB8311C2B7D70FA8E1157A94E1A460A@IA1PR12MB8311.namprd12.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.206.138.42]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org


> > This is a blocker, showstopper for me.
> +1
>=20
> > Have you considered having something like
> >    rte_flow_create_bulk()
> >
> > or better yet a Linux iouring style API?
> >
> > A ring style API would allow for better mixed operations across the boa=
rd and
> > get rid of the I-cache overhead which is the root cause of the needing =
inline.
> Existing async flow API is somewhat close to the io_uring interface.
> The difference being that queue is not directly exposed to the applicatio=
n.
> Application interacts with the queue using rte_flow_async_* APIs (e.g., p=
laces operations in the queue, pushes them to the HW).
> Such design has some benefits over a flow API which exposes the queue to =
the user:
> - Easier to use - Applications do not manage the queue directly, they do =
it through exposed APIs.
> - Consistent with other DPDK APIs - In other libraries, queues are manipu=
lated through API, not directly by an application.
> - Lower memory usage - only HW primitives are needed (e.g., HW queue on P=
MD side), no need to allocate separate application
> queues.
>=20
> Bulking of flow operations is a tricky subject.
> Compared to packet processing, where it is desired to keep the manipulati=
on of raw packet data to the minimum (e.g., only packet
> headers are accessed),
> during flow rule creation all items and actions must be processed by PMD =
to create a flow rule.
> The amount of memory consumed by items and actions themselves during this=
 process might be nonnegligible.
> If flow rule operations were bulked, the size of working set of memory wo=
uld increase, which could have negative consequences on
> the cache behavior.
> So, it might be the case that by utilizing bulking the I-cache overhead i=
s removed, but the D-cache overhead is added.

Is rte_flow struct really that big?
We do bulk processing for mbufs, crypto_ops, etc., and usually bulk process=
ing improves performance not degrades it.
Of course bulk size has to be somewhat reasonable.

> On the other hand, creating flow rule operations (or enqueuing flow rule =
operations) one by one enables applications to reuse the
> same memory for different flow rules.
>=20
> In summary, in my opinion extending the async flow API with bulking capab=
ilities or exposing the queue directly to the application is
> not desirable.
> This proposal aims to reduce the I-cache overhead in async flow API by re=
using the existing design pattern in DPDK - fast path
> functions are inlined to the application code and they call cached PMD ca=
llbacks.
>=20
> Best regards,
> Dariusz Sosnowski