From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 397A1423FD; Tue, 17 Jan 2023 14:36:25 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 21C45400EF; Tue, 17 Jan 2023 14:36:25 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 6FBB3400D4 for ; Tue, 17 Jan 2023 14:36:23 +0100 (CET) Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: [RFC] Fix cryptodev socket id for devices on unknown NUMA node Date: Tue, 17 Jan 2023 14:36:21 +0100 X-MimeOLE: Produced By Microsoft Exchange V6.5 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8767E@smartserver.smartshare.dk> In-Reply-To: X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: [RFC] Fix cryptodev socket id for devices on unknown NUMA node Thread-Index: AdkqdCxBuDxjTAg0QQminmdJWET/agAAXSuw References: <20230117101646.2521875-1-didier.pallard@6wind.com> <98CBD80474FA8B44BF855DF32C47DC35D8767D@smartserver.smartshare.dk> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Bruce Richardson" Cc: "Didier Pallard" , "Akhil Goyal" , "Fan Zhang" , "Olivier Matz" , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Bruce Richardson [mailto:bruce.richardson@intel.com] > Sent: Tuesday, 17 January 2023 14.04 >=20 > On Tue, Jan 17, 2023 at 12:32:14PM +0100, Morten Br=F8rup wrote: > > > From: Didier Pallard [mailto:didier.pallard@6wind.com] > > > Sent: Tuesday, 17 January 2023 11.17 > > > > > > Since DPDK 22.11 and below commit: > > > > = https://git.dpdk.org/dpdk/commit/?id=3D7dcd73e37965ba0bfa430efeac362fe183= > > > ed0ae2 > > > rte_cryptodev_socket_id() could return an incorrect value of 255. > > > Problem has been seen during configuration of the qat device > > > on an Atom C3000 architecture. On this arch, PCI is not depending > on > > > any numa socket, causing device numa_node to be equal to > SOCKET_ID_ANY. > > > > Disclaimer: I'm not up to speed with this topic or patch, so feel > free to ignore my comments here! I'm only speaking up because I fear = we > are increasing the risk of bugs here. But again, please bear with me, > if I have totally misunderstood this! > > > > I was under the impression that single-socket systems used socket_id > 0 as default. How can the PCI bus (or QAT device) not depend on any > socket? It must be connected somewhere. > > > > Doesn't assigning socket_id =3D -1 for devices (QAT or anything = else) > introduce a big risk of bugs, e.g. in comparisons? The special > socket_id value -1 should have only two meanings: 1) return value > "error", or 2) input value "any". Now it also can mean 3) "unknown"? > How do comparison functions work for that... is "any" =3D=3D = "unknown"? And > does searching for "0" match "unknown"? It might, or might not, but > searching for "any" does match "0". And how about searching for > "unknown", if such a value is propagate around in the system. > > > > And if we started considering socket_id =3D=3D -1 valid with that = patch, > should the return type of rte_socket_id(void) be signed instead of > unsigned? > > > The issue here is that not all PCI endpoints connect directly to a > socket, > some connect to the chipset instead, and so do not have any numa > affinity. > That was the original meaning of the "-1" value, and it came about = from > an > era before we had on-die PCI endpoints. Thank you for elaborating, Bruce. Now I get it! Then it does make sense instantiating devices with socket_id =3D -1. A minor detail: SOCKET_ID_ANY is defined in lib/eal/include/rte_memory.h = [1]. Since it is being used for other purposes than memory allocation, = it could move to a more central location. No good ideas from me, but = perhaps memory and devices have an appropriate header file in common. [1]: = https://elixir.bootlin.com/dpdk/latest/source/lib/eal/include/rte_memory.= h#L38 And the multiple purposes of SOCKET_ID_ANY (e.g. what you just = described) could be mentioned as a comment with its definition. And I'm still worried about the risk of comparison bugs, e.g. when = requesting allocation of a device resource, you cannot specify a = preference for a device that is connected to the chipset, because the = SOCKET_ID_ANY in the allocation request would be interpreted as "any = socket" instead of "no socket". Although that might just be me worrying too much. ;-) -Morten