From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EE14741BC6; Sat, 4 Feb 2023 09:09:16 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 86025406A2; Sat, 4 Feb 2023 09:09:16 +0100 (CET) Received: from smartserver.smartsharesystems.com (smartserver.smartsharesystems.com [77.243.40.215]) by mails.dpdk.org (Postfix) with ESMTP id 0502840395 for ; Sat, 4 Feb 2023 09:09:14 +0100 (CET) X-MimeOLE: Produced By Microsoft Exchange V6.5 Content-class: urn:content-classes:message MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Subject: RE: Sign changes through function signatures Date: Sat, 4 Feb 2023 09:09:11 +0100 Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D876FC@smartserver.smartshare.dk> In-Reply-To: <20230203221247.GC4631@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> X-MS-Has-Attach: X-MS-TNEF-Correlator: Thread-Topic: Sign changes through function signatures Thread-Index: Adk4HK+4KuPKXxTuR0m6keBZSazDvAATakFw References: <20230202202604.GA23284@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> <5155790.0VBMTVartN@thomas> <98CBD80474FA8B44BF855DF32C47DC35D876F8@smartserver.smartshare.dk> <20230203221247.GC4631@linuxonhyperv3.guj3yctzbm1etfxqx2vob5hsef.xx.internal.cloudapp.net> From: =?iso-8859-1?Q?Morten_Br=F8rup?= To: "Tyler Retzlaff" , "Bruce Richardson" Cc: "Thomas Monjalon" , "Ben Magistro" , "Olivier Matz" , , , , , "Stefan Baranoff" , , X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com] > Sent: Friday, 3 February 2023 23.13 >=20 > On Fri, Feb 03, 2023 at 12:05:04PM +0000, Bruce Richardson wrote: > > On Thu, Feb 02, 2023 at 10:26:48PM +0100, Morten Br=F8rup wrote: > > > > From: Thomas Monjalon [mailto:thomas@monjalon.net] > > > > Sent: Thursday, 2 February 2023 21.45 > > > > > > > > 02/02/2023 21:26, Tyler Retzlaff: > > > > > On Thu, Feb 02, 2023 at 02:23:39PM -0500, Ben Magistro wrote: > > > > > > Hello, > > > > > > > > > > > > While making some updates to our code base for 22.11.1 that > were > > > > missed in > > > > > > our first pass through, we hit the numa node change[1]. In > the > > > > process of > > > > > > updating our code, we noticed that a couple functions > > > > (rx/tx_queue_setup, > > > > > > maybe more that we aren't using) state they accept > `SOCKET_ID_ANY` > > > > but the > > > > > > function signature then asks for an unsigned integer while > > > > `SOCKET_ID_ANY` > > > > > > is `-1`. Following it through the redirect to the "real" > function > > > > it also > > > > > > asks for an unsigned integer which is then passed on to one > or more > > > > > > functions asking for an integer. As an example using the = the > i40e > > > > driver > > > > > > -- we would call `rte_eth_tx_queue_setup` [2] which > ultimately > > > > calls > > > > > > `i40e_dev_tx_queue_setup`[3] which finally calls > > > > `rte_zmalloc_socket`[4] > > > > > > and `rte_eth_dma_zone_reserve`[5]. > > > > > > > > > > > > I guess what I am looking for is clarification on if this is > > > > intentional or > > > > > > if this is additional cleanup that may need to be > completed/be > > > > desirable so > > > > > > that signs are maintained through the call paths and avoid > > > > potentially > > > > > > producing sign-conversion warnings. From the very quick > glance I > > > > took at > > > > > > the i40e driver, it seems these are just passed through to > other > > > > functions > > > > > > and no direct use/manipulation occurs (at least in the > mentioned > > > > functions). > > > > > > > > > > i believe this is just sloppyness with sign in our api = surface. > i too > > > > > find it frustrating that use of these api force either = explicit > > > > > casts or suffer having to suppress warnings. > > > > > > > > > > in the past examples of this have been cleaned up without full > > > > deprecation > > > > > notices but there are a lot of instances. i also feel > (unpopular > > > > opinion) > > > > > that for some integer types like this that have constrained > range / > > > > number > > > > > spaces it would be of value to introduce a typedef that can be > used > > > > > consistently. > > > > > > > > > > for now you'll just have to add the casts and hopefully in the > future > > > > we > > > > > will fix the api making them unnecessary. of course feel free > to > > > > submit > > > > > patches too, it would be great to have these cleaned up. > > > > > > > > I agree it should be cleaned up. > > > > Those IDs should accept negative values. > > > > Not sure which type we should choose (int, int32_t, or a > typedef). > > > > > > Why would we use a signed socket ID? We don't use signed port IDs. > To me, unsigned seems the way to go. (A minor detail: With unsigned we > can use the entire range of values minus one (for the magic "any" > value), whereas with signed we can only use the positive range of > values. This detail is completely irrelevant when using 32 bit for > socket ID, but could be relevant if using fewer bits.) > > > > > > Also, we don't need 32 bit for socket ID. 8 or 16 bit should > suffice, like port ID. But reducing from 32 bit would probably cause > major ABI breakage. > > > > > > > > > > > Another thing to check is the name of the variable. > > > > It should be a socket ID when talking about CPU, > > > > and a NUMA node ID when talking about memory. > > > > > > > > And last but not the least, > > > > how can we keep ABI compatibility? > > > > I hope we can use function versioning to avoid deprecation and > > > > breaking. > > > > > > > > Trials and suggestions are welcome. > > > > > > Signedness is not the only problem with the socket ID. The meaning > of SOCKET_ID_ANY is excessively overloaded. If we want to clean this > up, we should consider the need for another magic value SOCKET_ID_NONE > for devices connected to the chipset, as discussed in this other email > thread [1]. And as discussed there, there are also size problems, > because some device structures use 8 bit to hold the socket ID. > > > > > > And functions should always return -1, never SOCKET_ID_ANY, to > indicate error. > > > > > > [1]: > = http://inbox.dpdk.org/dev/98CBD80474FA8B44BF855DF32C47DC35D87684@smarts > erver.smartshare.dk/ > > > > > > I only bring warnings and complications to the discussion here, no > solutions. Sorry! :-( > > > > > > > Personally, I think if we are going to change things, we should do > things > > properly, especially/even if we are going to have to break ABI or = use > ABI > > compatibility. > > > > I would suggest rather than a typedef, we should actually wrap the > int > > value in a struct - for two reasons: >=20 > > > > * it means the compiler will actually error out for us if an int or > > unsigned int is used instead. This allow easier fixing at compile- > time > > rather than hoping things are correctly specified in existing = code. > > > > * it allows us to do things like explicitly calling out flags, = rather > than > > just using magic values. While still keeping the size 32 bits, we > can > > have the actual socket value as 16-bits and have flags to = indicate: > > - ANY socket, NO socket, INVALID value socket. This could end up > being > > useful in many cases, for example, when allocating memory we could > > specify a socket number with the ANY flag, indicating that any > socket is > > ok, but we'd ideally prefer the number specified. >=20 > i'm a fan of this where it makes sense. i did this with rte_thread_t > for > exactly your first reason. but i did receive resistance from other > members of the community. personally i like compilation to fail when i > make a mistake. >=20 > it's definitely way easier to make the argument to do this when the > actual valued is opaque. if it isn't i think then we need to provide > macro/inline accessors to allow applications do whatever it is they do > with the value they carry. >=20 > i'll also note that this allows you a cheap way to sprinkle extra > integrity checking when running functional tests. if you have low > performance inline accessors you can do things like enforce the range > of > values or or that enumerations are part of a set for debug builds. >=20 > as a side i would also caution while i suggested a typedef i don't = mean > that everything should be typedef'd especially actual structs that are > used like structs. typedefs for things like socket id would > unquestionably convey more information and implied semantics to the > user > of an api than just a standard `int' or whatever. consequently i have > found > that this lowers mistakes with the use of the api. Hiding the socket_id in a typedef'd structure seems like shooting = sparrows with cannons. DPDK is using a C coding style, where there is a convention for not = using typedefs: https://www.kernel.org/doc/html/v4.10/process/coding-style.html#typedefs In the tread case, a typedef made sense, because the underlying type can = differ across O/S'es, and thus should be opaque. Which is in line with = the coding style. But I don't think this is the case for socket_id. The socket_id is an = enumeration type, and all we need is a magic number for the "chipset" = pseudo-socket. And with that, perhaps some iterator macros to = include/omit this pseudo-socket, like the lcore_id iterators with and = without the main lcore. The mix of signed and unsigned in function signatures (and in the = definition of SOCKET_ID_ANY) is pure sloppyness. This problem may also = be present in other function signatures; we just happened to run into it = for the socket_id. The compiler has flags to warn about mixing signed and unsigned types, = so we could use that flag to reveal and fix those bugs. >=20 > > > > As for socket id, and numa id, I'm not sure we should have different > > names/types for the two. For example, for PCI devices, do they need = a > third > > type or are they associated with cores or with memory? The socket id > for > > the core only matters in terms of data locality, i.e. what memory or > cache > > location it is in. Therefore, for me, I'd pick one name and stick > with it. >=20 > i think the choice for more than one type vs one type is whether or = not > they are "the same" number space as opposed to just coincidentally > overlapping number spaces. >=20 > > > > /Bruce