On Wed, Sep 27, 2023, 16:09 Ferruh Yigit wrote: > On 9/27/2023 2:48 PM, Stanisław Kardach wrote: > > On Wed, Sep 27, 2023 at 1:55 PM Ferruh Yigit > wrote: > >> > >> On 9/21/2023 3:49 PM, Stanisław Kardach wrote: > >>> On Thu, Sep 21, 2023, 15:18 Tummala, Sivaprasad > >>> > > wrote: > >>> > >>> [AMD Official Use Only - General] > >>> > >>> > -----Original Message----- > >>> > From: David Marchand >>> > > >>> > Sent: Wednesday, September 20, 2023 1:05 PM > >>> > To: Stanisław Kardach >>> >; Tummala, Sivaprasad > >>> > > > >>> > Cc: Ruifeng Wang >>> >; Min Zhou >>> >; > >>> > David Christensen >>> >; Bruce Richardson > >>> > >>; > >>> Konstantin Ananyev > >>> > >>> >; dev >>> >; Yigit, Ferruh > >>> > >; Thomas > >>> Monjalon > > >>> > Subject: Re: [PATCH v2 2/2] eal: remove NUMFLAGS enumeration > >>> > > >>> > Caution: This message originated from an External Source. Use > >>> proper caution > >>> > when opening attachments, clicking links, or responding. > >>> > > >>> > > >>> > On Wed, Sep 20, 2023 at 8:01 AM Stanisław Kardach > >>> > wrote: > >>> > > > >>> > > On Tue, Sep 19, 2023 at 4:47 PM David Marchand > >>> > > > wrote: > >>> > > > >>> > > > > Also I see you're still removing the RTE_CPUFLAG_NUMFLAGS > >>> (what I call a > >>> > last element canary). Why? If you're concerned with ABI, then > >>> we're talking about > >>> > an application linking dynamically with DPDK or talking via some > >>> RPC channel with > >>> > another DPDK application. So clashing with this definition does > >>> not come into > >>> > question. One should rather use rte_cpu_get_flag_enabled(). > >>> > > > > Also if you want to introduce new features, one would add > >>> them yo the > >>> > rte_cpuflags headers, unless you'd like to not add those and > keep an > >>> > undocumented list "above" the last defined element. > >>> > > > > Could you explain a bit more Your use-case? > >>> > > > > >>> > > > Hey Stanislaw, > >>> > > > > >>> > > > Talking generically, one problem with such pattern (having a > LAST, > >>> > > > or MAX enum) is when an array sized with such a symbol is > exposed. > >>> > > > As I mentionned in the past, this can have unwanted effects: > >>> > > > > >>> > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493 > >>> < > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493> > >>> > > > -1-david.marchand@redhat.com/ > >>> > >>> > > >>> > Argh... who broke copy/paste in my browser ?! > >>> > Wrt to MAX and arrays, I wanted to point at: > >>> > > >>> > http://inbox.dpdk.org/dev/CAJFAV8xs5CVdE2xwRtaxk5vE_PiQMV5LY5tKStk3R1gOuR > > > >>> > TsUw@mail.gmail.com/ > >>> > > >>> > > I agree, though I'd argue "LAST" and "MAX" semantics are a bit > >>> different. "LAST" > >>> > delimits the known enumeration territory while "MAX" is more of a > >>> `constepxr` > >>> > value type. > >>> > > > > >>> > > > Another issue is when an existing enum meaning changes: from > the > >>> > > > application pov, the (old) MAX value is incorrect, but for > the > >>> > > > library pov, a new meaning has been associated. > >>> > > > This may trigger bugs in the application when calling a > function > >>> > > > that returns such an enum which never return this MAX value > in > >>> the past. > >>> > > > > >>> > > > For at least those two reasons, removing those canary > elements is > >>> > > > being done in DPDK. > >>> > > > > >>> > > > This specific removal has been announced: > >>> > > > > >>> > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493 > >>> < > https://patchwork.dpdk.org/project/dpdk/patch/20230919140430.3251493> > >>> > > > -1-david.marchand@redhat.com/ > >>> > >>> > > Thanks for pointing this out but did you mean to link to the > >>> patch again here? > >>> > > >>> > Sorry, same here, bad copy/paste :-(. > >>> > > >>> > The intended link is: > >>> https://git.dpdk.org/dpdk/commit/?id=5da7c13521 > >>> > >>> > The deprecation notice was badly formulated and this patch here > is > >>> consistent with > >>> > it. > >>> > > >>> > > >>> > > > > >>> > > > Now, practically, when I look at the cpuflags API, I don't > see us > >>> > > > exposed to those two issues wrt rte_cpu_flag_t, so maybe this > >>> change > >>> > > > is unneeded. > >>> > > > But on the other hand, is it really an issue for an > application to > >>> > > > lose this (internal) information? > >>> > > I doubt it, maybe it could be used as a sanity check for > >>> choosing proper functors > >>> > in the application. Though the initial description of the reason > >>> behind this patch was > >>> > to not break the ABI and I don't think it does that. What it does > >>> is enforces users to > >>> > use explicit cpu flag values which is a good thing. Though if so, > >>> then it should be > >>> > stated in the commit description. > >>> > > >>> > I agree. > >>> > Siva, can you work on a new revision? > >>> > > >>> David, Stanislaw, > >>> > >>> The original motivation of this patch was to avoid ABI breakage > with > >>> the introduction of new CPU flag > >>> "RTE_CPUFLAG_MONITORX" > >>> (http://mails.dpdk.org/archives/test-report/2023-April/382489.html > >>> >). > >>> > >>> Because of ABI breakage, the feature was postponed to this release. > >>> > https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tummala@amd.com/ > < > https://patchwork.dpdk.org/project/dpdk/patch/20230413115334.43172-3-sivaprasad.tummala@amd.com/ > > > >>> > >>> This test is flawed, reason being that the NUMFLAGS should not be > >>> treated as a flag value and instead as a canary but this test is not > >>> taking into account. > >>> > >> > >> Hi Stanislaw, > >> > >> Why test is flawed? > >> > >> The enum in in the public header, so the 'RTE_CPUFLAG_NUMFLAGS' enum > >> item, and there are APIs using the enum, so the enum exchanged between > >> shared library and the application. > > In a similar way lots of Linux uapi headers contain bits that should > > not be used directly, even though they are defined there. The reason > > for that is the C language syntax, not necessarily the intent of a > > developer. > > Since NUMFLAGS was a canary to make the flag handling code easier, it > > should not be treated as a "real" value and hence my suggestion of a > > flawed test. That said, NUMFLAGS does not bring enough value to not > > remove it. :) > > > > Both it doesn't enough value to hang on, and we don't have control on > how it is used by the application once it is exposed by the library. > > > >> > >> Similar thing discussed before and when enum exchanged between > >> application and shared library, there is an ABI breakage risk when enum > >> extended and general tendency is to eliminate the MAX value to reduce > >> the risk. > > Agreed though as I have mentioned before, "MAX" has a different > > semantics than "NUM". Then again since we have rte_cpu_feature_table, > > we can RTE_DIM to check the user input. > > > > Their usage and intention on having them is same I think, can you please > elaborate what is the difference between MAX and NUM enum items that is > added as last item in an enum? > MAX specifies a semantic numerical value, such as MTU. NUM counts elements in an enumeration where elements describe some items and their value is just an implementation detail. > > > >> > >> > >> When enum value sent from library to application, it is more clear that > >> this can cause an ABI breakage, because application can receive a value > >> that it is not aware in the build time, which can cause unexpected > behavior. > >> Simply think about a case application allocated array in > >> 'RTE_CPUFLAG_NUMFLAGS' size and directly accessing the array index based > >> on returned enum item value, if the enum extended in the new version of > >> the shared library, this can cause invalid memory access in application. > > Using the NUM enum element (which serves as a last item canary) to > > size an array is not a good idea unless it's returned from a runtime > > call. Otherwise one hits issues that you've described. > > > > I agree :), but that is a way to describe how it can be a problem. > Also last time I argued similar to what you said, that application > should check against MAX value before using it but I have been told > not to assume what application does. My take from it is, expect worst > from application as a library side developer. > > > >> > >> When enum value sent from application to library, I am not quite sure > >> how problematic it is to be honest. Like being in the > >> 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' in question. > >> Only when application sends 'RTE_CPUFLAG_NUMFLAGS' to > >> 'rte_cpu_get_flag_name()', it expects a NULL returned, but this won't > >> happen in new version of the shared library, not sure if this can cause > >> any problem for the application. > >> But as I mentioned, general guidance is to eliminate this kind of MAX > >> enum value usage. > >> > >> > >> And for this specific issue, although usage of the enum in > >> 'rte_cpu_get_flag_enabled()' & 'rte_cpu_get_flag_name()' APIs is not > >> clear if it cause ABI breakage, > >> enum being embedded into the 'struct rte_bbdev_driver_info' struct > >> doesn't leave a question, since this struct is returned from library to > >> the application and change in the enum causes an ABI breakage. > > Enum size does not change irrespective of changing its values. So > > size-wise it's not an ABI breakage. Re-ordering values is an ABI > > breakage.> > > Agree it is not size-wise issue. But still an issue. > > > >> > >> > >> Briefly, I think even appending to the end of 'enum rte_cpu_flag_t' > >> cause ABI breakage and removing 'RTE_CPUFLAG_NUMFLAGS' helps to extend > >> this enum in the future. > >> And an outstanding deprecation notice already exists for this: > >> > https://git.dpdk.org/dpdk/tree/doc/guides/rel_notes/deprecation.rst?h=v23.07#n63 > >> > >> > >>> Your change did not break the ABI because you have properly added the > >>> new flag at the end. > >>> So I would ask to change the commit description to mention that > NUMFLAGS > >>> is removed to: > >>> 1. Prevent users from treating it as a usable value or an array size. > >>> 2. Prevent false-positive failures in the ABI test. > >>> > >>> Also it would be good to link to the aforementioned ABI test failure to > >>> give readers some context when inspecting the git tree. > >>> > >>> > >>> > >>> Can you please add what exactly needs to be reworked in the new > version. > >>> > >>> > > >>> > Thanks. > >>> > > >>> > -- > >>> > David Marchand > >>> > >> > > > > > >