* RFC acceptable handling of VLAs across toolchains
@ 2023-11-07 19:32 Tyler Retzlaff
2023-11-08 2:31 ` Stephen Hemminger
` (2 more replies)
0 siblings, 3 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2023-11-07 19:32 UTC (permalink / raw)
To: dev
hi folks,
i'm seeking advice. we have use of VLAs which are now optional in
standard C. some toolchains provide a conformant implementation and msvc
does not (and never will).
we seem to have a few options, just curious about what people would
prefer.
* use alloca
* use dynamically allocated storage
* conditional compiled code where the msvc leg uses one of the previous
two options
i'll leave it simple for now, i'd like to hear input rather than provide
a recommendation for now.
thanks!
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFC acceptable handling of VLAs across toolchains
2023-11-07 19:32 RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
@ 2023-11-08 2:31 ` Stephen Hemminger
2023-11-08 3:25 ` Tyler Retzlaff
2023-11-08 16:51 ` Stephen Hemminger
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
2 siblings, 1 reply; 34+ messages in thread
From: Stephen Hemminger @ 2023-11-08 2:31 UTC (permalink / raw)
To: Tyler Retzlaff; +Cc: dev
On Tue, 7 Nov 2023 11:32:20 -0800
Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> hi folks,
>
> i'm seeking advice. we have use of VLAs which are now optional in
> standard C. some toolchains provide a conformant implementation and msvc
> does not (and never will).
>
> we seem to have a few options, just curious about what people would
> prefer.
>
> * use alloca
>
> * use dynamically allocated storage
>
> * conditional compiled code where the msvc leg uses one of the previous
> two options
>
> i'll leave it simple for now, i'd like to hear input rather than provide
> a recommendation for now.
>
VLAs are a bug magnet. Best to avoid them, most code doesn't need them.
The one common use case is code that accepts a burst of packets.
But such code could easily have an upper bound if necessary.
Please don't add more to the maze of #ifdef's
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFC acceptable handling of VLAs across toolchains
2023-11-08 2:31 ` Stephen Hemminger
@ 2023-11-08 3:25 ` Tyler Retzlaff
2023-11-08 8:19 ` Morten Brørup
0 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2023-11-08 3:25 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
On Tue, Nov 07, 2023 at 06:31:14PM -0800, Stephen Hemminger wrote:
> On Tue, 7 Nov 2023 11:32:20 -0800
> Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
>
> > hi folks,
> >
> > i'm seeking advice. we have use of VLAs which are now optional in
> > standard C. some toolchains provide a conformant implementation and msvc
> > does not (and never will).
> >
> > we seem to have a few options, just curious about what people would
> > prefer.
> >
> > * use alloca
> >
> > * use dynamically allocated storage
> >
> > * conditional compiled code where the msvc leg uses one of the previous
> > two options
> >
> > i'll leave it simple for now, i'd like to hear input rather than provide
> > a recommendation for now.
> >
>
> VLAs are a bug magnet. Best to avoid them, most code doesn't need them.
just in case i didn't clarify properly early when i said they were
optional i meant they used to be non-optional. the intent of the RFC
here isn't that i want to add more but i'm looking for the best approach
to getting rid of the ones we already have.
> The one common use case is code that accepts a burst of packets.
> But such code could easily have an upper bound if necessary.
>
> Please don't add more to the maze of #ifdef's
thanks! i'll keep this in mind.
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: RFC acceptable handling of VLAs across toolchains
2023-11-08 3:25 ` Tyler Retzlaff
@ 2023-11-08 8:19 ` Morten Brørup
0 siblings, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2023-11-08 8:19 UTC (permalink / raw)
To: Tyler Retzlaff, Stephen Hemminger; +Cc: dev
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Wednesday, 8 November 2023 04.25
>
> On Tue, Nov 07, 2023 at 06:31:14PM -0800, Stephen Hemminger wrote:
> > On Tue, 7 Nov 2023 11:32:20 -0800
> > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> >
> > > hi folks,
> > >
> > > i'm seeking advice. we have use of VLAs which are now optional in
> > > standard C. some toolchains provide a conformant implementation and
> msvc
> > > does not (and never will).
Just so everyone is on the same page... this is a VLA (Variable Length Array):
void f(int n) {
int v[n]; // VLA: its size is determined at run-time.
}
> > >
> > > we seem to have a few options, just curious about what people would
> > > prefer.
> > >
> > > * use alloca
VLAs have the advantage that they are allocated on the stack, which usually means that the memory is already present in the CPU's L1 cache (or L2 cache if using a larger block of memory).
It seems alloca() also allocates on the stack, so alloca() should provide similar performance.
> > >
> > > * use dynamically allocated storage
This would probably have lower performance than alloca() due to using "cold" memory, as opposed to memory on the stack.
And it needs to be explicitly freed again, which is somewhat annoying, compared to automatically freed memory.
> > >
> > > * conditional compiled code where the msvc leg uses one of the
> previous
> > > two options
I agree with Stephen on this: Whatever VLA alternative we choose for MSVC, other compilers can use that too. There is no need for #ifdefs to keep VLAs for other compilers.
> > >
> > > i'll leave it simple for now, i'd like to hear input rather than
> provide
> > > a recommendation for now.
> > >
> >
> > VLAs are a bug magnet. Best to avoid them, most code doesn't need
> them.
>
> just in case i didn't clarify properly early when i said they were
> optional i meant they used to be non-optional.
VLAs were standard in C99, and became optional in C11.
> the intent of the RFC
> here isn't that i want to add more but i'm looking for the best
> approach
> to getting rid of the ones we already have.
> > The one common use case is code that accepts a burst of packets.
> > But such code could easily have an upper bound if necessary.
Exactly!
I suggest that we forbid the use of VLAs.
For fast path code, constant-size arrays should be strongly recommended.
For non-fast path code, use alloca() or whatever VLA alternative is convenient on a case-by-case basis.
Perhaps checkpatches can detect the use of VLAs? Or it could be updated to check for them.
For reference, VLAs are forbidden in the Linux Kernel [1]. A good excuse for also forbidding them in DPDK. ;-)
[1]: https://www.phoronix.com/news/Linux-Kills-The-VLA
> >
> > Please don't add more to the maze of #ifdef's
>
> thanks! i'll keep this in mind.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFC acceptable handling of VLAs across toolchains
2023-11-07 19:32 RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
2023-11-08 2:31 ` Stephen Hemminger
@ 2023-11-08 16:51 ` Stephen Hemminger
2023-11-08 17:48 ` Morten Brørup
2023-11-09 20:26 ` RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
2 siblings, 2 replies; 34+ messages in thread
From: Stephen Hemminger @ 2023-11-08 16:51 UTC (permalink / raw)
To: Tyler Retzlaff; +Cc: dev
On Tue, 7 Nov 2023 11:32:20 -0800
Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> hi folks,
>
> i'm seeking advice. we have use of VLAs which are now optional in
> standard C. some toolchains provide a conformant implementation and msvc
> does not (and never will).
>
> we seem to have a few options, just curious about what people would
> prefer.
>
> * use alloca
>
> * use dynamically allocated storage
>
> * conditional compiled code where the msvc leg uses one of the previous
> two options
>
> i'll leave it simple for now, i'd like to hear input rather than provide
> a recommendation for now.
>
> thanks!
As an experiment did a build of current DPDK with -Wvla option.
Lots of errors, some have obvious solutions like:
../drivers/net/failsafe/failsafe_intr.c: In function ‘fs_rx_event_proxy_service_install’:
../drivers/net/failsafe/failsafe_intr.c:142:17: warning: ISO C90 forbids variable length array ‘service_core_list’ [-Wvla]
142 | uint32_t service_core_list[num_service_cores];
| ^~~~~~~~
This could just be RTE_MAX_LCORES.
others like rte_metrics should just use malloc() as is used already in
that function.
../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_update_metrics_ethdev’:
../lib/metrics/rte_metrics_telemetry.c:140:9: warning: ISO C90 forbids variable length array ‘xstats_values’ [-Wvla]
140 | uint64_t xstats_values[num_xstats];
| ^~~~~~~~
../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_extract_data’:
../lib/metrics/rte_metrics_telemetry.c:384:9: warning: ISO C90 forbids variable length array ‘stat_names’ [-Wvla]
384 | const char *stat_names[num_stat_names];
| ^~~~~
Others already have an implicit upper bound.
Example is in rte_cuckoo_hash where some fields us RTE_HASH_LOOKUP_BULK_MAX
and some use VLA.
[170/2868] Compiling C object lib/librte_hash.a.p/hash_rte_cuckoo_hash.c.o
../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_bulk_data’:
../lib/hash/rte_cuckoo_hash.c:2355:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
2355 | int32_t positions[num_keys];
| ^~~~~~~
../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_with_hash_bulk_data’:
../lib/hash/rte_cuckoo_hash.c:2471:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
2471 | int32_t positions[num_keys];
| ^~~~~~~
Would it make sense to have an rte_config.h value for maximum burst size?
Lots of code is using nb_pkts.
There is also some confusing ones like:
../lib/mempool/rte_mempool.c: In function ‘mempool_cache_init’:
../lib/mempool/rte_mempool.c:751:50: warning: ISO C90 forbids array whose size cannot be evaluated [-Wvla]
751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
| ^~~~~~~~~~~~~~~~~
../lib/eal/include/rte_common.h:498:65: note: in definition of macro ‘RTE_BUILD_BUG_ON’
498 | #define RTE_BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
| ^~~~~~~~~
../lib/mempool/rte_mempool.c:751:26: note: in expansion of macro ‘RTE_SIZEOF_FIELD’
751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: RFC acceptable handling of VLAs across toolchains
2023-11-08 16:51 ` Stephen Hemminger
@ 2023-11-08 17:48 ` Morten Brørup
2023-11-09 10:25 ` RFC: default burst sizes in rte_config Morten Brørup
2023-11-09 20:26 ` RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
1 sibling, 1 reply; 34+ messages in thread
From: Morten Brørup @ 2023-11-08 17:48 UTC (permalink / raw)
To: Stephen Hemminger, Tyler Retzlaff; +Cc: dev
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Wednesday, 8 November 2023 17.52
>
> On Tue, 7 Nov 2023 11:32:20 -0800
> Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
>
> > hi folks,
> >
> > i'm seeking advice. we have use of VLAs which are now optional in
> > standard C. some toolchains provide a conformant implementation and
> msvc
> > does not (and never will).
> >
> > we seem to have a few options, just curious about what people would
> > prefer.
> >
> > * use alloca
> >
> > * use dynamically allocated storage
> >
> > * conditional compiled code where the msvc leg uses one of the
> previous
> > two options
> >
> > i'll leave it simple for now, i'd like to hear input rather than
> provide
> > a recommendation for now.
> >
> > thanks!
>
> As an experiment did a build of current DPDK with -Wvla option.
>
> Lots of errors, some have obvious solutions like:
>
> ../drivers/net/failsafe/failsafe_intr.c: In function
> ‘fs_rx_event_proxy_service_install’:
> ../drivers/net/failsafe/failsafe_intr.c:142:17: warning: ISO C90
> forbids variable length array ‘service_core_list’ [-Wvla]
> 142 | uint32_t service_core_list[num_service_cores];
> | ^~~~~~~~
>
> This could just be RTE_MAX_LCORES.
>
> others like rte_metrics should just use malloc() as is used already in
> that function.
>
> ../lib/metrics/rte_metrics_telemetry.c: In function
> ‘rte_metrics_tel_update_metrics_ethdev’:
> ../lib/metrics/rte_metrics_telemetry.c:140:9: warning: ISO C90 forbids
> variable length array ‘xstats_values’ [-Wvla]
> 140 | uint64_t xstats_values[num_xstats];
> | ^~~~~~~~
> ../lib/metrics/rte_metrics_telemetry.c: In function
> ‘rte_metrics_tel_extract_data’:
> ../lib/metrics/rte_metrics_telemetry.c:384:9: warning: ISO C90 forbids
> variable length array ‘stat_names’ [-Wvla]
> 384 | const char *stat_names[num_stat_names];
> | ^~~~~
>
> Others already have an implicit upper bound.
> Example is in rte_cuckoo_hash where some fields us
> RTE_HASH_LOOKUP_BULK_MAX
> and some use VLA.
>
> [170/2868] Compiling C object
> lib/librte_hash.a.p/hash_rte_cuckoo_hash.c.o
> ../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_bulk_data’:
> ../lib/hash/rte_cuckoo_hash.c:2355:9: warning: ISO C90 forbids variable
> length array ‘positions’ [-Wvla]
> 2355 | int32_t positions[num_keys];
> | ^~~~~~~
> ../lib/hash/rte_cuckoo_hash.c: In function
> ‘rte_hash_lookup_with_hash_bulk_data’:
> ../lib/hash/rte_cuckoo_hash.c:2471:9: warning: ISO C90 forbids variable
> length array ‘positions’ [-Wvla]
> 2471 | int32_t positions[num_keys];
> | ^~~~~~~
>
> Would it make sense to have an rte_config.h value for maximum burst
> size?
I would support that! There could be a few burst size defines, e.g.
- SMALL: used for small bursts (I think some drivers use bursts of 8)
- NORMAL: used for typical bursts
- LARGE: used for large bursts, e.g. mempool cache flush
Having these available at build time would also allow more optimizations in DPDK libs and drivers for those specific burst sizes.
> Lots of code is using nb_pkts.
>
> There is also some confusing ones like:
> ../lib/mempool/rte_mempool.c: In function ‘mempool_cache_init’:
> ../lib/mempool/rte_mempool.c:751:50: warning: ISO C90 forbids array
> whose size cannot be evaluated [-Wvla]
> 751 | RTE_SIZEOF_FIELD(struct
> rte_mempool_cache, objs[0]));
> |
> ^~~~~~~~~~~~~~~~~
> ../lib/eal/include/rte_common.h:498:65: note: in definition of macro
> ‘RTE_BUILD_BUG_ON’
> 498 | #define RTE_BUILD_BUG_ON(condition) ((void)sizeof(char[1 -
> 2*!!(condition)]))
> |
> ^~~~~~~~~
> ../lib/mempool/rte_mempool.c:751:26: note: in expansion of macro
> ‘RTE_SIZEOF_FIELD’
> 751 | RTE_SIZEOF_FIELD(struct
> rte_mempool_cache, objs[0]));
^ permalink raw reply [flat|nested] 34+ messages in thread
* RFC: default burst sizes in rte_config
2023-11-08 17:48 ` Morten Brørup
@ 2023-11-09 10:25 ` Morten Brørup
0 siblings, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2023-11-09 10:25 UTC (permalink / raw)
To: Stephen Hemminger, bruce.richardson, thomas; +Cc: dev, Tyler Retzlaff
> From: Morten Brørup [mailto:mb@smartsharesystems.com]
> Sent: Wednesday, 8 November 2023 18.49
>
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Wednesday, 8 November 2023 17.52
> >
[...]
> >
> > Would it make sense to have an rte_config.h value for maximum burst
> > size?
>
> I would support that!
It would also be a good place to document the reasoning behind the choice of burst size, so application developers can better understand how to fine tune the values according to available hardware and application specific requirements.
Those build-time configurable values should also be used by DPDK libraries, instead of more or less randomly chosen hardcoded burst sizes.
E.g. when I implemented rte_pktmbuf_free_bulk(), I considered 64 plenty of burst capacity, because it was double the size of the traditional burst size of 32. But it is probably sub-optimal for applications using a default burst size of 128.
> There could be a few burst size defines, e.g.
>
> - SMALL: used for small bursts (I think some drivers use bursts of 8)
The reason for choosing 8 is probably rooted in cache alignment:
Eight 64-bit pointers covers one cache line.
I wonder if those drivers would perform better using bursts of 16 mbufs on 32-bit architectures, or on 64-bit architectures with 128 B cache line size?
> - NORMAL: used for typical bursts
This is usually a balance between latency and throughput:
Using shorter bursts can reduce the latency (if the application is designed with this in mind).
Using larger bursts improves processing performance, and thus increases throughput.
There is also some upper limit:
If the burst is too large, the amount of memory touched by a pipeline stage might not fit into the CPU data cache size, and performance drops like a rock.
E.g. a CPU with 64 B cache line size and 32 KB L1 data cache per lcore can hold 512 cache lines in its L1 data cache, so a burst of 32 mbufs allows touching an average of 512/32 = 16 cache lines per packet.
The mbuf structure itself uses 2 cache lines, so the max theoretical burst would be 512/2 = 256 if no other memory was touched.
However, the array holding the mbuf pointers is also touched, so I would put 128 as the largest good burst size on such a CPU.
> - LARGE: used for large bursts, e.g. mempool cache flush
If kept at 512, like the magnitude of the mempool cache flushes/refills, it should only be used for moving mbuf pointers around, without touching the mbufs themselves, or the CPU's L1 data cache will overflow.
>
> Having these available at build time would also allow more
> optimizations in DPDK libs and drivers for those specific burst sizes.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFC acceptable handling of VLAs across toolchains
2023-11-08 16:51 ` Stephen Hemminger
2023-11-08 17:48 ` Morten Brørup
@ 2023-11-09 20:26 ` Tyler Retzlaff
2024-03-21 0:12 ` Tyler Retzlaff
1 sibling, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2023-11-09 20:26 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
On Wed, Nov 08, 2023 at 08:51:54AM -0800, Stephen Hemminger wrote:
> On Tue, 7 Nov 2023 11:32:20 -0800
> Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
>
> > hi folks,
> >
> > i'm seeking advice. we have use of VLAs which are now optional in
> > standard C. some toolchains provide a conformant implementation and msvc
> > does not (and never will).
> >
> > we seem to have a few options, just curious about what people would
> > prefer.
> >
> > * use alloca
> >
> > * use dynamically allocated storage
> >
> > * conditional compiled code where the msvc leg uses one of the previous
> > two options
> >
> > i'll leave it simple for now, i'd like to hear input rather than provide
> > a recommendation for now.
> >
> > thanks!
>
> As an experiment did a build of current DPDK with -Wvla option.
so maybe what i will do here is put a series up that convers to alloca()
for libs and enables -Wvla as a part of the review we can discuss
case-by-case basis of keeping alloca or converting to regular C arrays?
for the items identified below i'll make the conversions as you have
suggested in v1 of the series and seek further comment.
>
> Lots of errors, some have obvious solutions like:
>
> ../drivers/net/failsafe/failsafe_intr.c: In function ‘fs_rx_event_proxy_service_install’:
> ../drivers/net/failsafe/failsafe_intr.c:142:17: warning: ISO C90 forbids variable length array ‘service_core_list’ [-Wvla]
> 142 | uint32_t service_core_list[num_service_cores];
> | ^~~~~~~~
>
> This could just be RTE_MAX_LCORES.
>
> others like rte_metrics should just use malloc() as is used already in
> that function.
>
> ../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_update_metrics_ethdev’:
> ../lib/metrics/rte_metrics_telemetry.c:140:9: warning: ISO C90 forbids variable length array ‘xstats_values’ [-Wvla]
> 140 | uint64_t xstats_values[num_xstats];
> | ^~~~~~~~
> ../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_extract_data’:
> ../lib/metrics/rte_metrics_telemetry.c:384:9: warning: ISO C90 forbids variable length array ‘stat_names’ [-Wvla]
> 384 | const char *stat_names[num_stat_names];
> | ^~~~~
>
> Others already have an implicit upper bound.
> Example is in rte_cuckoo_hash where some fields us RTE_HASH_LOOKUP_BULK_MAX
> and some use VLA.
>
> [170/2868] Compiling C object lib/librte_hash.a.p/hash_rte_cuckoo_hash.c.o
> ../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_bulk_data’:
> ../lib/hash/rte_cuckoo_hash.c:2355:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
> 2355 | int32_t positions[num_keys];
> | ^~~~~~~
> ../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_with_hash_bulk_data’:
> ../lib/hash/rte_cuckoo_hash.c:2471:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
> 2471 | int32_t positions[num_keys];
> | ^~~~~~~
>
> Would it make sense to have an rte_config.h value for maximum burst size?
> Lots of code is using nb_pkts.
>
> There is also some confusing ones like:
> ../lib/mempool/rte_mempool.c: In function ‘mempool_cache_init’:
> ../lib/mempool/rte_mempool.c:751:50: warning: ISO C90 forbids array whose size cannot be evaluated [-Wvla]
> 751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
> | ^~~~~~~~~~~~~~~~~
> ../lib/eal/include/rte_common.h:498:65: note: in definition of macro ‘RTE_BUILD_BUG_ON’
> 498 | #define RTE_BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
> | ^~~~~~~~~
> ../lib/mempool/rte_mempool.c:751:26: note: in expansion of macro ‘RTE_SIZEOF_FIELD’
> 751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: RFC acceptable handling of VLAs across toolchains
2023-11-09 20:26 ` RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
@ 2024-03-21 0:12 ` Tyler Retzlaff
0 siblings, 0 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-03-21 0:12 UTC (permalink / raw)
To: Stephen Hemminger; +Cc: dev
So just top posting to revive this discussion.
i spent some time going through lib and drivers and the use of VLAs is
very extensive. additionally, i have learned that there is some syntax
improvement value in using them over alloca() in spite of neither being
able to report allocation failure.
i would like to propose that for msvc we only adapt code that targets
windows and to use alloca() and as for all of the non-windows built code
leave existing VLA use, perhaps adding -Wvla and suppressing the warning
on existing uses.
what's the advantages of VLA over alloca()?
* sizeof(array) works as expected.
* multi-dimensional arrays are still arrays instead of pointers to
dynamically allocated space. this means multiple subscript syntax
works (unlike on a pointer) and addresses of elements in the
multi-dimensional array are in ascending order. you can approximate
a multi-dimensional array with alloca() but it isn't one and you
are burdened with loops for initialization and non-subscripted
syntax to access elements making code harder to maintain.
for the above reasons i'd recommend only converting to alloca() where
necessary (msvc has to compile it) and for the other instances leave
them as they are. alternative is i could just conditionally compile
and duplicate code at usage sites which is pretty ugly.
appreciate input/feedback here.
thanks!
On Thu, Nov 09, 2023 at 12:26:48PM -0800, Tyler Retzlaff wrote:
> On Wed, Nov 08, 2023 at 08:51:54AM -0800, Stephen Hemminger wrote:
> > On Tue, 7 Nov 2023 11:32:20 -0800
> > Tyler Retzlaff <roretzla@linux.microsoft.com> wrote:
> >
> > > hi folks,
> > >
> > > i'm seeking advice. we have use of VLAs which are now optional in
> > > standard C. some toolchains provide a conformant implementation and msvc
> > > does not (and never will).
> > >
> > > we seem to have a few options, just curious about what people would
> > > prefer.
> > >
> > > * use alloca
> > >
> > > * use dynamically allocated storage
> > >
> > > * conditional compiled code where the msvc leg uses one of the previous
> > > two options
> > >
> > > i'll leave it simple for now, i'd like to hear input rather than provide
> > > a recommendation for now.
> > >
> > > thanks!
> >
> > As an experiment did a build of current DPDK with -Wvla option.
>
> so maybe what i will do here is put a series up that convers to alloca()
> for libs and enables -Wvla as a part of the review we can discuss
> case-by-case basis of keeping alloca or converting to regular C arrays?
>
> for the items identified below i'll make the conversions as you have
> suggested in v1 of the series and seek further comment.
>
> >
> > Lots of errors, some have obvious solutions like:
> >
> > ../drivers/net/failsafe/failsafe_intr.c: In function ‘fs_rx_event_proxy_service_install’:
> > ../drivers/net/failsafe/failsafe_intr.c:142:17: warning: ISO C90 forbids variable length array ‘service_core_list’ [-Wvla]
> > 142 | uint32_t service_core_list[num_service_cores];
> > | ^~~~~~~~
> >
> > This could just be RTE_MAX_LCORES.
> >
> > others like rte_metrics should just use malloc() as is used already in
> > that function.
> >
> > ../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_update_metrics_ethdev’:
> > ../lib/metrics/rte_metrics_telemetry.c:140:9: warning: ISO C90 forbids variable length array ‘xstats_values’ [-Wvla]
> > 140 | uint64_t xstats_values[num_xstats];
> > | ^~~~~~~~
> > ../lib/metrics/rte_metrics_telemetry.c: In function ‘rte_metrics_tel_extract_data’:
> > ../lib/metrics/rte_metrics_telemetry.c:384:9: warning: ISO C90 forbids variable length array ‘stat_names’ [-Wvla]
> > 384 | const char *stat_names[num_stat_names];
> > | ^~~~~
> >
> > Others already have an implicit upper bound.
> > Example is in rte_cuckoo_hash where some fields us RTE_HASH_LOOKUP_BULK_MAX
> > and some use VLA.
> >
> > [170/2868] Compiling C object lib/librte_hash.a.p/hash_rte_cuckoo_hash.c.o
> > ../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_bulk_data’:
> > ../lib/hash/rte_cuckoo_hash.c:2355:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
> > 2355 | int32_t positions[num_keys];
> > | ^~~~~~~
> > ../lib/hash/rte_cuckoo_hash.c: In function ‘rte_hash_lookup_with_hash_bulk_data’:
> > ../lib/hash/rte_cuckoo_hash.c:2471:9: warning: ISO C90 forbids variable length array ‘positions’ [-Wvla]
> > 2471 | int32_t positions[num_keys];
> > | ^~~~~~~
> >
> > Would it make sense to have an rte_config.h value for maximum burst size?
> > Lots of code is using nb_pkts.
> >
> > There is also some confusing ones like:
> > ../lib/mempool/rte_mempool.c: In function ‘mempool_cache_init’:
> > ../lib/mempool/rte_mempool.c:751:50: warning: ISO C90 forbids array whose size cannot be evaluated [-Wvla]
> > 751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
> > | ^~~~~~~~~~~~~~~~~
> > ../lib/eal/include/rte_common.h:498:65: note: in definition of macro ‘RTE_BUILD_BUG_ON’
> > 498 | #define RTE_BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
> > | ^~~~~~~~~
> > ../lib/mempool/rte_mempool.c:751:26: note: in expansion of macro ‘RTE_SIZEOF_FIELD’
> > 751 | RTE_SIZEOF_FIELD(struct rte_mempool_cache, objs[0]));
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 0/4] RFC samples converting VLA to alloca
2023-11-07 19:32 RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
2023-11-08 2:31 ` Stephen Hemminger
2023-11-08 16:51 ` Stephen Hemminger
@ 2024-04-04 17:15 ` Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 1/4] latencystats: use alloca instead of vla trivial Tyler Retzlaff
` (4 more replies)
2 siblings, 5 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-04 17:15 UTC (permalink / raw)
To: dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon,
Morten Brørup, Tyler Retzlaff
This series is not intended for merge. It insteat provides examples of
converting use of VLAs to alloca() would look like.
what's the advantages of VLA over alloca()?
* sizeof(array) works as expected.
* multi-dimensional arrays are still arrays instead of pointers to
dynamically allocated space. this means multiple subscript syntax
works (unlike on a pointer) and calculation of addresses into allocated
space in ascending order is performed by the compiler instead of manually.
what's the disadvantage of VLA over alloca()?
* VLA generation is subtl/implicit, there do appear to be places where
a VLA is being used where it perhaps was not intended but it is hard
to spot. e.g. hotpath rte_mbuf *array[burst_size]; where burst_size
is not a constant expression, e.g. unintended in other syntax positions
that are not intuitive, see patchwork link.
https://patchwork.dpdk.org/project/dpdk/patch/1699896038-28106-1-git-send-email-roretzla@linux.microsoft.com/
for the above reasons i'd recommend only converting to alloca() where
necessary (msvc has to compile it) and for the other instances leave
them as they are.
Tyler Retzlaff (4):
latencystats: use alloca instead of vla trivial
hash: use alloca instead of vla trivial
vhost: use alloca instead of vla sizeof
dispatcher: use alloca instead of vla multi dimensional
lib/dispatcher/rte_dispatcher.c | 6 +++---
lib/hash/rte_thash.c | 2 +-
lib/latencystats/rte_latencystats.c | 2 +-
lib/vhost/socket.c | 5 +++--
4 files changed, 8 insertions(+), 7 deletions(-)
--
1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 1/4] latencystats: use alloca instead of vla trivial
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
@ 2024-04-04 17:15 ` Tyler Retzlaff
2024-04-06 15:28 ` Morten Brørup
2024-04-04 17:15 ` [PATCH 2/4] hash: " Tyler Retzlaff
` (3 subsequent siblings)
4 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-04 17:15 UTC (permalink / raw)
To: dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon,
Morten Brørup, Tyler Retzlaff
RFC sample illustrating simple conversion of VLA to alloca().
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
---
lib/latencystats/rte_latencystats.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/latencystats/rte_latencystats.c b/lib/latencystats/rte_latencystats.c
index 4ea9b0d..f59a9eb 100644
--- a/lib/latencystats/rte_latencystats.c
+++ b/lib/latencystats/rte_latencystats.c
@@ -159,7 +159,7 @@ struct latency_stats_nameoff {
{
unsigned int i, cnt = 0;
uint64_t now;
- float latency[nb_pkts];
+ float *latency = alloca(sizeof(float) * nb_pkts);
static float prev_latency;
/*
* Alpha represents degree of weighting decrease in EWMA,
--
1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 2/4] hash: use alloca instead of vla trivial
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 1/4] latencystats: use alloca instead of vla trivial Tyler Retzlaff
@ 2024-04-04 17:15 ` Tyler Retzlaff
2024-04-06 16:01 ` Morten Brørup
2024-04-04 17:15 ` [PATCH 3/4] vhost: use alloca instead of vla sizeof Tyler Retzlaff
` (2 subsequent siblings)
4 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-04 17:15 UTC (permalink / raw)
To: dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon,
Morten Brørup, Tyler Retzlaff
RFC sample illustrating simple conversion of VLA to alloca() where
dimension multiplier removed.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
---
lib/hash/rte_thash.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/hash/rte_thash.c b/lib/hash/rte_thash.c
index 68f653f..633e211 100644
--- a/lib/hash/rte_thash.c
+++ b/lib/hash/rte_thash.c
@@ -771,7 +771,7 @@ struct rte_thash_subtuple_helper *
uint32_t desired_value, unsigned int attempts,
rte_thash_check_tuple_t fn, void *userdata)
{
- uint32_t tmp_tuple[tuple_len / sizeof(uint32_t)];
+ uint32_t *tmp_tuple = alloca(tuple_len);
unsigned int i, j, ret = 0;
uint32_t hash, adj_bits;
const uint8_t *hash_key;
--
1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 3/4] vhost: use alloca instead of vla sizeof
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 1/4] latencystats: use alloca instead of vla trivial Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 2/4] hash: " Tyler Retzlaff
@ 2024-04-04 17:15 ` Tyler Retzlaff
2024-04-06 22:30 ` Morten Brørup
2024-04-04 17:15 ` [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional Tyler Retzlaff
2024-04-07 9:31 ` [PATCH 0/4] RFC samples converting VLA to alloca Mattias Rönnblom
4 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-04 17:15 UTC (permalink / raw)
To: dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon,
Morten Brørup, Tyler Retzlaff
RFC sample illustrating conversion of VLA to alloca() where
sizeof(array) was in use.
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
---
lib/vhost/socket.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
index 96b3ab5..cedcfb2 100644
--- a/lib/vhost/socket.c
+++ b/lib/vhost/socket.c
@@ -110,7 +110,8 @@ struct vhost_user {
{
struct iovec iov;
struct msghdr msgh;
- char control[CMSG_SPACE(max_fds * sizeof(int))];
+ const size_t control_sz = sizeof(char) * CMSG_SPACE(max_fds * sizeof(int));
+ char *control = alloca(control_sz);
struct cmsghdr *cmsg;
int got_fds = 0;
int ret;
@@ -124,7 +125,7 @@ struct vhost_user {
msgh.msg_iov = &iov;
msgh.msg_iovlen = 1;
msgh.msg_control = control;
- msgh.msg_controllen = sizeof(control);
+ msgh.msg_controllen = control_sz;
ret = recvmsg(sockfd, &msgh, 0);
if (ret <= 0) {
--
1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
` (2 preceding siblings ...)
2024-04-04 17:15 ` [PATCH 3/4] vhost: use alloca instead of vla sizeof Tyler Retzlaff
@ 2024-04-04 17:15 ` Tyler Retzlaff
2024-04-06 15:49 ` Morten Brørup
2024-04-07 9:31 ` [PATCH 0/4] RFC samples converting VLA to alloca Mattias Rönnblom
4 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-04 17:15 UTC (permalink / raw)
To: dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon,
Morten Brørup, Tyler Retzlaff
RFC sample illustrating conversion of multi-dimensional VLA to use
alloca().
Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
---
lib/dispatcher/rte_dispatcher.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/lib/dispatcher/rte_dispatcher.c b/lib/dispatcher/rte_dispatcher.c
index 7934917..f154c26 100644
--- a/lib/dispatcher/rte_dispatcher.c
+++ b/lib/dispatcher/rte_dispatcher.c
@@ -119,7 +119,7 @@ struct rte_dispatcher {
struct rte_event *events, uint16_t num_events)
{
int i;
- struct rte_event bursts[EVD_MAX_HANDLERS][num_events];
+ struct rte_event *bursts = alloca(sizeof(struct rte_event) * EVD_MAX_HANDLERS * num_events);
uint16_t burst_lens[EVD_MAX_HANDLERS] = { 0 };
uint16_t drop_count = 0;
uint16_t dispatch_count;
@@ -136,7 +136,7 @@ struct rte_dispatcher {
continue;
}
- bursts[handler_idx][burst_lens[handler_idx]] = *event;
+ bursts[handler_idx * num_events + burst_lens[handler_idx]] = *event;
burst_lens[handler_idx]++;
}
@@ -152,7 +152,7 @@ struct rte_dispatcher {
continue;
handler->process_fun(dispatcher->event_dev_id, port->port_id,
- bursts[i], len, handler->process_data);
+ &bursts[i * num_events], len, handler->process_data);
dispatched += len;
--
1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 1/4] latencystats: use alloca instead of vla trivial
2024-04-04 17:15 ` [PATCH 1/4] latencystats: use alloca instead of vla trivial Tyler Retzlaff
@ 2024-04-06 15:28 ` Morten Brørup
2024-04-07 9:36 ` Mattias Rönnblom
0 siblings, 1 reply; 34+ messages in thread
From: Morten Brørup @ 2024-04-06 15:28 UTC (permalink / raw)
To: Tyler Retzlaff, Stephen Hemminger, Thomas Monjalon, Bruce Richardson; +Cc: dev
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Thursday, 4 April 2024 19.15
>
> RFC sample illustrating simple conversion of VLA to alloca().
>
> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> ---
[...]
> --- a/lib/latencystats/rte_latencystats.c
> +++ b/lib/latencystats/rte_latencystats.c
> @@ -159,7 +159,7 @@ struct latency_stats_nameoff {
> {
> unsigned int i, cnt = 0;
> uint64_t now;
> - float latency[nb_pkts];
> + float *latency = alloca(sizeof(float) * nb_pkts);
In cases where we are processing packet bursts, I would prefer introducing a global #define RTE_MAX_PKT_BURST_SIZE, indicating the max packet burst size supported by libraries and drivers.
For reference, rte_config.h already has #define RTE_GRAPH_BURST_SIZE 256.
Such a common define should also be used by functions such as rte_pktmbuf_free_bulk(); although it also supports segmented packets, so it must still be able to handle more mbufs.
https://elixir.bootlin.com/dpdk/v24.03/source/lib/mbuf/rte_mbuf.c#L486
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional
2024-04-04 17:15 ` [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional Tyler Retzlaff
@ 2024-04-06 15:49 ` Morten Brørup
0 siblings, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2024-04-06 15:49 UTC (permalink / raw)
To: Tyler Retzlaff, dev; +Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Thursday, 4 April 2024 19.15
>
> RFC sample illustrating conversion of multi-dimensional VLA to use
> alloca().
>
> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> ---
> lib/dispatcher/rte_dispatcher.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/lib/dispatcher/rte_dispatcher.c
> b/lib/dispatcher/rte_dispatcher.c
> index 7934917..f154c26 100644
> --- a/lib/dispatcher/rte_dispatcher.c
> +++ b/lib/dispatcher/rte_dispatcher.c
> @@ -119,7 +119,7 @@ struct rte_dispatcher {
> struct rte_event *events, uint16_t num_events)
> {
> int i;
> - struct rte_event bursts[EVD_MAX_HANDLERS][num_events];
> + struct rte_event *bursts = alloca(sizeof(struct rte_event) *
> EVD_MAX_HANDLERS * num_events);
This is an interesting example, because keeping the allocated memory tight probably has better cache performance than a max sized multi dimensional array, such as bursts[EVD_MAX_HANDLERS][RTE_MAX_EVENT_BURST_SIZE].
And multiplication on modern CPUs are not much slower than left shifting, as they were on CPUs ages ago.
So this suggested solution for multi dimensional arrays seems preferable.
> uint16_t burst_lens[EVD_MAX_HANDLERS] = { 0 };
> uint16_t drop_count = 0;
> uint16_t dispatch_count;
> @@ -136,7 +136,7 @@ struct rte_dispatcher {
> continue;
> }
>
> - bursts[handler_idx][burst_lens[handler_idx]] = *event;
> + bursts[handler_idx * num_events + burst_lens[handler_idx]]
> = *event;
> burst_lens[handler_idx]++;
> }
>
> @@ -152,7 +152,7 @@ struct rte_dispatcher {
> continue;
>
> handler->process_fun(dispatcher->event_dev_id, port-
> >port_id,
> - bursts[i], len, handler->process_data);
> + &bursts[i * num_events], len, handler-
> >process_data);
>
> dispatched += len;
>
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 2/4] hash: use alloca instead of vla trivial
2024-04-04 17:15 ` [PATCH 2/4] hash: " Tyler Retzlaff
@ 2024-04-06 16:01 ` Morten Brørup
0 siblings, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2024-04-06 16:01 UTC (permalink / raw)
To: Tyler Retzlaff, Bruce Richardson, Stephen Hemminger, Thomas Monjalon; +Cc: dev
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Thursday, 4 April 2024 19.15
>
> RFC sample illustrating simple conversion of VLA to alloca() where
> dimension multiplier removed.
>
> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> ---
[...]
> {
> - uint32_t tmp_tuple[tuple_len / sizeof(uint32_t)];
> + uint32_t *tmp_tuple = alloca(tuple_len);
This code is in the rte_thash_adjust_tuple() function [1].
I think we could use a constant size array here, making it large enough for what we think would suffice for any realistic purpose.
The function could check the tuple_len parameter at runtime and return an error value if too big for the array.
It could also check the tuple_len parameter, if constant, at build time.
[1]: https://elixir.bootlin.com/dpdk/v24.03/source/lib/hash/rte_thash.c#L768
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 3/4] vhost: use alloca instead of vla sizeof
2024-04-04 17:15 ` [PATCH 3/4] vhost: use alloca instead of vla sizeof Tyler Retzlaff
@ 2024-04-06 22:30 ` Morten Brørup
0 siblings, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2024-04-06 22:30 UTC (permalink / raw)
To: Tyler Retzlaff, dev; +Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Thursday, 4 April 2024 19.15
>
> RFC sample illustrating conversion of VLA to alloca() where
> sizeof(array) was in use.
>
> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> ---
> lib/vhost/socket.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> index 96b3ab5..cedcfb2 100644
> --- a/lib/vhost/socket.c
> +++ b/lib/vhost/socket.c
> @@ -110,7 +110,8 @@ struct vhost_user {
> {
> struct iovec iov;
> struct msghdr msgh;
> - char control[CMSG_SPACE(max_fds * sizeof(int))];
> + const size_t control_sz = sizeof(char) * CMSG_SPACE(max_fds *
> sizeof(int));
I get the point, but think multiplying with sizeof(char) is overkill.
It's a matter of personal taste; maybe it's just me.
If it was an array of a different type, e.g. int, multiplying with sizeof(int) would be required, like in the latencystats example.
Anyway, I agree with the approach here.
> + char *control = alloca(control_sz);
> struct cmsghdr *cmsg;
> int got_fds = 0;
> int ret;
> @@ -124,7 +125,7 @@ struct vhost_user {
> msgh.msg_iov = &iov;
> msgh.msg_iovlen = 1;
> msgh.msg_control = control;
> - msgh.msg_controllen = sizeof(control);
> + msgh.msg_controllen = control_sz;
>
> ret = recvmsg(sockfd, &msgh, 0);
> if (ret <= 0) {
> --
> 1.8.3.1
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
` (3 preceding siblings ...)
2024-04-04 17:15 ` [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional Tyler Retzlaff
@ 2024-04-07 9:31 ` Mattias Rönnblom
2024-04-07 11:07 ` Morten Brørup
4 siblings, 1 reply; 34+ messages in thread
From: Mattias Rönnblom @ 2024-04-07 9:31 UTC (permalink / raw)
To: Tyler Retzlaff, dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon, Morten Brørup
On 2024-04-04 19:15, Tyler Retzlaff wrote:
> This series is not intended for merge. It insteat provides examples of
> converting use of VLAs to alloca() would look like.
>
> what's the advantages of VLA over alloca()?
>
> * sizeof(array) works as expected.
>
> * multi-dimensional arrays are still arrays instead of pointers to
> dynamically allocated space. this means multiple subscript syntax
> works (unlike on a pointer) and calculation of addresses into allocated
> space in ascending order is performed by the compiler instead of manually.
>
alloca() is a pretty obscure mechanism, and also not a part of the C
standard. VLAs are C99, and well-known and understood, and very efficient.
> what's the disadvantage of VLA over alloca()?
>
> * VLA generation is subtl/implicit, there do appear to be places where
> a VLA is being used where it perhaps was not intended but it is hard
> to spot. e.g. hotpath rte_mbuf *array[burst_size]; where burst_size
> is not a constant expression, e.g. unintended in other syntax positions
> that are not intuitive, see patchwork link.
>
> https://patchwork.dpdk.org/project/dpdk/patch/1699896038-28106-1-git-send-email-roretzla@linux.microsoft.com/
>
> for the above reasons i'd recommend only converting to alloca() where
> necessary (msvc has to compile it) and for the other instances leave
> them as they are.
>
> Tyler Retzlaff (4):
> latencystats: use alloca instead of vla trivial
> hash: use alloca instead of vla trivial
> vhost: use alloca instead of vla sizeof
> dispatcher: use alloca instead of vla multi dimensional
>
> lib/dispatcher/rte_dispatcher.c | 6 +++---
> lib/hash/rte_thash.c | 2 +-
> lib/latencystats/rte_latencystats.c | 2 +-
> lib/vhost/socket.c | 5 +++--
> 4 files changed, 8 insertions(+), 7 deletions(-)
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 1/4] latencystats: use alloca instead of vla trivial
2024-04-06 15:28 ` Morten Brørup
@ 2024-04-07 9:36 ` Mattias Rönnblom
2024-04-07 17:00 ` Stephen Hemminger
0 siblings, 1 reply; 34+ messages in thread
From: Mattias Rönnblom @ 2024-04-07 9:36 UTC (permalink / raw)
To: Morten Brørup, Tyler Retzlaff, Stephen Hemminger,
Thomas Monjalon, Bruce Richardson
Cc: dev
On 2024-04-06 17:28, Morten Brørup wrote:
>> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
>> Sent: Thursday, 4 April 2024 19.15
>>
>> RFC sample illustrating simple conversion of VLA to alloca().
>>
>> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
>> ---
>
> [...]
>
>> --- a/lib/latencystats/rte_latencystats.c
>> +++ b/lib/latencystats/rte_latencystats.c
>> @@ -159,7 +159,7 @@ struct latency_stats_nameoff {
>> {
>> unsigned int i, cnt = 0;
>> uint64_t now;
>> - float latency[nb_pkts];
>> + float *latency = alloca(sizeof(float) * nb_pkts);
>
> In cases where we are processing packet bursts, I would prefer introducing a global #define RTE_MAX_PKT_BURST_SIZE, indicating the max packet burst size supported by libraries and drivers.
First question: what is meant by a "packet" here? An mbuf? A
network-layer PDU? Something that in some way relates to zero or more
packets, like an rte_event? Or just any object that are sent or receive
of some DPDK API in batches or bursts?
Second question: is RTE_MAX_PKT_BURST_SIZE meant as an upper bound, so
no API can consumer or produce a burst larger than this, it does all
APIs literally have to support that burst size.
Third question: why not just keep VLAs?
> For reference, rte_config.h already has #define RTE_GRAPH_BURST_SIZE 256.
>
> Such a common define should also be used by functions such as rte_pktmbuf_free_bulk(); although it also supports segmented packets, so it must still be able to handle more mbufs.
> https://elixir.bootlin.com/dpdk/v24.03/source/lib/mbuf/rte_mbuf.c#L486
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-07 9:31 ` [PATCH 0/4] RFC samples converting VLA to alloca Mattias Rönnblom
@ 2024-04-07 11:07 ` Morten Brørup
2024-04-07 17:03 ` Stephen Hemminger
0 siblings, 1 reply; 34+ messages in thread
From: Morten Brørup @ 2024-04-07 11:07 UTC (permalink / raw)
To: Mattias Rönnblom, Tyler Retzlaff, dev
Cc: Bruce Richardson, Stephen Hemminger, Thomas Monjalon
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Sunday, 7 April 2024 11.32
>
> On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > This series is not intended for merge. It insteat provides examples
> of
> > converting use of VLAs to alloca() would look like.
> >
> > what's the advantages of VLA over alloca()?
> >
> > * sizeof(array) works as expected.
> >
> > * multi-dimensional arrays are still arrays instead of pointers to
> > dynamically allocated space. this means multiple subscript syntax
> > works (unlike on a pointer) and calculation of addresses into
> allocated
> > space in ascending order is performed by the compiler instead of
> manually.
> >
>
> alloca() is a pretty obscure mechanism, and also not a part of the C
> standard. VLAs are C99, and well-known and understood, and very
> efficient.
The RFC fails to mention why we need to replace VLAs with something else:
VLAs are C99, but not C++; VLAs were made optional in C11.
MSVC doesn't support VLAs, and is not going to:
https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/#variable-length-arrays
I dislike alloca() too, and the notes section in the alloca(3) man page even discourages the use of alloca():
https://man7.org/linux/man-pages/man3/alloca.3.html
But I guess alloca() is the simplest replacement for VLAs.
This RFC patch series opens the discussion for alternatives in different use cases.
>
> > what's the disadvantage of VLA over alloca()?
> >
> > * VLA generation is subtl/implicit, there do appear to be places where
> > a VLA is being used where it perhaps was not intended but it is
> hard
> > to spot. e.g. hotpath rte_mbuf *array[burst_size]; where burst_size
> > is not a constant expression, e.g. unintended in other syntax
> positions
> > that are not intuitive, see patchwork link.
> >
> > https://patchwork.dpdk.org/project/dpdk/patch/1699896038-28106-1-
> git-send-email-roretzla@linux.microsoft.com/
> >
> > for the above reasons i'd recommend only converting to alloca() where
> > necessary (msvc has to compile it) and for the other instances leave
> > them as they are.
> >
> > Tyler Retzlaff (4):
> > latencystats: use alloca instead of vla trivial
> > hash: use alloca instead of vla trivial
> > vhost: use alloca instead of vla sizeof
> > dispatcher: use alloca instead of vla multi dimensional
> >
> > lib/dispatcher/rte_dispatcher.c | 6 +++---
> > lib/hash/rte_thash.c | 2 +-
> > lib/latencystats/rte_latencystats.c | 2 +-
> > lib/vhost/socket.c | 5 +++--
> > 4 files changed, 8 insertions(+), 7 deletions(-)
> >
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 1/4] latencystats: use alloca instead of vla trivial
2024-04-07 9:36 ` Mattias Rönnblom
@ 2024-04-07 17:00 ` Stephen Hemminger
0 siblings, 0 replies; 34+ messages in thread
From: Stephen Hemminger @ 2024-04-07 17:00 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Morten Brørup, Tyler Retzlaff, Thomas Monjalon,
Bruce Richardson, dev
On Sun, 7 Apr 2024 11:36:59 +0200
Mattias Rönnblom <hofors@lysator.liu.se> wrote:
> On 2024-04-06 17:28, Morten Brørup wrote:
> >> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> >> Sent: Thursday, 4 April 2024 19.15
> >>
> >> RFC sample illustrating simple conversion of VLA to alloca().
> >>
> >> Signed-off-by: Tyler Retzlaff <roretzla@linux.microsoft.com>
> >> ---
> >
> > [...]
> >
> >> --- a/lib/latencystats/rte_latencystats.c
> >> +++ b/lib/latencystats/rte_latencystats.c
> >> @@ -159,7 +159,7 @@ struct latency_stats_nameoff {
> >> {
> >> unsigned int i, cnt = 0;
> >> uint64_t now;
> >> - float latency[nb_pkts];
> >> + float *latency = alloca(sizeof(float) * nb_pkts);
> >
> > In cases where we are processing packet bursts, I would prefer introducing a global #define RTE_MAX_PKT_BURST_SIZE, indicating the max packet burst size supported by libraries and drivers.
>
> First question: what is meant by a "packet" here? An mbuf? A
> network-layer PDU? Something that in some way relates to zero or more
> packets, like an rte_event? Or just any object that are sent or receive
> of some DPDK API in batches or bursts?
>
> Second question: is RTE_MAX_PKT_BURST_SIZE meant as an upper bound, so
> no API can consumer or produce a burst larger than this, it does all
> APIs literally have to support that burst size.
>
> Third question: why not just keep VLAs?
>
> > For reference, rte_config.h already has #define RTE_GRAPH_BURST_SIZE 256.
> >
> > Such a common define should also be used by functions such as rte_pktmbuf_free_bulk(); although it also supports segmented packets, so it must still be able to handle more mbufs.
> > https://elixir.bootlin.com/dpdk/v24.03/source/lib/mbuf/rte_mbuf.c#L486
> >
Looking at the maths here, calc_lantency can be seriously improved:
- the calc latency is in the fast path. for transmit.
- it is doing floating point math; floating point is much slower than doing
fixed point
- the latency[] array is a temporary, it should be possible to compute
total latency without it.
- it acquires a lock, in order to achieve DPDK level performance of 40 Mpps, it is
necessary to not do absolute minimum of locking.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-07 11:07 ` Morten Brørup
@ 2024-04-07 17:03 ` Stephen Hemminger
2024-04-08 15:27 ` Tyler Retzlaff
0 siblings, 1 reply; 34+ messages in thread
From: Stephen Hemminger @ 2024-04-07 17:03 UTC (permalink / raw)
To: Morten Brørup
Cc: Mattias Rönnblom, Tyler Retzlaff, dev, Bruce Richardson,
Thomas Monjalon
On Sun, 7 Apr 2024 13:07:06 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:
> > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > Sent: Sunday, 7 April 2024 11.32
> >
> > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > This series is not intended for merge. It insteat provides examples
> > of
> > > converting use of VLAs to alloca() would look like.
> > >
> > > what's the advantages of VLA over alloca()?
> > >
> > > * sizeof(array) works as expected.
> > >
> > > * multi-dimensional arrays are still arrays instead of pointers to
> > > dynamically allocated space. this means multiple subscript syntax
> > > works (unlike on a pointer) and calculation of addresses into
> > allocated
> > > space in ascending order is performed by the compiler instead of
> > manually.
> > >
> >
> > alloca() is a pretty obscure mechanism, and also not a part of the C
> > standard. VLAs are C99, and well-known and understood, and very
> > efficient.
>
> The RFC fails to mention why we need to replace VLAs with something else:
>
> VLAs are C99, but not C++; VLAs were made optional in C11.
>
> MSVC doesn't support VLAs, and is not going to:
> https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/#variable-length-arrays
>
>
> I dislike alloca() too, and the notes section in the alloca(3) man page even discourages the use of alloca():
> https://man7.org/linux/man-pages/man3/alloca.3.html
>
> But I guess alloca() is the simplest replacement for VLAs.
> This RFC patch series opens the discussion for alternatives in different use cases.
>
The other issue with VLA's is that if the number is something that can be externally
input, then it can be a source of stack overflow bugs. That is why the Linux kernel
has stopped using them; for security reasons. DPDK has much less of a security
trust domain. Mostly need to make sure that no data from network is being
used to compute VLA size.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-07 17:03 ` Stephen Hemminger
@ 2024-04-08 15:27 ` Tyler Retzlaff
2024-04-08 15:53 ` Morten Brørup
2024-04-10 7:27 ` Mattias Rönnblom
0 siblings, 2 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-08 15:27 UTC (permalink / raw)
To: Stephen Hemminger, techboard
Cc: Morten Brørup, Mattias Rönnblom, dev, Bruce Richardson,
Thomas Monjalon
For next technboard meeting.
On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> On Sun, 7 Apr 2024 13:07:06 +0200
> Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > Sent: Sunday, 7 April 2024 11.32
> > >
> > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > This series is not intended for merge. It insteat provides examples
> > > of
> > > > converting use of VLAs to alloca() would look like.
> > > >
> > > > what's the advantages of VLA over alloca()?
> > > >
> > > > * sizeof(array) works as expected.
> > > >
> > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > dynamically allocated space. this means multiple subscript syntax
> > > > works (unlike on a pointer) and calculation of addresses into
> > > allocated
> > > > space in ascending order is performed by the compiler instead of
> > > manually.
> > > >
> > >
> > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > standard. VLAs are C99, and well-known and understood, and very
> > > efficient.
> >
> > The RFC fails to mention why we need to replace VLAs with something else:
> >
> > VLAs are C99, but not C++; VLAs were made optional in C11.
> >
> > MSVC doesn't support VLAs, and is not going to:
> > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/#variable-length-arrays
> >
> >
> > I dislike alloca() too, and the notes section in the alloca(3) man page even discourages the use of alloca():
> > https://man7.org/linux/man-pages/man3/alloca.3.html
> >
> > But I guess alloca() is the simplest replacement for VLAs.
> > This RFC patch series opens the discussion for alternatives in different use cases.
> >
>
> The other issue with VLA's is that if the number is something that can be externally
> input, then it can be a source of stack overflow bugs. That is why the Linux kernel
> has stopped using them; for security reasons. DPDK has much less of a security
> trust domain. Mostly need to make sure that no data from network is being
> used to compute VLA size.
>
Looks like we need to discuss this at the next techboard meeting.
* MSVC doesn't support C11 optional VLAs (and never will).
* alloca() is an alternative that is available on all platforms/toolchain
combinations.
* it's reasonable for some VLAs to be turned into regular arrays but it
would be unsatisfactory to be stuck waiting discussions of defining new
constant expression macros on a per-use basis.
* there is resistance to using alloca() vs VLA so my proposal is to
change only the code that is built to target windows.
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-08 15:27 ` Tyler Retzlaff
@ 2024-04-08 15:53 ` Morten Brørup
2024-04-09 8:28 ` Konstantin Ananyev
2024-04-10 7:32 ` Mattias Rönnblom
2024-04-10 7:27 ` Mattias Rönnblom
1 sibling, 2 replies; 34+ messages in thread
From: Morten Brørup @ 2024-04-08 15:53 UTC (permalink / raw)
To: Tyler Retzlaff, Stephen Hemminger, techboard
Cc: Mattias Rönnblom, dev, Bruce Richardson, Thomas Monjalon
> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> Sent: Monday, 8 April 2024 17.27
>
> For next technboard meeting.
>
> On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> > On Sun, 7 Apr 2024 13:07:06 +0200
> > Morten Brørup <mb@smartsharesystems.com> wrote:
> >
> > > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > > Sent: Sunday, 7 April 2024 11.32
> > > >
> > > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > > This series is not intended for merge. It insteat provides examples
> > > > of
> > > > > converting use of VLAs to alloca() would look like.
> > > > >
> > > > > what's the advantages of VLA over alloca()?
> > > > >
> > > > > * sizeof(array) works as expected.
> > > > >
> > > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > > dynamically allocated space. this means multiple subscript syntax
> > > > > works (unlike on a pointer) and calculation of addresses into
> > > > allocated
> > > > > space in ascending order is performed by the compiler instead of
> > > > manually.
> > > > >
> > > >
> > > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > > standard. VLAs are C99, and well-known and understood, and very
> > > > efficient.
> > >
> > > The RFC fails to mention why we need to replace VLAs with something else:
> > >
> > > VLAs are C99, but not C++; VLAs were made optional in C11.
> > >
> > > MSVC doesn't support VLAs, and is not going to:
> > > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> arriving-in-msvc/#variable-length-arrays
> > >
> > >
> > > I dislike alloca() too, and the notes section in the alloca(3) man page
> even discourages the use of alloca():
> > > https://man7.org/linux/man-pages/man3/alloca.3.html
> > >
> > > But I guess alloca() is the simplest replacement for VLAs.
> > > This RFC patch series opens the discussion for alternatives in different
> use cases.
> > >
> >
> > The other issue with VLA's is that if the number is something that can be
> externally
> > input, then it can be a source of stack overflow bugs. That is why the Linux
> kernel
> > has stopped using them; for security reasons. DPDK has much less of a
> security
> > trust domain. Mostly need to make sure that no data from network is being
> > used to compute VLA size.
> >
>
> Looks like we need to discuss this at the next techboard meeting.
>
> * MSVC doesn't support C11 optional VLAs (and never will).
> * alloca() is an alternative that is available on all platforms/toolchain
> combinations.
> * it's reasonable for some VLAs to be turned into regular arrays but it
> would be unsatisfactory to be stuck waiting discussions of defining new
> constant expression macros on a per-use basis.
We must generally stop using VLAs, for many reasons.
The only available 1:1 replacement is alloca(), so we have to accept that.
If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
> * there is resistance to using alloca() vs VLA so my proposal is to
> change only the code that is built to target windows.
I would prefer to get rid of them all, so the CI can build with -Wvla to prevent them from being introduced again.
Not a strong preference.
On the other hand, the CI's MSVC builds will catch them if used for a Windows target.
And limiting to Windows code reduces the amount of work, so that's probably the most realistic solution.
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-08 15:53 ` Morten Brørup
@ 2024-04-09 8:28 ` Konstantin Ananyev
2024-04-09 15:08 ` Tyler Retzlaff
2024-04-10 7:32 ` Mattias Rönnblom
1 sibling, 1 reply; 34+ messages in thread
From: Konstantin Ananyev @ 2024-04-09 8:28 UTC (permalink / raw)
To: Morten Brørup, Tyler Retzlaff, Stephen Hemminger, techboard
Cc: Mattias Rönnblom, dev, Bruce Richardson, Thomas Monjalon
> > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > Sent: Monday, 8 April 2024 17.27
> >
> > For next technboard meeting.
> >
> > On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> > > On Sun, 7 Apr 2024 13:07:06 +0200
> > > Morten Brørup <mb@smartsharesystems.com> wrote:
> > >
> > > > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > > > Sent: Sunday, 7 April 2024 11.32
> > > > >
> > > > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > > > This series is not intended for merge. It insteat provides examples
> > > > > of
> > > > > > converting use of VLAs to alloca() would look like.
> > > > > >
> > > > > > what's the advantages of VLA over alloca()?
> > > > > >
> > > > > > * sizeof(array) works as expected.
> > > > > >
> > > > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > > > dynamically allocated space. this means multiple subscript syntax
> > > > > > works (unlike on a pointer) and calculation of addresses into
> > > > > allocated
> > > > > > space in ascending order is performed by the compiler instead of
> > > > > manually.
> > > > > >
> > > > >
> > > > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > > > standard. VLAs are C99, and well-known and understood, and very
> > > > > efficient.
> > > >
> > > > The RFC fails to mention why we need to replace VLAs with something else:
> > > >
> > > > VLAs are C99, but not C++; VLAs were made optional in C11.
> > > >
> > > > MSVC doesn't support VLAs, and is not going to:
> > > > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> > arriving-in-msvc/#variable-length-arrays
> > > >
> > > >
> > > > I dislike alloca() too, and the notes section in the alloca(3) man page
> > even discourages the use of alloca():
> > > > https://man7.org/linux/man-pages/man3/alloca.3.html
> > > >
> > > > But I guess alloca() is the simplest replacement for VLAs.
> > > > This RFC patch series opens the discussion for alternatives in different
> > use cases.
> > > >
> > >
> > > The other issue with VLA's is that if the number is something that can be
> > externally
> > > input, then it can be a source of stack overflow bugs. That is why the Linux
> > kernel
> > > has stopped using them; for security reasons. DPDK has much less of a
> > security
> > > trust domain. Mostly need to make sure that no data from network is being
> > > used to compute VLA size.
> > >
> >
> > Looks like we need to discuss this at the next techboard meeting.
> >
> > * MSVC doesn't support C11 optional VLAs (and never will).
> > * alloca() is an alternative that is available on all platforms/toolchain
> > combinations.
> > * it's reasonable for some VLAs to be turned into regular arrays but it
> > would be unsatisfactory to be stuck waiting discussions of defining new
> > constant expression macros on a per-use basis.
>
> We must generally stop using VLAs, for many reasons.
> The only available 1:1 replacement is alloca(), so we have to accept that.
>
> If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
>
> Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
Ok, but why we have to rush into 'alloca()' solution if none of us really fond of it?
As you already noted majority of these cases can be replaced with static sized arrays.
Let's try to compile a list of what needs to be changed, split it by priorities and work
progressively through it.
Konstantin
>
> > * there is resistance to using alloca() vs VLA so my proposal is to
> > change only the code that is built to target windows.
>
> I would prefer to get rid of them all, so the CI can build with -Wvla to prevent them from being introduced again.
> Not a strong preference.
> On the other hand, the CI's MSVC builds will catch them if used for a Windows target.
> And limiting to Windows code reduces the amount of work, so that's probably the most realistic solution.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-09 8:28 ` Konstantin Ananyev
@ 2024-04-09 15:08 ` Tyler Retzlaff
2024-04-10 9:58 ` Konstantin Ananyev
0 siblings, 1 reply; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-09 15:08 UTC (permalink / raw)
To: Konstantin Ananyev
Cc: Morten Brørup, Stephen Hemminger, techboard,
Mattias Rönnblom, dev, Bruce Richardson, Thomas Monjalon
On Tue, Apr 09, 2024 at 08:28:48AM +0000, Konstantin Ananyev wrote:
>
>
> > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > Sent: Monday, 8 April 2024 17.27
> > >
> > > For next technboard meeting.
> > >
> > > On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> > > > On Sun, 7 Apr 2024 13:07:06 +0200
> > > > Morten Brørup <mb@smartsharesystems.com> wrote:
> > > >
> > > > > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > > > > Sent: Sunday, 7 April 2024 11.32
> > > > > >
> > > > > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > > > > This series is not intended for merge. It insteat provides examples
> > > > > > of
> > > > > > > converting use of VLAs to alloca() would look like.
> > > > > > >
> > > > > > > what's the advantages of VLA over alloca()?
> > > > > > >
> > > > > > > * sizeof(array) works as expected.
> > > > > > >
> > > > > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > > > > dynamically allocated space. this means multiple subscript syntax
> > > > > > > works (unlike on a pointer) and calculation of addresses into
> > > > > > allocated
> > > > > > > space in ascending order is performed by the compiler instead of
> > > > > > manually.
> > > > > > >
> > > > > >
> > > > > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > > > > standard. VLAs are C99, and well-known and understood, and very
> > > > > > efficient.
> > > > >
> > > > > The RFC fails to mention why we need to replace VLAs with something else:
> > > > >
> > > > > VLAs are C99, but not C++; VLAs were made optional in C11.
> > > > >
> > > > > MSVC doesn't support VLAs, and is not going to:
> > > > > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> > > arriving-in-msvc/#variable-length-arrays
> > > > >
> > > > >
> > > > > I dislike alloca() too, and the notes section in the alloca(3) man page
> > > even discourages the use of alloca():
> > > > > https://man7.org/linux/man-pages/man3/alloca.3.html
> > > > >
> > > > > But I guess alloca() is the simplest replacement for VLAs.
> > > > > This RFC patch series opens the discussion for alternatives in different
> > > use cases.
> > > > >
> > > >
> > > > The other issue with VLA's is that if the number is something that can be
> > > externally
> > > > input, then it can be a source of stack overflow bugs. That is why the Linux
> > > kernel
> > > > has stopped using them; for security reasons. DPDK has much less of a
> > > security
> > > > trust domain. Mostly need to make sure that no data from network is being
> > > > used to compute VLA size.
> > > >
> > >
> > > Looks like we need to discuss this at the next techboard meeting.
> > >
> > > * MSVC doesn't support C11 optional VLAs (and never will).
> > > * alloca() is an alternative that is available on all platforms/toolchain
> > > combinations.
> > > * it's reasonable for some VLAs to be turned into regular arrays but it
> > > would be unsatisfactory to be stuck waiting discussions of defining new
> > > constant expression macros on a per-use basis.
> >
> > We must generally stop using VLAs, for many reasons.
> > The only available 1:1 replacement is alloca(), so we have to accept that.
> >
> > If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
> >
> > Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
>
> Ok, but why we have to rush into 'alloca()' solution if none of us really fond of it?
for the trivial case it is no worse than a VLA. while it isn't
standardized it is available for all platform/toolchains unlike VLA.
most of the code needed to be changed for windows falls into the trivial
case when converted.
there do appear to be cases where VLAs have just been unintentional.
i previously linked a patch where i fixed a case where they were
instantiated inside a cast and there are other cases i'm aware of in the
mlx5 driver where i believe they are unintended. at least with alloca
it is obvious but with a VLA if the expression used to determine the
size is wrapped up in something non-trivial and the author doesn't check
that it is truly a constant expression you get one by surprise.
> As you already noted majority of these cases can be replaced with static sized arrays.
unfortunately i don't think this is the case if we are talking about the
entire source tree.
> Let's try to compile a list of what needs to be changed, split it by priorities and work
> progressively through it.
i agree that working progressively is the way forward, my suggestion
partitioning has been to submit a smaller series that unblocks windows
using alloca as a starting point. this represents only a fraction of the
uses but can also serve for evaluation purposes.
if maintainers can identify a reasonable conversion to static array for
any of the converted instances i can incorporate the prescribed changes.
i would also suggest that in parallel we might introduce a series that
enables -Wvla but suppresses warning about -Wvla at the sites of use.
the purpose of this suggestion is to stop new introductions but also
annotate the uses we would like maintainers to evaluate. perhaps some
could also be trivially eliminated with the series.
> Konstantin
>
> >
> > > * there is resistance to using alloca() vs VLA so my proposal is to
> > > change only the code that is built to target windows.
> >
> > I would prefer to get rid of them all, so the CI can build with -Wvla to prevent them from being introduced again.
> > Not a strong preference.
> > On the other hand, the CI's MSVC builds will catch them if used for a Windows target.
> > And limiting to Windows code reduces the amount of work, so that's probably the most realistic solution.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-08 15:27 ` Tyler Retzlaff
2024-04-08 15:53 ` Morten Brørup
@ 2024-04-10 7:27 ` Mattias Rönnblom
2024-04-10 17:10 ` Tyler Retzlaff
1 sibling, 1 reply; 34+ messages in thread
From: Mattias Rönnblom @ 2024-04-10 7:27 UTC (permalink / raw)
To: Tyler Retzlaff, Stephen Hemminger, techboard
Cc: Morten Brørup, dev, Bruce Richardson, Thomas Monjalon
On 2024-04-08 17:27, Tyler Retzlaff wrote:
> For next technboard meeting.
>
> On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
>> On Sun, 7 Apr 2024 13:07:06 +0200
>> Morten Brørup <mb@smartsharesystems.com> wrote:
>>
>>>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
>>>> Sent: Sunday, 7 April 2024 11.32
>>>>
>>>> On 2024-04-04 19:15, Tyler Retzlaff wrote:
>>>>> This series is not intended for merge. It insteat provides examples
>>>> of
>>>>> converting use of VLAs to alloca() would look like.
>>>>>
>>>>> what's the advantages of VLA over alloca()?
>>>>>
>>>>> * sizeof(array) works as expected.
>>>>>
>>>>> * multi-dimensional arrays are still arrays instead of pointers to
>>>>> dynamically allocated space. this means multiple subscript syntax
>>>>> works (unlike on a pointer) and calculation of addresses into
>>>> allocated
>>>>> space in ascending order is performed by the compiler instead of
>>>> manually.
>>>>>
>>>>
>>>> alloca() is a pretty obscure mechanism, and also not a part of the C
>>>> standard. VLAs are C99, and well-known and understood, and very
>>>> efficient.
>>>
>>> The RFC fails to mention why we need to replace VLAs with something else:
>>>
>>> VLAs are C99, but not C++; VLAs were made optional in C11.
>>>
>>> MSVC doesn't support VLAs, and is not going to:
>>> https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/#variable-length-arrays
>>>
>>>
>>> I dislike alloca() too, and the notes section in the alloca(3) man page even discourages the use of alloca():
>>> https://man7.org/linux/man-pages/man3/alloca.3.html
>>>
>>> But I guess alloca() is the simplest replacement for VLAs.
>>> This RFC patch series opens the discussion for alternatives in different use cases.
>>>
>>
>> The other issue with VLA's is that if the number is something that can be externally
>> input, then it can be a source of stack overflow bugs. That is why the Linux kernel
>> has stopped using them; for security reasons. DPDK has much less of a security
>> trust domain. Mostly need to make sure that no data from network is being
>> used to compute VLA size.
>>
>
> Looks like we need to discuss this at the next techboard meeting.
>
> * MSVC doesn't support C11 optional VLAs (and never will).
This is due to dogmatism, or what? Surely, a lot of Open Source projects
written for C99 will use VLAs.
> * alloca() is an alternative that is available on all platforms/toolchain
> combinations.
alloca() is a poor alternative. The use of alloca() should be restricted
to situations where statically sized arrays can't do the job.
> * it's reasonable for some VLAs to be turned into regular arrays but it
> would be unsatisfactory to be stuck waiting discussions of defining new
> constant expression macros on a per-use basis.
> * there is resistance to using alloca() vs VLA so my proposal is to
> change only the code that is built to target windows.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-08 15:53 ` Morten Brørup
2024-04-09 8:28 ` Konstantin Ananyev
@ 2024-04-10 7:32 ` Mattias Rönnblom
2024-04-10 7:52 ` Morten Brørup
2024-04-10 17:04 ` Tyler Retzlaff
1 sibling, 2 replies; 34+ messages in thread
From: Mattias Rönnblom @ 2024-04-10 7:32 UTC (permalink / raw)
To: Morten Brørup, Tyler Retzlaff, Stephen Hemminger, techboard
Cc: dev, Bruce Richardson, Thomas Monjalon
On 2024-04-08 17:53, Morten Brørup wrote:
>> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
>> Sent: Monday, 8 April 2024 17.27
>>
>> For next technboard meeting.
>>
>> On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
>>> On Sun, 7 Apr 2024 13:07:06 +0200
>>> Morten Brørup <mb@smartsharesystems.com> wrote:
>>>
>>>>> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
>>>>> Sent: Sunday, 7 April 2024 11.32
>>>>>
>>>>> On 2024-04-04 19:15, Tyler Retzlaff wrote:
>>>>>> This series is not intended for merge. It insteat provides examples
>>>>> of
>>>>>> converting use of VLAs to alloca() would look like.
>>>>>>
>>>>>> what's the advantages of VLA over alloca()?
>>>>>>
>>>>>> * sizeof(array) works as expected.
>>>>>>
>>>>>> * multi-dimensional arrays are still arrays instead of pointers to
>>>>>> dynamically allocated space. this means multiple subscript syntax
>>>>>> works (unlike on a pointer) and calculation of addresses into
>>>>> allocated
>>>>>> space in ascending order is performed by the compiler instead of
>>>>> manually.
>>>>>>
>>>>>
>>>>> alloca() is a pretty obscure mechanism, and also not a part of the C
>>>>> standard. VLAs are C99, and well-known and understood, and very
>>>>> efficient.
>>>>
>>>> The RFC fails to mention why we need to replace VLAs with something else:
>>>>
>>>> VLAs are C99, but not C++; VLAs were made optional in C11.
>>>>
>>>> MSVC doesn't support VLAs, and is not going to:
>>>> https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
>> arriving-in-msvc/#variable-length-arrays
>>>>
>>>>
>>>> I dislike alloca() too, and the notes section in the alloca(3) man page
>> even discourages the use of alloca():
>>>> https://man7.org/linux/man-pages/man3/alloca.3.html
>>>>
>>>> But I guess alloca() is the simplest replacement for VLAs.
>>>> This RFC patch series opens the discussion for alternatives in different
>> use cases.
>>>>
>>>
>>> The other issue with VLA's is that if the number is something that can be
>> externally
>>> input, then it can be a source of stack overflow bugs. That is why the Linux
>> kernel
>>> has stopped using them; for security reasons. DPDK has much less of a
>> security
>>> trust domain. Mostly need to make sure that no data from network is being
>>> used to compute VLA size.
>>>
>>
>> Looks like we need to discuss this at the next techboard meeting.
>>
>> * MSVC doesn't support C11 optional VLAs (and never will).
>> * alloca() is an alternative that is available on all platforms/toolchain
>> combinations.
>> * it's reasonable for some VLAs to be turned into regular arrays but it
>> would be unsatisfactory to be stuck waiting discussions of defining new
>> constant expression macros on a per-use basis.
>
> We must generally stop using VLAs, for many reasons.
What reasons would that be? And which of those reasons are not also
reasons to stop using alloca().
> The only available 1:1 replacement is alloca(), so we have to accept that.
>
> If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
>
> Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
>
What is this supposed to mean? Finding alternatives to VLAs are required
to make progress of MSVC support in DPDK.
>> * there is resistance to using alloca() vs VLA so my proposal is to
>> change only the code that is built to target windows.
>
> I would prefer to get rid of them all, so the CI can build with -Wvla to prevent them from being introduced again.
> Not a strong preference.
> On the other hand, the CI's MSVC builds will catch them if used for a Windows target.
> And limiting to Windows code reduces the amount of work, so that's probably the most realistic solution.
>
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-10 7:32 ` Mattias Rönnblom
@ 2024-04-10 7:52 ` Morten Brørup
2024-04-10 17:04 ` Tyler Retzlaff
1 sibling, 0 replies; 34+ messages in thread
From: Morten Brørup @ 2024-04-10 7:52 UTC (permalink / raw)
To: Mattias Rönnblom, Tyler Retzlaff, Stephen Hemminger, techboard
Cc: dev, Bruce Richardson, Thomas Monjalon
> From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> Sent: Wednesday, 10 April 2024 09.32
>
> On 2024-04-08 17:53, Morten Brørup wrote:
> >> From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> >> Sent: Monday, 8 April 2024 17.27
> >>
[...]
> >> Looks like we need to discuss this at the next techboard meeting.
> >>
> >> * MSVC doesn't support C11 optional VLAs (and never will).
> >> * alloca() is an alternative that is available on all platforms/toolchain
> >> combinations.
> >> * it's reasonable for some VLAs to be turned into regular arrays but it
> >> would be unsatisfactory to be stuck waiting discussions of defining new
> >> constant expression macros on a per-use basis.
> >
> > We must generally stop using VLAs, for many reasons.
>
> What reasons would that be? And which of those reasons are not also
> reasons to stop using alloca().
The reasons against VLAs are the same as why MSVC doesn’t support them; primarily that they are insecure.
The reasons against VLAs and alloca() are the same, except MSVC supports alloca().
>
> > The only available 1:1 replacement is alloca(), so we have to accept that.
> >
> > If anyone still cares about improvements, we can turn alloca()'d arrays into
> regular arrays after this patch series.
> >
> > Alternatives to VLAs are very interesting discussions, but let's not stall
> MSVC progress because of it!
> >
>
> What is this supposed to mean? Finding alternatives to VLAs are required
> to make progress of MSVC support in DPDK.
It means that not enough people contribute to discussing and implementing alternatives, so we have to use the 1:1 replacement alternative, alloca(), to avoid stalling DPDK support for MSVC.
We can discuss and implement alternatives at any time, if anybody cares.
>
> >> * there is resistance to using alloca() vs VLA so my proposal is to
> >> change only the code that is built to target windows.
> >
> > I would prefer to get rid of them all, so the CI can build with -Wvla to
> prevent them from being introduced again.
> > Not a strong preference.
> > On the other hand, the CI's MSVC builds will catch them if used for a
> Windows target.
> > And limiting to Windows code reduces the amount of work, so that's probably
> the most realistic solution.
> >
^ permalink raw reply [flat|nested] 34+ messages in thread
* RE: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-09 15:08 ` Tyler Retzlaff
@ 2024-04-10 9:58 ` Konstantin Ananyev
2024-04-10 17:03 ` Tyler Retzlaff
0 siblings, 1 reply; 34+ messages in thread
From: Konstantin Ananyev @ 2024-04-10 9:58 UTC (permalink / raw)
To: Tyler Retzlaff
Cc: Morten Brørup, Stephen Hemminger, techboard,
Mattias Rönnblom, dev, Bruce Richardson, Thomas Monjalon
> >
> > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > Sent: Monday, 8 April 2024 17.27
> > > >
> > > > For next technboard meeting.
> > > >
> > > > On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> > > > > On Sun, 7 Apr 2024 13:07:06 +0200
> > > > > Morten Brørup <mb@smartsharesystems.com> wrote:
> > > > >
> > > > > > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > > > > > Sent: Sunday, 7 April 2024 11.32
> > > > > > >
> > > > > > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > > > > > This series is not intended for merge. It insteat provides examples
> > > > > > > of
> > > > > > > > converting use of VLAs to alloca() would look like.
> > > > > > > >
> > > > > > > > what's the advantages of VLA over alloca()?
> > > > > > > >
> > > > > > > > * sizeof(array) works as expected.
> > > > > > > >
> > > > > > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > > > > > dynamically allocated space. this means multiple subscript syntax
> > > > > > > > works (unlike on a pointer) and calculation of addresses into
> > > > > > > allocated
> > > > > > > > space in ascending order is performed by the compiler instead of
> > > > > > > manually.
> > > > > > > >
> > > > > > >
> > > > > > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > > > > > standard. VLAs are C99, and well-known and understood, and very
> > > > > > > efficient.
> > > > > >
> > > > > > The RFC fails to mention why we need to replace VLAs with something else:
> > > > > >
> > > > > > VLAs are C99, but not C++; VLAs were made optional in C11.
> > > > > >
> > > > > > MSVC doesn't support VLAs, and is not going to:
> > > > > > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> > > > arriving-in-msvc/#variable-length-arrays
> > > > > >
> > > > > >
> > > > > > I dislike alloca() too, and the notes section in the alloca(3) man page
> > > > even discourages the use of alloca():
> > > > > > https://man7.org/linux/man-pages/man3/alloca.3.html
> > > > > >
> > > > > > But I guess alloca() is the simplest replacement for VLAs.
> > > > > > This RFC patch series opens the discussion for alternatives in different
> > > > use cases.
> > > > > >
> > > > >
> > > > > The other issue with VLA's is that if the number is something that can be
> > > > externally
> > > > > input, then it can be a source of stack overflow bugs. That is why the Linux
> > > > kernel
> > > > > has stopped using them; for security reasons. DPDK has much less of a
> > > > security
> > > > > trust domain. Mostly need to make sure that no data from network is being
> > > > > used to compute VLA size.
> > > > >
> > > >
> > > > Looks like we need to discuss this at the next techboard meeting.
> > > >
> > > > * MSVC doesn't support C11 optional VLAs (and never will).
> > > > * alloca() is an alternative that is available on all platforms/toolchain
> > > > combinations.
> > > > * it's reasonable for some VLAs to be turned into regular arrays but it
> > > > would be unsatisfactory to be stuck waiting discussions of defining new
> > > > constant expression macros on a per-use basis.
> > >
> > > We must generally stop using VLAs, for many reasons.
> > > The only available 1:1 replacement is alloca(), so we have to accept that.
> > >
> > > If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
> > >
> > > Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
> >
> > Ok, but why we have to rush into 'alloca()' solution if none of us really fond of it?
>
> for the trivial case it is no worse than a VLA. while it isn't
> standardized it is available for all platform/toolchains unlike VLA.
> most of the code needed to be changed for windows falls into the trivial
> case when converted.
Personally, I think VLA is much more convenient then alloca().
At least you can do sizeof(vla_array) without a problem.
>
> there do appear to be cases where VLAs have just been unintentional.
> i previously linked a patch where i fixed a case where they were
> instantiated inside a cast and there are other cases i'm aware of in the
> mlx5 driver where i believe they are unintended. at least with alloca
> it is obvious but with a VLA if the expression used to determine the
> size is wrapped up in something non-trivial and the author doesn't check
> that it is truly a constant expression you get one by surprise.
>
> > As you already noted majority of these cases can be replaced with static sized arrays.
>
> unfortunately i don't think this is the case if we are talking about the
> entire source tree.
Ok, probably I misunderstood this RFC intention:
My first thought that it was all you need to make some minimalistic DPDK build with MSVC.
If that's not the case, then what would be the full list of changes that are necessary?
> > Let's try to compile a list of what needs to be changed, split it by priorities and work
> > progressively through it.
>
> i agree that working progressively is the way forward, my suggestion
> partitioning has been to submit a smaller series that unblocks windows
> using alloca as a starting point. this represents only a fraction of the
> uses but can also serve for evaluation purposes.
My concern here is that we are replacing something that is probably not ideal with
something that is even worse.
I do understand that it supposed to be a temporary measure, but as you said
alloca() is supported nearly everywhere, so in theory there would be no strong
reason for maintainers to spend their time on further code rearrangements to replace
alloca() with static arrays.
>
> if maintainers can identify a reasonable conversion to static array for
> any of the converted instances i can incorporate the prescribed changes.
Ok, that's why I suggested to start with the list of required changes.
And then decide on component-by-component basis.
From my side, I am ok to spend some time on the libs I am responsible for,
to do such code changes.
> i would also suggest that in parallel we might introduce a series that
> enables -Wvla but suppresses warning about -Wvla at the sites of use.
> the purpose of this suggestion is to stop new introductions but also
> annotate the uses we would like maintainers to evaluate. perhaps some
> could also be trivially eliminated with the series.
>
> > Konstantin
> >
> > >
> > > > * there is resistance to using alloca() vs VLA so my proposal is to
> > > > change only the code that is built to target windows.
> > >
> > > I would prefer to get rid of them all, so the CI can build with -Wvla to prevent them from being introduced again.
> > > Not a strong preference.
> > > On the other hand, the CI's MSVC builds will catch them if used for a Windows target.
> > > And limiting to Windows code reduces the amount of work, so that's probably the most realistic solution.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-10 9:58 ` Konstantin Ananyev
@ 2024-04-10 17:03 ` Tyler Retzlaff
0 siblings, 0 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-10 17:03 UTC (permalink / raw)
To: Konstantin Ananyev
Cc: Morten Brørup, Stephen Hemminger, techboard,
Mattias Rönnblom, dev, Bruce Richardson, Thomas Monjalon
On Wed, Apr 10, 2024 at 09:58:34AM +0000, Konstantin Ananyev wrote:
>
>
> > >
> > > > > From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> > > > > Sent: Monday, 8 April 2024 17.27
> > > > >
> > > > > For next technboard meeting.
> > > > >
> > > > > On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> > > > > > On Sun, 7 Apr 2024 13:07:06 +0200
> > > > > > Morten Brørup <mb@smartsharesystems.com> wrote:
> > > > > >
> > > > > > > > From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> > > > > > > > Sent: Sunday, 7 April 2024 11.32
> > > > > > > >
> > > > > > > > On 2024-04-04 19:15, Tyler Retzlaff wrote:
> > > > > > > > > This series is not intended for merge. It insteat provides examples
> > > > > > > > of
> > > > > > > > > converting use of VLAs to alloca() would look like.
> > > > > > > > >
> > > > > > > > > what's the advantages of VLA over alloca()?
> > > > > > > > >
> > > > > > > > > * sizeof(array) works as expected.
> > > > > > > > >
> > > > > > > > > * multi-dimensional arrays are still arrays instead of pointers to
> > > > > > > > > dynamically allocated space. this means multiple subscript syntax
> > > > > > > > > works (unlike on a pointer) and calculation of addresses into
> > > > > > > > allocated
> > > > > > > > > space in ascending order is performed by the compiler instead of
> > > > > > > > manually.
> > > > > > > > >
> > > > > > > >
> > > > > > > > alloca() is a pretty obscure mechanism, and also not a part of the C
> > > > > > > > standard. VLAs are C99, and well-known and understood, and very
> > > > > > > > efficient.
> > > > > > >
> > > > > > > The RFC fails to mention why we need to replace VLAs with something else:
> > > > > > >
> > > > > > > VLAs are C99, but not C++; VLAs were made optional in C11.
> > > > > > >
> > > > > > > MSVC doesn't support VLAs, and is not going to:
> > > > > > > https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> > > > > arriving-in-msvc/#variable-length-arrays
> > > > > > >
> > > > > > >
> > > > > > > I dislike alloca() too, and the notes section in the alloca(3) man page
> > > > > even discourages the use of alloca():
> > > > > > > https://man7.org/linux/man-pages/man3/alloca.3.html
> > > > > > >
> > > > > > > But I guess alloca() is the simplest replacement for VLAs.
> > > > > > > This RFC patch series opens the discussion for alternatives in different
> > > > > use cases.
> > > > > > >
> > > > > >
> > > > > > The other issue with VLA's is that if the number is something that can be
> > > > > externally
> > > > > > input, then it can be a source of stack overflow bugs. That is why the Linux
> > > > > kernel
> > > > > > has stopped using them; for security reasons. DPDK has much less of a
> > > > > security
> > > > > > trust domain. Mostly need to make sure that no data from network is being
> > > > > > used to compute VLA size.
> > > > > >
> > > > >
> > > > > Looks like we need to discuss this at the next techboard meeting.
> > > > >
> > > > > * MSVC doesn't support C11 optional VLAs (and never will).
> > > > > * alloca() is an alternative that is available on all platforms/toolchain
> > > > > combinations.
> > > > > * it's reasonable for some VLAs to be turned into regular arrays but it
> > > > > would be unsatisfactory to be stuck waiting discussions of defining new
> > > > > constant expression macros on a per-use basis.
> > > >
> > > > We must generally stop using VLAs, for many reasons.
> > > > The only available 1:1 replacement is alloca(), so we have to accept that.
> > > >
> > > > If anyone still cares about improvements, we can turn alloca()'d arrays into regular arrays after this patch series.
> > > >
> > > > Alternatives to VLAs are very interesting discussions, but let's not stall MSVC progress because of it!
> > >
> > > Ok, but why we have to rush into 'alloca()' solution if none of us really fond of it?
> >
> > for the trivial case it is no worse than a VLA. while it isn't
> > standardized it is available for all platform/toolchains unlike VLA.
> > most of the code needed to be changed for windows falls into the trivial
> > case when converted.
>
> Personally, I think VLA is much more convenient then alloca().
> At least you can do sizeof(vla_array) without a problem.
>
> >
> > there do appear to be cases where VLAs have just been unintentional.
> > i previously linked a patch where i fixed a case where they were
> > instantiated inside a cast and there are other cases i'm aware of in the
> > mlx5 driver where i believe they are unintended. at least with alloca
> > it is obvious but with a VLA if the expression used to determine the
> > size is wrapped up in something non-trivial and the author doesn't check
> > that it is truly a constant expression you get one by surprise.
> >
> > > As you already noted majority of these cases can be replaced with static sized arrays.
> >
> > unfortunately i don't think this is the case if we are talking about the
> > entire source tree.
>
> Ok, probably I misunderstood this RFC intention:
> My first thought that it was all you need to make some minimalistic DPDK build with MSVC.
> If that's not the case, then what would be the full list of changes that are necessary?
just to clarify expectations around scope.
MSVC is intended to be the primary toolchain for DPDK on Windows so the
scope of what is covered is any drivers or libraries that build for
Windows.
clang build for Windows is being maintained at high priority but lacks
capabilities Windows users require.
> > > Let's try to compile a list of what needs to be changed, split it by priorities and work
> > > progressively through it.
> >
> > i agree that working progressively is the way forward, my suggestion
> > partitioning has been to submit a smaller series that unblocks windows
> > using alloca as a starting point. this represents only a fraction of the
> > uses but can also serve for evaluation purposes.
>
> My concern here is that we are replacing something that is probably not ideal with
> something that is even worse.
> I do understand that it supposed to be a temporary measure, but as you said
> alloca() is supported nearly everywhere, so in theory there would be no strong
> reason for maintainers to spend their time on further code rearrangements to replace
> alloca() with static arrays.
>
> >
> > if maintainers can identify a reasonable conversion to static array for
> > any of the converted instances i can incorporate the prescribed changes.
>
> Ok, that's why I suggested to start with the list of required changes.
> And then decide on component-by-component basis.
The list is what is produced with -Wvla enabled on a clang build
targeting Windows.
> >From my side, I am ok to spend some time on the libs I am responsible for,
> to do such code changes.
I appreciate it!
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-10 7:32 ` Mattias Rönnblom
2024-04-10 7:52 ` Morten Brørup
@ 2024-04-10 17:04 ` Tyler Retzlaff
1 sibling, 0 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-10 17:04 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Morten Brørup, Stephen Hemminger, techboard, dev,
Bruce Richardson, Thomas Monjalon
On Wed, Apr 10, 2024 at 09:32:10AM +0200, Mattias Rönnblom wrote:
> On 2024-04-08 17:53, Morten Brørup wrote:
> >>From: Tyler Retzlaff [mailto:roretzla@linux.microsoft.com]
> >>Sent: Monday, 8 April 2024 17.27
> >>
> >>For next technboard meeting.
> >>
> >>On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> >>>On Sun, 7 Apr 2024 13:07:06 +0200
> >>>Morten Brørup <mb@smartsharesystems.com> wrote:
> >>>
> >>>>>From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> >>>>>Sent: Sunday, 7 April 2024 11.32
> >>>>>
> >>>>>On 2024-04-04 19:15, Tyler Retzlaff wrote:
> >>>>>>This series is not intended for merge. It insteat provides examples
> >>>>>of
> >>>>>>converting use of VLAs to alloca() would look like.
> >>>>>>
> >>>>>>what's the advantages of VLA over alloca()?
> >>>>>>
> >>>>>>* sizeof(array) works as expected.
> >>>>>>
> >>>>>>* multi-dimensional arrays are still arrays instead of pointers to
> >>>>>> dynamically allocated space. this means multiple subscript syntax
> >>>>>> works (unlike on a pointer) and calculation of addresses into
> >>>>>allocated
> >>>>>> space in ascending order is performed by the compiler instead of
> >>>>>manually.
> >>>>>>
> >>>>>
> >>>>>alloca() is a pretty obscure mechanism, and also not a part of the C
> >>>>>standard. VLAs are C99, and well-known and understood, and very
> >>>>>efficient.
> >>>>
> >>>>The RFC fails to mention why we need to replace VLAs with something else:
> >>>>
> >>>>VLAs are C99, but not C++; VLAs were made optional in C11.
> >>>>
> >>>>MSVC doesn't support VLAs, and is not going to:
> >>>>https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-
> >>arriving-in-msvc/#variable-length-arrays
> >>>>
> >>>>
> >>>>I dislike alloca() too, and the notes section in the alloca(3) man page
> >>even discourages the use of alloca():
> >>>>https://man7.org/linux/man-pages/man3/alloca.3.html
> >>>>
> >>>>But I guess alloca() is the simplest replacement for VLAs.
> >>>>This RFC patch series opens the discussion for alternatives in different
> >>use cases.
> >>>>
> >>>
> >>>The other issue with VLA's is that if the number is something that can be
> >>externally
> >>>input, then it can be a source of stack overflow bugs. That is why the Linux
> >>kernel
> >>>has stopped using them; for security reasons. DPDK has much less of a
> >>security
> >>>trust domain. Mostly need to make sure that no data from network is being
> >>>used to compute VLA size.
> >>>
> >>
> >>Looks like we need to discuss this at the next techboard meeting.
> >>
> >>* MSVC doesn't support C11 optional VLAs (and never will).
> >>* alloca() is an alternative that is available on all platforms/toolchain
> >> combinations.
> >>* it's reasonable for some VLAs to be turned into regular arrays but it
> >> would be unsatisfactory to be stuck waiting discussions of defining new
> >> constant expression macros on a per-use basis.
> >
> >We must generally stop using VLAs, for many reasons.
>
> What reasons would that be? And which of those reasons are not also
> reasons to stop using alloca().
truncated the sentence, probably should have said where static array is
not practical.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: [PATCH 0/4] RFC samples converting VLA to alloca
2024-04-10 7:27 ` Mattias Rönnblom
@ 2024-04-10 17:10 ` Tyler Retzlaff
0 siblings, 0 replies; 34+ messages in thread
From: Tyler Retzlaff @ 2024-04-10 17:10 UTC (permalink / raw)
To: Mattias Rönnblom
Cc: Stephen Hemminger, techboard, Morten Brørup, dev,
Bruce Richardson, Thomas Monjalon
On Wed, Apr 10, 2024 at 09:27:10AM +0200, Mattias Rönnblom wrote:
> On 2024-04-08 17:27, Tyler Retzlaff wrote:
> >For next technboard meeting.
> >
> >On Sun, Apr 07, 2024 at 10:03:06AM -0700, Stephen Hemminger wrote:
> >>On Sun, 7 Apr 2024 13:07:06 +0200
> >>Morten Brørup <mb@smartsharesystems.com> wrote:
> >>
> >>>>From: Mattias Rönnblom [mailto:hofors@lysator.liu.se]
> >>>>Sent: Sunday, 7 April 2024 11.32
> >>>>
> >>>>On 2024-04-04 19:15, Tyler Retzlaff wrote:
> >>>>>This series is not intended for merge. It insteat provides examples
> >>>>of
> >>>>>converting use of VLAs to alloca() would look like.
> >>>>>
> >>>>>what's the advantages of VLA over alloca()?
> >>>>>
> >>>>>* sizeof(array) works as expected.
> >>>>>
> >>>>>* multi-dimensional arrays are still arrays instead of pointers to
> >>>>> dynamically allocated space. this means multiple subscript syntax
> >>>>> works (unlike on a pointer) and calculation of addresses into
> >>>>allocated
> >>>>> space in ascending order is performed by the compiler instead of
> >>>>manually.
> >>>>
> >>>>alloca() is a pretty obscure mechanism, and also not a part of the C
> >>>>standard. VLAs are C99, and well-known and understood, and very
> >>>>efficient.
> >>>
> >>>The RFC fails to mention why we need to replace VLAs with something else:
> >>>
> >>>VLAs are C99, but not C++; VLAs were made optional in C11.
> >>>
> >>>MSVC doesn't support VLAs, and is not going to:
> >>>https://devblogs.microsoft.com/cppblog/c11-and-c17-standard-support-arriving-in-msvc/#variable-length-arrays
> >>>
> >>>
> >>>I dislike alloca() too, and the notes section in the alloca(3) man page even discourages the use of alloca():
> >>>https://man7.org/linux/man-pages/man3/alloca.3.html
> >>>
> >>>But I guess alloca() is the simplest replacement for VLAs.
> >>>This RFC patch series opens the discussion for alternatives in different use cases.
> >>>
> >>
> >>The other issue with VLA's is that if the number is something that can be externally
> >>input, then it can be a source of stack overflow bugs. That is why the Linux kernel
> >>has stopped using them; for security reasons. DPDK has much less of a security
> >>trust domain. Mostly need to make sure that no data from network is being
> >>used to compute VLA size.
> >>
> >
> >Looks like we need to discuss this at the next techboard meeting.
> >
> >* MSVC doesn't support C11 optional VLAs (and never will).
>
> This is due to dogmatism, or what? Surely, a lot of Open Source
> projects written for C99 will use VLAs.
well the statement from the MSVC team was
"VLAs provide attack vectors comparable to those of the infamous
gets() — deprecated and destined to removal — for opportunities of
“shifting the stack” and other exploits.
For these reasons we intend not to support VLAs as an optional
feature in C11"
i'm only communicating that they will neve be supported not debating the
reasons why. it's simply a statement in fact.
>
> >* alloca() is an alternative that is available on all platforms/toolchain
> > combinations.
>
> alloca() is a poor alternative. The use of alloca() should be
> restricted to situations where statically sized arrays can't do the
> job.
agree comletely.
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2024-04-10 17:10 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-11-07 19:32 RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
2023-11-08 2:31 ` Stephen Hemminger
2023-11-08 3:25 ` Tyler Retzlaff
2023-11-08 8:19 ` Morten Brørup
2023-11-08 16:51 ` Stephen Hemminger
2023-11-08 17:48 ` Morten Brørup
2023-11-09 10:25 ` RFC: default burst sizes in rte_config Morten Brørup
2023-11-09 20:26 ` RFC acceptable handling of VLAs across toolchains Tyler Retzlaff
2024-03-21 0:12 ` Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 0/4] RFC samples converting VLA to alloca Tyler Retzlaff
2024-04-04 17:15 ` [PATCH 1/4] latencystats: use alloca instead of vla trivial Tyler Retzlaff
2024-04-06 15:28 ` Morten Brørup
2024-04-07 9:36 ` Mattias Rönnblom
2024-04-07 17:00 ` Stephen Hemminger
2024-04-04 17:15 ` [PATCH 2/4] hash: " Tyler Retzlaff
2024-04-06 16:01 ` Morten Brørup
2024-04-04 17:15 ` [PATCH 3/4] vhost: use alloca instead of vla sizeof Tyler Retzlaff
2024-04-06 22:30 ` Morten Brørup
2024-04-04 17:15 ` [PATCH 4/4] dispatcher: use alloca instead of vla multi dimensional Tyler Retzlaff
2024-04-06 15:49 ` Morten Brørup
2024-04-07 9:31 ` [PATCH 0/4] RFC samples converting VLA to alloca Mattias Rönnblom
2024-04-07 11:07 ` Morten Brørup
2024-04-07 17:03 ` Stephen Hemminger
2024-04-08 15:27 ` Tyler Retzlaff
2024-04-08 15:53 ` Morten Brørup
2024-04-09 8:28 ` Konstantin Ananyev
2024-04-09 15:08 ` Tyler Retzlaff
2024-04-10 9:58 ` Konstantin Ananyev
2024-04-10 17:03 ` Tyler Retzlaff
2024-04-10 7:32 ` Mattias Rönnblom
2024-04-10 7:52 ` Morten Brørup
2024-04-10 17:04 ` Tyler Retzlaff
2024-04-10 7:27 ` Mattias Rönnblom
2024-04-10 17:10 ` Tyler Retzlaff
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).