Hi, I've noticed that there is a difference between the size of rte_mbuf in a Unix build comparing to Windows. The requirements is for rte_mbuf is to be RTE_CACHE_LINE_MIN_SIZE * 2 bytes however when I'm building it in Windows the size is RTE_CACHE_LINE_MIN_SIZE * 3. Looks like the diff results from the usage of bit fields inside rte_mbuf, from my testing it looks to me like the usage of 2 different bit fielded types inside rte_mbuf causes additional padding in Windows. For example from rte_mbuf, the following unions have the same size in Windows and Linux: union { uint32_t packet_type; // bit fields of type uint32_t will follow ... };... 4 bytes both in Unix and Windows. union { uint64_t tx_offload; // bit fields of type uint64_t will follow ... }; 8 bytes both in Unix and Windows. However when creating a struct containing both unions I'm getting sizeof 16 bytes in Unix and 24 bytes in Windows. Did someone faced this issue before? Is this a result of different alignment between gcc and clang when bit fields are used? Thanks, Tal
IIRC, it's this issue. https://bugs.llvm.org/show_bug.cgi?id=24383 -----Original Message----- From: Tal Shnaiderman <talshn@mellanox.com> Sent: Wednesday, May 13, 2020 12:55 AM To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>; Thomas Monjalon <thomas@monjalon.net>; pallavi.kadam@intel.com; navasile@linux.microsoft.com; ranjit.menon@intel.com; Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>; Omar Cardona <ocardona@microsoft.com>; Dmitry Malloy (MESHCHANINOV) <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com> Cc: dev@dpdk.org Subject: [EXTERNAL] rte_mbuf structure size in Windows Hi, I've noticed that there is a difference between the size of rte_mbuf in a Unix build comparing to Windows. The requirements is for rte_mbuf is to be RTE_CACHE_LINE_MIN_SIZE * 2 bytes however when I'm building it in Windows the size is RTE_CACHE_LINE_MIN_SIZE * 3. Looks like the diff results from the usage of bit fields inside rte_mbuf, from my testing it looks to me like the usage of 2 different bit fielded types inside rte_mbuf causes additional padding in Windows. For example from rte_mbuf, the following unions have the same size in Windows and Linux: union { uint32_t packet_type; // bit fields of type uint32_t will follow ... };... 4 bytes both in Unix and Windows. union { uint64_t tx_offload; // bit fields of type uint64_t will follow ... }; 8 bytes both in Unix and Windows. However when creating a struct containing both unions I'm getting sizeof 16 bytes in Unix and 24 bytes in Windows. Did someone faced this issue before? Is this a result of different alignment between gcc and clang when bit fields are used? Thanks, Tal
Tal, See attached compiler bug section for details. -----Original Message----- From: Omar Cardona Sent: Wednesday, May 13, 2020 1:04 AM To: Tal Shnaiderman <talshn@mellanox.com>; Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>; Thomas Monjalon <thomas@monjalon.net>; pallavi.kadam@intel.com; navasile@linux.microsoft.com; ranjit.menon@intel.com; Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>; Dmitry Malloy (MESHCHANINOV) <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com>; Jie Zhou <jizh@microsoft.com> Cc: dev@dpdk.org Subject: RE: rte_mbuf structure size in Windows IIRC, it's this issue. https://bugs.llvm.org/show_bug.cgi?id=24383 -----Original Message----- From: Tal Shnaiderman <talshn@mellanox.com> Sent: Wednesday, May 13, 2020 12:55 AM To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>; Thomas Monjalon <thomas@monjalon.net>; pallavi.kadam@intel.com; navasile@linux.microsoft.com; ranjit.menon@intel.com; Harini Ramakrishnan <Harini.Ramakrishnan@microsoft.com>; Omar Cardona <ocardona@microsoft.com>; Dmitry Malloy (MESHCHANINOV) <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com> Cc: dev@dpdk.org Subject: [EXTERNAL] rte_mbuf structure size in Windows Hi, I've noticed that there is a difference between the size of rte_mbuf in a Unix build comparing to Windows. The requirements is for rte_mbuf is to be RTE_CACHE_LINE_MIN_SIZE * 2 bytes however when I'm building it in Windows the size is RTE_CACHE_LINE_MIN_SIZE * 3. Looks like the diff results from the usage of bit fields inside rte_mbuf, from my testing it looks to me like the usage of 2 different bit fielded types inside rte_mbuf causes additional padding in Windows. For example from rte_mbuf, the following unions have the same size in Windows and Linux: union { uint32_t packet_type; // bit fields of type uint32_t will follow ... };... 4 bytes both in Unix and Windows. union { uint64_t tx_offload; // bit fields of type uint64_t will follow ... }; 8 bytes both in Unix and Windows. However when creating a struct containing both unions I'm getting sizeof 16 bytes in Unix and 24 bytes in Windows. Did someone faced this issue before? Is this a result of different alignment between gcc and clang when bit fields are used? Thanks, Tal
On Wed, 13 May 2020 07:55:07 +0000 Tal Shnaiderman <talshn@mellanox.com> wrote: > I've noticed that there is a difference between the size of rte_mbuf > in a Unix build comparing to Windows. > > The requirements is for rte_mbuf is to be RTE_CACHE_LINE_MIN_SIZE * 2 > bytes however when I'm building it in Windows the size is > RTE_CACHE_LINE_MIN_SIZE * 3. > > Looks like the diff results from the usage of bit fields inside > rte_mbuf, from my testing it looks to me like the usage of 2 > different bit fielded types inside rte_mbuf causes additional padding > in Windows. > > For example from rte_mbuf, the following unions have the same size in > Windows and Linux: > > union { > uint32_t packet_type; > // bit fields of type uint32_t will follow > ... > };... > > 4 bytes both in Unix and Windows. > > union { > uint64_t tx_offload; > // bit fields of type uint64_t will follow > ... > }; > > 8 bytes both in Unix and Windows. > > However when creating a struct containing both unions I'm getting > sizeof 16 bytes in Unix and 24 bytes in Windows. > > Did someone faced this issue before? Is this a result of different > alignment between gcc and clang when bit fields are used? Hi, This is the issue we were talking about from the beginning of year. Microsoft was supposed to track the bug and allocate resources to fix it if possible. On the last community call, Naty and Omar claimed there is no noticeable performance impact with l2fwd if mbuf spans 3 cache lines, but DmitryM commented this may depend on cache utilization. For GCC, the following workaround exists: https://github.com/PlushBeaver/dpdk/commit/37f052cb18d1d5d425818196d5e1d15a7ada0de0 No workaround for Clang is known, bug URL: https://bugs.llvm.org/show_bug.cgi?id=24383 -- Dmitry Kozlyuk
Thank you Omar, this is indeed the same issue.
> -----Original Message-----
> From: Omar Cardona <ocardona@microsoft.com>
> Sent: Wednesday, May 13, 2020 11:08 AM
> To: Tal Shnaiderman <talshn@mellanox.com>; Dmitry Kozlyuk
> <dmitry.kozliuk@gmail.com>; Thomas Monjalon <thomas@monjalon.net>;
> pallavi.kadam@intel.com; navasile@linux.microsoft.com;
> ranjit.menon@intel.com; Harini Ramakrishnan
> <Harini.Ramakrishnan@microsoft.com>; Dmitry Malloy (MESHCHANINOV)
> <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com>; Jie Zhou
> <jizh@microsoft.com>
> Cc: dev@dpdk.org
> Subject: RE: rte_mbuf structure size in Windows
>
> Tal,
> See attached compiler bug section for details.
>
> -----Original Message-----
> From: Omar Cardona
> Sent: Wednesday, May 13, 2020 1:04 AM
> To: Tal Shnaiderman <talshn@mellanox.com>; Dmitry Kozlyuk
> <dmitry.kozliuk@gmail.com>; Thomas Monjalon <thomas@monjalon.net>;
> pallavi.kadam@intel.com; navasile@linux.microsoft.com;
> ranjit.menon@intel.com; Harini Ramakrishnan
> <Harini.Ramakrishnan@microsoft.com>; Dmitry Malloy (MESHCHANINOV)
> <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com>; Jie Zhou
> <jizh@microsoft.com>
> Cc: dev@dpdk.org
> Subject: RE: rte_mbuf structure size in Windows
>
> IIRC, it's this issue.
> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fbugs.
> llvm.org%2Fshow_bug.cgi%3Fid%3D24383&data=02%7C01%7Ctalshn%4
> 0mellanox.com%7C2987047b646b4854350e08d7f714b2ea%7Ca652971c7d2e4
> d9ba6a4d149256f461b%7C0%7C0%7C637249540596603575&sdata=83m
> NExOvXtXpKIt%2FZaqkoZadkuGpX1olQE3Scc1xOAQ%3D&reserved=0
>
> -----Original Message-----
> From: Tal Shnaiderman <talshn@mellanox.com>
> Sent: Wednesday, May 13, 2020 12:55 AM
> To: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>; Thomas Monjalon
> <thomas@monjalon.net>; pallavi.kadam@intel.com;
> navasile@linux.microsoft.com; ranjit.menon@intel.com; Harini
> Ramakrishnan <Harini.Ramakrishnan@microsoft.com>; Omar Cardona
> <ocardona@microsoft.com>; Dmitry Malloy (MESHCHANINOV)
> <dmitrym@microsoft.com>; Yohad Tor <yohadt@mellanox.com>
> Cc: dev@dpdk.org
> Subject: [EXTERNAL] rte_mbuf structure size in Windows
>
> Hi,
>
> I've noticed that there is a difference between the size of rte_mbuf in a Unix
> build comparing to Windows.
>
> The requirements is for rte_mbuf is to be RTE_CACHE_LINE_MIN_SIZE * 2
> bytes however when I'm building it in Windows the size is
> RTE_CACHE_LINE_MIN_SIZE * 3.
>
> Looks like the diff results from the usage of bit fields inside rte_mbuf, from
> my testing it looks to me like the usage of 2 different bit fielded types inside
> rte_mbuf causes additional padding in Windows.
>
> For example from rte_mbuf, the following unions have the same size in
> Windows and Linux:
>
> union {
> uint32_t packet_type;
> // bit fields of type uint32_t will follow
> ...
> };...
>
> 4 bytes both in Unix and Windows.
>
> union {
> uint64_t tx_offload;
> // bit fields of type uint64_t will follow
> ...
> };
>
> 8 bytes both in Unix and Windows.
>
> However when creating a struct containing both unions I'm getting sizeof 16
> bytes in Unix and 24 bytes in Windows.
>
> Did someone faced this issue before? Is this a result of different alignment
> between gcc and clang when bit fields are used?
>
> Thanks,
>
> Tal
> Subject: Re: rte_mbuf structure size in Windows > > Hi, > > This is the issue we were talking about from the beginning of year. Microsoft > was supposed to track the bug and allocate resources to fix it if possible. On > the last community call, Naty and Omar claimed there is no noticeable > performance impact with l2fwd if mbuf spans 3 cache lines, but DmitryM > commented this may depend on cache utilization. > > For GCC, the following workaround exists: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F% > 2Fgithub.com%2FPlushBeaver%2Fdpdk%2Fcommit%2F37f052cb18d1d5d4258 > 18196d5e1d15a7ada0de0&data=02%7C01%7Ctalshn%40mellanox.com% > 7Ca5df987ffdff439a4ea608d7f7189f62%7Ca652971c7d2e4d9ba6a4d149256f46 > 1b%7C0%7C0%7C637249557444653368&sdata=dkaDS7%2FM%2BvOwgx > RVjfRsGkAO66rGhRCAHUHzybpOxYY%3D&reserved=0 Thank you Dmitry, do we plan to push this WO or stay with 3 cache lines uniformly on Windows builds until the clang bug is resolved? (I'm out of the loop regarding this issue). > > No workaround for Clang is known, bug URL: > > https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F% > 2Fbugs.llvm.org%2Fshow_bug.cgi%3Fid%3D24383&data=02%7C01%7Ct > alshn%40mellanox.com%7Ca5df987ffdff439a4ea608d7f7189f62%7Ca652971c > 7d2e4d9ba6a4d149256f461b%7C0%7C0%7C637249557444653368&sdata > =XJnc7Q%2BTRt%2F2TyPEOl3uO2cXhdCFzArbvreDak65DJw%3D&reserv > ed=0 > > -- > Dmitry Kozlyuk
On Wed, 13 May 2020 08:55:11 +0000
Tal Shnaiderman <talshn@mellanox.com> wrote:
> Thank you Dmitry, do we plan to push this WO or stay with 3 cache
> lines uniformly on Windows builds until the clang bug is resolved?
> (I'm out of the loop regarding this issue).
Let's bring this to the today's call. IMO, uniform layout would allow
building DPDK with MinGW and apps with Clang, which is good, and
performance is not a current priority, butI may be missing something.
--
Dmitry Kozlyuk