DPDK patches and discussions
 help / color / mirror / Atom feed
From: Stephen Hemminger <stephen@networkplumber.org>
To: "Morten Brørup" <mb@smartsharesystems.com>
Cc: "Robin Jarry" <rjarry@redhat.com>,
	"Vladimir Medvedkin" <medvedkinv@gmail.com>,
	<bruce.richardson@intel.com>,
	"Konstantin Ananyev" <konstantin.ananyev@huawei.com>,
	<dev@dpdk.org>, "Sunil Kumar Kori" <skori@marvell.com>,
	"Rakesh Kudurumalla" <rkudurumalla@marvell.com>,
	"Wisam Jaddo" <wisamm@nvidia.com>,
	"Cristian Dumitrescu" <cristian.dumitrescu@intel.com>,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>,
	"Akhil Goyal" <gakhil@marvell.com>,
	"Fan Zhang" <fanzhang.oss@gmail.com>,
	"Yipeng Wang" <yipeng1.wang@intel.com>,
	"Sameh Gobriel" <sameh.gobriel@intel.com>,
	"Nithin Dabilpuram" <ndabilpuram@marvell.com>,
	"Kiran Kumar K" <kirankumark@marvell.com>,
	"Satha Rao" <skoteshwar@marvell.com>,
	"Harman Kalra" <hkalra@marvell.com>,
	"Ankur Dwivedi" <adwivedi@marvell.com>,
	"Anoob Joseph" <anoobj@marvell.com>,
	"Tejasree Kondoj" <ktejasree@marvell.com>,
	"Gagandeep Singh" <g.singh@nxp.com>,
	"Hemant Agrawal" <hemant.agrawal@nxp.com>,
	"Ajit Khaparde" <ajit.khaparde@broadcom.com>,
	"Somnath Kotur" <somnath.kotur@broadcom.com>,
	"Chas Williams" <chas3@att.com>,
	"Min Hu (Connor)" <humin29@huawei.com>,
	"Potnuri Bharat Teja" <bharat@chelsio.com>,
	"Sachin Saxena" <sachin.saxena@nxp.com>,
	"Xiaoyun Wang" <cloud.wangxiaoyun@huawei.com>,
	"Jie Hai" <haijie1@huawei.com>,
	"Yisen Zhuang" <yisen.zhuang@huawei.com>,
	"Jingjing Wu" <jingjing.wu@intel.com>,
	"Dariusz Sosnowski" <dsosnowski@nvidia.com>,
	"Viacheslav Ovsiienko" <viacheslavo@nvidia.com>,
	"Bing Zhao" <bingz@nvidia.com>, "Ori Kam" <orika@nvidia.com>,
	"Suanming Mou" <suanmingm@nvidia.com>,
	"Matan Azrad" <matan@nvidia.com>,
	"Chaoyong He" <chaoyong.he@corigine.com>,
	"Devendra Singh Rawat" <dsinghrawat@marvell.com>,
	"Alok Prasad" <palok@marvell.com>,
	"Andrew Rybchenko" <andrew.rybchenko@oktetlabs.ru>,
	"Jiawen Wu" <jiawenwu@trustnetic.com>,
	"Jian Wang" <jianwang@trustnetic.com>,
	"Thomas Monjalon" <thomas@monjalon.net>,
	"Ferruh Yigit" <ferruh.yigit@amd.com>,
	"Jiayu Hu" <hujiayu.hu@foxmail.com>,
	"Pavan Nikhilesh" <pbhagavatula@marvell.com>,
	"Maxime Coquelin" <maxime.coquelin@redhat.com>,
	"Chenbo Xia" <chenbox@nvidia.com>
Subject: Re: IPv6 APIs rework
Date: Fri, 19 Jul 2024 10:07:01 -0700	[thread overview]
Message-ID: <20240719100701.7e7dfc7f@hermes.local> (raw)
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35E9F5AF@smartserver.smartshare.dk>

On Fri, 19 Jul 2024 17:47:47 +0200
Morten Brørup <mb@smartsharesystems.com> wrote:

> > From: Robin Jarry [mailto:rjarry@redhat.com]
> > 
> > Morten Brørup, Jul 19, 2024 at 12:46:  
> > > When passing an IPv4 address as a value, alignment does matter; it
> > > must be 4 byte aligned.  
> > 
> > I was expecting the compiler to do what is necessary to copy the data to
> > an aligned register before jumping to the function.  
> 
> Yes, and hereby you have achieved 4-byte alignment of the parameter.
> 
> What I meant was: If the parameter's type makes the parameter explicitly unaligned, e.g. an unaligned array of 4 bytes or an unaligned_uint32_t, the code inside the function must also treat the parameter as unaligned, and cannot assume it has magically become 4-byte aligned.
> 
> Our functions taking an IPv4 address parameter (by value) passes the value as aligned.
> Functions taking an IPv6 address parameter (by value) should behave exactly the same way: The compiler should do what is necessary to copy the data to an aligned register *before* jumping to the function. (Note: In 64 bit architectures, 128 bits requires two 64 bit registers.) The point remains: If conversion from unaligned to aligned is required, it is the responsibility of the code calling the function, not the function itself.
> 
> >   
> > > On a CPU with 128 bit registers, I would probably also pass an IPv6
> > > address as a value. With such a CPU, the parameter type should be
> > > uint128_t or rte_be128_t, depending on byte order.  
> > 
> > I don't think there is a portable/standard uint128_t yet. Everything
> > I could find is either GCC or linux specific.  
> 
> Agree. I am using uint128_t conceptually.
> 
> >   
> > > There's a 3rd option:
> > > Have an IPv6 type that is simply an array of 16 bytes with no  
> > explicitly specified alignment:  
> > >
> > > struct rte_ipv6_addr {
> > > 	unsigned char addr_bytes[RTE_IPV6_ADDR_LEN];
> > > };
> > >
> > > Or:
> > >
> > > typedef struct rte_ipv6_addr {
> > > 	unsigned char addr_bytes[RTE_IPV6_ADDR_LEN];
> > > } rte_ipv6_addr_t;
> > >
> > > If used as is, it will be unaligned.
> > >
> > > And if alignment offers improved performance for some use cases,
> > > explicit alignment attributes can be added to the type in those use
> > > cases.
> > >
> > > Not using an uint128_t type (or a union of other types than unsigned
> > > char) will also avoid byte order issues.
> > >
> > > I guess Stephen was right to begin with. :-)  
> > 
> > Having the type as a union (as is the POSIX type) makes casting to
> > integers a lot less tedious and makes the structure overall more
> > flexible.  
> 
> Maybe (probably?). However, if you explicitly make the type unaligned, how can the same type be used in an optimized way where the developer knows that it is 16 byte aligned?
> 
> NB: There's something in the C standard about type casting from char (and unsigned char) being less restricted than typecasting from uint8_t, so perhaps using unsigned char instead of uint8_t could solve the recasting issue your union is trying to solve. (Unfortunately, I cannot remember the source of this information.)
> 
> Generally I don't think that we should introduce complex types/structures/unions only to simplify type casting, if it is at the expense of performance or code readability.
> 
> > 
> > We could completely add an unaligned be128 member to the union by the
> > way. I don't see what is wrong with having sub union members.  
> 
> (Not that I agree to using a union, but..)
> Agree. If it's a union, and alignment is explicitly set, adding 64 bit and 128 bit sub union members should be perfectly acceptable, as it does not modify the alignment or anything else.
> 
> > 
> > About your concern with byte order, since the union members have
> > explicit rte_be*_t types, I don't think confusion can happen. I have
> > also renamed the members, replacing the "u" prefix with "a" so that it
> > does not indicate that it should be used as a host integer.
> > 
> >         struct __rte_aligned(1) rte_ipv6_addr {
> >                 union {
> >                         unsigned char a[16];
> >                         unaligned_be16_t a16[8];
> >                         unaligned_be32_t a32[4];
> >                         unaligned_be64_t a64[2];
> >                         unaligned_be128_t a128[1];
> >                 };
> >         } __rte_packed;  

Don't use packed, it makes the compiler generate very slow access since
it has to assume worst case alignment.

The intermediate forms are not actually big endian. Don't do that.
It also would cause checkers to think swaps are needed.

> 
> (Again, not that I'm accepting the structure, but...)
> Yes, this would solve the byte order concern.
> 
> How do you write efficient code with this forcefully unaligned type?
> Let's say an application has some structure with an IPv6 address field, which the developer has designed to be 16 byte aligned in the structure.
> The compiler would need to always access this 16 byte aligned field as unaligned, because the rte_ipv6_addr type makes it explicitly unaligned.

The force align is a bad idea.



  reply	other threads:[~2024-07-19 17:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-18 15:03 Robin Jarry
2024-07-18 20:27 ` Morten Brørup
2024-07-18 21:15   ` Stephen Hemminger
2024-07-18 21:40     ` Robin Jarry
2024-07-18 21:25   ` Vladimir Medvedkin
2024-07-18 21:34     ` Robin Jarry
2024-07-19  8:25       ` Konstantin Ananyev
2024-07-19  9:12       ` Morten Brørup
2024-07-19 10:02         ` Robin Jarry
2024-07-19 10:09           ` Bruce Richardson
2024-07-19 10:46           ` Morten Brørup
2024-07-19 11:09             ` Robin Jarry
2024-07-19 15:47               ` Morten Brørup
2024-07-19 17:07                 ` Stephen Hemminger [this message]
2024-07-20 17:43                   ` Robin Jarry
2024-07-20 20:26                     ` Stephen Hemminger
2024-07-20 20:33                       ` Robin Jarry
2024-07-21 16:12                         ` Morten Brørup
2024-07-21 21:51                           ` Robin Jarry
2024-07-22  9:31                             ` Morten Brørup
2024-07-19 10:41         ` Medvedkin, Vladimir

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240719100701.7e7dfc7f@hermes.local \
    --to=stephen@networkplumber.org \
    --cc=adwivedi@marvell.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=anoobj@marvell.com \
    --cc=bharat@chelsio.com \
    --cc=bingz@nvidia.com \
    --cc=bruce.richardson@intel.com \
    --cc=chaoyong.he@corigine.com \
    --cc=chas3@att.com \
    --cc=chenbox@nvidia.com \
    --cc=cloud.wangxiaoyun@huawei.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=dev@dpdk.org \
    --cc=dsinghrawat@marvell.com \
    --cc=dsosnowski@nvidia.com \
    --cc=fanzhang.oss@gmail.com \
    --cc=ferruh.yigit@amd.com \
    --cc=g.singh@nxp.com \
    --cc=gakhil@marvell.com \
    --cc=haijie1@huawei.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=hkalra@marvell.com \
    --cc=hujiayu.hu@foxmail.com \
    --cc=humin29@huawei.com \
    --cc=jianwang@trustnetic.com \
    --cc=jiawenwu@trustnetic.com \
    --cc=jingjing.wu@intel.com \
    --cc=kirankumark@marvell.com \
    --cc=konstantin.ananyev@huawei.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=ktejasree@marvell.com \
    --cc=matan@nvidia.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=mb@smartsharesystems.com \
    --cc=medvedkinv@gmail.com \
    --cc=ndabilpuram@marvell.com \
    --cc=orika@nvidia.com \
    --cc=palok@marvell.com \
    --cc=pbhagavatula@marvell.com \
    --cc=rjarry@redhat.com \
    --cc=rkudurumalla@marvell.com \
    --cc=sachin.saxena@nxp.com \
    --cc=sameh.gobriel@intel.com \
    --cc=skori@marvell.com \
    --cc=skoteshwar@marvell.com \
    --cc=somnath.kotur@broadcom.com \
    --cc=suanmingm@nvidia.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    --cc=wisamm@nvidia.com \
    --cc=yipeng1.wang@intel.com \
    --cc=yisen.zhuang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).