From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga18.intel.com (mga18.intel.com [134.134.136.126])
 by dpdk.org (Postfix) with ESMTP id 6EC273977
 for <dev@dpdk.org>; Sat, 30 Mar 2019 15:20:35 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 30 Mar 2019 07:20:34 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.60,288,1549958400"; d="scan'208";a="130016712"
Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159])
 by orsmga008.jf.intel.com with ESMTP; 30 Mar 2019 07:20:33 -0700
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.210]) by
 IRSMSX104.ger.corp.intel.com ([169.254.5.56]) with mapi id 14.03.0415.000;
 Sat, 30 Mar 2019 14:20:32 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Olivier Matz <olivier.matz@6wind.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "akhil.goyal@nxp.com" <akhil.goyal@nxp.com>
Thread-Topic: [PATCH v4 1/9] mbuf: new function to generate raw Tx offload
 value
Thread-Index: AQHU5hoPLQL0dX1vBECdciQ9rozSqaYikOGAgAGOHQA=
Date: Sat, 30 Mar 2019 14:20:31 +0000
Message-ID: <2601191342CEEE43887BDE71AB97725801365622BB@irsmsx105.ger.corp.intel.com>
References: <20190326154320.29913-1-konstantin.ananyev@intel.com>
 <20190329102726.27716-1-konstantin.ananyev@intel.com>
 <20190329102726.27716-2-konstantin.ananyev@intel.com>
 <20190329125427.hdwevmm4wwl73tlj@platinum>
In-Reply-To: <20190329125427.hdwevmm4wwl73tlj@platinum>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOGY0NzJhMTQtNDc0NS00MzIwLWE4YjAtMGU1MDdjZDY1MGIxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoid1lhUjNXTkxUWE5mWFQ0cEVyM0VOT3hzZVwvZTNJNGVrcWRnVFR3VmhiSEJBRmdmQWVwRnRjZVhJNUgzXC9WK1hNIn0=
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.0.400.15
dlp-reaction: no-action
x-originating-ip: [163.33.239.181]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v4 1/9] mbuf: new function to generate raw Tx
 offload value
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sat, 30 Mar 2019 14:20:36 -0000

Hi Olivier,

> > Operations to set/update bit-fields often cause compilers
> > to generate suboptimal code.
> > To help avoid such situation for tx_offload fields:
> > introduce new enum for tx_offload bit-fields lengths and offsets,
> > and new function to generate raw tx_offload value.
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
>=20
> I understand the need. Out of curiosity, do you have any performance
> numbers to share?

On my board (SKX):
 for micro-benchmark (doing nothing but setting tx_offload for 1M mbufs in =
a loop)=20
the difference is more than 150% -  from ~55 cycles to ~20 cycles per itera=
tion.
For ipsec-secgw - ~3% improvement for tunneled outbound packets.

>=20
> Few cosmetic questions below.
>=20
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 79 ++++++++++++++++++++++++++++++++++----
> >  1 file changed, 72 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index d961ccaf6..0b197e8ce 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -479,6 +479,31 @@ struct rte_mbuf_sched {
> >  	uint16_t reserved;   /**< Reserved. */
> >  }; /**< Hierarchical scheduler */
> >
> > +/**
> > + * enum for the tx_offload bit-fields lenghts and offsets.
> > + * defines the layout of rte_mbuf tx_offload field.
> > + */
> > +enum {
> > +	RTE_MBUF_L2_LEN_BITS =3D 7,
> > +	RTE_MBUF_L3_LEN_BITS =3D 9,
> > +	RTE_MBUF_L4_LEN_BITS =3D 8,
> > +	RTE_MBUF_TSO_SEGSZ_BITS =3D 16,
> > +	RTE_MBUF_OUTL3_LEN_BITS =3D 9,
> > +	RTE_MBUF_OUTL2_LEN_BITS =3D 7,
> > +	RTE_MBUF_L2_LEN_OFS =3D 0,
> > +	RTE_MBUF_L3_LEN_OFS =3D RTE_MBUF_L2_LEN_OFS + RTE_MBUF_L2_LEN_BITS,
> > +	RTE_MBUF_L4_LEN_OFS =3D RTE_MBUF_L3_LEN_OFS + RTE_MBUF_L3_LEN_BITS,
> > +	RTE_MBUF_TSO_SEGSZ_OFS =3D RTE_MBUF_L4_LEN_OFS + RTE_MBUF_L4_LEN_BITS=
,
> > +	RTE_MBUF_OUTL3_LEN_OFS =3D
> > +		RTE_MBUF_TSO_SEGSZ_OFS + RTE_MBUF_TSO_SEGSZ_BITS,
> > +	RTE_MBUF_OUTL2_LEN_OFS =3D
> > +		RTE_MBUF_OUTL3_LEN_OFS + RTE_MBUF_OUTL3_LEN_BITS,
> > +	RTE_MBUF_TXOFLD_UNUSED_OFS =3D
> > +		RTE_MBUF_OUTL2_LEN_OFS + RTE_MBUF_OUTL2_LEN_BITS,
> > +	RTE_MBUF_TXOFLD_UNUSED_BITS =3D
> > +		sizeof(uint64_t) * CHAR_BIT - RTE_MBUF_TXOFLD_UNUSED_OFS,
> > +};
> > +
>=20
> What is the advantage of defining an enum instead of #defines?

No big difference here, just looks nicer to me.

>=20
> In any case, I wonder if it wouldn't be clearer to change the order like
> this:
>=20
> enum {
> 	RTE_MBUF_L2_LEN_OFS =3D 0,
> 	RTE_MBUF_L2_LEN_BITS =3D 7,
> 	RTE_MBUF_L3_LEN_OFS =3D RTE_MBUF_L2_LEN_OFS + RTE_MBUF_L2_LEN_BITS,
> 	RTE_MBUF_L3_LEN_BITS =3D 9,
> 	RTE_MBUF_L4_LEN_OFS =3D RTE_MBUF_L3_LEN_OFS + RTE_MBUF_L3_LEN_BITS,
> 	RTE_MBUF_L4_LEN_BITS =3D 8,
> ...

NP, can do this way.

>=20
>=20
> >  /**
> >   * The generic rte_mbuf, containing a packet mbuf.
> >   */
> > @@ -640,19 +665,24 @@ struct rte_mbuf {
> >  		uint64_t tx_offload;       /**< combined for easy fetch */
> >  		__extension__
> >  		struct {
> > -			uint64_t l2_len:7;
> > +			uint64_t l2_len:RTE_MBUF_L2_LEN_BITS;
> >  			/**< L2 (MAC) Header Length for non-tunneling pkt.
> >  			 * Outer_L4_len + ... + Inner_L2_len for tunneling pkt.
> >  			 */
> > -			uint64_t l3_len:9; /**< L3 (IP) Header Length. */
> > -			uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
> > -			uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > +			uint64_t l3_len:RTE_MBUF_L3_LEN_BITS;
> > +			/**< L3 (IP) Header Length. */
> > +			uint64_t l4_len:RTE_MBUF_L4_LEN_BITS;
> > +			/**< L4 (TCP/UDP) Header Length. */
> > +			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
> > +			/**< TCP TSO segment size */
> >
> >  			/* fields for TX offloading of tunnels */
> > -			uint64_t outer_l3_len:9; /**< Outer L3 (IP) Hdr Length. */
> > -			uint64_t outer_l2_len:7; /**< Outer L2 (MAC) Hdr Length. */
> > +			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
> > +			/**< Outer L3 (IP) Hdr Length. */
> > +			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
> > +			/**< Outer L2 (MAC) Hdr Length. */
> >
> > -			/* uint64_t unused:8; */
> > +			/* uint64_t unused:RTE_MBUF_TXOFLD_UNUSED_BITS; */
> >  		};
> >  	};
> >
> > @@ -2243,6 +2273,41 @@ static inline int rte_pktmbuf_chain(struct rte_m=
buf *head, struct rte_mbuf *tail
> >  	return 0;
> >  }
> >
> > +/*
> > + * @warning
> > + * @b EXPERIMENTAL: This API may change without prior notice.
> > + *
> > + * For given input values generate raw tx_offload value.
> > + * @param il2
> > + *   l2_len value.
> > + * @param il3
> > + *   l3_len value.
> > + * @param il4
> > + *   l4_len value.
> > + * @param tso
> > + *   tso_segsz value.
> > + * @param ol3
> > + *   outer_l3_len value.
> > + * @param ol2
> > + *   outer_l2_len value.
> > + * @param unused
> > + *   unused value.
> > + * @return
> > + *   raw tx_offload value.
> > + */
> > +static __rte_always_inline uint64_t
> > +rte_mbuf_tx_offload(uint64_t il2, uint64_t il3, uint64_t il4, uint64_t=
 tso,
> > +	uint64_t ol3, uint64_t ol2, uint64_t unused)
> > +{
> > +	return il2 << RTE_MBUF_L2_LEN_OFS |
> > +		il3 << RTE_MBUF_L3_LEN_OFS |
> > +		il4 << RTE_MBUF_L4_LEN_OFS |
> > +		tso << RTE_MBUF_TSO_SEGSZ_OFS |
> > +		ol3 << RTE_MBUF_OUTL3_LEN_OFS |
> > +		ol2 << RTE_MBUF_OUTL2_LEN_OFS |
> > +		unused << RTE_MBUF_TXOFLD_UNUSED_OFS;
> > +}
> > +
> >  /**
>=20
>=20
> From what I see, the problem is quite similar to what was done with
> rte_mbuf_sched_set() recently. So I wondered if it was possible to
> declare a structure like this:
>=20
> 	struct rte_mbuf_ol_len {
> 	        uint64_t l2_len:7;
> 	        uint64_t l3_len:9; /**< L3 (IP) Header Length. */
> 	        uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
> 		...
> 	}
>=20
> And have the set function like this:
>=20
>         m->l =3D (struct rte_mbuf_ol_len) {
>                 .l2_len =3D l2_len,
>                 .l3_len =3D l3_len,
>                 .l4_len =3D l4_len,
> 		...
>=20
> This would avoid the definition of the offsets and bits, but I didn't
> find any way to declare these fields as anonymous in the mbuf structure.
> Did you tried that way too?

I thought about such approach, but as you said above it would change
from unnamed struct to named one.
Which, as I understand, means API breakage.
So don't think the hassle will be worth the benefit.
Also the code wouldn't be totally identical - that approach will generate f=
ew
extra 'AND' instructions.
Konstantin

From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by dpdk.space (Postfix) with ESMTP id 7FD07A05D3
	for <public@inbox.dpdk.org>; Sat, 30 Mar 2019 15:20:38 +0100 (CET)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id B58D2493D;
	Sat, 30 Mar 2019 15:20:37 +0100 (CET)
Received: from mga18.intel.com (mga18.intel.com [134.134.136.126])
 by dpdk.org (Postfix) with ESMTP id 6EC273977
 for <dev@dpdk.org>; Sat, 30 Mar 2019 15:20:35 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by orsmga106.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 30 Mar 2019 07:20:34 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.60,288,1549958400"; d="scan'208";a="130016712"
Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159])
 by orsmga008.jf.intel.com with ESMTP; 30 Mar 2019 07:20:33 -0700
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.210]) by
 IRSMSX104.ger.corp.intel.com ([169.254.5.56]) with mapi id 14.03.0415.000;
 Sat, 30 Mar 2019 14:20:32 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Olivier Matz <olivier.matz@6wind.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, "akhil.goyal@nxp.com" <akhil.goyal@nxp.com>
Thread-Topic: [PATCH v4 1/9] mbuf: new function to generate raw Tx offload
 value
Thread-Index: AQHU5hoPLQL0dX1vBECdciQ9rozSqaYikOGAgAGOHQA=
Date: Sat, 30 Mar 2019 14:20:31 +0000
Message-ID:
 <2601191342CEEE43887BDE71AB97725801365622BB@irsmsx105.ger.corp.intel.com>
References: <20190326154320.29913-1-konstantin.ananyev@intel.com>
 <20190329102726.27716-1-konstantin.ananyev@intel.com>
 <20190329102726.27716-2-konstantin.ananyev@intel.com>
 <20190329125427.hdwevmm4wwl73tlj@platinum>
In-Reply-To: <20190329125427.hdwevmm4wwl73tlj@platinum>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOGY0NzJhMTQtNDc0NS00MzIwLWE4YjAtMGU1MDdjZDY1MGIxIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoid1lhUjNXTkxUWE5mWFQ0cEVyM0VOT3hzZVwvZTNJNGVrcWRnVFR3VmhiSEJBRmdmQWVwRnRjZVhJNUgzXC9WK1hNIn0=
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.0.400.15
dlp-reaction: no-action
x-originating-ip: [163.33.239.181]
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v4 1/9] mbuf: new function to generate raw Tx
 offload value
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>
Message-ID: <20190330142031.yhohq_TXko9TnuoKWFSUD0tsEtjIdSi7sb7qfKEp_LI@z>

Hi Olivier,

> > Operations to set/update bit-fields often cause compilers
> > to generate suboptimal code.
> > To help avoid such situation for tx_offload fields:
> > introduce new enum for tx_offload bit-fields lengths and offsets,
> > and new function to generate raw tx_offload value.
> >
> > Signed-off-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
> > Acked-by: Akhil Goyal <akhil.goyal@nxp.com>
>=20
> I understand the need. Out of curiosity, do you have any performance
> numbers to share?

On my board (SKX):
 for micro-benchmark (doing nothing but setting tx_offload for 1M mbufs in =
a loop)=20
the difference is more than 150% -  from ~55 cycles to ~20 cycles per itera=
tion.
For ipsec-secgw - ~3% improvement for tunneled outbound packets.

>=20
> Few cosmetic questions below.
>=20
> > ---
> >  lib/librte_mbuf/rte_mbuf.h | 79 ++++++++++++++++++++++++++++++++++----
> >  1 file changed, 72 insertions(+), 7 deletions(-)
> >
> > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > index d961ccaf6..0b197e8ce 100644
> > --- a/lib/librte_mbuf/rte_mbuf.h
> > +++ b/lib/librte_mbuf/rte_mbuf.h
> > @@ -479,6 +479,31 @@ struct rte_mbuf_sched {
> >  	uint16_t reserved;   /**< Reserved. */
> >  }; /**< Hierarchical scheduler */
> >
> > +/**
> > + * enum for the tx_offload bit-fields lenghts and offsets.
> > + * defines the layout of rte_mbuf tx_offload field.
> > + */
> > +enum {
> > +	RTE_MBUF_L2_LEN_BITS =3D 7,
> > +	RTE_MBUF_L3_LEN_BITS =3D 9,
> > +	RTE_MBUF_L4_LEN_BITS =3D 8,
> > +	RTE_MBUF_TSO_SEGSZ_BITS =3D 16,
> > +	RTE_MBUF_OUTL3_LEN_BITS =3D 9,
> > +	RTE_MBUF_OUTL2_LEN_BITS =3D 7,
> > +	RTE_MBUF_L2_LEN_OFS =3D 0,
> > +	RTE_MBUF_L3_LEN_OFS =3D RTE_MBUF_L2_LEN_OFS + RTE_MBUF_L2_LEN_BITS,
> > +	RTE_MBUF_L4_LEN_OFS =3D RTE_MBUF_L3_LEN_OFS + RTE_MBUF_L3_LEN_BITS,
> > +	RTE_MBUF_TSO_SEGSZ_OFS =3D RTE_MBUF_L4_LEN_OFS + RTE_MBUF_L4_LEN_BITS=
,
> > +	RTE_MBUF_OUTL3_LEN_OFS =3D
> > +		RTE_MBUF_TSO_SEGSZ_OFS + RTE_MBUF_TSO_SEGSZ_BITS,
> > +	RTE_MBUF_OUTL2_LEN_OFS =3D
> > +		RTE_MBUF_OUTL3_LEN_OFS + RTE_MBUF_OUTL3_LEN_BITS,
> > +	RTE_MBUF_TXOFLD_UNUSED_OFS =3D
> > +		RTE_MBUF_OUTL2_LEN_OFS + RTE_MBUF_OUTL2_LEN_BITS,
> > +	RTE_MBUF_TXOFLD_UNUSED_BITS =3D
> > +		sizeof(uint64_t) * CHAR_BIT - RTE_MBUF_TXOFLD_UNUSED_OFS,
> > +};
> > +
>=20
> What is the advantage of defining an enum instead of #defines?

No big difference here, just looks nicer to me.

>=20
> In any case, I wonder if it wouldn't be clearer to change the order like
> this:
>=20
> enum {
> 	RTE_MBUF_L2_LEN_OFS =3D 0,
> 	RTE_MBUF_L2_LEN_BITS =3D 7,
> 	RTE_MBUF_L3_LEN_OFS =3D RTE_MBUF_L2_LEN_OFS + RTE_MBUF_L2_LEN_BITS,
> 	RTE_MBUF_L3_LEN_BITS =3D 9,
> 	RTE_MBUF_L4_LEN_OFS =3D RTE_MBUF_L3_LEN_OFS + RTE_MBUF_L3_LEN_BITS,
> 	RTE_MBUF_L4_LEN_BITS =3D 8,
> ...

NP, can do this way.

>=20
>=20
> >  /**
> >   * The generic rte_mbuf, containing a packet mbuf.
> >   */
> > @@ -640,19 +665,24 @@ struct rte_mbuf {
> >  		uint64_t tx_offload;       /**< combined for easy fetch */
> >  		__extension__
> >  		struct {
> > -			uint64_t l2_len:7;
> > +			uint64_t l2_len:RTE_MBUF_L2_LEN_BITS;
> >  			/**< L2 (MAC) Header Length for non-tunneling pkt.
> >  			 * Outer_L4_len + ... + Inner_L2_len for tunneling pkt.
> >  			 */
> > -			uint64_t l3_len:9; /**< L3 (IP) Header Length. */
> > -			uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
> > -			uint64_t tso_segsz:16; /**< TCP TSO segment size */
> > +			uint64_t l3_len:RTE_MBUF_L3_LEN_BITS;
> > +			/**< L3 (IP) Header Length. */
> > +			uint64_t l4_len:RTE_MBUF_L4_LEN_BITS;
> > +			/**< L4 (TCP/UDP) Header Length. */
> > +			uint64_t tso_segsz:RTE_MBUF_TSO_SEGSZ_BITS;
> > +			/**< TCP TSO segment size */
> >
> >  			/* fields for TX offloading of tunnels */
> > -			uint64_t outer_l3_len:9; /**< Outer L3 (IP) Hdr Length. */
> > -			uint64_t outer_l2_len:7; /**< Outer L2 (MAC) Hdr Length. */
> > +			uint64_t outer_l3_len:RTE_MBUF_OUTL3_LEN_BITS;
> > +			/**< Outer L3 (IP) Hdr Length. */
> > +			uint64_t outer_l2_len:RTE_MBUF_OUTL2_LEN_BITS;
> > +			/**< Outer L2 (MAC) Hdr Length. */
> >
> > -			/* uint64_t unused:8; */
> > +			/* uint64_t unused:RTE_MBUF_TXOFLD_UNUSED_BITS; */
> >  		};
> >  	};
> >
> > @@ -2243,6 +2273,41 @@ static inline int rte_pktmbuf_chain(struct rte_m=
buf *head, struct rte_mbuf *tail
> >  	return 0;
> >  }
> >
> > +/*
> > + * @warning
> > + * @b EXPERIMENTAL: This API may change without prior notice.
> > + *
> > + * For given input values generate raw tx_offload value.
> > + * @param il2
> > + *   l2_len value.
> > + * @param il3
> > + *   l3_len value.
> > + * @param il4
> > + *   l4_len value.
> > + * @param tso
> > + *   tso_segsz value.
> > + * @param ol3
> > + *   outer_l3_len value.
> > + * @param ol2
> > + *   outer_l2_len value.
> > + * @param unused
> > + *   unused value.
> > + * @return
> > + *   raw tx_offload value.
> > + */
> > +static __rte_always_inline uint64_t
> > +rte_mbuf_tx_offload(uint64_t il2, uint64_t il3, uint64_t il4, uint64_t=
 tso,
> > +	uint64_t ol3, uint64_t ol2, uint64_t unused)
> > +{
> > +	return il2 << RTE_MBUF_L2_LEN_OFS |
> > +		il3 << RTE_MBUF_L3_LEN_OFS |
> > +		il4 << RTE_MBUF_L4_LEN_OFS |
> > +		tso << RTE_MBUF_TSO_SEGSZ_OFS |
> > +		ol3 << RTE_MBUF_OUTL3_LEN_OFS |
> > +		ol2 << RTE_MBUF_OUTL2_LEN_OFS |
> > +		unused << RTE_MBUF_TXOFLD_UNUSED_OFS;
> > +}
> > +
> >  /**
>=20
>=20
> From what I see, the problem is quite similar to what was done with
> rte_mbuf_sched_set() recently. So I wondered if it was possible to
> declare a structure like this:
>=20
> 	struct rte_mbuf_ol_len {
> 	        uint64_t l2_len:7;
> 	        uint64_t l3_len:9; /**< L3 (IP) Header Length. */
> 	        uint64_t l4_len:8; /**< L4 (TCP/UDP) Header Length. */
> 		...
> 	}
>=20
> And have the set function like this:
>=20
>         m->l =3D (struct rte_mbuf_ol_len) {
>                 .l2_len =3D l2_len,
>                 .l3_len =3D l3_len,
>                 .l4_len =3D l4_len,
> 		...
>=20
> This would avoid the definition of the offsets and bits, but I didn't
> find any way to declare these fields as anonymous in the mbuf structure.
> Did you tried that way too?

I thought about such approach, but as you said above it would change
from unnamed struct to named one.
Which, as I understand, means API breakage.
So don't think the hassle will be worth the benefit.
Also the code wouldn't be totally identical - that approach will generate f=
ew
extra 'AND' instructions.
Konstantin