DPDK patches and discussions
 help / color / mirror / Atom feed
From: Nithin Dabilpuram <ndabilpuram@marvell.com>
To: Slava Ovsiienko <viacheslavo@mellanox.com>
Cc: Olivier Matz <olivier.matz@6wind.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	Jerin Jacob <jerinjacobk@gmail.com>,
	"Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>,
	Nithin Dabilpuram <nithind1988@gmail.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	Andrew Rybchenko <arybchenko@solarflare.com>,
	Ori Kam <orika@mellanox.com>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>,
	"Mcnamara, John" <john.mcnamara@intel.com>,
	"Kovacevic, Marko" <marko.kovacevic@intel.com>,
	dpdk-dev <dev@dpdk.org>, Jerin Jacob <jerinj@marvell.com>,
	Krzysztof Kanas <kkanas@marvell.com>,
	"techboard@dpdk.org" <techboard@dpdk.org>
Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for packet marking
Date: Mon, 8 Jun 2020 15:09:48 +0530	[thread overview]
Message-ID: <20200608093948.GA11124@outlook.office365.com> (raw)
In-Reply-To: <AM4PR05MB32657915251B4BCBEAC703A8D2880@AM4PR05MB3265.eurprd05.prod.outlook.com>

On Wed, Jun 03, 2020 at 04:14:13PM +0000, Slava Ovsiienko wrote:
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Nithin Dabilpuram
> > Sent: Wednesday, June 3, 2020 15:52
> > To: Olivier Matz <olivier.matz@6wind.com>
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Jerin Jacob
> > <jerinjacobk@gmail.com>; Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com>; Nithin Dabilpuram
> > <nithind1988@gmail.com>; Thomas Monjalon <thomas@monjalon.net>;
> > Yigit, Ferruh <ferruh.yigit@intel.com>; Andrew Rybchenko
> > <arybchenko@solarflare.com>; Ori Kam <orika@mellanox.com>; Burakov,
> > Anatoly <anatoly.burakov@intel.com>; Mcnamara, John
> > <john.mcnamara@intel.com>; Kovacevic, Marko
> > <marko.kovacevic@intel.com>; dpdk-dev <dev@dpdk.org>; Jerin Jacob
> > <jerinj@marvell.com>; Krzysztof Kanas <kkanas@marvell.com>;
> > techboard@dpdk.org
> > Subject: Re: [dpdk-dev] [EXT] Re: [PATCH 1/3] mbuf: add Tx offloads for
> > packet marking
> > 
> > On Wed, Jun 03, 2020 at 01:38:22PM +0200, Olivier Matz wrote:
> > > Hi Nithin,
> > >
> > > On Wed, Jun 03, 2020 at 04:14:14PM +0530, Nithin Dabilpuram wrote:
> > > > On Wed, Jun 03, 2020 at 10:28:44AM +0200, Olivier Matz wrote:
> > > > > Hi,
> > > > >
> > > > > On Tue, Jun 02, 2020 at 07:55:37PM +0530, Nithin Dabilpuram wrote:
> > > > > > On Tue, Jun 02, 2020 at 10:53:08AM +0000, Ananyev, Konstantin
> > wrote:
> > > > > > > Hi Jerin,
> > > > > > >
> > > > > > > > > > > I also share Olivier's concern about consuming 3 bits in
> > ol_flags for that feature.
> > > > > > > > > > > Can it probably be squeezed somehow?
> > > > > > > > > > > Let say we reserve one flag that this information is
> > > > > > > > > > > present or not, and re-use one of rx-only fields for store
> > additional information (packet_type, or so).
> > > > > > > > > > > Or might be some other approach.
> > > > > > > > > >
> > > > > > > > > > We are fine with this approach where we define one bit
> > > > > > > > > > in Tx offloads for pkt marking and and 3 bits reused from Rx
> > offload flags area.
> > > > > > > > > >
> > > > > > > > > > For example:
> > > > > > > > > >
> > > > > > > > > > @@ -186,10 +186,16 @@ extern "C" {
> > > > > > > > > >
> > > > > > > > > >  /* add new RX flags here, don't forget to update
> > > > > > > > > > PKT_FIRST_FREE */
> > > > > > > > > >
> > > > > > > > > > +/* Reused Rx offload bits for Tx offloads */
> > > > > > > > > > +#define PKT_X_TX_MARK_VLAN_DEI         (1ULL << 0)
> > > > > > > > > > +#define PKT_X_TX_MARK_IP_DSCP          (1ULL << 1)
> > > > > > > > > > +#define PKT_X_TX_MARK_IP_ECN           (1ULL << 2)
> > > > > > > > > > +
> > > > > > > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define
> > > > > > > > > > PKT_LAST_FREE (1ULL << 40)
> > > > > > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > > > > > >
> > > > > > > > > >  /* add new TX flags here, don't forget to update
> > > > > > > > > > PKT_LAST_FREE  */
> > > > > > > > > > +#define PKT_TX_MARK_EN         (1ULL << 40)
> > > > > > > > > >
> > > > > > > > > > Is this fine ?
> > > > > > > > >
> > > > > > > > > Any thoughts on this approach which uses only 1 bit in Tx
> > > > > > > > > flags out of 18 and reuse unused Rx flag bits ?
> > > > > > >
> > > > > > > My thought was not about re-defining the flags (I think it is
> > > > > > > better to keep them intact), but adding a union for one of rx-only
> > fields (packet_type/rss/timestamp).
> > > > > >
> > > > > > Ok. Adding a union field at packet_type field is also fine like below.
> > > > > >
> > > > > > @@ -187,9 +187,10 @@ extern "C" {
> > > > > >  /* add new RX flags here, don't forget to update PKT_FIRST_FREE
> > > > > > */
> > > > > >
> > > > > >  #define PKT_FIRST_FREE (1ULL << 23) -#define PKT_LAST_FREE
> > > > > > (1ULL << 40)
> > > > > > +#define PKT_LAST_FREE (1ULL << 39)
> > > > > >
> > > > > >  /* add new TX flags here, don't forget to update PKT_LAST_FREE
> > > > > > */
> > > > > > +#define PKT_TX_MARK_EN		(1ULL << 40)
> > > > > >
> > > > > >  /**
> > > > > >   * Outer UDP checksum offload flag. This flag is used for
> > > > > > enabling @@ -461,6 +462,14 @@ enum {  #endif  };
> > > > > >
> > > > > > +/* Tx packet marking flags in rte_mbuf::tx_mark.
> > > > > > + * Valid only when PKT_TX_MARK_EN is set in
> > > > > > + * rte_mbuf::ol_flags.
> > > > > > + */
> > > > > > +#define TX_MARK_VLAN_DEI	(1ULL << 0)
> > > > > > +#define TX_MARK_IP_DSCP	(1ULL << 1)
> > > > > > +#define TX_MARK_IP_ECN		(1ULL << 2)
> > > > > > +
> > > > > >  /**
> > > > > >   * The generic rte_mbuf, containing a packet mbuf.
> > > > > >   */
> > > > > > @@ -543,6 +552,10 @@ struct rte_mbuf {
> > > > > >  			};
> > > > > >  			uint32_t inner_l4_type:4; /**< Inner L4 type. */
> > > > > >  		};
> > > > > > +		struct {
> > > > > > +			uint32_t reserved:29;
> > > > > > +			uint32_t tx_mark:3;
> > > > > > +		};
> > > > > >  	};
> > > > > >
> > > > > >
> > > > > >
> > > > > > Please correct me if this is not what you mean.
> > > > >
> > > > > I'm not a big fan of reusing Rx fields or flags for Tx.
> > > > > It's not obvious for an application than adding a tx_mark will
> > > > > overwrite the packet_type. I understand that the risk is limited
> > > > > because packet_type is Rx and the marks are Tx, but there is still one.
> > > >
> > > > I'm also not a big fan but just wanted to take this approach so
> > > > that, it can both conserve space and also help fast path.
> > > > Reusing Rx area is however not a new thing as is already followed
> > > > for
> > > > mbuf->txadapter field.
> > >
> > > Yes, and in my opinion this is something we should avoid when
> > > possible, because it makes some features exclusive (ex: the big union
> > > with sched/rss/adapter/usr/...).
> > >
> > > > Apart from documentation issue, Is there any other issue or future
> > > > ramification with using Rx field's for Tx ?
> > >
> > > No, I don't see any other issue except the ones we already mentioned (doc,
> > code clarity, ).
> > >
> > > > If it is only about documentation, then we can add more documentation
> > to make things clear.
> > > > >
> > > > > To summarize the different proposed approaches (please correct me if
> > I'm wrong):
> > > > >
> > > > > a- add 3 Tx mbuf flags
> > > > >    (-) consumes limited resource
> > > > >
> > > > > b- add 3 dynamic flags
> > > > >    (-) slower
> > > >
> > > > - Tx burst Vector implementation can't be done for this tx offload as
> > > >   offset keeps changing.
> > >
> > > A vector implementation can be done. But yes, it would be slower than
> > > with a static flag.
> > 
> > Very slow atleast in our HW as, we try to translate ol_flags to HW descriptor
> > flags in addition to extra operations to be done like offset calculations etc.
> 
> The dynamic flag offset is not subject to be changed after registration.
> So, flag offset can be converted once into appropriate mask and stored locally by PMD  for further using in vector instructions.
> The only difference - loaded variable mask instead of constant one, should not affect performance too much.
> With the cmpeq/shifts the and results can be converted to any desired predefined mask (like static one).
> 
> > 
> > So if we have fixed offsets, then it is easy to have static constant
> 
> BTW, how this constant is loaded?
> I suppose on x86 it would be rather mm_load?_si128(*constant array) - no difference 
> between dynamic and static masks at all. BTW, there is the hint - to make local copy of the
> mask/offset in order to avoid cache-line concurrency in global variable storage.
> 
> > 128/256 bit words with offsets and use things like shuffle/table lookup to
> > reorganize multiple mbuf flags to descriptor fields in a single instruction.
> 
> Offsets in the HW descriptor remain the fixed ones, so shuffle would still work OK.
> Do you already have some vectorized implementation? It would be curious to have a look at.

Agree that the constants need to be loaded from memory, but there is also a
logic created onto transforming that data to descriptor which is not always straight forward.

For example, see otx2_tx.c, 

// mbuf has tx offload field L2_LEN of 7 bits, L3_LEN of 9 bits which are not on byte boundary. 
// Below are the transformations done to get them to byte boundary.

/* Get tx_offload for ol2, ol3, l2, l3 lengths from mbuf*/
/*
 * Operation result:
 * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7)
 * E(8):OL2_LEN(7):OL3_LEN(9):E(24):L3_LEN(9):L2_LEN(7)
 */

  asm volatile ("LD1 {%[a].D}[0],[%[in]]\n\t" :
                [a]"+w"(senddesc01_w1) :
                [in]"r"(mbuf0 + 2) : "memory");


 /*
  * Operation Result:
  * E(47):L3_LEN(9):L2_LEN(7+z)
  * E(47):L3_LEN(9):L2_LEN(7+z)
  */
 senddesc01_w1 = vshlq_n_u64(senddesc01_w1, 1);
 senddesc23_w1 = vshlq_n_u64(senddesc23_w1, 1);

/*
 * Result: 
 * E(48):L3_LEN(8):L2_LEN(z+7)
 * E(48):L3_LEN(8):L2_LEN(z+7)
 */
const int8x16_t tshft3 = {
        -1, 0, 8, 8, 8, 8, 8, 8,
        -1, 0, 8, 8, 8, 8, 8, 8,
};

senddesc01_w1 = vshlq_u8(senddesc01_w1, tshft3);
senddesc23_w1 = vshlq_u8(senddesc23_w1, tshft3);


We cannot get all the above logic done runtime for every position of mbuf->l2_len if
the field position keeps changing for every application run.


> 
> I strongly share the concern about defining the static mbuf flags,
> we should consider all ways to avoid doing this.
> 
> WBR, Slava
> > 
> > >
> > > > >
> > > > > c- add 1 Tx flag and union with Rx field
> > > > >    (-) exclusive with Rx field
> > > > >    (-) still consumes one flag
> > > > >
> > > > > My preference is still b-, for these reasons:
> > > > >
> > > > > - There are many different DPDK use cases, and resources in mbuf is
> > tight.
> > > > >   Recent contributions (rte_flow and ice driver) already made use of
> > dynamic
> > > > >   fields/flags.
> > > > - Since RTE_FLOW metadata is 32-bit field, it is a clear candidate
> > > > for dynamic flags.
> > >
> > > I'm not sure to get why it is a better candidate than packet marking.
> > > You mean because it requires more room in mbuf?
> > 
> > Yes, I feel space consumption is one way to decide whether it should be a
> > dynfield or static field.
> > 
> > IMO, other parameter to judge could be whether the field definition/usage
> > itself is well know standard and is a part of RTE spec or its definition is
> > vendor specific.
> > 
> > >
> > > > - ICE PMD's dynamic field is however a vendor specific field and
> > > > only for ICE PMD users.
> > >
> > > Yes, but ICE PMD users may be as important as packet marking users.
> > 
> > Agree, I only meant that the flag ICE PMD registered cannot be used for
> > other PMD's so by using dynamic field, we are avoiding wastage of a static
> > field that is needed only by one specific PMD irrespective of whether that
> > PMD is probed or not.
> > 
> > >
> > > > In this case, it is just 1 bit out of 18 free bits available in ol_flags.
> > > >
> > > > >
> > > > > - When I implemented the dynamic fields/flags feature, I did a test
> > which
> > > > >   showed that the cost of having a dynamic offset was few cycles (on
> > my test
> > > > >   platform, it was~3 cycles for reading a field and ~2 cycles for writing a
> > > > >   field).
> > > >
> > > > I think this cost is of the case where the address where the
> > > > dyn_offset is stored is already in cache as it needs to be read first.
> > >
> > > This fetch of the value (in case it is not in cache) can be done once
> > > per bulk, so I'm not sure the impact would be high.
> > 
> > Agreed, for bulk case offset loading should have less impact.
> > 
> > >
> > >
> > > Regards,
> > > Olivier
> > >
> > >
> > > >
> > > >
> > > > >
> > > > > Regards,
> > > > > Olivier
> > > > >
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > + Techboard
> > > > > > > >
> > > > > > > > There is a related thread going on
> > > > > > > > https://urldefense.proofpoint.com/v2/url?u=https-3A__eur03.safelinks.protection.outlook.com_-3Furl-3Dhttps-253A&d=DwIFAg&c=nKjWec2b6R0mOyPaz7xtfQ&r=FZ_tPCbgFOh18zwRPO9H0yDx8VW38vuapifdDfc8SFQ&m=5IzeBXssneRLE7EeAyOUnytITHfcmhQu-eRLr18e5U0&s=aCn4On5jrJAfrGkQm_ftPodlBQo3QiozQM76MU9S8j8&e= 
> > > > > > > > %2F%2Furldefense.proofpoint.com%2Fv2%2Furl%3Fu%3Dhttp-
> > 3A__ma
> > > > > > > > ils.dpdk.org_archives_dev_2020-
> > 2DMay_168810.html%26d%3DDwIGa
> > > > > > > >
> > Q%26c%3DnKjWec2b6R0mOyPaz7xtfQ%26r%3DFZ_tPCbgFOh18zwRPO9H0y
> > D
> > > > > > > >
> > x8VW38vuapifdDfc8SFQ%26m%3DnyV4Rud03HW6DbWMpyvOCulQNkagmf
> > o0w
> > > > > > > >
> > KtrwQ7zmmg%26s%3DVuktoUb_xoLsHKdB9mV87x67cP9tXk3DqVXptt9nF_s
> > > > > > > >
> > %26e%3D&amp;data=02%7C01%7Cviacheslavo%40mellanox.com%7C5e7a
> > > > > > > >
> > 9c93fd684e09267108d807bd0160%7Ca652971c7d2e4d9ba6a4d149256f4
> > > > > > > >
> > 61b%7C0%7C0%7C637267855650797843&amp;sdata=r%2B0JIDapZocl6DD
> > > > > > > > A8wbNd4LV0CX6zEdDoQJhBpMoELw%3D&amp;reserved=0
> > > > > > > >
> > > > > > > > If there is no consensus on email, then I would like to add
> > > > > > > > this item to the next TB meeting.
> > > > > > >
> > > > > > > Ok, I'll add that to tomorrow meeting agenda.
> > > > > > > Konstantin
> > > > > > >
> > > > >
> > > > >

  reply	other threads:[~2020-06-08  9:40 UTC|newest]

Thread overview: 36+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-04-17  7:22 [dpdk-dev] " Nithin Dabilpuram
2020-04-17  7:22 ` [dpdk-dev] [PATCH 2/3] net/octeontx2: add tm packet marking cb Nithin Dabilpuram
2020-04-17  7:22 ` [dpdk-dev] [PATCH 3/3] net/octeontx2: add Tx packet marking offload support Nithin Dabilpuram
2020-05-01 11:18 ` [dpdk-dev] [PATCH 1/3] mbuf: add Tx offloads for packet marking Jerin Jacob
2020-05-04  8:06   ` Olivier Matz
2020-05-04  8:27     ` [dpdk-dev] [EXT] " Nithin Dabilpuram
2020-05-04  9:16       ` Olivier Matz
2020-05-04 10:04         ` Nithin Dabilpuram
2020-05-04 12:27           ` Olivier Matz
2020-05-05  6:19             ` Nithin Dabilpuram
2020-05-13 12:28               ` Nithin Dabilpuram
2020-05-14 20:29               ` Olivier Matz
2020-05-15 10:08                 ` Nithin Dabilpuram
2020-05-15 10:30                   ` Ananyev, Konstantin
2020-05-15 13:57                     ` Nithin Dabilpuram
2020-05-28 15:43                       ` Nithin Dabilpuram
2020-05-30 15:12                         ` Jerin Jacob
2020-06-02 10:53                           ` Ananyev, Konstantin
2020-06-02 14:25                             ` Nithin Dabilpuram
2020-06-03  8:28                               ` Olivier Matz
2020-06-03 10:44                                 ` Nithin Dabilpuram
2020-06-03 11:38                                   ` Olivier Matz
2020-06-03 12:52                                     ` Nithin Dabilpuram
2020-06-03 16:14                                       ` Slava Ovsiienko
2020-06-08  9:39                                         ` Nithin Dabilpuram [this message]
2020-06-03 14:56                                     ` Thomas Monjalon
2020-06-03 10:31                               ` Ananyev, Konstantin
2020-05-15 13:12                   ` Thomas Monjalon
2020-05-15 13:44                     ` Nithin Dabilpuram
2020-05-15 15:10                       ` Thomas Monjalon
2020-05-15 16:26                         ` Jerin Jacob
2020-05-15 16:52                           ` Thomas Monjalon
2020-05-15 17:00                             ` Jerin Jacob
2020-05-15 18:07                             ` Nithin Dabilpuram
2023-07-31 12:54 ` [dpdk-dev] " Thomas Monjalon
2023-08-14  8:11   ` Nithin Dabilpuram

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200608093948.GA11124@outlook.office365.com \
    --to=ndabilpuram@marvell.com \
    --cc=anatoly.burakov@intel.com \
    --cc=arybchenko@solarflare.com \
    --cc=cristian.dumitrescu@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=john.mcnamara@intel.com \
    --cc=kkanas@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=marko.kovacevic@intel.com \
    --cc=nithind1988@gmail.com \
    --cc=olivier.matz@6wind.com \
    --cc=orika@mellanox.com \
    --cc=techboard@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).