DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	"Konstantin Ananyev" <konstantin.v.ananyev@yandex.ru>,
	"Feifei Wang" <Feifei.Wang2@arm.com>
Cc: "nd" <nd@arm.com>, <dev@dpdk.org>,
	"Ruifeng Wang" <Ruifeng.Wang@arm.com>,
	<honnappanagarahalli@gmail.com>, "nd" <nd@arm.com>
Subject: RE: [PATCH v1 0/5] Direct re-arming of buffers on receive side
Date: Fri, 1 Jul 2022 22:28:50 +0200	[thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D8719F@smartserver.smartshare.dk> (raw)
In-Reply-To: <DBAPR08MB5814A63234E86623E628B36C98BD9@DBAPR08MB5814.eurprd08.prod.outlook.com>

> From: Honnappa Nagarahalli [mailto:Honnappa.Nagarahalli@arm.com]
> Sent: Friday, 1 July 2022 21.31
> 
> <snip>
> 
> > > > > >>>>
> > > > > >>>> 16/05/2022 07:10, Feifei Wang пишет:
> > > > > >>>>>
> > > > > >>>>>>> Currently, the transmit side frees the buffers into the
> > > lcore
> > > > > >>>>>>> cache and the receive side allocates buffers from the
> > > > > >>>>>>> lcore
> > > > > cache.
> > > > > >>>>>>> The transmit side typically frees 32 buffers resulting
> in
> > > > > >>>>>>> 32*8=256B of stores to lcore cache. The receive side
> > > allocates
> > > > > 32
> > > > > >>>>>>> buffers and stores them in the receive side software
> ring,
> > > > > >>>>>>> resulting in 32*8=256B of stores and 256B of load from
> the
> > > > > lcore cache.
> > > > > >>>>>>>
> > > > > >>>>>>> This patch proposes a mechanism to avoid freeing
> > > to/allocating
> > > > > >>>>>>> from the lcore cache. i.e. the receive side will free
> the
> > > > > buffers
> > > > > >>>>>>> from transmit side directly into it's software ring.
> This
> > > will
> > > > > >>>>>>> avoid the 256B of loads and stores introduced by the
> lcore
> > > > > cache.
> > > > > >>>>>>> It also frees up the cache lines used by the lcore
> cache.
> > > > > >>>>>>>
> > > > > >>>>>>> However, this solution poses several constraints:
> > > > > >>>>>>>
> > > > > >>>>>>> 1)The receive queue needs to know which transmit queue
> it
> > > > > should
> > > > > >>>>>>> take the buffers from. The application logic decides
> which
> > > > > >>>>>>> transmit port to use to send out the packets. In many
> use
> > > > > >>>>>>> cases the NIC might have a single port ([1], [2], [3]),
> in
> > > > > >>>>>>> which case
> > > > > a
> > > > > >>>>>>> given transmit queue is always mapped to a single
> receive
> > > > > >>>>>>> queue
> > > > > >>>>>>> (1:1 Rx queue: Tx queue). This is easy to configure.
> > > > > >>>>>>>
> > > > > >>>>>>> If the NIC has 2 ports (there are several references),
> > > > > >>>>>>> then
> > > we
> > > > > >>>>>>> will have
> > > > > >>>>>>> 1:2 (RX queue: TX queue) mapping which is still easy to
> > > > > configure.
> > > > > >>>>>>> However, if this is generalized to 'N' ports, the
> > > > > >>>>>>> configuration can be long. More over the PMD would have
> to
> > > > > >>>>>>> scan a list of transmit queues to pull the buffers
> from.
> > > > > >>>>>
> > > > > >>>>>> Just to re-iterate some generic concerns about this
> > > proposal:
> > > > > >>>>>>     - We effectively link RX and TX queues - when this
> > > feature
> > > > > is enabled,
> > > > > >>>>>>       user can't stop TX queue without stopping linked
> RX
> > > queue
> > > > > first.
> > > > > >>>>>>       Right now user is free to start/stop any queues at
> > > > > >>>>>> his
> > > > > will.
> > > > > >>>>>>       If that feature will allow to link queues from
> > > different
> > > > > ports,
> > > > > >>>>>>       then even ports will become dependent and user
> will
> > > have
> > > > > to pay extra
> > > > > >>>>>>       care when managing such ports.
> > > > > >>>>>
> > > > > >>>>> [Feifei] When direct rearm enabled, there are two path
> for
> > > > > >>>>> thread
> > > > > to
> > > > > >>>>> choose. If there are enough Tx freed buffers, Rx can put
> > > buffers
> > > > > >>>>> from Tx.
> > > > > >>>>> Otherwise, Rx will put buffers from mempool as usual.
> Thus,
> > > > > >>>>> users
> > > > > do
> > > > > >>>>> not need to pay much attention managing ports.
> > > > > >>>>
> > > > > >>>> What I am talking about: right now different port or
> > > > > >>>> different
> > > > > queues
> > > > > >>>> of the same port can be treated as independent entities:
> > > > > >>>> in general user is free to start/stop (and even
> reconfigure
> > > > > >>>> in
> > > > > some
> > > > > >>>> cases) one entity without need to stop other entity.
> > > > > >>>> I.E user can stop and re-configure TX queue while keep
> > > receiving
> > > > > >>>> packets from RX queue.
> > > > > >>>> With direct re-arm enabled, I think it wouldn't be
> possible
> > > any
> > > > > more:
> > > > > >>>> before stopping/reconfiguring TX queue user would have
> make
> > > sure
> > > > > that
> > > > > >>>> corresponding RX queue wouldn't be used by datapath.
> > > > > >>> I am trying to understand the problem better. For the TX
> queue
> > > to
> > > > > be stopped,
> > > > > >> the user must have blocked the data plane from accessing the
> TX
> > > > > queue.
> > > > > >>
> > > > > >> Surely it is user responsibility tnot to call tx_burst() for
> > > > > stopped/released queue.
> > > > > >> The problem is that while TX for that queue is stopped, RX
> for
> > > > > related queue still
> > > > > >> can continue.
> > > > > >> So rx_burst() will try to read/modify TX queue data, that
> might
> > > be
> > > > > already freed,
> > > > > >> or simultaneously modified by control path.
> > > > > > Understood, agree on the issue
> > > > > >
> > > > > >>
> > > > > >> Again, it all can be mitigated by carefully re-designing and
> > > > > modifying control and
> > > > > >> data-path inside user app - by doing extra checks and
> > > > > synchronizations, etc.
> > > > > >> But from practical point - I presume most of users simply
> would
> > > > > avoid using this
> > > > > >> feature due all potential problems it might cause.
> > > > > > That is subjective, it all depends on the performance
> > > improvements
> > > > > users see in their application.
> > > > > > IMO, the performance improvement seen with this patch is
> worth
> > > few
> > > > > changes.
> > > > >
> > > > > Yes, it is subjective till some extent, though my feeling that
> it
> > > > > might end-up being sort of synthetic improvement used only by
> some
> > > > > show-case benchmarks.
> > > >
> > > > I believe that one specific important use case has already been
> > > mentioned, so I
> > > > don't think this is a benchmark only feature.
> > > +1
> > >
> > > >
> > > > >  From my perspective, it would be much more plausible, if we
> can
> > > > > introduce some sort of generic improvement, that doesn't impose
> > > > > all these extra constraints and implications.
> > > > > Like one, discussed below in that thread with ZC mempool
> approach.
> > > > >
> > > >
> > > > Considering this feature from a high level perspective, I agree
> with
> > > Konstantin's
> > > > concerns, so I'll also support his views.
> > > We did hack the ZC mempool approach [1], level of improvement is
> > > pretty small compared with this patch.
> > >
> > > [1]
> http://patches.dpdk.org/project/dpdk/patch/20220613055136.1949784-
> > > 1-feifei.wang2@arm.com/
> > >
> > > >
> > > > If this patch is supposed to be a generic feature, please add
> > > > support
> > > for it in all
> > > > NIC PMDs, not just one. (Regardless if the feature is defined as
> 1:1
> > > mapping or
> > > > N:M mapping.) It is purely software, so it should be available
> for
> > > all PMDs, not
> > > > just your favorite hardware! Consider the "fast mbuf free"
> feature,
> > > which is
> > > > pure software; why is that feature not implemented in all PMDs?
> > > Agree, it is good to have it supported in all the drivers. We do
> not
> > > have a favorite hardware, just picked a PMD which we are more
> familiar
> > > with. We do plan to implement in other prominent PMDs.
> > >
> > > >
> > > > A secondary point I'm making here is that this specific feature
> will
> > > lead to an
> > > > enormous amount of copy-paste code, instead of a generic library
> > > function
> > > > easily available for all PMDs.
> > > Are you talking about the i40e driver code in specific? If yes,
> agree
> > > we should avoid copy-paste and we will look to reduce that.
> >
> > Yes, I am talking about the code that needs to be copied into all
> prominent
> > PMDs. Perhaps you can move the majority of it into a common
> directory, if not
> > in a generic library, so the modification per PMD becomes smaller. (I
> see the
> > same copy-paste issue with the "fast mbuf free" feature, if to be
> supported by
> > other than the i40e PMD.)
> The current abstraction does not allow for common code at this (lower)
> level across all the PMDs. If we look at "fast free", it is accessing
> the device private structure for the list of buffers to free. If it
> needs to be common code, this needs to be lifted up along with other
> dependent configuration thresholds etc.

Exactly. The "direct re-arm" feature has some of the same design properties as "fast free": They are both purely software features, and they are both embedded deeply in the PMD code.

I don't know if it is possible, but perhaps we could redesign the private buffer structures of the PMDs, so the first part of it is common across PMDs, and the following part is truly private. The common part would obviously hold the mbuf pointer. That way, a common library could manipulate the public part.

Or perhaps some other means could be provided for a common library to manipulate the common parts of a private structure, e.g. a common fast_free function could take a few parameters to work on private buffer structures: void * buffer_array, size_t element_size, size_t mbuf_offset_in_element, unsigned int num_elements.

> 
> >
> > Please note that I do not expect you to implement this feature in
> other PMDs
> > than you need. I was trying to make the point that implementing a
> software
> > feature in a PMD requires copy-pasting to other PMDs, which can
> require a big
> > effort; while implementing it in a library and calling the library
> from the PMDs
> > require a smaller effort per PMD. I intentionally phrased it somewhat
> > provokingly, and was lucky not to offend anyone. :-)

Unless we find a better solution, copy-paste code across PMDs seems to be the only way to implement such features. And I agree that they should not be blocked due to "code complexity" or "copy-paste" arguments, if they are impossible to implement in a more "correct" way.

The DPDK community should accept such contributions to the common code, and be grateful that they don't just go into private forks of DPDK!


  reply	other threads:[~2022-07-01 20:28 UTC|newest]

Thread overview: 145+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-20  8:16 Feifei Wang
2022-04-20  8:16 ` [PATCH v1 1/5] net/i40e: remove redundant Dtype initialization Feifei Wang
2022-04-20  8:16 ` [PATCH v1 2/5] net/i40e: enable direct rearm mode Feifei Wang
2022-05-11 22:28   ` Konstantin Ananyev
2022-04-20  8:16 ` [PATCH v1 3/5] ethdev: add API for " Feifei Wang
2022-04-20  9:59   ` Morten Brørup
2022-04-29  2:42     ` 回复: " Feifei Wang
2022-04-20 10:41   ` Andrew Rybchenko
2022-04-29  6:28     ` 回复: " Feifei Wang
2022-05-10 22:49       ` Honnappa Nagarahalli
2022-06-03 10:19         ` Andrew Rybchenko
2022-04-20 10:50   ` Jerin Jacob
2022-05-02  3:09     ` 回复: " Feifei Wang
2022-04-21 14:57   ` Stephen Hemminger
2022-04-29  6:35     ` 回复: " Feifei Wang
2022-04-20  8:16 ` [PATCH v1 4/5] net/i40e: add direct rearm mode internal API Feifei Wang
2022-05-11 22:31   ` Konstantin Ananyev
2022-04-20  8:16 ` [PATCH v1 5/5] examples/l3fwd: enable direct rearm mode Feifei Wang
2022-04-20 10:10   ` Morten Brørup
2022-04-21  2:35     ` Honnappa Nagarahalli
2022-04-21  6:40       ` Morten Brørup
2022-05-10 22:01         ` Honnappa Nagarahalli
2022-05-11  7:17           ` Morten Brørup
2022-05-11 22:33   ` Konstantin Ananyev
2022-05-27 11:28     ` Konstantin Ananyev
2022-05-31 17:14       ` Honnappa Nagarahalli
2022-06-03 10:32         ` Andrew Rybchenko
2022-06-06 11:27         ` Konstantin Ananyev
2022-06-29 21:25           ` Honnappa Nagarahalli
2022-05-11 23:00 ` [PATCH v1 0/5] Direct re-arming of buffers on receive side Konstantin Ananyev
     [not found] ` <20220516061012.618787-1-feifei.wang2@arm.com>
2022-05-24  1:25   ` Konstantin Ananyev
2022-05-24 12:40     ` Morten Brørup
2022-05-24 20:14     ` Honnappa Nagarahalli
2022-05-28 12:22       ` Konstantin Ananyev
2022-06-01  1:00         ` Honnappa Nagarahalli
2022-06-03 23:32           ` Konstantin Ananyev
2022-06-04  8:07             ` Morten Brørup
2022-06-29 21:58               ` Honnappa Nagarahalli
2022-06-30 15:21                 ` Morten Brørup
2022-07-01 19:30                   ` Honnappa Nagarahalli
2022-07-01 20:28                     ` Morten Brørup [this message]
2022-06-13  5:55     ` 回复: " Feifei Wang
2023-01-04  7:30 ` [PATCH v3 0/3] " Feifei Wang
2023-01-04  7:30   ` [PATCH v3 1/3] ethdev: enable direct rearm with separate API Feifei Wang
2023-01-04  8:21     ` Morten Brørup
2023-01-04  8:51       ` 回复: " Feifei Wang
2023-01-04 10:11         ` Morten Brørup
2023-02-24  8:55           ` 回复: " Feifei Wang
2023-03-06 12:49       ` Ferruh Yigit
2023-03-06 13:26         ` Morten Brørup
2023-03-06 14:53           ` 回复: " Feifei Wang
2023-03-06 15:02           ` Ferruh Yigit
2023-03-07  6:12             ` Honnappa Nagarahalli
2023-03-07 10:52               ` Konstantin Ananyev
2023-03-07 20:41               ` Ferruh Yigit
2023-03-22 14:43                 ` Honnappa Nagarahalli
2023-02-02 14:33     ` Konstantin Ananyev
2023-02-24  9:45       ` 回复: " Feifei Wang
2023-02-27 19:31         ` Konstantin Ananyev
2023-02-28  2:16           ` 回复: " Feifei Wang
2023-02-28  8:09           ` Morten Brørup
2023-03-01  7:34             ` 回复: " Feifei Wang
2023-01-04  7:30   ` [PATCH v3 2/3] net/i40e: " Feifei Wang
2023-02-02 14:37     ` Konstantin Ananyev
2023-02-24  9:50       ` 回复: " Feifei Wang
2023-02-27 19:35         ` Konstantin Ananyev
2023-02-28  2:15           ` 回复: " Feifei Wang
2023-03-07 11:01             ` Konstantin Ananyev
2023-03-14  6:07               ` 回复: " Feifei Wang
2023-03-19 16:11                 ` Konstantin Ananyev
2023-03-23 10:49                   ` Feifei Wang
2023-01-04  7:30   ` [PATCH v3 3/3] net/ixgbe: " Feifei Wang
2023-01-31  6:13   ` 回复: [PATCH v3 0/3] Direct re-arming of buffers on receive side Feifei Wang
2023-02-01  1:10     ` Konstantin Ananyev
2023-02-01  2:24       ` 回复: " Feifei Wang
2023-03-22 12:56   ` Morten Brørup
2023-03-22 13:41     ` Honnappa Nagarahalli
2023-03-22 14:04       ` Morten Brørup
2023-08-02  7:38 ` [PATCH v8 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-02  7:38   ` [PATCH v8 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-02  7:38   ` [PATCH v8 2/4] net/i40e: implement " Feifei Wang
2023-08-02  7:38   ` [PATCH v8 3/4] net/ixgbe: " Feifei Wang
2023-08-02  7:38   ` [PATCH v8 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-02  8:08 ` [PATCH v9 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-02  8:08   ` [PATCH v9 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-02  8:08   ` [PATCH v9 2/4] net/i40e: implement " Feifei Wang
2023-08-02  8:08   ` [PATCH v9 3/4] net/ixgbe: " Feifei Wang
2023-08-02  8:08   ` [PATCH v9 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-04  9:24 ` [PATCH v10 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-04  9:24   ` [PATCH v10 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-04  9:24   ` [PATCH v10 2/4] net/i40e: implement " Feifei Wang
2023-08-04  9:24   ` [PATCH v10 3/4] net/ixgbe: " Feifei Wang
2023-08-04  9:24   ` [PATCH v10 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-22  7:27 ` [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-22  7:27   ` [PATCH v11 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-22 14:02     ` Stephen Hemminger
2023-08-24  3:16       ` Feifei Wang
2023-08-22 23:33     ` Konstantin Ananyev
2023-08-24  3:38     ` Feifei Wang
2023-08-22  7:27   ` [PATCH v11 2/4] net/i40e: implement " Feifei Wang
2023-08-22 23:43     ` Konstantin Ananyev
2023-08-24  6:10     ` Feifei Wang
2023-08-31 17:24       ` Konstantin Ananyev
2023-08-31 23:49         ` Konstantin Ananyev
2023-09-01 12:22         ` Feifei Wang
2023-09-01 14:22           ` Konstantin Ananyev
2023-09-04  6:59             ` Feifei Wang
2023-09-04  7:49               ` Konstantin Ananyev
2023-09-04  9:24                 ` Feifei Wang
2023-09-04 10:21                   ` Konstantin Ananyev
2023-09-05  3:11                     ` Feifei Wang
2023-09-22 14:58                       ` Feifei Wang
2023-09-22 15:46                         ` Feifei Wang
2023-09-22 16:40                           ` Konstantin Ananyev
2023-09-23  5:52                             ` Feifei Wang
2023-09-23 20:40                               ` Konstantin Ananyev
2023-09-25  3:26                               ` Feifei Wang
2023-08-22  7:27   ` [PATCH v11 3/4] net/ixgbe: " Feifei Wang
2023-08-22  7:27   ` [PATCH v11 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-08-22  7:33   ` [PATCH v11 0/4] Recycle mbufs from Tx queue into Rx queue Feifei Wang
2023-08-22 13:59   ` Stephen Hemminger
2023-08-24  3:11     ` Feifei Wang
2023-08-24  7:36 ` [PATCH v12 " Feifei Wang
2023-08-24  7:36   ` [PATCH v12 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-08-31  9:16     ` Feifei Wang
2023-09-20 13:10     ` Ferruh Yigit
2023-08-24  7:36   ` [PATCH v12 2/4] net/i40e: implement " Feifei Wang
2023-08-24  7:36   ` [PATCH v12 3/4] net/ixgbe: " Feifei Wang
2023-08-24  7:36   ` [PATCH v12 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-09-20 13:11     ` Ferruh Yigit
2023-09-20 13:12   ` [PATCH v12 0/4] Recycle mbufs from Tx queue into Rx queue Ferruh Yigit
2023-09-22 15:30     ` Ferruh Yigit
2023-09-25  3:19 ` [PATCH v13 " Feifei Wang
2023-09-25  3:19   ` [PATCH v13 1/4] ethdev: add API for mbufs recycle mode Feifei Wang
2023-09-25  4:40     ` Ajit Khaparde
2023-09-25  3:19   ` [PATCH v13 2/4] net/i40e: implement " Feifei Wang
2023-09-26  8:26     ` Ferruh Yigit
2023-09-26  8:56       ` Konstantin Ananyev
2023-09-26 13:34     ` Konstantin Ananyev
2023-09-25  3:19   ` [PATCH v13 3/4] net/ixgbe: " Feifei Wang
2023-09-26 13:30     ` Konstantin Ananyev
2023-09-25  3:19   ` [PATCH v13 4/4] app/testpmd: add recycle mbufs engine Feifei Wang
2023-09-26 13:30     ` Konstantin Ananyev
2023-09-26 16:38       ` Ajit Khaparde
2023-09-27 17:24   ` [PATCH v13 0/4] Recycle mbufs from Tx queue into Rx queue Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D8719F@smartserver.smartshare.dk \
    --to=mb@smartsharesystems.com \
    --cc=Feifei.Wang2@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=dev@dpdk.org \
    --cc=honnappanagarahalli@gmail.com \
    --cc=konstantin.v.ananyev@yandex.ru \
    --cc=nd@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).