DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	Stephen Hemminger <stephen@networkplumber.org>,
	Jerin Jacob <jerinjacobk@gmail.com>
Cc: Kathleen Capella <Kathleen.Capella@arm.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Dharmik Thakkar <Dharmik.Thakkar@arm.com>,
	Ruifeng Wang <Ruifeng.Wang@arm.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
	Stephen Hemminger <sthemmin@microsoft.com>, nd <nd@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] L3fwd mode in testpmd
Date: Wed, 28 Apr 2021 10:59:40 +0000	[thread overview]
Message-ID: <DM6PR11MB44919BA0829F3B36C10419389A409@DM6PR11MB4491.namprd11.prod.outlook.com> (raw)
In-Reply-To: <DBAPR08MB5814C1CF33A897A0538DE72498419@DBAPR08MB5814.eurprd08.prod.outlook.com>


 
> > > >>>>>>>>>>>> On Thu, Mar 11, 2021 at 12:01 AM Honnappa Nagarahalli
> > > >>>>>>>>>>>> <Honnappa.Nagarahalli@arm.com> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hello,
> > > >>>>>>>>>>>>>         Performance of L3fwd example application is one
> > > >>>>>>>>>>>>> of the key
> > > >>>>>>>>>>>> benchmarks in DPDK. However, the application does not
> > > >>>>>>>>>>>> have many debugging statistics to understand the
> > > >>>>>>>>>>>> performance issues. We have added L3fwd as another
> > > >>>>>>>>>>>> mode/stream to testpmd which provides
> > > >>>>>>>>>> enough
> > > >>>>>>>>>>>> statistics at various levels. This has allowed us to
> > > >>>>>>>>>>>> debug the performance issues effectively.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> There is more work to be done to get it to upstreamable
> > > >>>>>>>>>>>>> state. I am
> > > >>>>>>>>>>>> wondering if such a patch is helpful for others and if
> > > >>>>>>>>>>>> the community would be interested in taking a look.
> > > >>>>>>>>>>>> Please let me know
> > > >>>>>>>>> what you think.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> We are using app/proc-info/ to attach and analyze the
> > > >>>>> performance.
> > > >>>>>>>>>>>> That helps to analyze the unmodified application. I
> > > >>>>>>>>>>>> think, if something is missing in proc-info app, in my
> > > >>>>>>>>>>>> opinion it is better to enhance proc-info so that it can
> > > >>>>>>>>>>>> help other third-party
> > > >>>>>>> applications.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Just my 2c.
> > > >>>>>>>>>>> Thanks Jerin. We will explore that.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I agree it is dangerous to rely too much on testpmd for
> > > >> everything.
> > > >>>>>>>>>> Please tell us what in testpmd could be useful out of it.
> > > >>>>>>>>>>
> > > >>>>>>>>> Things that are very helpful in testpmd are: 1) HW
> > > >>>>>>>>> statistics from the NIC 2) Forwarding stats 3) Burst stats
> > > >>>>>>>>> (indication of headroom
> > > >>>>>>>>> availability) 4) Easy to set parameters like RX and TX queue
> > > >>>>>>>>> depths (among others) without having to recompile.
> > > >>>>>>>>
> > > >>>>>>>> [Kathleen Capella]
> > > >>>>>>>> Thank you for the suggestion of app/proc-info. I've tried it
> > > >>>>>>>> out with l3fwd and see that it does have the HW stats from
> > > >>>>>>>> the NIC and the forwarding
> > > >>>>>>> stats.
> > > >>>>>>>> However, it does not have the burst stats testpmd offers, nor
> > > >>>>>>>> the
> > > >>>>>>>
> > > >>>>>>> One option to see such  level of debugging would be to have
> > > >>>>>>> - Create a memzone in the primary process
> > > >>>>>>> - Application under test can update the stats in memzone based
> > > >>>>>>> on the code flow
> > > >>>>>>> - proc-info can read the counters updated by application under
> > > >>>>>>> test using the memzone object got through
> > > >> rte_memzone_lookup()
> > > >>>>>> Agreed. Currently, using app/proc-info does not provide this
> > > >>>>>> ability. We
> > > >>>>> cannot add this capability to app/proc-info as these stats would
> > > >>>>> be specific to L3fwd application.
> > > >>>>>
> > > >>>>> I meant creating generic counter-read/write infra via memzone to
> > > >>>>> not make it as l3fwd specific.
> > > >>>> Currently, app/proc-info is able to print the stats as they are
> > > >>>> standardized
> > > >> via the API. But for statistics that are generated in the
> > > >> application, they are very specific to that application. For ex:
> > > >> burst stats in testpmd are very specific to it and another
> > > >> application might implement the same in a very different manner.
> > > >>>>
> > > >>>> In needs to be something like the app/proc-info just needs to be
> > > >>>> a dumb
> > > >> displaying utility and the application has to do all the heavy
> > > >> lifting of copying the exact display strings to the memory.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>> Another approach will be using rte_trace()[1] for
> > > >>>>>>> debugging/tracing by adding tracepoints in l3fwd for such events.
> > > >>>>>>> It has a timestamp and the trace format is opensource trace
> > > >>>>>>> format(CTF(Common trace format)), so that we can use post
> > > >>>>>>> posting tools to analyze.
> > > >>>>>>> [1]
> > > >>>>>>> https://doc.dpdk.org/guides/prog_guide/trace_lib.html
> > > >>>>>> This is good for analyzing an incident. I think it is an
> > > >>>>>> overhead for
> > > >>>>> development purposes.
> > > >>>>>
> > > >>>>> Consider if one wants to add burst stats, one can add stats
> > > >>>>> increment under RTE_TRACE_POINT_FP, it will be emitted
> > whenever
> > > >>>>> code flow through that path. Set of events of can be viewed in
> > > >>>>> trace viewer[1]. Would that be enough?
> > > >>>>> Adding traces to l3fwd can be upstreamed as it is useful for
> > > >>>>> others for debugging.
> > > >>>>>
> > > >>>>> [1]
> > > >>>>> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG
> > > >>>> This needs post processing of the trace info to derive the
> > > >>>> information, is it
> > > >> correct? For ex: for burst stats, there will be several traces
> > > >> generated collecting the number of packets returned by
> > > >> rte_eth_rx_burst which needs to be post processed.
> > > >>>
> > > >>> Or You can have an additional variable to acculate it.
> > > >>>
> > > >>>> Also, adding traces is equivalent to adding statistics in L3fwd.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>> If the sole purpose only stats then it is better to add status in
> > > >>> l3fwd without performance impact. I thought some thing else.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>>> ability to easily change parameters without having to
> > > >>>>>>>> recompile, which helps reduce debugging time significantly.
> > > >>>> We will not be able to fix this above issue.
> > > >>>
> > > >>> It depends on what you want to debug. Trace can be disabled at
> > runtime.
> > > >>
> > > >>
> > > >> DPDK has existing API's for application metrics but they are rarely used.
> > > >>
> > > >> Why not implement rte_metrics in l3fwd and proc-info?
> > > > This discussion has ended up as a stats discussion. But, we also need to
> > be able to change the configurable parameters easily.
> > > > If we implement the stats and ability to change the configurable
> > > > parameters, then it is essentially bringing in some of the
> > > > capabilities from
> > > testpmd to the sample application. I think that will result in lot more code in
> > the sample application and will make it complicated.
> > > >
> > > > Instead our proposal is to take L3fwd to testpmd and use all the
> > > > infra code that testpmd provides. We see that this approach results
> > > > in less
> > > amount of code added to DPDK overall.
> > > >
> > >
> > > Agree that it may help testing to have l3fwd support on the testpmd.
> > >
> > > Two concerns,
> > > 1) Testpmd already too complex.
> > > 2) Code duplication.
> > >
> > > For 1), if the l3fwd can be implemented in testpmd as new, independent
> > > forwarding mode, without touching rest of the testpmd, I think it can be
> > OK.
> Yes, this is what we have done. It is a new forwarding mode.
> We could remove some forwarding modes from testpmd. For ex: macfwd, macswap seem very similar to iofwd mode.

Not really, iowfd doesn't touch packet data at all, while macfwd and macwap change L2 headers.
In fact I found all of them quite helpful (just for different cases), so please keep them.

> 
> >
> > In fact, l3fwd is also quite big and complex:
> > $ wc -l examples/l3fwd/*.[h,c] |grep total
> >   6969 total
> >
> > Plus it will introduce extra dependencies (fib, lpm, hash, might-be acl?) I am
> > not sure it is a good idea to pull all these complexities into test-pmd.
> I do not suggest pulling all these in. In our case, I see that the ask is only on LPM. I am open to hearing what others see as the requirement.

Ok, but l3fwd forwarding model is quite different from current PMD one
(egress queue selection, TX packets buffering, etc.).
I suppose you'll need to pull all that too from l3fwd?

> 
> > I can't imagine that l3fwd app need ability to configure each and every PMD
> > parameter.
> > From my experience in l3fwd most of cycles are spent not in PMD itself, but
> > in actual packet processing: header parsing and checking, classification,
> > routing table lookup, etc.
> During our work, we had to experiment with burst size, rx/tx queue depths along with other PMD specific configuration parameters. The
> packet processing code remains the same and there is not much to optimize.

I think burst-size and rx/tx queue size can be added into l3fwd as new config parameters.
Doesn't look like a major issue to me.
PMD specific parameters could be a problem... anything particular you plan to use?
 
> >
> > > Not sure how to address 2), also lets say we want to add new feature
> > > to l3fwd, where it should go, to the sample or to the testpmd?
> L3fwd example will remain as the example. We have to duplicate the code into testpmd. If L3fwd example is changed, it needs to be
> changed in testpmd as well.

Usually code duplication is not a good sign.
I understand that sometimes it is unavoidable, but why we have to do it here?


  parent reply	other threads:[~2021-04-28 10:59 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-10 18:31 Honnappa Nagarahalli
2021-03-11  6:41 ` Jerin Jacob
2021-03-11 15:18   ` Honnappa Nagarahalli
2021-03-11 15:46     ` Thomas Monjalon
2021-03-11 16:00       ` Honnappa Nagarahalli
2021-03-31 20:35         ` Kathleen Capella
2021-03-31 21:17           ` Jerin Jacob
2021-04-01  0:20             ` Honnappa Nagarahalli
2021-04-01  4:38               ` Jerin Jacob
2021-04-24  0:26                 ` Honnappa Nagarahalli
2021-04-26  9:44                   ` Jerin Jacob
2021-04-26 17:47                     ` Stephen Hemminger
2021-04-26 20:46                       ` Honnappa Nagarahalli
2021-04-27  9:39                         ` Andrew Rybchenko
2021-04-27  9:50                         ` Ferruh Yigit
2021-04-27  9:57                           ` Ananyev, Konstantin
2021-04-27 11:11                             ` Thomas Monjalon
2021-04-27 11:32                               ` Bruce Richardson
2021-04-27 23:26                                 ` Honnappa Nagarahalli
2021-04-27 23:17                             ` Honnappa Nagarahalli
2021-04-28 10:48                               ` Bruce Richardson
2021-04-28 11:04                                 ` Stanisław Kardach
2021-04-28 11:19                                   ` Thomas Monjalon
2021-04-28 21:44                                   ` Honnappa Nagarahalli
2021-04-29  7:49                                     ` Stanislaw Kardach
2021-04-29  8:31                                       ` Ananyev, Konstantin
2021-04-29 10:39                                         ` Stanislaw Kardach
2021-04-29 11:47                                           ` Ananyev, Konstantin
2021-04-29 11:53                                             ` Stanislaw Kardach
2021-04-30 11:28                                               ` Ananyev, Konstantin
2021-08-02 15:07                                                 ` Dharmik Thakkar
2021-04-28 11:17                                 ` Thomas Monjalon
2021-04-28 10:59                               ` Ananyev, Konstantin [this message]
2021-04-28 22:10                                 ` Honnappa Nagarahalli
2021-04-27 16:01                           ` Stephen Hemminger
2021-04-27 20:20                             ` Honnappa Nagarahalli
2021-04-27 22:23                               ` Ananyev, Konstantin
2021-04-27 23:11                                 ` Honnappa Nagarahalli
2021-04-28 11:00                                   ` Ananyev, Konstantin
2021-04-26 20:32                     ` Honnappa Nagarahalli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DM6PR11MB44919BA0829F3B36C10419389A409@DM6PR11MB4491.namprd11.prod.outlook.com \
    --to=konstantin.ananyev@intel.com \
    --cc=Dharmik.Thakkar@arm.com \
    --cc=Honnappa.Nagarahalli@arm.com \
    --cc=Kathleen.Capella@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=hemant.agrawal@nxp.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=nd@arm.com \
    --cc=stephen@networkplumber.org \
    --cc=sthemmin@microsoft.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).