From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>,
"Yigit, Ferruh" <ferruh.yigit@intel.com>,
Stephen Hemminger <stephen@networkplumber.org>,
Jerin Jacob <jerinjacobk@gmail.com>
Cc: Kathleen Capella <Kathleen.Capella@arm.com>,
"thomas@monjalon.net" <thomas@monjalon.net>,
"dev@dpdk.org" <dev@dpdk.org>,
Dharmik Thakkar <Dharmik.Thakkar@arm.com>,
Ruifeng Wang <Ruifeng.Wang@arm.com>,
"david.marchand@redhat.com" <david.marchand@redhat.com>,
"Richardson, Bruce" <bruce.richardson@intel.com>,
"jerinj@marvell.com" <jerinj@marvell.com>,
"hemant.agrawal@nxp.com" <hemant.agrawal@nxp.com>,
Stephen Hemminger <sthemmin@microsoft.com>, nd <nd@arm.com>,
nd <nd@arm.com>
Subject: Re: [dpdk-dev] L3fwd mode in testpmd
Date: Wed, 28 Apr 2021 10:59:40 +0000 [thread overview]
Message-ID: <DM6PR11MB44919BA0829F3B36C10419389A409@DM6PR11MB4491.namprd11.prod.outlook.com> (raw)
In-Reply-To: <DBAPR08MB5814C1CF33A897A0538DE72498419@DBAPR08MB5814.eurprd08.prod.outlook.com>
> > > >>>>>>>>>>>> On Thu, Mar 11, 2021 at 12:01 AM Honnappa Nagarahalli
> > > >>>>>>>>>>>> <Honnappa.Nagarahalli@arm.com> wrote:
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> Hello,
> > > >>>>>>>>>>>>> Performance of L3fwd example application is one
> > > >>>>>>>>>>>>> of the key
> > > >>>>>>>>>>>> benchmarks in DPDK. However, the application does not
> > > >>>>>>>>>>>> have many debugging statistics to understand the
> > > >>>>>>>>>>>> performance issues. We have added L3fwd as another
> > > >>>>>>>>>>>> mode/stream to testpmd which provides
> > > >>>>>>>>>> enough
> > > >>>>>>>>>>>> statistics at various levels. This has allowed us to
> > > >>>>>>>>>>>> debug the performance issues effectively.
> > > >>>>>>>>>>>>>
> > > >>>>>>>>>>>>> There is more work to be done to get it to upstreamable
> > > >>>>>>>>>>>>> state. I am
> > > >>>>>>>>>>>> wondering if such a patch is helpful for others and if
> > > >>>>>>>>>>>> the community would be interested in taking a look.
> > > >>>>>>>>>>>> Please let me know
> > > >>>>>>>>> what you think.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> We are using app/proc-info/ to attach and analyze the
> > > >>>>> performance.
> > > >>>>>>>>>>>> That helps to analyze the unmodified application. I
> > > >>>>>>>>>>>> think, if something is missing in proc-info app, in my
> > > >>>>>>>>>>>> opinion it is better to enhance proc-info so that it can
> > > >>>>>>>>>>>> help other third-party
> > > >>>>>>> applications.
> > > >>>>>>>>>>>>
> > > >>>>>>>>>>>> Just my 2c.
> > > >>>>>>>>>>> Thanks Jerin. We will explore that.
> > > >>>>>>>>>>
> > > >>>>>>>>>> I agree it is dangerous to rely too much on testpmd for
> > > >> everything.
> > > >>>>>>>>>> Please tell us what in testpmd could be useful out of it.
> > > >>>>>>>>>>
> > > >>>>>>>>> Things that are very helpful in testpmd are: 1) HW
> > > >>>>>>>>> statistics from the NIC 2) Forwarding stats 3) Burst stats
> > > >>>>>>>>> (indication of headroom
> > > >>>>>>>>> availability) 4) Easy to set parameters like RX and TX queue
> > > >>>>>>>>> depths (among others) without having to recompile.
> > > >>>>>>>>
> > > >>>>>>>> [Kathleen Capella]
> > > >>>>>>>> Thank you for the suggestion of app/proc-info. I've tried it
> > > >>>>>>>> out with l3fwd and see that it does have the HW stats from
> > > >>>>>>>> the NIC and the forwarding
> > > >>>>>>> stats.
> > > >>>>>>>> However, it does not have the burst stats testpmd offers, nor
> > > >>>>>>>> the
> > > >>>>>>>
> > > >>>>>>> One option to see such level of debugging would be to have
> > > >>>>>>> - Create a memzone in the primary process
> > > >>>>>>> - Application under test can update the stats in memzone based
> > > >>>>>>> on the code flow
> > > >>>>>>> - proc-info can read the counters updated by application under
> > > >>>>>>> test using the memzone object got through
> > > >> rte_memzone_lookup()
> > > >>>>>> Agreed. Currently, using app/proc-info does not provide this
> > > >>>>>> ability. We
> > > >>>>> cannot add this capability to app/proc-info as these stats would
> > > >>>>> be specific to L3fwd application.
> > > >>>>>
> > > >>>>> I meant creating generic counter-read/write infra via memzone to
> > > >>>>> not make it as l3fwd specific.
> > > >>>> Currently, app/proc-info is able to print the stats as they are
> > > >>>> standardized
> > > >> via the API. But for statistics that are generated in the
> > > >> application, they are very specific to that application. For ex:
> > > >> burst stats in testpmd are very specific to it and another
> > > >> application might implement the same in a very different manner.
> > > >>>>
> > > >>>> In needs to be something like the app/proc-info just needs to be
> > > >>>> a dumb
> > > >> displaying utility and the application has to do all the heavy
> > > >> lifting of copying the exact display strings to the memory.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>> Another approach will be using rte_trace()[1] for
> > > >>>>>>> debugging/tracing by adding tracepoints in l3fwd for such events.
> > > >>>>>>> It has a timestamp and the trace format is opensource trace
> > > >>>>>>> format(CTF(Common trace format)), so that we can use post
> > > >>>>>>> posting tools to analyze.
> > > >>>>>>> [1]
> > > >>>>>>> https://doc.dpdk.org/guides/prog_guide/trace_lib.html
> > > >>>>>> This is good for analyzing an incident. I think it is an
> > > >>>>>> overhead for
> > > >>>>> development purposes.
> > > >>>>>
> > > >>>>> Consider if one wants to add burst stats, one can add stats
> > > >>>>> increment under RTE_TRACE_POINT_FP, it will be emitted
> > whenever
> > > >>>>> code flow through that path. Set of events of can be viewed in
> > > >>>>> trace viewer[1]. Would that be enough?
> > > >>>>> Adding traces to l3fwd can be upstreamed as it is useful for
> > > >>>>> others for debugging.
> > > >>>>>
> > > >>>>> [1]
> > > >>>>> https://github.com/jerinjacobk/share/blob/master/dpdk_trace.JPG
> > > >>>> This needs post processing of the trace info to derive the
> > > >>>> information, is it
> > > >> correct? For ex: for burst stats, there will be several traces
> > > >> generated collecting the number of packets returned by
> > > >> rte_eth_rx_burst which needs to be post processed.
> > > >>>
> > > >>> Or You can have an additional variable to acculate it.
> > > >>>
> > > >>>> Also, adding traces is equivalent to adding statistics in L3fwd.
> > > >>>
> > > >>> Yes.
> > > >>>
> > > >>> If the sole purpose only stats then it is better to add status in
> > > >>> l3fwd without performance impact. I thought some thing else.
> > > >>>
> > > >>>>
> > > >>>>>>>
> > > >>>>>>>> ability to easily change parameters without having to
> > > >>>>>>>> recompile, which helps reduce debugging time significantly.
> > > >>>> We will not be able to fix this above issue.
> > > >>>
> > > >>> It depends on what you want to debug. Trace can be disabled at
> > runtime.
> > > >>
> > > >>
> > > >> DPDK has existing API's for application metrics but they are rarely used.
> > > >>
> > > >> Why not implement rte_metrics in l3fwd and proc-info?
> > > > This discussion has ended up as a stats discussion. But, we also need to
> > be able to change the configurable parameters easily.
> > > > If we implement the stats and ability to change the configurable
> > > > parameters, then it is essentially bringing in some of the
> > > > capabilities from
> > > testpmd to the sample application. I think that will result in lot more code in
> > the sample application and will make it complicated.
> > > >
> > > > Instead our proposal is to take L3fwd to testpmd and use all the
> > > > infra code that testpmd provides. We see that this approach results
> > > > in less
> > > amount of code added to DPDK overall.
> > > >
> > >
> > > Agree that it may help testing to have l3fwd support on the testpmd.
> > >
> > > Two concerns,
> > > 1) Testpmd already too complex.
> > > 2) Code duplication.
> > >
> > > For 1), if the l3fwd can be implemented in testpmd as new, independent
> > > forwarding mode, without touching rest of the testpmd, I think it can be
> > OK.
> Yes, this is what we have done. It is a new forwarding mode.
> We could remove some forwarding modes from testpmd. For ex: macfwd, macswap seem very similar to iofwd mode.
Not really, iowfd doesn't touch packet data at all, while macfwd and macwap change L2 headers.
In fact I found all of them quite helpful (just for different cases), so please keep them.
>
> >
> > In fact, l3fwd is also quite big and complex:
> > $ wc -l examples/l3fwd/*.[h,c] |grep total
> > 6969 total
> >
> > Plus it will introduce extra dependencies (fib, lpm, hash, might-be acl?) I am
> > not sure it is a good idea to pull all these complexities into test-pmd.
> I do not suggest pulling all these in. In our case, I see that the ask is only on LPM. I am open to hearing what others see as the requirement.
Ok, but l3fwd forwarding model is quite different from current PMD one
(egress queue selection, TX packets buffering, etc.).
I suppose you'll need to pull all that too from l3fwd?
>
> > I can't imagine that l3fwd app need ability to configure each and every PMD
> > parameter.
> > From my experience in l3fwd most of cycles are spent not in PMD itself, but
> > in actual packet processing: header parsing and checking, classification,
> > routing table lookup, etc.
> During our work, we had to experiment with burst size, rx/tx queue depths along with other PMD specific configuration parameters. The
> packet processing code remains the same and there is not much to optimize.
I think burst-size and rx/tx queue size can be added into l3fwd as new config parameters.
Doesn't look like a major issue to me.
PMD specific parameters could be a problem... anything particular you plan to use?
> >
> > > Not sure how to address 2), also lets say we want to add new feature
> > > to l3fwd, where it should go, to the sample or to the testpmd?
> L3fwd example will remain as the example. We have to duplicate the code into testpmd. If L3fwd example is changed, it needs to be
> changed in testpmd as well.
Usually code duplication is not a good sign.
I understand that sometimes it is unavoidable, but why we have to do it here?
next prev parent reply other threads:[~2021-04-28 10:59 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-03-10 18:31 Honnappa Nagarahalli
2021-03-11 6:41 ` Jerin Jacob
2021-03-11 15:18 ` Honnappa Nagarahalli
2021-03-11 15:46 ` Thomas Monjalon
2021-03-11 16:00 ` Honnappa Nagarahalli
2021-03-31 20:35 ` Kathleen Capella
2021-03-31 21:17 ` Jerin Jacob
2021-04-01 0:20 ` Honnappa Nagarahalli
2021-04-01 4:38 ` Jerin Jacob
2021-04-24 0:26 ` Honnappa Nagarahalli
2021-04-26 9:44 ` Jerin Jacob
2021-04-26 17:47 ` Stephen Hemminger
2021-04-26 20:46 ` Honnappa Nagarahalli
2021-04-27 9:39 ` Andrew Rybchenko
2021-04-27 9:50 ` Ferruh Yigit
2021-04-27 9:57 ` Ananyev, Konstantin
2021-04-27 11:11 ` Thomas Monjalon
2021-04-27 11:32 ` Bruce Richardson
2021-04-27 23:26 ` Honnappa Nagarahalli
2021-04-27 23:17 ` Honnappa Nagarahalli
2021-04-28 10:48 ` Bruce Richardson
2021-04-28 11:04 ` Stanisław Kardach
2021-04-28 11:19 ` Thomas Monjalon
2021-04-28 21:44 ` Honnappa Nagarahalli
2021-04-29 7:49 ` Stanislaw Kardach
2021-04-29 8:31 ` Ananyev, Konstantin
2021-04-29 10:39 ` Stanislaw Kardach
2021-04-29 11:47 ` Ananyev, Konstantin
2021-04-29 11:53 ` Stanislaw Kardach
2021-04-30 11:28 ` Ananyev, Konstantin
2021-08-02 15:07 ` Dharmik Thakkar
2021-04-28 11:17 ` Thomas Monjalon
2021-04-28 10:59 ` Ananyev, Konstantin [this message]
2021-04-28 22:10 ` Honnappa Nagarahalli
2021-04-27 16:01 ` Stephen Hemminger
2021-04-27 20:20 ` Honnappa Nagarahalli
2021-04-27 22:23 ` Ananyev, Konstantin
2021-04-27 23:11 ` Honnappa Nagarahalli
2021-04-28 11:00 ` Ananyev, Konstantin
2021-04-26 20:32 ` Honnappa Nagarahalli
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=DM6PR11MB44919BA0829F3B36C10419389A409@DM6PR11MB4491.namprd11.prod.outlook.com \
--to=konstantin.ananyev@intel.com \
--cc=Dharmik.Thakkar@arm.com \
--cc=Honnappa.Nagarahalli@arm.com \
--cc=Kathleen.Capella@arm.com \
--cc=Ruifeng.Wang@arm.com \
--cc=bruce.richardson@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=hemant.agrawal@nxp.com \
--cc=jerinj@marvell.com \
--cc=jerinjacobk@gmail.com \
--cc=nd@arm.com \
--cc=stephen@networkplumber.org \
--cc=sthemmin@microsoft.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).