DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com>
To: Matan Azrad <matan@nvidia.com>, Li Zhang <lizh@nvidia.com>,
	Dekel Peled <dekelp@nvidia.com>, Ori Kam <orika@nvidia.com>,
	Slava Ovsiienko <viacheslavo@nvidia.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	NBU-Contact-Thomas Monjalon <thomas@monjalon.net>,
	Raslan Darawsheh <rasland@nvidia.com>,
	"mb@smartsharesystems.com" <mb@smartsharesystems.com>,
	"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"Singh, Jasvinder" <jasvinder.singh@intel.com>
Subject: Re: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
Date: Fri, 5 Mar 2021 18:44:47 +0000	[thread overview]
Message-ID: <MWHPR11MB2032ED54BB21B361328BBE2BEB969@MWHPR11MB2032.namprd11.prod.outlook.com> (raw)
In-Reply-To: <MW2PR12MB249253C22244068A2B30E2FADF979@MW2PR12MB2492.namprd12.prod.outlook.com>



> -----Original Message-----
> From: Matan Azrad <matan@nvidia.com>
> Sent: Thursday, March 4, 2021 6:34 AM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Li Zhang
> <lizh@nvidia.com>; Dekel Peled <dekelp@nvidia.com>; Ori Kam
> <orika@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
> Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>; Raslan Darawsheh <rasland@nvidia.com>;
> mb@smartsharesystems.com; ajit.khaparde@broadcom.com; Yigit, Ferruh
> <ferruh.yigit@intel.com>; Singh, Jasvinder <jasvinder.singh@intel.com>
> Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> 
> Hi Cristian
> 
> From: Dumitrescu, Cristian
> > Hi Matan,
> >
> > > -----Original Message-----
> > > From: Matan Azrad <matan@nvidia.com>
> > > Sent: Tuesday, March 2, 2021 6:10 PM
> > > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Li Zhang
> > > <lizh@nvidia.com>; Dekel Peled <dekelp@nvidia.com>; Ori Kam
> > > <orika@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
> > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> <thomas@monjalon.net>;
> > > Raslan Darawsheh <rasland@nvidia.com>; mb@smartsharesystems.com;
> > > ajit.khaparde@broadcom.com; Yigit, Ferruh <ferruh.yigit@intel.com>;
> > > Singh, Jasvinder <jasvinder.singh@intel.com>
> > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> > >
> > > Hi Cristian
> > >
> > > Good discussion, thank you for that!
> > >
> > > From: Dumitrescu, Cristian
> > > > Hi Matan,
> > > >
> > > > > -----Original Message-----
> > > > > From: Matan Azrad <matan@nvidia.com>
> > > > > Sent: Tuesday, March 2, 2021 12:37 PM
> > > > > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Li Zhang
> > > > > <lizh@nvidia.com>; Dekel Peled <dekelp@nvidia.com>; Ori Kam
> > > > > <orika@nvidia.com>; Slava Ovsiienko <viacheslavo@nvidia.com>
> > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > <thomas@monjalon.net>;
> > > > > Raslan Darawsheh <rasland@nvidia.com>;
> mb@smartsharesystems.com;
> > > > > ajit.khaparde@broadcom.com; Yigit, Ferruh
> > > > > <ferruh.yigit@intel.com>; Singh, Jasvinder
> > > > > <jasvinder.singh@intel.com>
> > > > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile
> > > > >
> > > > > HI Cristian
> > > > >
> > > > > From: Dumitrescu, Cristian
> > > > > > Hi Matan,
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Matan Azrad <matan@nvidia.com>
> > > > > > > Sent: Tuesday, March 2, 2021 7:02 AM
> > > > > > > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Li
> > > > > > > Zhang <lizh@nvidia.com>; Dekel Peled <dekelp@nvidia.com>; Ori
> > > > > > > Kam <orika@nvidia.com>; Slava Ovsiienko
> > > > > > > <viacheslavo@nvidia.com>
> > > > > > > Cc: dev@dpdk.org; NBU-Contact-Thomas Monjalon
> > > > > <thomas@monjalon.net>;
> > > > > > > Raslan Darawsheh <rasland@nvidia.com>;
> > > mb@smartsharesystems.com;
> > > > > > > ajit.khaparde@broadcom.com; Yigit, Ferruh
> > > > > > > <ferruh.yigit@intel.com>; Singh, Jasvinder
> > > > > > > <jasvinder.singh@intel.com>
> > > > > > > Subject: RE: [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS
> > > > > > > profile
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > Hi Cristian
> > > > > > >
> > > > > > > Thank you for review, please see inline.
> > > > > > >
> > > > > > > From: Dumitrescu, Cristian
> > > > > > > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Li Zhang
> > > > > > > <snip>
> > > > > > > > We had this same problem earlier for the rte_tm.h API, where
> > > > > > > > people
> > > > > > > asked to
> > > > > > > > add support for WRED and shaper rates specified in packets
> > > > > > > > to the existing
> > > > > > > byte
> > > > > > > > rate support. I am more than happy to support adding the
> > > > > > > > same here, but please let's adopt the same solution here
> > > > > > > > rather than invent a different approach.
> > > > > > > >
> > > > > > > > Please refer to struct rte_tm_wred_params and struct
> > > > > > > rte_tm_shaper_params
> > > > > > > > from rte_tm.h: the packets vs. bytes mode is explicitly
> > > > > > > > specified through
> > > > > > > the use
> > > > > > > > of a flag called packet_mode that is added to the WRED and
> > > > > > > > shaper
> > > > > profile.
> > > > > > > > When packet_mode is 0, the profile rates and bucket sizes
> > > > > > > > are specified in bytes per second and bytes, respectively;
> > > > > > > > when packet_mode is not 0, the profile rates and bucket
> > > > > > > > sizes are specified in packets and packets per
> > > > > > > second,
> > > > > > > > respectively. The same profile parameters are used, no need
> > > > > > > > to invent additional algorithms (such as srTCM - packet
> > > > > > > > mode) or profile data
> > > > > > > structures.
> > > > > > > > Can we do the same here, please?
> > > > > > >
> > > > > > > This flag approach is very intuitive suggestion and it has
> advantages.
> > > > > > >
> > > > > > > The main problem with the flag approach is that it breaks ABI and
> API.
> > > > > > > The profile structure size is changed due to a new field - ABI
> > > breakage.
> > > > > > > The user must initialize the flag with zero to get old
> > > > > > > behavior - API
> > > > > breakage.
> > > > > > >
> > > > > >
> > > > > > The rte_mtr API is experimental, all the API functions are
> > > > > > correctly marked with __rte_experimental in rte_mtr.h file, so
> > > > > > we can safely change the API
> > > > > and
> > > > > > the ABI breakage is not applicable here. Therefore, this problem
> > > > > > does not
> > > > > exist,
> > > > > > correct?
> > > > >
> > > > > Yes, but still meter is not new API and I know that a lot of user
> > > > > uses it for a long time.
> > > > > Forcing them to change while we have good solution that don't
> > > > > force it, looks me problematic.
> > > > >
> > > >
> > > > Not really, only 3 drivers are currently implementing this API.
> > >
> > > The user is not the PMD, the PMDs are the providers.
> > > I'm talking about all our customers, all the current DPDK based
> > > applications like OVS and others (I familiar with at least 4 ConnectX
> > > customer applications) which use the meter API and I'm sure there are
> more
> > around the world.
> > >
> > > > Even to these drivers, the required changes are none or extremely
> small:
> > > as Ajit
> > > > was also noting, as the default value of 0 continues to represent
> > > > the
> > > existing
> > > > byte mode, all you have to do is make sure the new flag is set to
> > > > zero in the profile params structure, which is already done
> > > > implicitly in most places as
> > > this
> > > > structure is initialized to all-zeros.
> > >
> > > Are you sure all the world initialize the struct to 0? and also in
> > > this case, without new compilation, not all the struct will be
> > > zeroes(the old size is smaller).
> > >
> > > > A simple search exercise for struct rte_mtr_meter_profile is all
> > > > that is
> > > needed.
> > > > You also agreed the flag approach is very intuitive, hence better
> > > > and nicer,
> > > with
> > > > no additional work needed for you, so why not do it?
> > >
> > > Do you understand that any current application that use the meter API
> > > must recompile the code of the application? Part of them also probably
> > > need to set the flag to 0....
> > > Do you understand also the potential issues for the applications which
> > > are not aware to the change? Debug time, etc....
> > >
> > > > > > > I don't see issues with Li suggestion, Do you think Li
> > > > > > > suggestion has critical issues?
> > > > > >
> > > > > > It is probably better to keep the rte_mtr and the rte_tm APIs
> > > > > > aligned, it simplifies the code maintenance and improves the
> > > > > > user experience, which always pays off in the long run. Both
> > > > > > APIs configure token buckets in either packet mode or byte mode,
> > > > > > and it is desirable to have them work in the
> > > > > same
> > > > > > way. Also, I think we should avoid duplicating configuration
> > > > > > data structures
> > > > > for
> > > > > > to support essentially the same algorithms (such as srTCM or
> > > > > > trTCM) if we
> > > > > can.
> > > > > >
> > > > >
> > > > > Yes, but I don't think this motivation is critical.
> > > >
> > > > I really disagree. As API maintainer, making every effort to keep
> > > > the APIs
> > > clear
> > > > and consistent is a critical task for me.
> > >
> > > New pps profile is also clear and simple.
> > >
> > > > We don't want to proliferate the API data structures and parameters
> > > > if there is a good way to avoid it. Especially
> > > in
> > > > cases like this, when the drivers are just beginning to pick up this
> > > > (still
> > > > experimental) API,  we have the rare chance to make things right and
> > > therefore
> > > > we should do it. Please also keep in mind that, as more feature are
> > > > added
> > > to
> > > > the API, small corner cuts like this one that might not look like a
> > > > big deal
> > > now,
> > > > eventually come back as unnecessary complexity in the drivers
> themselves.
> > >
> > > I don't see a complexity in the current suggestion.
> > >
> > > > So, please, let's try to keep the quality of the APIs high.
> > >
> > > Also by this way is high.
> > >
> > >
> > > Look, the flag approach is also good and makes the job.
> > > The two approaches are clear, simple and in high quality.
> > > I don't care which one from the 2 to take but I want to be sure we are
> > > all understand the pros and cons.
> > >
> > > If you understand my concern on flag approach and still insist to take
> > > the flag approach we will align.
> >
> > Yes, thanks for summarizing the pros and cons, I confirm that I do
> understand
> > your concerns.
> >
> > Yes, sorry to disappoint you, I still think the packet_mode based approach
> is
> > better for the long run, as it keeps the APIs clean and consistent. We are
> not
> > adding new algorithms here, we're just adding a new mode to an existing
> > algorithm, so IMO we should not duplicate configuration data structures
> and
> > proliferate the number of algorithms artificially.
> 
> Actually, PPS meter is a new algorithm - you can see that current algorithms
> RFCs don't talk about PPS.
> 

Yes, I know, I implemented it in librte_meter, but still, it is the same algorithm with just a different measurement unit (packet instead of byte), that's why many people (and you included :) ) still refer to it as srTCM  - RFC 2697.

> > Yes, I do realize that in some limited cases the users will have to explicitly
> set
> > the new packet_mode flag to zero or one, in case they need to enable the
> > packet mode, but I think this is an acceptable cost because: (A) This API is
> > clearly marked as experimental; (B) It is better to take a small incremental
> hit
> > now to keep the APIs in good order rather than taking a bit hit in a few
> years as
> > more features are added in the wrong way and the APIs become
> > unmanageable.
> 
> I don't think that the current suggestion is in wrong way.
> In any case, you insist, we will align.
> 

Thank you.

> > > And if we so, and we are going to break the API\ABI, we are going to
> > > introduce new meter policy API soon and there, breaking API can help,
> > > lets see in other discussion later.
> > >
> >
> > Yes, as you point out API changes are unavoidable as new features are
> added,
> > we have to manage the API evolution correctly.
> >
> > > One more point:
> > > Currently, the meter_id is managed by the user, I think it is better
> > > to let the PMDs to manage the meter_id.
> > >
> > > Searching the PMD meter handler inside the PMD is very expensive for
> > > the API call rate when the meter_id is managed by the user.
> > >
> > > Same for profile_id.
> > >
> > > Also all the rte_flow API including the shared action API taking the
> > > PMD management approach.
> > >
> > > What do you think?
> > >
> >
> > Yes, we have carefully considered and discussed both approaches a few
> years
> > back when the API was introduced, this is not done by accident :), there are
> > pros and cons for each of them.
> >
> > If the object IDs are generated by the driver (outputs of the API), then it is
> the
> > user application that needs to keep track of them, which can be very
> painful.
> > Basically, for each API object the user application needs to create its own
> > wrapper to store this ID. We basically transfer this problem to the user app.
> 
> No exactly\not for all, the app gets the meter_id in the same time it decides
> it now.
> 
> > If the object IDs are generated by the user application (inputs into the API),
> > then we simplify the application by removing and indirection layer. Yes, it is
> > true that this indirection layer now moves into the driver, but we should try
> to
> > make the life easier for the appl developers as opposed to us, the driver
> > developers. This indirection layer in the driver can be made a bit smarter
> than
> > just a slow "for" loop; the search operation can be made faster with a small
> bit
> > of effort, such as keeping this list sorted based on the object ID, splitting
> this list
> > into buckets (similar to a hash table), etc, right?
> 
> Yes, there are even better solution than hash table from "rate" perspective.
> 

I'd be very interested to hear your proposals here.

> But any solution costs a lot of memory just for this mapping...
> When we talked about 4M meters supported(in mlx5 next release) it
> becomes an issue.
> 

I thought your concern was about the speed/rate of API calls, you are saying it is not speed but memory footprint??

I would imagine that a system that enables all the 4M meters is a big beast with the most powerful CPU on the planet and many dozens of gigabytes of RAM, so a few extra megabytes for some API layers is not a concern?

> > Having the user app provide the object ID is especially important in the case
> of
> > rte_tm API, where we have to deal with a tree of nodes, with thousands of
> > nodes for each level. Having the app to store and manages this tree of IDs
> is a
> > really bad idea, as the user app needs to mirror the tree of nodes on its
> side for
> > no real benefit. As an added benefit, the user can generate these IDs using
> a
> > rule, such as: given the specific path through the tree, the value of the ID
> can
> > be computed.
> 
> rte_tm is not rte_mtr - I think meter is different and used differently.
> For example, as I know, no one from our dpdk meter customers(at least 5)
> use TREEs for meter management. OVS, for example, just randomize some
> meter_id and don't care about it.
> 

What kinds of trees? I'd be very interested to hear some proposals to make this handle mapping faster.

> Also, all the rte_flow API basics works with PMD ID\handle management
> approach.
> 

Yes, I am not saying it is wrong, none of the approaches is wrong IMO.

> > But again, as you also mention above, there is a list of pros and cons for
> every
> > approach, no approach is perfect. We took this approach for the good
> reasons
> > listed above.
> 
> If you familiar with TREE usage with meter, maybe we can combined easily
> the two approaches in this topic,
> 
> meter_id argument can be by reference, if it 0 - PMD set it, if not PMD use it.
> 

It would be good if you could elaborate here a bit, just to make sure we are on the same page.

> > > > > > The flag proposal is actually reducing the amount of work that
> > > > > > you guys
> > > > > need to
> > > > > > do to implement your proposal. There is no negative impact to
> > > > > > your
> > > > > proposal
> > > > > > and no big change, right?
> > > > >
> > > > > Yes you right, but the implementation effect is not our concern.
> > > > >
> > > > >
> > > > > > > > This is a quick summary of the required API changes to add
> > > > > > > > support for the packet mode, they are minimal:
> > > > > > > > a) Introduce the packet_mode flag in the profile parameters
> > > > > > > > data
> > > > > > > structure.
> > > > > > > > b) Change the description (comment) of the rate and bucket
> > > > > > > > size
> > > > > > > parameters in
> > > > > > > > the meter profile parameters data structures to reflect that
> > > > > > > > their values represents either bytes or packets, depending
> > > > > > > > on the value of the new flag packet_mode from the same
> structure.
> > > > > > > > c) Add the relevant capabilities: just search for "packet"
> > > > > > > > in the rte_tm.h capabilities data structures and apply the
> > > > > > > > same to the rte_mtr.h
> > > > > > > capabilities,
> > > > > > > > when applicable.
> > > > > > >
> > > > > > > > Regards,
> > > > > > > > Cristian
> > > > > >
> > > > > > Regards,
> > > > > > Cristian
> > > >
> > > > Regards,
> > > > Cristian
> >
> > Regards,
> > Cristian

Regards,
Cristian

  reply	other threads:[~2021-03-05 18:44 UTC|newest]

Thread overview: 49+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-01-25  1:02 [dpdk-dev] [RFC 0/1] lib/librte_ethdev: Meter algorithms support packet per second Li Zhang
2021-01-25  1:02 ` [dpdk-dev] [PATCH] [RFC]: adds support PPS(packet per second) on meter Li Zhang
2021-01-25  1:20   ` [dpdk-dev] [RFC 0/1] lib/librte_ethdev: Meter algorithms support packet per second Li Zhang
2021-01-25  1:20     ` [dpdk-dev] [PATCH] [RFC, v2]: adds support PPS(packet per second) on meter Li Zhang
2021-01-28 18:27       ` Ferruh Yigit
2021-02-12  7:40         ` Morten Brørup
2021-02-23  2:07           ` Li Zhang
2021-02-23  8:24             ` Morten Brørup
2021-03-01  3:16               ` Li Zhang
2021-03-01  3:31                 ` Ajit Khaparde
2021-03-01  7:20                 ` Morten Brørup
2021-03-01 13:08         ` Dumitrescu, Cristian
2021-03-01  9:39       ` [dpdk-dev] [RFC v3 0/4] " Li Zhang
2021-03-01  9:39         ` [dpdk-dev] [RFC v3 1/4] ethdev: add meter PPS profile Li Zhang
2021-03-01  9:39         ` [dpdk-dev] [RFC v3 2/4] common/mlx5: add meter mode definition in PRM file Li Zhang
2021-03-01  9:39         ` [dpdk-dev] [RFC v3 3/4] net/mlx5: support meter PPS profile Li Zhang
2021-03-01  9:40         ` [dpdk-dev] [RFC v3 4/4] app/testpmd: add meter pps mode cmd Li Zhang
2021-03-01  9:43       ` [dpdk-dev] [RFC v3 0/4] adds support PPS(packet per second) on meter Li Zhang
2021-03-01  9:43         ` [dpdk-dev] [RFC v3 1/4] ethdev: add meter PPS profile Li Zhang
2021-03-01  9:43         ` [dpdk-dev] [RFC v3 2/4] common/mlx5: add meter mode definition in PRM file Li Zhang
2021-03-01  9:43         ` [dpdk-dev] [RFC v3 3/4] net/mlx5: support meter PPS profile Li Zhang
2021-03-01  9:43         ` [dpdk-dev] [RFC v3 4/4] app/testpmd: add meter pps mode cmd Li Zhang
2021-03-01  9:53       ` [dpdk-dev] [RFC v3 0/4] adds support PPS(packet per second) on meter Li Zhang
2021-03-01  9:53         ` [dpdk-dev] [RFC v3 1/4] ethdev: add meter PPS profile Li Zhang
2021-03-01  9:53         ` [dpdk-dev] [RFC v3 2/4] common/mlx5: add meter mode definition in PRM file Li Zhang
2021-03-01  9:53         ` [dpdk-dev] [RFC v3 3/4] net/mlx5: support meter PPS profile Li Zhang
2021-03-01  9:53         ` [dpdk-dev] [RFC v3 4/4] app/testpmd: add meter pps mode cmd Li Zhang
2021-02-12 21:35   ` [dpdk-dev] [PATCH] [RFC]: adds support PPS(packet per second) on meter Ajit Khaparde
2021-02-23  2:11     ` Li Zhang
2021-03-01 10:35 ` [dpdk-dev] [RFC v4 0/4] " Li Zhang
2021-03-01 10:35   ` [dpdk-dev] [RFC v4 1/4] ethdev: add meter PPS profile Li Zhang
2021-03-01 13:20     ` Dumitrescu, Cristian
2021-03-01 15:53       ` Thomas Monjalon
2021-03-02  1:27         ` Li Zhang
2021-03-02  1:46       ` Ajit Khaparde
2021-03-02 12:13         ` Dumitrescu, Cristian
2021-03-02  7:02       ` Matan Azrad
2021-03-02 12:29         ` Dumitrescu, Cristian
2021-03-02 12:37           ` Matan Azrad
2021-03-02 14:33             ` Dumitrescu, Cristian
2021-03-02 18:10               ` Matan Azrad
2021-03-03 20:35                 ` Dumitrescu, Cristian
2021-03-04  6:34                   ` Matan Azrad
2021-03-05 18:44                     ` Dumitrescu, Cristian [this message]
2021-03-01 10:35   ` [dpdk-dev] [RFC v4 2/4] common/mlx5: add meter mode definition in PRM file Li Zhang
2021-03-01 10:35   ` [dpdk-dev] [RFC v4 3/4] net/mlx5: support meter PPS profile Li Zhang
2021-03-01 10:35   ` [dpdk-dev] [RFC v4 4/4] app/testpmd: add meter pps mode cmd Li Zhang
2021-03-02  1:48     ` Li, Xiaoyun
2021-03-02  3:04       ` Li Zhang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=MWHPR11MB2032ED54BB21B361328BBE2BEB969@MWHPR11MB2032.namprd11.prod.outlook.com \
    --to=cristian.dumitrescu@intel.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=dekelp@nvidia.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jasvinder.singh@intel.com \
    --cc=lizh@nvidia.com \
    --cc=matan@nvidia.com \
    --cc=mb@smartsharesystems.com \
    --cc=orika@nvidia.com \
    --cc=rasland@nvidia.com \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).