Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior

DPDK patches and discussions
 help / color / mirror / Atom feed

From: "Yang, Zhiyong" <zhiyong.yang@intel.com>
To: Adrien Mazarguil <adrien.mazarguil@6wind.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	Andrew Rybchenko <arybchenko@solarflare.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"thomas.monjalon@6wind.com" <thomas.monjalon@6wind.com>
Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching behavior
Date: Tue, 7 Feb 2017 07:50:21 +0000	[thread overview]
Message-ID: <E182254E98A5DA4EB1E657AC7CB9BD2A3EB6EB5B@BGSMSX101.gar.corp.intel.com> (raw)
In-Reply-To: <20170123163607.GB3779@6wind.com>

Hi, Adrien:

	Sorry for the late reply  due to Chinese new year.

> -----Original Message-----
> From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> Sent: Tuesday, January 24, 2017 12:36 AM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Andrew
> Rybchenko <arybchenko@solarflare.com>; Yang, Zhiyong
> <zhiyong.yang@intel.com>; dev@dpdk.org; thomas.monjalon@6wind.com
> Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD batching
> behavior
> 
> On Fri, Jan 20, 2017 at 11:48:22AM +0000, Bruce Richardson wrote:
> > On Fri, Jan 20, 2017 at 11:24:40AM +0000, Ananyev, Konstantin wrote:
> > > >
> > > > From: Andrew Rybchenko [mailto:arybchenko@solarflare.com]
> > > > Sent: Friday, January 20, 2017 10:26 AM
> > > > To: Yang, Zhiyong <zhiyong.yang@intel.com>; dev@dpdk.org
> > > > Cc: thomas.monjalon@6wind.com; Richardson, Bruce
> > > > <bruce.richardson@intel.com>; Ananyev, Konstantin
> > > > <konstantin.ananyev@intel.com>
> > > > Subject: Re: [dpdk-dev] [RFC] lib/librte_ether: consistent PMD
> > > > batching behavior
> > > >
> > > > On 01/20/2017 12:51 PM, Zhiyong Yang wrote:
> > > > The rte_eth_tx_burst() function in the file Rte_ethdev.h is
> > > > invoked to transmit output packets on the output queue for DPDK
> > > > applications as follows.
> > > >
> > > > static inline uint16_t
> > > > rte_eth_tx_burst(uint8_t port_id, uint16_t queue_id,
> > > >                  struct rte_mbuf **tx_pkts, uint16_t nb_pkts);
> > > >
> > > > Note: The fourth parameter nb_pkts: The number of packets to
> transmit.
> > > > The rte_eth_tx_burst() function returns the number of packets it
> > > > actually sent. The return value equal to *nb_pkts* means that all
> > > > packets have been sent, and this is likely to signify that other
> > > > output packets could be immediately transmitted again.
> > > > Applications that implement a "send as many packets to transmit as
> > > > possible" policy can check this specific case and keep invoking
> > > > the rte_eth_tx_burst() function until a value less than
> > > > *nb_pkts* is returned.
> > > >
> > > > When you call TX only once in rte_eth_tx_burst, you may get
> > > > different behaviors from different PMDs. One problem that every
> > > > DPDK user has to face is that they need to take the policy into
> > > > consideration at the app- lication level when using any specific
> > > > PMD to send the packets whether or not it is necessary, which
> > > > brings usage complexities and makes DPDK users easily confused
> > > > since they have to learn the details on TX function limit of
> > > > specific PMDs and have to handle the different return value: the
> > > > number of packets transmitted successfully for various PMDs. Some
> > > > PMDs Tx func- tions have a limit of sending at most 32 packets for
> > > > every invoking, some PMDs have another limit of at most 64 packets
> > > > once, another ones have imp- lemented to send as many packets to
> transmit as possible, etc. This will easily cause wrong usage for DPDK users.
> > > >
> > > > This patch proposes to implement the above policy in DPDK lib in
> > > > order to simplify the application implementation and avoid the
> > > > incorrect invoking as well. So, DPDK Users don't need to consider
> > > > the implementation policy and to write duplicated code at the
> > > > application level again when sending packets. In addition to it,
> > > > the users don't need to know the difference of specific PMD TX and
> > > > can transmit the arbitrary number of packets as they expect when
> > > > invoking TX API rte_eth_tx_burst, then check the return value to get
> the number of packets actually sent.
> > > >
> > > > How to implement the policy in DPDK lib? Two solutions are proposed
> below.
> > > >
> > > > Solution 1:
> > > > Implement the wrapper functions to remove some limits for each
> > > > specific PMDs as i40e_xmit_pkts_simple and ixgbe_xmit_pkts_simple
> do like that.
> > > >
> > > > > IMHO, the solution is a bit better since it:
> > > > > 1. Does not affect other PMDs at all
> > > > > 2. Could be a bit faster for the PMDs which require it since has
> > > > >no indirect
> > > > >    function call on each iteration
> > > > > 3. No ABI change
> > >
> > > I also would prefer solution number 1 for the reasons outlined by Andrew
> above.
> > > Also, IMO current limitation for number of packets to TX in some
> > > Intel PMD TX routines are sort of artificial:
> > > - they are not caused by any real HW limitations
> > > - avoiding them at PMD level shouldn't cause any performance or
> functional degradation.
> > > So I don't see any good reason why instead of fixing these
> > > limitations in our own PMDs we are trying to push them to the upper
> (rte_ethdev) layer.
> 
> For what it's worth, I agree with Konstantin. Wrappers should be as thin as
> possible on top of PMD functions, they are not helpers. We could define a
> set of higher level functions for this purpose though.
> 
> In the meantime, exposing and documenting PMD limitations seems safe
> enough.
> 
> We could assert that RX/TX burst requests larger than the size of the target
> queue are unlikely to be fully met (i.e. PMDs usually do not check for
> completions in the middle of a TX burst).

As a tool,  it is very important for its users to easily consume it and make it work
in a right way.  Sort of artificial limits will make things look like a little confused  and
make some users probably get into trouble when writing drivers. 
Why do we correct it and make it easier?  :)

Zhiyong

next prev parent reply	other threads:[~2017-02-07  7:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-01-20  9:51 Zhiyong Yang
2017-01-20 10:26 ` Andrew Rybchenko
     [not found]   ` <2601191342CEEE43887BDE71AB9772583F108924@irsmsx105.ger.corp.intel.com>
2017-01-20 11:24     ` Ananyev, Konstantin
2017-01-20 11:48       ` Bruce Richardson
2017-01-23 16:36         ` Adrien Mazarguil
2017-02-07  7:50           ` Yang, Zhiyong [this message]
2017-01-21  4:07       ` Yang, Zhiyong
2017-01-21  4:13   ` Yang, Zhiyong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E182254E98A5DA4EB1E657AC7CB9BD2A3EB6EB5B@BGSMSX101.gar.corp.intel.com \
    --to=zhiyong.yang@intel.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=arybchenko@solarflare.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).