From: Matan Azrad <matan@mellanox.com>
To: Chas Williams <3chas3@gmail.com>
Cc: Eric Kinzie <ehkinzie@gmail.com>,
"bluca@debian.org" <bluca@debian.org>,
"dev@dpdk.org" <dev@dpdk.org>,
Declan Doherty <declan.doherty@intel.com>,
Chas Williams <chas3@att.com>,
"stable@dpdk.org" <stable@dpdk.org>
Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH v4] net/bonding: per-slave intermediate rx ring
Date: Sun, 26 Aug 2018 07:40:44 +0000 [thread overview]
Message-ID: <AM0PR0502MB40194FAA439D78BB044D6BDAD2340@AM0PR0502MB4019.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <CAG2-GkmeEXR_M8W=Ky5PNn57=ji_wBSXJxgQ+oZVYRttq=P37Q@mail.gmail.com>
From: Chas Williams <3chas3@gmail.com>
>On Thu, Aug 23, 2018 at 3:28 AM Matan Azrad <mailto:matan@mellanox.com> wrote:
>Hi
>
>From: Eric Kinzie
>> On Wed Aug 22 11:42:37 +0000 2018, Matan Azrad wrote:
>> > Hi Luca
>> >
>> > From: Luca Boccassi
>> > > On Wed, 2018-08-22 at 07:09 +0000, Matan Azrad wrote:
>> > > > Hi Chas
>> > > >
>> > > > From: Chas Williams
>> > > > > On Tue, Aug 21, 2018 at 11:43 AM Matan Azrad
>> > > > > <mailto:mailto:matan@mellanox .com> wrote:
>> > > > > Hi Chas
>> > > > >
>> > > > > From: Chas Williams
>> > > > > > On Tue, Aug 21, 2018 at 6:56 AM Matan Azrad
>> > > > > > <mailto:mailto:matan@mellano https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fx.com&data=02%7C01%7Cmatan%40mellanox.com%7Cc662ec1ee7734d12025808d609104474%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636706362822778011&sdata=MNAx1E5TgOrzXO9N8SWCOojrWmbqD8DPND%2BCXorOYhQ%3D&reserved=0>
>> > > > > > wrote:
>> > > > > > Hi
>> > > > > >
>> > > > > > From: Chas Williams
>> > > > > > > This will need to be implemented for some of the other RX
>> > > > > > > burst methods at some point for other modes to see this
>> > > > > > > performance improvement (with the exception of active-backup).
>> > > > > >
>> > > > > > Yes, I think it should be done at least to
>> > > > > > bond_ethdev_rx_burst_8023ad_fast_queue (should be easy) for
>> now.
>> > > > > >
>> > > > > > There is some duplicated code between the various RX paths.
>> > > > > > I would like to eliminate that as much as possible, so I was
>> > > > > > going to give that some thought first.
>> > > > >
>> > > > > There is no reason to stay this function as is while its twin is
>> > > > > changed.
>> > > > >
>> > > > > Unfortunately, this is all the patch I have at this time.
>> > > > >
>> > > > >
>> > > > > >
>> > > > > >
>> > > > > > > On Thu, Aug 16, 2018 at 9:32 AM Luca Boccassi
>> > > > > > > <mailto:mailto:bluca@deb https://emea01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fian.org&data=02%7C01%7Cmatan%40mellanox.com%7Cc662ec1ee7734d12025808d609104474%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7C0%7C636706362822778011&sdata=AAcm%2FcbAA4CQsOnXFZWIqii6T%2BFUcc8xxT7%2Fs3tKIfY%3D&reserved=0> wrote:
>> > > > > > >
>> > > > > > > > During bond 802.3ad receive, a burst of packets is fetched
>> > > > > > > > from each slave into a local array and appended to
>> > > > > > > > per-slave ring buffer.
>> > > > > > > > Packets are taken from the head of the ring buffer and
>> > > > > > > > returned to the caller. The number of mbufs provided to
>> > > > > > > > each slave is sufficient to meet the requirements of the
>> > > > > > > > ixgbe vector receive.
>> > > > > >
>> > > > > > Luca,
>> > > > > >
>> > > > > > Can you explain these requirements of ixgbe?
>> > > > > >
>> > > > > > The ixgbe (and some other Intel PMDs) have vectorized RX
>> > > > > > routines that are more efficient (if not faster) taking
>> > > > > > advantage of some advanced CPU instructions. I think you need
>> > > > > > to be receiving at least 32 packets or more.
>> > > > >
>> > > > > So, why to do it in bond which is a generic driver for all the
>> > > > > vendors PMDs, If for ixgbe and other Intel nics it is better you
>> > > > > can force those PMDs to receive always 32 packets and to manage
>> > > > > a ring by themselves.
>> > > > >
>> > > > > The drawback of the ring is some additional latency on the
>> > > > > receive path.
>> > > > > In testing, the additional latency hasn't been an issue for bonding.
>> > > >
>> > > > When bonding does processing slower it may be a bottleneck for the
>> > > > packet processing for some application.
>> > > >
>> > > > > The bonding PMD has a fair bit of overhead associated with the
>> > > > > RX and TX path calculations. Most applications can just arrange
>> > > > > to call the RX path with a sufficiently large receive. Bonding
>> > > > > can't do this.
>> > > >
>> > > > I didn't talk on application I talked on the slave PMDs, The slave
>> > > > PMD can manage a ring by itself if it helps for its own performance.
>> > > > The bonding should not be oriented to specific PMDs.
>> > >
>> > > The issue though is that the performance problem is not with the
>> > > individual PMDs - it's with bonding. There were no reports regarding
>> > > the individual PMDs.
>> > > This comes from reports from customers from real world production
>> > > deployments - the issue of bonding being too slow was raised multiple
>> times.
>> > > This patch addresses those issues, again in production deployments,
>> > > where it's been used for years, to users and customers satisfaction.
>> >
>> > From Chas I understood that using burst of 32 helps for some slave PMDs
>> performance which makes sense.
>> > I can't understand how the extra copy phases improves the bonding itself
>> performance:
>> >
>> > You added 2 copy phases in the bonding RX function:
>> > 1. Get packets from the slave to a local array.
>> > 2. Copy packet pointers from a local array to the ring array.
>> > 3. Copy packet pointers from the ring array to the application array.
>> >
>> > Each packet arriving to the application must pass the above 3 phases(in a
>> specific call or in previous calls).
>> >
>> > Without this patch we have only -
>> > Get packets from the slave to the application array.
>> >
>> > Can you explain how the extra copies improves the bonding performance?
>> >
>> > Looks like it improves the slaves PMDs and because of that the bonding
>> PMD performance becomes better.
>>
>> I'm not sure that adding more buffer management to the vector PMDs will
>> improve the drivers' performance; it's just that calling the rx function in such
>> a way that it returns no data wastes time.
>
>Sorry, I don't fully understand what you said here, please rephrase.
>
>> The bonding driver is already an exercise in buffer management so adding this layer of indirection here makes
>> sense in my opinion, as does hiding the details of the consituent interfaces where possible.
>
>Can you explain how this new buffer management with the extra pointer copies improves the bonding itself performance?
>Looks really strange to me.
>
>Because rings are generally quite efficient.
But you are using a ring in addition to regular array management, it must hurt performance of the bonding PMD
(means the bonding itself - not the slaves PMDs which are called from the bonding)
>
>Bonding is in a middle ground between application and PMD.
Yes.
>What bonding is doing, may not improve all applications.
Yes, but it can be solved using some bonding modes.
> If using a ring to buffer the vectorized receive routines, improves your particular application,
>that's great.
It may be not great and even bad for some other PMDs which are not vectororized.
> However, I don't think I can say that it would help all
>applications. As you point out, there is overhead associated with
>a ring.
Yes.
>Bonding's receive burst isn't especially efficient (in mode 4).
Why?
> Bonding benefits from being able to read as much as possible (within limits of
>course, large reads would blow out caches) from each slave.
The slaves PMDs can benefits in the same way.
>It can't return all that data though because applications tend to use the
>burst size that would be efficient for a typical PMD.
What is the preferred burst size of the bonding? Maybe the application should use it when they are using bonding.
>An alternative might be to ask bonding applications to simply issue larger reads for
>certain modes. That's probably not as easy as it sounds given the
>way that the burst length effects multiplexing.
Can you explain it more?
>Another solution might be just alternatively poll the individual
>slaves on each rx burst. But that means you need to poll at a
>faster rate. Depending on your application, you might not be
>able to do that.
Again, can you be more precise in the above explanation?
> We can avoid this scheduling overhead by just
>doing the extra reads in bonding and buffering in a ring.
>
>Since bonding is going to be buffering everything in a ring,
? I don't familiar with it. For now I don't think we need a ring.
>it makes sense to just read as much as is as efficiently possible. For the
>Intel adapters, this means using a read big enough trigger the vectorized
>versions.
>> > > So I'd like to share this improvement rather than keeping it private
>> > > - because I'm nice that way :-P
>> > >
>> > > > > > Did you check for other vendor PMDs? It may hurt performance
>> > > > > > there..
>> > > > > >
>> > > > > > I don't know, but I suspect probably not. For the most part
>> > > > > > you are typically reading almost up to the vector requirement.
>> > > > > > But if one slave has just a single packet, then you can't
>> > > > > > vectorize on the next slave.
>> > > > > >
>> > > > >
>> > > > > I don't think that the ring overhead is better for PMDs which
>> > > > > are not using the vectorized instructions.
>> > > > >
>> > > > > The non-vectorized PMDs are usually quite slow. The additional
>> > > > > overhead doesn't make a difference in their performance.
>> > > >
>> > > > We should not do things worse than they are.
>> > >
>> > > There were no reports that this made things worse. The feedback from
>> > > production was that it made things better.
>> >
>> > Yes, It may be good for specific slaves drivers but hurt another
>> > slaves drivers, So maybe it should stay private to specific costumers using
>> specific nics.
>> >
>> > Again, I can understand how this patch improves performance of some
>> > PMDs therefore I think the bonding is not the place to add it but maybe
>> some PMDs.
>
next prev parent reply other threads:[~2018-08-26 7:40 UTC|newest]
Thread overview: 28+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-08-15 15:46 [dpdk-dev] [PATCH] " Luca Boccassi
2018-08-15 16:06 ` [dpdk-dev] [PATCH v2] " Luca Boccassi
2018-08-16 12:52 ` [dpdk-dev] [PATCH v3] " Luca Boccassi
2018-08-16 13:32 ` [dpdk-dev] [PATCH v4] " Luca Boccassi
2018-08-20 14:11 ` Chas Williams
2018-08-21 10:56 ` Matan Azrad
2018-08-21 11:13 ` Luca Boccassi
2018-08-21 14:58 ` Chas Williams
2018-08-21 15:43 ` Matan Azrad
2018-08-21 18:19 ` Chas Williams
2018-08-22 7:09 ` Matan Azrad
2018-08-22 10:19 ` [dpdk-dev] [dpdk-stable] " Luca Boccassi
2018-08-22 11:42 ` Matan Azrad
2018-08-22 17:43 ` Eric Kinzie
2018-08-23 7:28 ` Matan Azrad
2018-08-23 15:51 ` Chas Williams
2018-08-26 7:40 ` Matan Azrad [this message]
2018-08-27 13:22 ` Chas Williams
2018-08-27 15:30 ` Matan Azrad
2018-08-27 15:51 ` Chas Williams
2018-08-28 9:51 ` Matan Azrad
2018-08-29 14:30 ` Chas Williams
2018-08-29 15:20 ` Matan Azrad
2018-08-31 16:01 ` Luca Boccassi
2018-09-02 11:34 ` Matan Azrad
2018-09-09 20:57 ` Chas Williams
2018-09-12 5:38 ` Matan Azrad
2018-09-19 18:09 ` [dpdk-dev] " Luca Boccassi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM0PR0502MB40194FAA439D78BB044D6BDAD2340@AM0PR0502MB4019.eurprd05.prod.outlook.com \
--to=matan@mellanox.com \
--cc=3chas3@gmail.com \
--cc=bluca@debian.org \
--cc=chas3@att.com \
--cc=declan.doherty@intel.com \
--cc=dev@dpdk.org \
--cc=ehkinzie@gmail.com \
--cc=stable@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).