DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Adrien Mazarguil" <adrien.mazarguil@6wind.com>,
	"Morten Brørup" <mb@smartsharesystems.com>
Cc: "Richardson, Bruce" <bruce.richardson@intel.com>,
	"Wiles, Keith" <keith.wiles@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Olivier Matz <olivier.matz@6wind.com>,
	Oleg Kuporosov <olegk@mellanox.com>
Subject: Re: [dpdk-dev] mbuf changes
Date: Tue, 25 Oct 2016 13:58:28 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB9772583F0CC835@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <20161025134817.GL5733@6wind.com>



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil
> Sent: Tuesday, October 25, 2016 2:48 PM
> To: Morten Brørup <mb@smartsharesystems.com>
> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Wiles, Keith <keith.wiles@intel.com>; dev@dpdk.org; Olivier Matz
> <olivier.matz@6wind.com>; Oleg Kuporosov <olegk@mellanox.com>
> Subject: Re: [dpdk-dev] mbuf changes
> 
> On Tue, Oct 25, 2016 at 02:16:29PM +0200, Morten Brørup wrote:
> > Comments inline.
> 
> I'm only replying to the nb_segs bits here.
> 
> > > -----Original Message-----
> > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > > Sent: Tuesday, October 25, 2016 1:14 PM
> > > To: Adrien Mazarguil
> > > Cc: Morten Brørup; Wiles, Keith; dev@dpdk.org; Olivier Matz; Oleg
> > > Kuporosov
> > > Subject: Re: [dpdk-dev] mbuf changes
> > >
> > > On Tue, Oct 25, 2016 at 01:04:44PM +0200, Adrien Mazarguil wrote:
> > > > On Tue, Oct 25, 2016 at 12:11:04PM +0200, Morten Brørup wrote:
> > > > > Comments inline.
> > > > >
> > > > > Med venlig hilsen / kind regards
> > > > > - Morten Brørup
> > > > >
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com]
> > > > > > Sent: Tuesday, October 25, 2016 11:39 AM
> > > > > > To: Bruce Richardson
> > > > > > Cc: Wiles, Keith; Morten Brørup; dev@dpdk.org; Olivier Matz; Oleg
> > > > > > Kuporosov
> > > > > > Subject: Re: [dpdk-dev] mbuf changes
> > > > > >
> > > > > > On Mon, Oct 24, 2016 at 05:25:38PM +0100, Bruce Richardson wrote:
> > > > > > > On Mon, Oct 24, 2016 at 04:11:33PM +0000, Wiles, Keith wrote:
> > > > > > [...]
> > > > > > > > > On Oct 24, 2016, at 10:49 AM, Morten Brørup
> > > > > > <mb@smartsharesystems.com> wrote:
> > > > > > [...]
> > > > > > > > > 5.
> > > > > > > > >
> > > > > > > > > And here’s something new to think about:
> > > > > > > > >
> > > > > > > > > m->next already reveals if there are more segments to a
> > > packet.
> > > > > > Which purpose does m->nb_segs serve that is not already covered
> > > by
> > > > > > m-
> > > > > > >next?
> > > > > > >
> > > > > > > It is duplicate info, but nb_segs can be used to check the
> > > > > > > validity
> > > > > > of
> > > > > > > the next pointer without having to read the second mbuf
> > > cacheline.
> > > > > > >
> > > > > > > Whether it's worth having is something I'm happy enough to
> > > > > > > discuss, though.
> > > > > >
> > > > > > Although slower in some cases than a full blown "next packet"
> > > > > > pointer, nb_segs can also be conveniently abused to link several
> > > > > > packets and their segments in the same list without wasting
> > > space.
> > > > >
> > > > > I don’t understand that; can you please elaborate? Are you abusing
> > > m->nb_segs as an index into an array in your application? If that is
> > > the case, and it is endorsed by the community, we should get rid of m-
> > > >nb_segs and add a member for application specific use instead.
> > > >
> > > > Well, that's just an idea, I'm not aware of any application using
> > > > this, however the ability to link several packets with segments seems
> > > > useful to me (e.g. buffering packets). Here's a diagram:
> > > >
> > > >  .-----------.   .-----------.   .-----------.   .-----------.   .---
> > > ---
> > > >  | pkt 0     |   | seg 1     |   | seg 2     |   | pkt 1     |   |
> > > pkt 2
> > > >  |      next --->|      next --->|      next --->|      next --->|
> > > ...
> > > >  | nb_segs 3 |   | nb_segs 1 |   | nb_segs 1 |   | nb_segs 1 |   |
> > > >  `-----------'   `-----------'   `-----------'   `-----------'   `---
> > > ---
> >
> > I see. It makes it possible to refer to a burst of packets (with segments or not) by a single mbuf reference, as an alternative to the current
> design pattern of using an array and length (struct rte_mbuf **mbufs, unsigned count).
> >
> > This would require implementation in the PMDs etc.
> >
> > And even in this case, m->nb_segs does not need to be an integer, but could be replaced by a single bit indicating if the segment is a
> continuation of a packet or the beginning (alternatively the end) of a packet, i.e. the bit can be set for either the first or the last segment in
> the packet.

We do need nb_segs -  at least for TX.
That's how TX function calculates how many TXDs it needs to allocate(and fill).
Of-course it can re-scan whole chain of segments to count them, but I think
it would slowdown things even more.
Though yes, I suppose it can be moved to the second cahe-line.
Konstantin

> 
> Sure however if we keep the current definition, a single bit would not be
> enough as it must be nonzero for the buffer to be valid. I think a 8 bit
> field is not that expensive for a counter.
> 
> > It is an almost equivalent alternative to the fundamental design pattern of using an array of mbuf with count, which is widely implemented
> in DPDK. And m->next still lives in the second cache line, so I don't see any gain by this.
> 
> That's right, it does not have to live in the first cache line, my only
> concern was its entire removal.
> 
> > I still don't get how m->nb_segs can be abused without m->next.
> 
> By "abused" I mean that applications are not supposed to pass this kind of
> mbuf lists directly to existing mbuf-handling functions (TX burst,
> rte_pktmbuf_free() and so on), however these same applications (even PMDs)
> can do so internally temporarily because it's so simple.
> 
> The next pointer of the last segment of a packet must still be set to NULL
> every time a packet is retrieved from such a list to be processed.
> 
> > > However, nb_segs may be a good candidate for demotion, along with
> > > possibly the port value, or the reference count.
> 
> Yes, I think that's fine as long as it's kept somewhere.
> 
> --
> Adrien Mazarguil
> 6WIND

  reply	other threads:[~2016-10-25 13:58 UTC|newest]

Thread overview: 46+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-10-24 15:49 Morten Brørup
2016-10-24 16:11 ` Wiles, Keith
2016-10-24 16:25   ` Bruce Richardson
2016-10-24 21:47     ` Morten Brørup
2016-10-25  8:53       ` Bruce Richardson
2016-10-25 10:02         ` Ananyev, Konstantin
2016-10-25 10:22           ` Morten Brørup
2016-10-25 13:00             ` Olivier Matz
2016-10-25 13:04               ` Ramia, Kannan Babu
2016-10-25 13:24                 ` Thomas Monjalon
2016-10-25 14:32                   ` Ramia, Kannan Babu
2016-10-25 14:54                     ` Thomas Monjalon
2016-10-25 13:15               ` Bruce Richardson
2016-10-25 13:20               ` Thomas Monjalon
2016-10-25  9:39     ` Adrien Mazarguil
2016-10-25 10:11       ` Morten Brørup
2016-10-25 11:04         ` Adrien Mazarguil
2016-10-25 11:13           ` Bruce Richardson
2016-10-25 12:16             ` Morten Brørup
2016-10-25 12:20               ` Bruce Richardson
2016-10-25 12:33                 ` Morten Brørup
2016-10-25 12:45                   ` Bruce Richardson
2016-10-25 12:48                     ` Olivier Matz
2016-10-25 13:13                       ` Morten Brørup
2016-10-25 13:38                       ` Ananyev, Konstantin
2016-10-25 14:25                         ` Morten Brørup
2016-10-25 14:45                           ` Olivier Matz
2016-10-28 13:34                       ` Pattan, Reshma
2016-10-28 14:11                         ` Morten Brørup
2016-10-28 15:25                           ` Pattan, Reshma
2016-10-28 16:50                           ` Adrien Mazarguil
2016-10-28 17:00                             ` Richardson, Bruce
2016-10-28 20:27                               ` Morten Brørup
2016-10-31 10:55                               ` Morten Brørup
2016-10-31 16:05                       ` Morten Brørup
2016-10-25 13:48               ` Adrien Mazarguil
2016-10-25 13:58                 ` Ananyev, Konstantin [this message]
2016-10-25 11:54     ` Shreyansh Jain
2016-10-25 12:05       ` Bruce Richardson
2016-10-26  8:28         ` Alejandro Lucero
2016-10-26  9:01           ` Morten Brørup
2016-11-09 11:42           ` Alejandro Lucero
2016-11-10  9:19             ` [dpdk-dev] Fwd: " Alejandro Lucero
2016-10-25 12:49       ` [dpdk-dev] " Morten Brørup
2016-10-25 13:14 ` Olivier Matz
2016-10-25 13:18   ` Morten Brørup

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB9772583F0CC835@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=adrien.mazarguil@6wind.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=keith.wiles@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=olegk@mellanox.com \
    --cc=olivier.matz@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).