DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"Richardson, Bruce" <bruce.richardson@intel.com>
Cc: Yongseok Koh <yskoh@mellanox.com>,
	Olivier Matz <olivier.matz@6wind.com>,
	 "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
Date: Wed, 14 Feb 2018 14:16:12 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB97725890572F6D@irsmsx105.ger.corp.intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB97725890572EF7@irsmsx105.ger.corp.intel.com>



> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> Sent: Wednesday, February 14, 2018 12:35 PM
> To: Richardson, Bruce <bruce.richardson@intel.com>
> Cc: Yongseok Koh <yskoh@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>; dev@dpdk.org
> Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> 
> 
> 
> > -----Original Message-----
> > From: Richardson, Bruce
> > Sent: Wednesday, February 14, 2018 12:12 PM
> > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Cc: Yongseok Koh <yskoh@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>; dev@dpdk.org
> > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> >
> > On Wed, Feb 14, 2018 at 12:03:55PM +0000, Ananyev, Konstantin wrote:
> > >
> > >
> > > > -----Original Message-----
> > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ananyev, Konstantin
> > > > Sent: Wednesday, February 14, 2018 11:48 AM
> > > > To: Yongseok Koh <yskoh@mellanox.com>; Olivier Matz <olivier.matz@6wind.com>
> > > > Cc: dev@dpdk.org
> > > > Subject: Re: [dpdk-dev] Accessing 2nd cacheline in rte_pktmbuf_prefree_seg()
> > > >
> > > > Hi Yongseok,
> > > >
> > > > > > On Feb 13, 2018, at 2:45 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> > > > > >
> > > > > > Hi Olivier
> > > > > >
> > > > > > I'm wondering why rte_pktmbuf_prefree_seg() checks m->next instead of
> > > > > > m->nb_segs? As 'next' is in the 2nd cacheline, checking nb_segs seems beneficial
> > > > > > to the cases where almost mbufs have single segment.
> > > > > >
> > > > > > A customer reported high rate of cache misses in the code and I thought the
> > > > > > following patch could be helpful. I haven't had them try it yet but just wanted
> > > > > > to hear from you.
> > > > > >
> > > > > > I'd appreciate if you can review this idea.
> > > > > >
> > > > > > diff --git a/lib/librte_mbuf/rte_mbuf.h b/lib/librte_mbuf/rte_mbuf.h
> > > > > > index 62740254d..96edbcb9e 100644
> > > > > > --- a/lib/librte_mbuf/rte_mbuf.h
> > > > > > +++ b/lib/librte_mbuf/rte_mbuf.h
> > > > > > @@ -1398,7 +1398,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > >                if (RTE_MBUF_INDIRECT(m))
> > > > > >                        rte_pktmbuf_detach(m);
> > > > > >
> > > > > > -               if (m->next != NULL) {
> > > > > > +               if (m->nb_segs > 1) {
> > > > > >                        m->next = NULL;
> > > > > >                        m->nb_segs = 1;
> > > > > >                }
> > > > > > @@ -1410,7 +1410,7 @@ rte_pktmbuf_prefree_seg(struct rte_mbuf *m)
> > > > > >                if (RTE_MBUF_INDIRECT(m))
> > > > > >                        rte_pktmbuf_detach(m);
> > > > > >
> > > > > > -               if (m->next != NULL) {
> > > > > > +               if (m->nb_segs > 1) {
> > > > > >                        m->next = NULL;
> > > > > >                        m->nb_segs = 1;
> > > > > >                }
> > > > >
> > > > > Well, m->pool in the 2nd cacheline has to be accessed anyway in order to put it back to the mempool.
> > > > > It looks like the cache miss is unavoidable.
> > > >
> > > > As a thought: in theory PMD can store pool pointer together with each mbuf it has to free,
> > > > then it could be something like:
> > > >
> > > > if (rte_pktmbuf_prefree_seg(m[x] != NULL)
> > > >    rte_mempool_put(pool[x], m[x]);
> > > >
> > > > Then what you suggested above might help.
> > >
> > > After another thought - we have to check m->next not m->nb_segs.
> > > There could be a situations where nb_segs==1, but m->next != NULL
> > > (2-nd segment of the 3 segment packet for example).
> > > So probably we have to keep it as it is.
> > > Sorry for the noise
> > > Konstantin
> >
> > It's still worth considering as an option. We could check nb_segs for
> > the first segment of a packet and thereafter iterate using the next
> > pointer.
> 
> In multi-seg case PMD frees segments (not packets).
> It could happen that first segment would be already freed while the second
> still not.
> 
> > It means that your idea of storing the pool pointer for each
> > mbuf becomes useful for single-segment packets.
> 
> But then we'll have to support 2 different flavors of prefree_seg().
> Alternative would be to change all PMDs multi-seg TX so when first segment is
> going to be freed we update nb_segs for the second and so on.
> Both options seems like too much hassle.
> 

As  a side thought what probably can be  done to minimize access
to 2-nd mbuf's cache line at PMD tx free:
Introduce something like that:
static __rte_always_inline struct rte_mepool *
xxx_prefree_seg(struct rte_mbuf *m)
{
        if (rte_mbuf_refcnt_read(m) == 1 && RTE_MBUF_DIRECT(m)) {
                if (m->next != NULL) {
                        m->next = NULL;
                        m->nb_segs = 1;
                }
                return m->pool;
       }
       return NULL;
}

Then at tx_burst() before doing actual TX PMD can call that function
and store it's return value along with mbuf:
..
m[x] = pkt;
pool[x] = xxx_prefree_seg(m[x]);

Then at free time, we can do something ilike:
If (pool[x] != NULL) 
   rte_mempool_put(pool[x], m[x]);
else
    rte_pktmbuf_free_seg(m[x]);

We still access m->next but doing that before actual TX is done.
Hopefully there would be more chances that m->next
is still in the cache at that moment.
In theory, that might help for most common case when
we have direct mbufs with refcnt==1.
Though for indirect/refcnt>1 mbufs there would be extra overhead.
Konstantin

      reply	other threads:[~2018-02-14 14:16 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-13 22:45 Yongseok Koh
2018-02-14  3:16 ` Yongseok Koh
2018-02-14 11:48   ` Ananyev, Konstantin
2018-02-14 12:03     ` Ananyev, Konstantin
2018-02-14 12:11       ` Bruce Richardson
2018-02-14 12:35         ` Ananyev, Konstantin
2018-02-14 14:16           ` Ananyev, Konstantin [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB97725890572F6D@irsmsx105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=olivier.matz@6wind.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).