DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Richardson, Bruce" <bruce.richardson@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v2] ixgbe: don't override mbuf buffer length
Date: Fri, 5 Dec 2014 11:19:11 +0000	[thread overview]
Message-ID: <2601191342CEEE43887BDE71AB977258213BCEE2@IRSMSX105.ger.corp.intel.com> (raw)
In-Reply-To: <20141205103857.GA8184@bricha3-MOBL3>



> -----Original Message-----
> From: Richardson, Bruce
> Sent: Friday, December 05, 2014 10:39 AM
> To: Ananyev, Konstantin
> Cc: Jean-Mickael Guerin; dev@dpdk.org
> Subject: Re: [PATCH v2] ixgbe: don't override mbuf buffer length
> 
> On Fri, Dec 05, 2014 at 01:10:28AM +0000, Ananyev, Konstantin wrote:
> >
> >
> > > -----Original Message-----
> > > From: Jean-Mickael Guerin [mailto:jean-mickael.guerin@6wind.com]
> > > Sent: Thursday, December 04, 2014 6:09 PM
> > > To: dev@dpdk.org
> > > Cc: Richardson, Bruce; Ananyev, Konstantin
> > > Subject: [PATCH v2] ixgbe: don't override mbuf buffer length
> > >
> > > The template mbuf_initializer is hard coded with a buflen which
> > > might have been set differently by the application at the time of
> > > mbuf pool creation.
> > >
> > > Switch to a mbuf allocation, to fetch the correct default values.
> > > There is no performance impact because this is not a data-plane API.
> > >
> > > Signed-off-by: Jean-Mickael Guerin <jean-mickael.guerin@6wind.com>
> > > Acked-by: David Marchand <david.marchand@6wind.com>
> > > Fixes: 0ff3324da2 ("ixgbe: rework vector pmd following mbuf changes")
> > > ---
> > >
> > >  v2: check returned value of ixgbe_rxq_vec_setup
> > >
> > >  lib/librte_pmd_ixgbe/ixgbe_rxtx.c     |  5 ++++-
> > >  lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c | 19 ++++++++++++-------
> > >  2 files changed, 16 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > index 5c36bff..7994da1 100644
> > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c
> > > @@ -2244,7 +2244,10 @@ ixgbe_dev_rx_queue_setup(struct rte_eth_dev *dev,
> > >  	use_def_burst_func = check_rx_burst_bulk_alloc_preconditions(rxq);
> > >
> > >  #ifdef RTE_IXGBE_INC_VECTOR
> > > -	ixgbe_rxq_vec_setup(rxq);
> > > +	if (ixgbe_rxq_vec_setup(rxq) < 0) {
> > > +		ixgbe_rx_queue_release(rxq);
> > > +		return (-ENOMEM);
> > > +	}
> > >  #endif
> > >  	/* Check if pre-conditions are satisfied, and no Scattered Rx */
> > >  	if (!use_def_burst_func && !dev->data->scattered_rx) {
> > > diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > index c1b5a78..f7b02f5 100644
> > > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > > @@ -732,17 +732,22 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> > >  int
> > >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> > >  {
> > > -	struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> > > +	struct rte_mbuf *mb_def;
> > >
> > > -	mb_def.nb_segs = 1;
> > > -	mb_def.data_off = RTE_PKTMBUF_HEADROOM;
> > > -	mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> > > -	mb_def.port = rxq->port_id;
> > > -	rte_mbuf_refcnt_set(&mb_def, 1);
> > > +	mb_def = rte_pktmbuf_alloc(rxq->mb_pool);
> > > +	if (mb_def == NULL) {
> > > +		PMD_INIT_LOG(ERR, "ixgbe_rxq_vec_setup: could not allocate one mbuf");
> > > +		return -1;
> > > +	}
> > > +	/* nb_segs, refcnt, data_off and buf_len are already set */
> > > +	mb_def->port = rxq->port_id;
> > >
> > >  	/* prevent compiler reordering: rearm_data covers previous fields */
> > >  	rte_compiler_barrier();
> > > -	rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > > +	rxq->mbuf_initializer = *((uint64_t *)&mb_def->rearm_data);
> > > +
> > > +	rte_pktmbuf_free(mb_def);
> > > +
> > >  	return 0;
> > >  }
> > >
> > > --
> > > 2.1.3
> >
> > As I said in another mail, I don't think it is a proper fix.
> > What we did here - just changed one assumption to another.
> > Current assumption - all mbuf obj_init() would setup buf_len in exactly the same manner as  rte_pktmbuf_init() does:
> > buf_len = mp->elt_size - sizeof(struct rte_mbuf);
> > New assumption - all mbuf obj_init() would setup buf_len for all mbufs in the pool to the same value.
> > Both assumptions, I believe, are not always correct.
> > Though, probably the new one would be true more often.
> >
> > I still think the proper fix is not to update mbuf's buf_len field at ixgbe_rxq_rearm() at all.
> > We should just leave the original value unmodified.
> > Actually, while looking at ixgbe_rxq_rearm(), I don't see any reason why we need to update buf_len field.
> > It is not the data that need to be rearmed.
> > The fields that need to be rearmed are:
> > uint16_t data_off;
> > uint16_t refcnt
> > uint8_t nb_segs;
> > uint8_t port;
> >
> > 6B in total.
> > We probably would like to keep rearming as one 64bit load/store.
> > Though straight below them we have:
> > uint64_t ol_flags;
> >
> > As RX fully override ol_flags anyway, we can safely overwrite first 2B of it.
> > That would allow us to still read/write whole 64bits and avoid overwriting buf_len.
> > I am talking about something like patch  below.
> > I admit that it looks not so pretty, but I think it is much safer and correct.
> > Konstantin
> 
> I just don't see this as worthwhile doing. We are looking here at an mbuf pool
> which is to be used for packet buffers for RX. If the packet buffers in that
> pool are all of different sizes then we need to go back and look at other places
> throughout the code too. For instance, we query the buffer length of the mbuf
> pool when initializing the RX queues to determine if we need to enable scattered
> RX. If the mbufs in the pool can potentially be of different sizes, we need to
> turn off the no-scattered-packets optimization and always use the scatter
> packets code path - because the assumption that buffers don't get resized could
> also be false. Similarly here, if the mbufs are going to be of different
> sizes then the user should disable the vector PMD, and use RX code that doesn't
> override the buf_len each time.

Yes, all buffers in the pool supposed to be not smaller than some threshold.
But it doesn't mean that buffer can't be bigger.
Let say our usual case - buf_len = 2K.
Why it should be prohibited to use for RX mbuf with buf_len == 4K?
There is absolutely nothing wrong with it. 

> 
> Furthermore, we still don't have an actual use-case where the user would want to
> have different size mbufs in an mbuf pool used for RX. They can still have
> variable-sized mbufs in other pools, but having all buffers the same size in the
> pool used to receive packets seems a perfectly fair restriction to have. If
> someone has an app that they are creating that needs this functionality, I'll
> reconsider this opinion, but right now this is all theoretical.

Well, the question is why do we need to override buf_len at all?
At first place, it doesn't look correct.
Second - what advantage we are gaining from it?
Performance?
I tried with the changes below and didn't see any performance difference at all. 
So what is the point in keeping that potential point of failure?

About use cases - I think I gave two examples in my previous mail.
Not sure why you consider them artificial.
>From other side, do we have right now any actual use-case where people setup buf_len to something
different then: mp->elt_size - sizeof(struct rte_mbuf)?

Konstantin

> 
> Regards,
> /Bruce
> 
> >
> > --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx_vec.c
> > @@ -79,13 +79,19 @@ ixgbe_rxq_rearm(struct igb_rx_queue *rxq)
> >         /* Initialize the mbufs in vector, process 2 mbufs in one loop */
> >         for (i = 0; i < RTE_IXGBE_RXQ_REARM_THRESH; i += 2, rxep += 2) {
> >                 __m128i vaddr0, vaddr1;
> > +               uintptr_t p0, p1;
> >
> >                 mb0 = rxep[0].mbuf;
> >                 mb1 = rxep[1].mbuf;
> >
> >                 /* flush mbuf with pkt template */
> > -               mb0->rearm_data[0] = rxq->mbuf_initializer;
> > -               mb1->rearm_data[0] = rxq->mbuf_initializer;
> > +               p0 = (uintptr_t)&mb0->data_off;
> > +               *(uint64_t *)p0 = rxq->mbuf_initializer;
> > +               p1 = (uintptr_t)&mb1->data_off;
> > +               *(uint64_t *)p1 = rxq->mbuf_initializer;
> > +
> > +               //mb0->rearm_data[0] = rxq->mbuf_initializer;
> > +               //mb1->rearm_data[0] = rxq->mbuf_initializer;
> >
> >                 /* load buf_addr(lo 64bit) and buf_physaddr(hi 64bit) */
> >                 vaddr0 = _mm_loadu_si128((__m128i *)&(mb0->buf_addr));
> > @@ -732,6 +738,7 @@ static struct ixgbe_txq_ops vec_txq_ops = {
> >  int
> >  ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >  {
> > +       uintptr_t p;
> >         struct rte_mbuf mb_def = { .buf_addr = 0 }; /* zeroed mbuf */
> >
> >         mb_def.nb_segs = 1;
> > @@ -739,7 +746,8 @@ ixgbe_rxq_vec_setup(struct igb_rx_queue *rxq)
> >         mb_def.buf_len = rxq->mb_pool->elt_size - sizeof(struct rte_mbuf);
> >         mb_def.port = rxq->port_id;
> >         rte_mbuf_refcnt_set(&mb_def, 1);
> > -       rxq->mbuf_initializer = *((uint64_t *)&mb_def.rearm_data);
> > +       p = (uintptr_t)&mb_def.data_off;
> > +       rxq->mbuf_initializer = *(uint64_t *)p;
> >         return 0;
> >  }
> >
> >

  reply	other threads:[~2014-12-05 11:19 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-12-04 18:09 Jean-Mickael Guerin
2014-12-05  1:10 ` Ananyev, Konstantin
2014-12-05 10:38   ` Bruce Richardson
2014-12-05 11:19     ` Ananyev, Konstantin [this message]
2014-12-05 11:28       ` Bruce Richardson
2014-12-05 11:59         ` Ananyev, Konstantin
2014-12-05 12:10           ` Bruce Richardson
2014-12-05 15:23             ` Ananyev, Konstantin
2014-12-05 10:40 ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2601191342CEEE43887BDE71AB977258213BCEE2@IRSMSX105.ger.corp.intel.com \
    --to=konstantin.ananyev@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).