DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Jay Rolette <rolette@infinite.io>
Cc: DPDK <dev@dpdk.org>, Andriy Berestovskyy <aber@semihalf.com>
Subject: Re: [dpdk-dev] Couple of PMD questions
Date: Thu, 21 Apr 2016 11:54:04 +0100	[thread overview]
Message-ID: <20160421105404.GA11224@bricha3-MOBL3> (raw)
In-Reply-To: <CADNuJVqZScK038LibZkWLQNj8FvdjqLjVZ4bv3oY4=mYTeOF+w@mail.gmail.com>

On Wed, Apr 20, 2016 at 10:47:59AM -0500, Jay Rolette wrote:
> On Wed, Apr 20, 2016 at 9:05 AM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
> 
> > On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> > > On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> > > bruce.richardson@intel.com> wrote:
> > >
> > > > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > > > In ixgbe_dev_rx_init(), there is this bit of code:
> > > > >
> > > > >       /*
> > > > >        * Configure the RX buffer size in the BSIZEPACKET field of
> > > > >        * the SRRCTL register of the queue.
> > > > >        * The value is in 1 KB resolution. Valid values can be from
> > > > >        * 1 KB to 16 KB.
> > > > >        */
> > > > >       buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool)
> > -
> > > > >               RTE_PKTMBUF_HEADROOM);
> > > > >       srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > > > >                  IXGBE_SRRCTL_BSIZEPKT_MASK);
> > > > >
> > > > >       IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > > > >
> > > > >       buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > > > >                              IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > > > >
> > > > >       /* It adds dual VLAN length for supporting dual VLAN */
> > > > >       if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > > > >                                   2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > > > >               dev->data->scattered_rx = 1;
> > > > >
> > > > >
> > > > > Couple of questions I have about it:
> > > > >
> > > > > 1) If the caller configured the MBUF memory pool data room size to be
> > > > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then
> > the
> > > > > driver ends up silently programming the NIC to use a smaller max RX
> > size
> > > > > than the caller specified.
> > > > >
> > > > > Should the driver error out in that case instead of only "sort of"
> > > > working?
> > > > > If not, it seems like it should be logging an error message.
> > > >
> > > > Well, it does log at the end of the whole rx init process what RX
> > function
> > > > is
> > > > being used, so there is that.
> > > > However, looking at it now, given that there is an explicit setting in
> > the
> > > > config
> > > > to request scattered mode, I think you may be right and that we should
> > > > error out
> > > > here if scattered mode is needed and not set. It could help prevent
> > > > unexpected
> > > > problems with these drivers.
> > > >
> > >
> > > +1. A hard fail at init if I've misconfigured things is much preferred to
> > > something that mostly works for typical test cases and only fails on
> > > corner/limit testing.
> > >
> > >
> > > > > 2) Second question is about the "/* It adds dual VLAN length for
> > > > supporting
> > > > > dual VLAN */" bit...
> > > > >
> > > > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be
> > set
> > > > to
> > > > > the max frame size you support (although it says it's only used if
> > jumbo
> > > > > frame is enabled). That same value is generally used when
> > calculating the
> > > > > size that mbuf elements should be created at.
> > > > >
> > > > > The description for the data_room_size parameter of
> > > > > rte_pktmbuf_pool_create():
> > > > >
> > > > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > > > >
> > > > >
> > > > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > > > make
> > > > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > > > data_room_size
> > > > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > > > >
> > > > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > > > will
> > > > > fail the dual VLAN length check. The really nasty part about that is
> > it
> > > > has
> > > > > a side-effect of enabling scattered RX regardless of the fact that I
> > > > didn't
> > > > > enable scattered RX in dev_conf.rxmode.
> > > > >
> > > > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > > > >
> > > > > Is that check correct? It seems wrong to be adding space for q-in-q
> > on
> > > > top
> > > > > of your specified max frame size...
> > > >
> > > > The issue here is what the correct behaviour needs to be. If we have
> > the
> > > > user
> > > > specify the maximum frame size including all vlan tags, then we hit the
> > > > problem
> > > > where we have to subtract the VLAN tag sizes when writing the value to
> > the
> > > > NIC.
> > > > In that case, we will hit a problem where we get a e.g. 9210 byte
> > frame -
> > > > to
> > > > reuse your example - without any VLAN tags, which will be rejected by
> > the
> > > > hardware as being oversized. If we don't do the subtraction, and we
> > get the
> > > > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept
> > it
> > > > and
> > > > then split it across multiple descriptors because the actual DMA size
> > is
> > > > 9218 bytes.
> > > >
> > >
> > > As an app developer, I didn't realize the max frame size didn't include
> > > VLAN tags. I expected max frame size to be the size of the ethernet frame
> > > on the wire, which I would expect to include space used by any VLAN or
> > MPLS
> > > tags.
> > >
> > > Is there anything in the docs or example apps about that? I did some
> > > digging as I was debugging this and didn't notice it, but entirely
> > possible
> > > I just missed it.
> > >
> > >
> > > > I'm not sure there is a works-in-all-cases solution here.
> > > >
> > >
> > > Andriy's suggestion seems like it points in the right direction.
> > >
> > > From an app developer point of view, I'd expect to have a single max
> > frame
> > > size value to track and the APIs should take care of any adjustments
> > > required internally. Maybe have rte_pktmbuf_pool_create() add the
> > > additional bytes when it calls rte_mempool_create() under the covers?
> > Then
> > > it's nice and clean for the API without unexpected side-effects.
> > >
> >
> > It will still have unintended side-effects I think, depending on the
> > resolution
> > of the NIC buffer length paramters. For drivers like ixgbe or e1000, the
> > mempool
> > create call could potentially have to add an additional 1k to each buffer
> > just
> > to be able to store the extra eight bytes.
> 
> 
> The comments in the ixgbe driver say that the value programmed into SRRCTL
> must be on a 1K boundary. Based on your previous response, it sounded like
> the NIC ignores that limit for VLAN tags, hence the check for the extra 8
> bytes on the mbuf element size. Are you worried about the size resolution
> on mempool elements?
> 
> Sounds like I've got to go spend some quality time in the NIC data
> sheets... Maybe I should back up and just ask the higher level question:
> 
> What's the right incantation in both the dev_conf structure and in creating
> the mbuf pool to support jumbo frames of some particular size on the wire,
> with or without VLAN tags, without requiring scattered_rx support in an app?
> 
I think that the answer to that may be that it's hard to do, or impossible,
depending on the NIC. From what I can see from the datasheet for some of the
ixgbe NICs, the max-frame-size (MAXFRS) field that you specify always excludes
VLAN tags. Therefore you can't tell the NIC to accept a max size of X irrespective
of VLAN tags. The max size will always vary depending on the tags.

/Bruce

  parent reply	other threads:[~2016-04-21 10:54 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-04-19 20:16 Jay Rolette
2016-04-20  9:10 ` Bruce Richardson
2016-04-20 12:22   ` Jay Rolette
2016-04-20 14:05     ` Bruce Richardson
2016-04-20 15:47       ` Jay Rolette
2016-04-21  9:13         ` Andriy Berestovskyy
2016-04-21 10:54         ` Bruce Richardson [this message]
2016-04-20  9:35 ` Andriy Berestovskyy
2016-04-20 12:45   ` Jay Rolette

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160421105404.GA11224@bricha3-MOBL3 \
    --to=bruce.richardson@intel.com \
    --cc=aber@semihalf.com \
    --cc=dev@dpdk.org \
    --cc=rolette@infinite.io \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).