From: Bruce Richardson <bruce.richardson@intel.com>
To: Jay Rolette <rolette@infinite.io>
Cc: DPDK <dev@dpdk.org>
Subject: Re: [dpdk-dev] Couple of PMD questions
Date: Wed, 20 Apr 2016 15:05:54 +0100 [thread overview]
Message-ID: <20160420140554.GA11780@bricha3-MOBL3> (raw)
In-Reply-To: <CADNuJVr_9+u6_F5ALMnm+8eC=sSRd06+L2P=U+UPPM_42FGAnQ@mail.gmail.com>
On Wed, Apr 20, 2016 at 07:22:57AM -0500, Jay Rolette wrote:
> On Wed, Apr 20, 2016 at 4:10 AM, Bruce Richardson <
> bruce.richardson@intel.com> wrote:
>
> > On Tue, Apr 19, 2016 at 03:16:37PM -0500, Jay Rolette wrote:
> > > In ixgbe_dev_rx_init(), there is this bit of code:
> > >
> > > /*
> > > * Configure the RX buffer size in the BSIZEPACKET field of
> > > * the SRRCTL register of the queue.
> > > * The value is in 1 KB resolution. Valid values can be from
> > > * 1 KB to 16 KB.
> > > */
> > > buf_size = (uint16_t)(rte_pktmbuf_data_room_size(rxq->mb_pool) -
> > > RTE_PKTMBUF_HEADROOM);
> > > srrctl |= ((buf_size >> IXGBE_SRRCTL_BSIZEPKT_SHIFT) &
> > > IXGBE_SRRCTL_BSIZEPKT_MASK);
> > >
> > > IXGBE_WRITE_REG(hw, IXGBE_SRRCTL(rxq->reg_idx), srrctl);
> > >
> > > buf_size = (uint16_t) ((srrctl & IXGBE_SRRCTL_BSIZEPKT_MASK) <<
> > > IXGBE_SRRCTL_BSIZEPKT_SHIFT);
> > >
> > > /* It adds dual VLAN length for supporting dual VLAN */
> > > if (dev->data->dev_conf.rxmode.max_rx_pkt_len +
> > > 2 * IXGBE_VLAN_TAG_SIZE > buf_size)
> > > dev->data->scattered_rx = 1;
> > >
> > >
> > > Couple of questions I have about it:
> > >
> > > 1) If the caller configured the MBUF memory pool data room size to be
> > > something other than a multiple of 1K (+ RTE_PKTMBUF_HEADROOM), then the
> > > driver ends up silently programming the NIC to use a smaller max RX size
> > > than the caller specified.
> > >
> > > Should the driver error out in that case instead of only "sort of"
> > working?
> > > If not, it seems like it should be logging an error message.
> >
> > Well, it does log at the end of the whole rx init process what RX function
> > is
> > being used, so there is that.
> > However, looking at it now, given that there is an explicit setting in the
> > config
> > to request scattered mode, I think you may be right and that we should
> > error out
> > here if scattered mode is needed and not set. It could help prevent
> > unexpected
> > problems with these drivers.
> >
>
> +1. A hard fail at init if I've misconfigured things is much preferred to
> something that mostly works for typical test cases and only fails on
> corner/limit testing.
>
>
> > > 2) Second question is about the "/* It adds dual VLAN length for
> > supporting
> > > dual VLAN */" bit...
> > >
> > > As I understand it, dev_conf.rxmode.max_rx_pkt_len is supposed to be set
> > to
> > > the max frame size you support (although it says it's only used if jumbo
> > > frame is enabled). That same value is generally used when calculating the
> > > size that mbuf elements should be created at.
> > >
> > > The description for the data_room_size parameter of
> > > rte_pktmbuf_pool_create():
> > >
> > > "Size of data buffer in each mbuf, including RTE_PKTMBUF_HEADROOM."
> > >
> > >
> > > If I support a max frame size of 9216 bytes (exactly a 1K multiple to
> > make
> > > the NIC happy), then max_rx_pkt_len is going to be 9216 and
> > data_room_size
> > > will be 9216 + RTE_PKTMBUF_HEADROOM.
> > >
> > > ixgbe_dev_rx_init() will calculate normalize that back to 9216, which
> > will
> > > fail the dual VLAN length check. The really nasty part about that is it
> > has
> > > a side-effect of enabling scattered RX regardless of the fact that I
> > didn't
> > > enable scattered RX in dev_conf.rxmode.
> > >
> > > The code in the e1000 PMD is similar, so nothing unique to ixgbe.
> > >
> > > Is that check correct? It seems wrong to be adding space for q-in-q on
> > top
> > > of your specified max frame size...
> >
> > The issue here is what the correct behaviour needs to be. If we have the
> > user
> > specify the maximum frame size including all vlan tags, then we hit the
> > problem
> > where we have to subtract the VLAN tag sizes when writing the value to the
> > NIC.
> > In that case, we will hit a problem where we get a e.g. 9210 byte frame -
> > to
> > reuse your example - without any VLAN tags, which will be rejected by the
> > hardware as being oversized. If we don't do the subtraction, and we get the
> > same 9210 byte packet with 2 VLAN tags on it, the hardware will accept it
> > and
> > then split it across multiple descriptors because the actual DMA size is
> > 9218 bytes.
> >
>
> As an app developer, I didn't realize the max frame size didn't include
> VLAN tags. I expected max frame size to be the size of the ethernet frame
> on the wire, which I would expect to include space used by any VLAN or MPLS
> tags.
>
> Is there anything in the docs or example apps about that? I did some
> digging as I was debugging this and didn't notice it, but entirely possible
> I just missed it.
>
>
> > I'm not sure there is a works-in-all-cases solution here.
> >
>
> Andriy's suggestion seems like it points in the right direction.
>
> From an app developer point of view, I'd expect to have a single max frame
> size value to track and the APIs should take care of any adjustments
> required internally. Maybe have rte_pktmbuf_pool_create() add the
> additional bytes when it calls rte_mempool_create() under the covers? Then
> it's nice and clean for the API without unexpected side-effects.
>
It will still have unintended side-effects I think, depending on the resolution
of the NIC buffer length paramters. For drivers like ixgbe or e1000, the mempool
create call could potentially have to add an additional 1k to each buffer just
to be able to store the extra eight bytes.
/Bruce
next prev parent reply other threads:[~2016-04-20 14:07 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-19 20:16 Jay Rolette
2016-04-20 9:10 ` Bruce Richardson
2016-04-20 12:22 ` Jay Rolette
2016-04-20 14:05 ` Bruce Richardson [this message]
2016-04-20 15:47 ` Jay Rolette
2016-04-21 9:13 ` Andriy Berestovskyy
2016-04-21 10:54 ` Bruce Richardson
2016-04-20 9:35 ` Andriy Berestovskyy
2016-04-20 12:45 ` Jay Rolette
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20160420140554.GA11780@bricha3-MOBL3 \
--to=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=rolette@infinite.io \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).