From: Stephen Hemminger <stephen@networkplumber.org>
To: "Ouyang, Changchun" <changchun.ouyang@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
Date: Mon, 20 Jul 2015 08:47:17 -0700 [thread overview]
Message-ID: <20150720084717.7c02ce8f@urahara> (raw)
In-Reply-To: <F52918179C57134FAEC9EA62FA2F962511C0803B@shsmsx102.ccr.corp.intel.com>
On Sat, 18 Jul 2015 12:11:11 +0000
"Ouyang, Changchun" <changchun.ouyang@intel.com> wrote:
> Hi Stephen,
>
> > -----Original Message-----
> > From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> > Sent: Saturday, July 18, 2015 12:28 AM
> > To: Ouyang, Changchun
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] virtio: fix the vq size issue
> >
> > On Wed, 1 Jul 2015 15:48:50 +0800
> > Ouyang Changchun <changchun.ouyang@intel.com> wrote:
> >
> > > This commit breaks virtio basic packets rx functionality:
> > > d78deadae4dca240e85054bf2d604a801676becc
> > >
> > > The QEMU use 256 as default vring size, also use this default value to
> > > calculate the virtio avail ring base address and used ring base
> > > address, and vhost in the backend use the ring base address to do packet
> > IO.
> > >
> > > Virtio spec also says the queue size in PCI configuration is
> > > read-only, so virtio front end can't change it. just need use the
> > > read-only value to allocate space for vring and calculate the avail
> > > and used ring base address. Otherwise, the avail and used ring base
> > address will be different between host and guest, accordingly, packet IO
> > can't work normally.
> > >
> > > Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
> > > ---
> > > drivers/net/virtio/virtio_ethdev.c | 14 +++-----------
> > > 1 file changed, 3 insertions(+), 11 deletions(-)
> > >
> > > diff --git a/drivers/net/virtio/virtio_ethdev.c
> > > b/drivers/net/virtio/virtio_ethdev.c
> > > index fe5f9a1..d84de13 100644
> > > --- a/drivers/net/virtio/virtio_ethdev.c
> > > +++ b/drivers/net/virtio/virtio_ethdev.c
> > > @@ -263,8 +263,6 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> > *dev,
> > > */
> > > vq_size = VIRTIO_READ_REG_2(hw, VIRTIO_PCI_QUEUE_NUM);
> > > PMD_INIT_LOG(DEBUG, "vq_size: %d nb_desc:%d", vq_size,
> > nb_desc);
> > > - if (nb_desc == 0)
> > > - nb_desc = vq_size;
> >
> > command queue is setup with nb_desc = 0
>
> nb_desc is not used in the rest of the function, then why we need such an assignment here?
> Why command queues is setup whit nb_desc = 0?
> Even if it is the case, what the code change break?
>
> >
> > > if (vq_size == 0) {
> > > PMD_INIT_LOG(ERR, "%s: virtqueue does not exist",
> > __func__);
> > > return -EINVAL;
> > > @@ -275,15 +273,9 @@ int virtio_dev_queue_setup(struct rte_eth_dev
> > *dev,
> > > return -EINVAL;
> > > }
> > >
> > > - if (nb_desc < vq_size) {
> > > - if (!rte_is_power_of_2(nb_desc)) {
> > > - PMD_INIT_LOG(ERR,
> > > - "nb_desc(%u) size is not powerof 2",
> > > - nb_desc);
> > > - return -EINVAL;
> > > - }
> > > - vq_size = nb_desc;
> > > - }
> > > + if (nb_desc != vq_size)
> > > + PMD_INIT_LOG(ERR, "Warning: nb_desc(%d) is not equal to
> > vq size (%d), fall to vq size",
> > > + nb_desc, vq_size);
> >
> > Nack. This breaks onn Google Compute Engine the vring size is 16K.
>
>
> As I mentioned before, the commit d78deadae4dca240e85054bf2d604a801676becc break the basic functionality of virtio pmd,
> I don't think keeping it broken is good way for us.
> We have to revert it firstly to recover its functionality on qemu!
> Why we need break current functionality to just meet a new thing's requirement?
>
> >
> > An application that wants to work on both QEMU and GCE will want to pass a
> > reasonable size and have the negotiation resolve to best value.
>
> Do you have already a patch to revert the mistaken and support both qemu and gce?
> If you have, then pls send out it, and let's review.
>
> >
> > For example, vRouter passes 512 as Rx ring size.
> > On QEMU this gets rounded down to 256 and on GCE only 512 elements are
> > used.
> >
> > This is what the Linux kernel virtio does.
The part in dev_queue_setup is correct, but there is a different problem
if the user has requested smaller number of descriptors. What happens is that
the receive start process runs the mbuf pool out of space getting more packets
than the application expected. Imagine application expects 512 packets in rx ring
but full 16K are allocated.
Working on a fix to the rx initialization logic to take that into account.
next prev parent reply other threads:[~2015-07-20 15:47 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-07-01 7:48 Ouyang Changchun
2015-07-01 8:55 ` Xu, Qian Q
2015-07-01 15:53 ` Xie, Huawei
2015-07-02 0:29 ` Ouyang, Changchun
2015-07-02 2:01 ` Xie, Huawei
2015-07-02 2:16 ` Ouyang, Changchun
2015-07-02 9:15 ` Xie, Huawei
2015-07-03 1:53 ` Ouyang, Changchun
2015-07-10 14:05 ` Xie, Huawei
2015-07-10 14:11 ` Thomas Monjalon
2015-07-13 1:40 ` Ouyang, Changchun
2015-07-07 2:32 ` Ouyang, Changchun
2015-07-17 10:42 ` Thomas Monjalon
2015-07-17 16:27 ` Stephen Hemminger
2015-07-18 12:11 ` Ouyang, Changchun
2015-07-20 15:47 ` Stephen Hemminger [this message]
2015-07-20 3:40 ` Xu, Qian Q
2015-07-20 6:18 ` Ouyang, Changchun
2015-07-20 10:42 ` Thomas Monjalon
2015-07-21 5:23 ` Ouyang, Changchun
2015-07-20 15:30 ` Stephen Hemminger
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20150720084717.7c02ce8f@urahara \
--to=stephen@networkplumber.org \
--cc=changchun.ouyang@intel.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).