DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ouyang, Changchun" <changchun.ouyang@intel.com>
To: "Xie, Huawei" <huawei.xie@intel.com>,
	Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx
Date: Thu, 30 Oct 2014 00:58:41 +0000	[thread overview]
Message-ID: <F52918179C57134FAEC9EA62FA2F96251187ECD5@shsmsx102.ccr.corp.intel.com> (raw)
In-Reply-To: <C37D651A908B024F974696C65296B57B0F2D86AF@SHSMSX101.ccr.corp.intel.com>

Hi,

> -----Original Message-----
> From: Xie, Huawei
> Sent: Thursday, October 30, 2014 7:37 AM
> To: Ouyang, Changchun; Thomas Monjalon
> Cc: dev@dpdk.org
> Subject: RE: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector
> Rx
> 
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Ouyang,
> Changchun
> > Sent: Monday, October 27, 2014 6:56 AM
> > To: Thomas Monjalon
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for
> > vector Rx
> >
> > Hi Thomas,
> >
> > > -----Original Message-----
> > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > Sent: Monday, October 27, 2014 4:46 PM
> > > To: Ouyang, Changchun
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for
> > > vector Rx
> > >
> > > 2014-10-25 00:48, Ouyang, Changchun:
> > > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> > > > > 2014-10-24 16:38, Ouyang Changchun:
> > > > > > For zero copy, it need check whether RX descriptor num meets
> > > > > > the least requirement when using vector PMD Rx function, and
> > > > > > give user more hints if it fails to meet the least requirement.
> > > > > [...]
> > > > > > --- a/examples/vhost/main.c
> > > > > > +++ b/examples/vhost/main.c
> > > > > > @@ -131,6 +131,10 @@
> > > > > >  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK
> virt
> > > FE: 128. */
> > > > > >  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK
> virt
> > > FE: 64.  */
> > > > > >
> > > > > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > > > > +#define VPMD_RX_BURST         32
> > > > > > +#endif
> > > > > > +
> > > > > >  /* Get first 4 bytes in mbuf headroom. */  #define
> > > > > > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf)
> \
> > > > > >  		+ sizeof(struct rte_mbuf))) @@ -792,6 +796,19 @@
> > > > > > us_vhost_parse_args(int argc, char **argv)
> > > > > >  		return -1;
> > > > > >  	}
> > > > > >
> > > > > > +#ifdef RTE_IXGBE_INC_VECTOR
> > > > > > +	if ((zero_copy == 1) && (num_rx_descriptor <=
> > VPMD_RX_BURST)) {
> > > > > > +		RTE_LOG(INFO, VHOST_PORT,
> > > > > > +			"The RX desc num: %d is too small for PMD to
> > > work\n"
> > > > > > +			"properly, please enlarge it to bigger than %d
> > if\n"
> > > > > > +			"possible by the option: '--rx-desc-num
> > > <number>'\n"
> > > > > > +			"One alternative is disabling
> > > RTE_IXGBE_INC_VECTOR\n"
> > > > > > +			"in config file and rebuild the libraries.\n",
> > > > > > +			num_rx_descriptor, VPMD_RX_BURST);
> > > > > > +		return -1;
> > > > > > +	}
> > > > > > +#endif
> > > > > > +
> > > > > >  	return 0;
> > > > > >  }
> > > > >
> > > > > I feel there is a design problem here.
> > > > > An application shouldn't have to care about the underlying driver.
> > > >
> > > > For most of other applications, as their descriptor numbers are
> > > > set as big enough(1024 or so) , So there is no need to check the
> > > > descriptor
> > > number at the early stage of running.
> > > >
> > > > But for vhost zero copy(note vhost one copy also has 1024
> > > > descriptor
> > > > number) has the default descriptor number of 32.
> > > > Why use 32?
> > > > because vhost zero copy implementation (working as backend) need
> > > > support dpdk based app which use pmd virtio-net driver, And also
> > > > need
> > > support linux legacy virtio-net based application.
> > > > When it is the linux legacy virtio-net case, on one side the qemu
> > > > has hard code to confine the total virtio descriptor size to 256,
> > > > On other side,
> > > legacy virtio use half of them as virtio header, and then only another half
> i.e.
> > > 128 descriptors are available to use as real buffer.
> > > >
> > > > In PMD mode, all HW descriptors need to be filled DMA address in
> > > > the rx
> > > initial stage, otherwise there is probably exceptional in rx process.
> > > > Based on that, we need use really limited virtio buffer to fully
> > > > fill all hw descriptor DMA address, Or in other word, the
> > > > available virtio descriptor size will determine the total mbuf
> > > > size and hw descriptor size in the case of zero copy,
> > > >
> > > > Tune and find that 32 is the suitable value for vhost zero copy to
> > > > work
> > > properly when it legacy linux virtio case.
> > > > Another factor to reduce the value to 32, is that mempool use ring
> > > > to accommodate the mbuf, it cost one to flag the ring head/tail,
> > > > And there are
> > > some other overheads like temporary mbufs(size as RX_BURST) when rx.
> > > > Note that number descriptor should need power 2.
> > > >
> > > > Why the change occur at this moment?
> > > > Recently the default rx function is modified into vector RX
> > > > function, while it use non-vector mode (scalar mode) Rx
> > > > previously, Vector RX
> > > function need more than 32 descriptor to work properly,  but scalar
> > > mode RX hasn't this limitation.
> > > >
> > > > As the RX function is changeable(you can use vector mode or
> > > > non-vector),
> > > and descriptor number can also be changed.
> > > > So here in the vhost app, check if they match to make sure all
> > > > things could
> > > work normally, and give some hints if they don't match.
> > > >
> > > > Hope the above could make it a bit clearer. :-)
> > >
> > > Thank you for your explanation.
> > > Your fix shows that driver and application are tightly linked.
> > > It's a design issue. As I said:
> > > "An application shouldn't have to care about the underlying driver."
> > > I didn't dig enough in vhost to suggest a good fix but I'm sure
> > > someone could have an idea.
> > >
> > Agree with you, there is something linked between app and driver, but
> > that's due to a few things:
> > 1.Qume has hard code to confine the total vring size; 2.PMD driver
> > need fully fill the dma address of descriptor at setup stage;
> 
> Can PMD driver fill its desc ring until all has been filled or  there is no mbuf in
> pool?
> 
Currently not,
If you mean can PMD be modified into that functionality, possibly yes.
It should be another topic about how to improve  our PMD.

> > 3.PMD driver use vector PMD as default path which require more than 32
> > in the burst RX; 4.Zero copy need use external buffer directly and set
> > to HW descriptor dma address;
> >
> > Except for item 3, everything could not be removed or ignored easily.
> > As for item 3, I don't know why we set vector PMD as default path,
> > while perusing performance, but at the cost of flexibility, Here is an
> > example, vhost zero copy need check it and descriptor number. I am not
> > sure if the default vector pmd path will affect other app or not.
> >
> > On the other hand, without this check, and hint to user, User will
> > suffer from this, vhost can't receive packets any more after
> > forwarding
> > 31 packets or so,
> > and user don't know what really happen behind, also can't easily get
> > the information of their descriptor is not enough for vector pmd RX.
> > Then this is followed by painfully debugging and figuring out the issue.
> >
> > While with this check and hint to user, things go smoothly.
> >
> > Meanwhile we can waiting for other people's viewpoints.
> >
> > Thanks
> > Changchun

      reply	other threads:[~2014-10-30  0:50 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-24  8:38 Ouyang Changchun
2014-10-24  9:27 ` Thomas Monjalon
2014-10-25  0:48   ` Ouyang, Changchun
2014-10-27  8:46     ` Thomas Monjalon
2014-10-27 13:55       ` Ouyang, Changchun
2014-10-29 23:37         ` Xie, Huawei
2014-10-30  0:58           ` Ouyang, Changchun [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=F52918179C57134FAEC9EA62FA2F96251187ECD5@shsmsx102.ccr.corp.intel.com \
    --to=changchun.ouyang@intel.com \
    --cc=dev@dpdk.org \
    --cc=huawei.xie@intel.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).