From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <changchun.ouyang@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id 2AD505951
 for <dev@dpdk.org>; Sat, 25 Oct 2014 02:40:25 +0200 (CEST)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga101.jf.intel.com with ESMTP; 24 Oct 2014 17:48:56 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.04,783,1406617200"; d="scan'208";a="595640897"
Received: from pgsmsx103.gar.corp.intel.com ([10.221.44.82])
 by orsmga001.jf.intel.com with ESMTP; 24 Oct 2014 17:48:55 -0700
Received: from shsmsx104.ccr.corp.intel.com (10.239.4.70) by
 PGSMSX103.gar.corp.intel.com (10.221.44.82) with Microsoft SMTP Server (TLS)
 id 14.3.195.1; Sat, 25 Oct 2014 08:48:54 +0800
Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.156]) by
 SHSMSX104.ccr.corp.intel.com ([169.254.5.174]) with mapi id 14.03.0195.001;
 Sat, 25 Oct 2014 08:48:47 +0800
From: "Ouyang, Changchun" <changchun.ouyang@intel.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Thread-Topic: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx
Thread-Index: AQHP72XgqLzVtIlfu0ypahTULjdsBZw+dJCAgAF2x4A=
Date: Sat, 25 Oct 2014 00:48:46 +0000
Message-ID: <F52918179C57134FAEC9EA62FA2F96251187D03F@shsmsx102.ccr.corp.intel.com>
References: <1414139898-26562-1-git-send-email-changchun.ouyang@intel.com>
 <1684003.hLxV2SOth0@xps13>
In-Reply-To: <1684003.hLxV2SOth0@xps13>
Accept-Language: zh-CN, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.239.127.40]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sat, 25 Oct 2014 00:40:27 -0000

Hi Thomas,=20

Thanks for your comments  and my response as below.

> -----Original Message-----
> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> Sent: Friday, October 24, 2014 5:28 PM
> To: Ouyang, Changchun
> Cc: dev@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector
> Rx
>=20
> Hi Changchun,
>=20
> 2014-10-24 16:38, Ouyang Changchun:
> > For zero copy, it need check whether RX descriptor num meets the least
> > requirement when using vector PMD Rx function, and give user more
> > hints if it fails to meet the least requirement.
> [...]
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -131,6 +131,10 @@
> >  #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32   /* legacy: 32, DPDK virt FE:
> 128. */
> >  #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64   /* legacy: 64, DPDK virt FE:
> 64.  */
> >
> > +#ifdef RTE_IXGBE_INC_VECTOR
> > +#define VPMD_RX_BURST         32
> > +#endif
> > +
> >  /* Get first 4 bytes in mbuf headroom. */  #define
> > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \
> >  		+ sizeof(struct rte_mbuf)))
> > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv)
> >  		return -1;
> >  	}
> >
> > +#ifdef RTE_IXGBE_INC_VECTOR
> > +	if ((zero_copy =3D=3D 1) && (num_rx_descriptor <=3D VPMD_RX_BURST)) {
> > +		RTE_LOG(INFO, VHOST_PORT,
> > +			"The RX desc num: %d is too small for PMD to
> work\n"
> > +			"properly, please enlarge it to bigger than %d if\n"
> > +			"possible by the option: '--rx-desc-num
> <number>'\n"
> > +			"One alternative is disabling
> RTE_IXGBE_INC_VECTOR\n"
> > +			"in config file and rebuild the libraries.\n",
> > +			num_rx_descriptor, VPMD_RX_BURST);
> > +		return -1;
> > +	}
> > +#endif
> > +
> >  	return 0;
> >  }
>=20
> I feel there is a design problem here.
> An application shouldn't have to care about the underlying driver.
>=20

For most of other applications, as their descriptor numbers are set as big =
enough(1024 or so) ,
So there is no need to check the descriptor number at the early stage of ru=
nning.

But for vhost zero copy(note vhost one copy also has 1024 descriptor number=
) has the default=20
descriptor number of 32.
Why use 32?=20
because vhost zero copy implementation (working as backend) need support dp=
dk based app which use pmd virtio-net driver,
And also need support linux legacy virtio-net based application. =20
When it is the linux legacy virtio-net case, on one side the qemu has hard =
code to confine the total virtio descriptor size to 256,=20
On other side, legacy virtio use half of them as virtio header, and then on=
ly another half i.e. 128 descriptors are available to use as real buffer.

In PMD mode, all HW descriptors need to be filled DMA address in the rx ini=
tial stage, otherwise there is probably exceptional in rx process.
Based on that, we need use really limited virtio buffer to fully fill all h=
w descriptor DMA address,
Or in other word, the available virtio descriptor size will determine the t=
otal mbuf size and hw descriptor size in the case of zero copy,

Tune and find that 32 is the suitable value for vhost zero copy to work pro=
perly when it legacy linux virtio case.
Another factor to reduce the value to 32, is that mempool use ring to accom=
modate the mbuf, it cost one to flag the ring head/tail,
And there are some other overheads like temporary mbufs(size as RX_BURST) w=
hen rx.
Note that number descriptor should need power 2.  =20

Why the change occur at this moment?
Recently the default rx function is modified into vector RX function, while=
 it use non-vector mode (scalar mode) Rx previously,
Vector RX function need more than 32 descriptor to work properly,  but scal=
ar mode RX hasn't this limitation.

As the RX function is changeable(you can use vector mode or non-vector), an=
d descriptor number can also be changed.
So here in the vhost app, check if they match to make sure all things could=
 work normally, and give some hints if they don't match.

Hope the above could make it a bit clearer. :-)
Thanks again,
Best regards,
Changchun