From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 3A5FB5938 for ; Mon, 27 Oct 2014 14:47:15 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga102.fm.intel.com with ESMTP; 27 Oct 2014 06:55:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.04,796,1406617200"; d="scan'208";a="621369867" Received: from pgsmsx103.gar.corp.intel.com ([10.221.44.82]) by fmsmga002.fm.intel.com with ESMTP; 27 Oct 2014 06:55:55 -0700 Received: from pgsmsx105.gar.corp.intel.com (10.221.44.96) by PGSMSX103.gar.corp.intel.com (10.221.44.82) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 27 Oct 2014 21:55:54 +0800 Received: from shsmsx151.ccr.corp.intel.com (10.239.6.50) by pgsmsx105.gar.corp.intel.com (10.221.44.96) with Microsoft SMTP Server (TLS) id 14.3.195.1; Mon, 27 Oct 2014 21:55:53 +0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.156]) by SHSMSX151.ccr.corp.intel.com ([169.254.3.44]) with mapi id 14.03.0195.001; Mon, 27 Oct 2014 21:55:53 +0800 From: "Ouyang, Changchun" To: Thomas Monjalon Thread-Topic: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx Thread-Index: AQHP72XgqLzVtIlfu0ypahTULjdsBZw+dJCAgAF2x4CAAzSwgIAA1GTg Date: Mon, 27 Oct 2014 13:55:52 +0000 Message-ID: References: <1414139898-26562-1-git-send-email-changchun.ouyang@intel.com> <1684003.hLxV2SOth0@xps13> <1433643.g0MTQjq1Jf@xps13> In-Reply-To: <1433643.g0MTQjq1Jf@xps13> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector Rx X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 27 Oct 2014 13:47:16 -0000 Hi Thomas, > -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Monday, October 27, 2014 4:46 PM > To: Ouyang, Changchun > Cc: dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH] vhost: Check descriptor number for vector > Rx >=20 > 2014-10-25 00:48, Ouyang, Changchun: > > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > > > 2014-10-24 16:38, Ouyang Changchun: > > > > For zero copy, it need check whether RX descriptor num meets the > > > > least requirement when using vector PMD Rx function, and give user > > > > more hints if it fails to meet the least requirement. > > > [...] > > > > --- a/examples/vhost/main.c > > > > +++ b/examples/vhost/main.c > > > > @@ -131,6 +131,10 @@ > > > > #define RTE_TEST_RX_DESC_DEFAULT_ZCP 32 /* legacy: 32, DPDK virt > FE: 128. */ > > > > #define RTE_TEST_TX_DESC_DEFAULT_ZCP 64 /* legacy: 64, DPDK virt > FE: 64. */ > > > > > > > > +#ifdef RTE_IXGBE_INC_VECTOR > > > > +#define VPMD_RX_BURST 32 > > > > +#endif > > > > + > > > > /* Get first 4 bytes in mbuf headroom. */ #define > > > > MBUF_HEADROOM_UINT32(mbuf) (*(uint32_t *)((uint8_t *)(mbuf) \ > > > > + sizeof(struct rte_mbuf))) > > > > @@ -792,6 +796,19 @@ us_vhost_parse_args(int argc, char **argv) > > > > return -1; > > > > } > > > > > > > > +#ifdef RTE_IXGBE_INC_VECTOR > > > > + if ((zero_copy =3D=3D 1) && (num_rx_descriptor <=3D VPMD_RX_BURST= )) { > > > > + RTE_LOG(INFO, VHOST_PORT, > > > > + "The RX desc num: %d is too small for PMD to > work\n" > > > > + "properly, please enlarge it to bigger than %d if\n" > > > > + "possible by the option: '--rx-desc-num > '\n" > > > > + "One alternative is disabling > RTE_IXGBE_INC_VECTOR\n" > > > > + "in config file and rebuild the libraries.\n", > > > > + num_rx_descriptor, VPMD_RX_BURST); > > > > + return -1; > > > > + } > > > > +#endif > > > > + > > > > return 0; > > > > } > > > > > > I feel there is a design problem here. > > > An application shouldn't have to care about the underlying driver. > > > > For most of other applications, as their descriptor numbers are set as > > big enough(1024 or so) , So there is no need to check the descriptor > number at the early stage of running. > > > > But for vhost zero copy(note vhost one copy also has 1024 descriptor > > number) has the default descriptor number of 32. > > Why use 32? > > because vhost zero copy implementation (working as backend) need > > support dpdk based app which use pmd virtio-net driver, And also need > support linux legacy virtio-net based application. > > When it is the linux legacy virtio-net case, on one side the qemu has > > hard code to confine the total virtio descriptor size to 256, On other = side, > legacy virtio use half of them as virtio header, and then only another ha= lf i.e. > 128 descriptors are available to use as real buffer. > > > > In PMD mode, all HW descriptors need to be filled DMA address in the rx > initial stage, otherwise there is probably exceptional in rx process. > > Based on that, we need use really limited virtio buffer to fully fill > > all hw descriptor DMA address, Or in other word, the available virtio > > descriptor size will determine the total mbuf size and hw descriptor > > size in the case of zero copy, > > > > Tune and find that 32 is the suitable value for vhost zero copy to work > properly when it legacy linux virtio case. > > Another factor to reduce the value to 32, is that mempool use ring to > > accommodate the mbuf, it cost one to flag the ring head/tail, And there= are > some other overheads like temporary mbufs(size as RX_BURST) when rx. > > Note that number descriptor should need power 2. > > > > Why the change occur at this moment? > > Recently the default rx function is modified into vector RX function, > > while it use non-vector mode (scalar mode) Rx previously, Vector RX > function need more than 32 descriptor to work properly, but scalar mode = RX > hasn't this limitation. > > > > As the RX function is changeable(you can use vector mode or non-vector)= , > and descriptor number can also be changed. > > So here in the vhost app, check if they match to make sure all things c= ould > work normally, and give some hints if they don't match. > > > > Hope the above could make it a bit clearer. :-) >=20 > Thank you for your explanation. > Your fix shows that driver and application are tightly linked. > It's a design issue. As I said: > "An application shouldn't have to care about the underlying driver." > I didn't dig enough in vhost to suggest a good fix but I'm sure someone c= ould > have an idea. > Agree with you, there is something linked between app and driver, but that'= s due to a few things: 1.Qume has hard code to confine the total vring size; 2.PMD driver need fully fill the dma address of descriptor at setup stage; 3.PMD driver use vector PMD as default path which require more than 32 in t= he burst RX; 4.Zero copy need use external buffer directly and set to HW descriptor dma = address; Except for item 3, everything could not be removed or ignored easily. As for item 3, I don't know why we set vector PMD as default path, while pe= rusing performance, but at the cost of flexibility, Here is an example, vhost zero copy need check it and descriptor number. I = am not sure if the default vector pmd path will affect other app or not. On the other hand, without this check, and hint to user, User will suffer from this, vhost can't receive packets any more after forw= arding 31 packets or so, and user don't know what really happen behind, also can't easily get the in= formation of=20 their descriptor is not enough for vector pmd RX. Then this is followed by painfully debugging and figuring out the issue. While with this check and hint to user, things go smoothly. Meanwhile we can waiting for other people's viewpoints. Thanks Changchun