From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 2372F19F5 for ; Fri, 26 Dec 2014 03:08:08 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga102.fm.intel.com with ESMTP; 25 Dec 2014 18:08:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="4.97,862,1389772800"; d="scan'208";a="433496933" Received: from kmsmsx153.gar.corp.intel.com ([172.21.73.88]) by FMSMGA003.fm.intel.com with ESMTP; 25 Dec 2014 17:56:26 -0800 Received: from pgsmsx108.gar.corp.intel.com (10.221.44.103) by KMSMSX153.gar.corp.intel.com (172.21.73.88) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 26 Dec 2014 10:07:58 +0800 Received: from shsmsx104.ccr.corp.intel.com (10.239.4.70) by PGSMSX108.gar.corp.intel.com (10.221.44.103) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 26 Dec 2014 10:07:57 +0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.216]) by SHSMSX104.ccr.corp.intel.com ([169.254.5.182]) with mapi id 14.03.0195.001; Fri, 26 Dec 2014 10:07:56 +0800 From: "Ouyang, Changchun" To: Vlad Zolotarov , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS Thread-Index: AQHQHznC9fbr7m6mLEynsGIYrmMT5JyeB1MAgAGDnaCAADmzAIABWjYw Date: Fri, 26 Dec 2014 02:07:55 +0000 Message-ID: References: <1419389808-9559-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-6-git-send-email-changchun.ouyang@intel.com> <549A97F6.30901@cloudius-systems.com> <549C0D84.60305@cloudius-systems.com> In-Reply-To: <549C0D84.60305@cloudius-systems.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Dec 2014 02:08:10 -0000 > -----Original Message----- > From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > Sent: Thursday, December 25, 2014 9:14 PM > To: Ouyang, Changchun; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS >=20 >=20 > On 12/25/14 04:14, Ouyang, Changchun wrote: > > Hi, > > > >> -----Original Message----- > >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > >> Sent: Wednesday, December 24, 2014 6:40 PM > >> To: Ouyang, Changchun; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS > >> > >> > >> On 12/24/14 07:23, Ouyang Changchun wrote: > >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable > VF > >> RSS. > >>> The psrtype will determine how many queues the received packets will > >>> distribute to, and the value of psrtype should depends on both facet: > >>> max VF rxq number which has been negotiated with PF, and the number > >>> of > >> rxq specified in config on guest. > >>> Signed-off-by: Changchun Ouyang > >>> --- > >>> lib/librte_pmd_ixgbe/ixgbe_pf.c | 15 +++++++ > >>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92 > >> ++++++++++++++++++++++++++++++++++----- > >>> 2 files changed, 97 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev > >> *eth_dev) > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw- > mac.num_rar_entries), 0); > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw- > mac.num_rar_entries), 0); > >>> > >>> + /* > >>> + * VF RSS can support at most 4 queues for each VF, even if > >>> + * 8 queues are available for each VF, it need refine to 4 > >>> + * queues here due to this limitation, otherwise no queue > >>> + * will receive any packet even RSS is enabled. > >> According to Table 7-3 in the 82599 spec RSS is not available when > >> port is configured to have 8 queues per pool. This means that if u > >> see this configuration u may immediately disable RSS flow in your code= . > >> > > 8 queues here means the available number queue per vf, it is > > calculated according to max vfs, e.g. if max vfs is 16(or less than), > > then each vf 'COULD' have 8 queues evenly, pf early init stage estimate= this > value, but that is not precise, so need refine this. > > User don't know this estimated value, it is internal value, not come fr= om > user's input/configure. > > Hope it is clear to you. > >>> + */ > >>> + if (eth_dev->data->dev_conf.rxmode.mq_mode =3D=3D > >> ETH_MQ_RX_VMDQ_RSS) { > >>> + if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D=3D 8) { > >>> + RTE_ETH_DEV_SRIOV(eth_dev).active =3D > >> ETH_32_POOLS; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D 4; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =3D > >>> + dev_num_vf(eth_dev) * 4; > >> According to 82599 spec u can't do that since RSS is not allowed when > >> port is configured to have 8 function per-VF. Have u verified that > >> this works? If yes, then spec should be updated. > >> > > Response as above, > > Of course I have validated this. It works well. > > > >>> + } > >>> + } > >>> + > >>> /* set VMDq map to default PF pool */ > >>> hw->mac.ops.set_vmdq(hw, 0, > >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx); > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> index f69abda..a7c17a4 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct > >> igb_rx_queue *rxq) > >>> } > >>> > >>> static int > >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) { > >>> + struct ixgbe_hw *hw; > >>> + uint32_t mrqc; > >>> + > >>> + ixgbe_rss_configure(dev); > >>> + > >>> + hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> + > >>> + /* MRQC: enable VF RSS */ > >>> + mrqc =3D IXGBE_READ_REG(hw, IXGBE_MRQC); > >>> + mrqc &=3D ~IXGBE_MRQC_MRQE_MASK; > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> + case ETH_64_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS64EN; > >>> + break; > >>> + > >>> + case ETH_32_POOLS: > >>> + case ETH_16_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS32EN; > >> Again, this contradicts with the spec. > >> > >>> + break; > >>> + > >>> + default: > >>> + PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static int > >>> ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) > >>> { > >>> struct ixgbe_hw *hw =3D > >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct > >> rte_eth_dev *dev) > >>> default: ixgbe_rss_disable(dev); > >>> } > >>> } else { > >>> - switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> /* > >>> * SRIOV active scheme > >>> * FIXME if support DCB/RSS together with VMDq & SRIOV > >>> */ > >>> - case ETH_64_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQEN); > >>> + switch (dev->data->dev_conf.rxmode.mq_mode) { > >>> + case ETH_MQ_RX_RSS: > >>> + case ETH_MQ_RX_VMDQ_RSS: > >>> + ixgbe_config_vf_rss(dev); > >>> break; > >>> > >>> - case ETH_32_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT4TCEN); > >>> - break; > >>> + default: > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >> Sorry for nitpicking but have u considered taking this encapsulated > >> "switch- case" block into a separate function? This could make the > >> code look a lot nicer. ;) > > Only one place use it, so don't need make it a function, And I prefer > > to the current code. >=20 > Functions may be used not only to have a repeatedly called code but also = to > make a caller code more readable. Encapsulated switch-case is one of the > examples of a *not* readable code constructs which should be avoided. >=20 > > > >>> + case ETH_64_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQEN); > >>> + break; > >>> > >>> - case ETH_16_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT8TCEN); > >>> + case ETH_32_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT4TCEN); > >>> + break; > >>> + > >>> + case ETH_16_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT8TCEN); > >>> + break; > >>> + default: > >>> + PMD_INIT_LOG(ERR, > >>> + "invalid pool number in IOV mode"); > >>> + break; > >>> + } > >>> break; > >>> - default: > >>> - PMD_INIT_LOG(ERR, "invalid pool number in IOV > >> mode"); > >>> } > >>> } > >>> > >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev > *dev) > >>> uint16_t buf_size; > >>> uint16_t i; > >>> int ret; > >>> + uint16_t valid_rxq_num; > >>> > >>> PMD_INIT_FUNC_TRACE(); > >>> hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> > >>> + valid_rxq_num =3D RTE_MIN(dev->data->nb_rx_queues, > >>> +hw->mac.max_rx_queues); > >>> + > >>> + /* > >>> + * VMDq RSS can't support 3 queues, so config it into 4 queues, > >>> + * and give user a hint that some packets may loss if it doesn't > >>> + * poll the queue where those packets are distributed to. > >>> + */ > >>> + if (valid_rxq_num =3D=3D 3) > >>> + valid_rxq_num =3D 4; > >> Why to configure more queues that requested and not less (2)? Why to > >> configure anything at all and not return an error? > >> > >>> + > >>> + if (dev->data->nb_rx_queues > valid_rxq_num) { > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue invalid, " > >>> + "it should be equal to or less than %d", > >>> + valid_rxq_num); > >>> + return -1; > >>> + } else if (dev->data->nb_rx_queues < valid_rxq_num) > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue is less " > >>> + "than the number of available Rx queues:%d, " > >>> + "packets in Rx queues(q_id >=3D %d) may loss.", > >>> + valid_rxq_num, dev->data->nb_rx_queues); > >> Who ever looks in the "INIT_LOG" if everything "work well" and u make = it > >> look so by allowing this call to succeed. And then some packets will j= ust > >> silently not arrive?! And what the used should somehow guess to do? > >> - Look in the "INIT_LOG"?! This is a nightmare! > > Sorry, I don't think so again, if user find any packets loss, he will c= are for log, > > Then he can find that log there, then user can refine its rxq number du= e > the wrong rxq number, > > Why is it a nightmare? >=20 > Because usually u expect that if the function call returns with a > success it means a success. Why a user has to learn that a device > configuration function was provided with wrong parameters from the > packet loss? If parameters are not allowed u expect to get an error as a > return value. Since when errors are returned in a form of a log message? > Why do u think there is a living person running a DPDK based > application? How do u expect somebody build an automated environment > when part of errors are returned in some log? Should he/she add a log > parser? If it is automated environment, the log parsing should also be automated, At least it should report to living person. But this is another topic we c= ould stop here. Let's focus on vf rss > On the other hand, why do u think 4 queues is a better option for a user > than 2 queue when he asked for 3 queues? What kind of heuristics is that? >=20 > To summarize - it would be much better if u just returned an EINVAL > error in that case. As I said, it seems you have strong reason to use 2 queues here, and return= error If user try to use 3 queues, I will consider it in v4, but still need insig= hts from more guys here. >=20 > > > > I don't agree with you about "silently not arrive", because we have hin= t/log > there. > > > > Return error here is also possible way, >=20 > It's the only possible way! ;) >=20 > > Again need other guys' insight here. > > > >>> + > >>> /* > >>> * When the VF driver issues a IXGBE_VF_RESET request, the PF > >> driver > >>> * disables the VF receipt of packets if the PF MTU is > 1500. > >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) > >>> IXGBE_PSRTYPE_IPV6HDR; > >>> #endif > >>> > >>> + /* Set RQPL for VF RSS according to max Rx queue */ > >>> + psrtype |=3D (valid_rxq_num >> 1) << > >>> + IXGBE_PSRTYPE_RQPL_SHIFT; > >>> IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype); > >>> > >>> if (dev->data->dev_conf.rxmode.enable_scatter) {