From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 79C6D689B for ; Fri, 26 Dec 2014 02:52:31 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga103.fm.intel.com with ESMTP; 25 Dec 2014 17:48:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,646,1413270000"; d="scan'208";a="653265762" Received: from pgsmsx104.gar.corp.intel.com ([10.221.44.91]) by fmsmga002.fm.intel.com with ESMTP; 25 Dec 2014 17:52:28 -0800 Received: from shsmsx103.ccr.corp.intel.com (10.239.4.69) by PGSMSX104.gar.corp.intel.com (10.221.44.91) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 26 Dec 2014 09:52:27 +0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.216]) by SHSMSX103.ccr.corp.intel.com ([169.254.4.240]) with mapi id 14.03.0195.001; Fri, 26 Dec 2014 09:52:26 +0800 From: "Ouyang, Changchun" To: Vlad Zolotarov , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS Thread-Index: AQHQHznC9fbr7m6mLEynsGIYrmMT5JyeB1MAgAGQNdCAAC7zAIABUPNQ Date: Fri, 26 Dec 2014 01:52:25 +0000 Message-ID: References: <1419389808-9559-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-6-git-send-email-changchun.ouyang@intel.com> <549A97F6.30901@cloudius-systems.com> <549C0F10.8050402@cloudius-systems.com> In-Reply-To: <549C0F10.8050402@cloudius-systems.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Dec 2014 01:52:32 -0000 > -----Original Message----- > From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > Sent: Thursday, December 25, 2014 9:20 PM > To: Ouyang, Changchun; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS >=20 >=20 > On 12/25/14 04:43, Ouyang, Changchun wrote: > > Hi, > > Sorry miss some comments, so continue my response below, > > > >> -----Original Message----- > >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > >> Sent: Wednesday, December 24, 2014 6:40 PM > >> To: Ouyang, Changchun; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS > >> > >> > >> On 12/24/14 07:23, Ouyang Changchun wrote: > >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable > VF > >> RSS. > >>> The psrtype will determine how many queues the received packets will > >>> distribute to, and the value of psrtype should depends on both facet: > >>> max VF rxq number which has been negotiated with PF, and the number > >>> of > >> rxq specified in config on guest. > >>> Signed-off-by: Changchun Ouyang > >>> --- > >>> lib/librte_pmd_ixgbe/ixgbe_pf.c | 15 +++++++ > >>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92 > >> ++++++++++++++++++++++++++++++++++----- > >>> 2 files changed, 97 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev > >> *eth_dev) > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw- > mac.num_rar_entries), 0); > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw- > mac.num_rar_entries), 0); > >>> > >>> + /* > >>> + * VF RSS can support at most 4 queues for each VF, even if > >>> + * 8 queues are available for each VF, it need refine to 4 > >>> + * queues here due to this limitation, otherwise no queue > >>> + * will receive any packet even RSS is enabled. > >> According to Table 7-3 in the 82599 spec RSS is not available when > >> port is configured to have 8 queues per pool. This means that if u > >> see this configuration u may immediately disable RSS flow in your code= . > >> > >>> + */ > >>> + if (eth_dev->data->dev_conf.rxmode.mq_mode =3D=3D > >> ETH_MQ_RX_VMDQ_RSS) { > >>> + if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D=3D 8) { > >>> + RTE_ETH_DEV_SRIOV(eth_dev).active =3D > >> ETH_32_POOLS; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D 4; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =3D > >>> + dev_num_vf(eth_dev) * 4; > >> According to 82599 spec u can't do that since RSS is not allowed when > >> port is configured to have 8 function per-VF. Have u verified that > >> this works? If yes, then spec should be updated. > >> > >>> + } > >>> + } > >>> + > >>> /* set VMDq map to default PF pool */ > >>> hw->mac.ops.set_vmdq(hw, 0, > >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx); > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> index f69abda..a7c17a4 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct > >> igb_rx_queue *rxq) > >>> } > >>> > >>> static int > >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) { > >>> + struct ixgbe_hw *hw; > >>> + uint32_t mrqc; > >>> + > >>> + ixgbe_rss_configure(dev); > >>> + > >>> + hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> + > >>> + /* MRQC: enable VF RSS */ > >>> + mrqc =3D IXGBE_READ_REG(hw, IXGBE_MRQC); > >>> + mrqc &=3D ~IXGBE_MRQC_MRQE_MASK; > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> + case ETH_64_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS64EN; > >>> + break; > >>> + > >>> + case ETH_32_POOLS: > >>> + case ETH_16_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS32EN; > >> Again, this contradicts with the spec. > > Yes, the spec say the hw can't support vf rss at all, but experiment fi= nd that > could be done. >=20 > The spec explicitly say that VF RSS *is* supported in particular in the t= able > mentioned above. But the spec(January 2014 revision 2.9) on my hand says: "in IOV mode, VMDq= +RSS mode is not available" in note of section 4.6.10.2.1 > What your code is doing is that in case of 16 VFs u setup a 32 pools > configuration and use only 16 out of them. But I don't see any big issue here, in this case, each vf COULD have 8 queu= es, like I said before, but this is estimation value, actually only 4 queue= s Are really available for one vf, you can refer to spec for the correctness = here. >=20 > > We can focus on discussing the implementation firstly. >=20 > > > >>> + break; > >>> + > >>> + default: > >>> + PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static int > >>> ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) > >>> { > >>> struct ixgbe_hw *hw =3D > >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct > >> rte_eth_dev *dev) > >>> default: ixgbe_rss_disable(dev); > >>> } > >>> } else { > >>> - switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> /* > >>> * SRIOV active scheme > >>> * FIXME if support DCB/RSS together with VMDq & SRIOV > >>> */ > >>> - case ETH_64_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQEN); > >>> + switch (dev->data->dev_conf.rxmode.mq_mode) { > >>> + case ETH_MQ_RX_RSS: > >>> + case ETH_MQ_RX_VMDQ_RSS: > >>> + ixgbe_config_vf_rss(dev); > >>> break; > >>> > >>> - case ETH_32_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT4TCEN); > >>> - break; > >>> + default: > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >> Sorry for nitpicking but have u considered taking this encapsulated > >> "switch- case" block into a separate function? This could make the > >> code look a lot nicer. ;) > >> > >>> + case ETH_64_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQEN); > >>> + break; > >>> > >>> - case ETH_16_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT8TCEN); > >>> + case ETH_32_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT4TCEN); > >>> + break; > >>> + > >>> + case ETH_16_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT8TCEN); > >>> + break; > >>> + default: > >>> + PMD_INIT_LOG(ERR, > >>> + "invalid pool number in IOV mode"); > >>> + break; > >>> + } > >>> break; > >>> - default: > >>> - PMD_INIT_LOG(ERR, "invalid pool number in IOV > >> mode"); > >>> } > >>> } > >>> > >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev > *dev) > >>> uint16_t buf_size; > >>> uint16_t i; > >>> int ret; > >>> + uint16_t valid_rxq_num; > >>> > >>> PMD_INIT_FUNC_TRACE(); > >>> hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> > >>> + valid_rxq_num =3D RTE_MIN(dev->data->nb_rx_queues, > >>> +hw->mac.max_rx_queues); > >>> + > >>> + /* > >>> + * VMDq RSS can't support 3 queues, so config it into 4 queues, > >>> + * and give user a hint that some packets may loss if it doesn't > >>> + * poll the queue where those packets are distributed to. > >>> + */ > >>> + if (valid_rxq_num =3D=3D 3) > >>> + valid_rxq_num =3D 4; > >> Why to configure more queues that requested and not less (2)? Why to > >> configure anything at all and not return an error? > > Sorry, I don't agree this is "anything" you say, because I don't use 5,= 6,7, > 8, ..., 16, 2014, 2015,... etc. > > By considering 2 or 4, > > I prefer 4, the reason is if user need more than 3 queues per vf to do > > something, And pf has also the capability to setup 4 queues per vf, > > confining to 2 queues is also not good thing, So here try to enable 4 q= ueues, > and give user hints here. > > Btw, change it into 2 is another way, depends on other guys' more insig= ht > here. >=20 > Like I said before, trying to guess what user wants is a way to making a = code > that is very hard to use and to maintain. Pls., just return an error and = let the > user code deal with it the way he/she really wants and not the way u *thi= nk* > he/she wants. >=20 I didn't disagree on this, either :-) If you have strong reason for this way and more guys agree with it, I will modify it probably in v4.=20 > > > >>> + > >>> + if (dev->data->nb_rx_queues > valid_rxq_num) { > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue invalid, " > >>> + "it should be equal to or less than %d", > >>> + valid_rxq_num); > >>> + return -1; > >>> + } else if (dev->data->nb_rx_queues < valid_rxq_num) > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue is less " > >>> + "than the number of available Rx queues:%d, " > >>> + "packets in Rx queues(q_id >=3D %d) may loss.", > >>> + valid_rxq_num, dev->data->nb_rx_queues); > >> Who ever looks in the "INIT_LOG" if everything "work well" and u make > >> it look so by allowing this call to succeed. And then some packets > >> will just silently not arrive?! And what the used should somehow guess= to > do? > >> - Look in the "INIT_LOG"?! This is a nightmare! > >> > >>> + > >>> /* > >>> * When the VF driver issues a IXGBE_VF_RESET request, the PF > >> driver > >>> * disables the VF receipt of packets if the PF MTU is > 1500. > >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) > >>> IXGBE_PSRTYPE_IPV6HDR; > >>> #endif > >>> > >>> + /* Set RQPL for VF RSS according to max Rx queue */ > >>> + psrtype |=3D (valid_rxq_num >> 1) << > >>> + IXGBE_PSRTYPE_RQPL_SHIFT; > >>> IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype); > >>> > >>> if (dev->data->dev_conf.rxmode.enable_scatter) {