From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 4277B689B for ; Fri, 26 Dec 2014 02:26:29 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by orsmga102.jf.intel.com with ESMTP; 25 Dec 2014 17:24:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.07,645,1413270000"; d="scan'208";a="653260738" Received: from pgsmsx102.gar.corp.intel.com ([10.221.44.80]) by fmsmga002.fm.intel.com with ESMTP; 25 Dec 2014 17:26:15 -0800 Received: from pgsmsx108.gar.corp.intel.com (10.221.44.103) by PGSMSX102.gar.corp.intel.com (10.221.44.80) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 26 Dec 2014 09:26:14 +0800 Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by PGSMSX108.gar.corp.intel.com (10.221.44.103) with Microsoft SMTP Server (TLS) id 14.3.195.1; Fri, 26 Dec 2014 09:26:13 +0800 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.216]) by SHSMSX101.ccr.corp.intel.com ([169.254.1.110]) with mapi id 14.03.0195.001; Fri, 26 Dec 2014 09:26:12 +0800 From: "Ouyang, Changchun" To: Vlad Zolotarov , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS Thread-Index: AQHQHznC9fbr7m6mLEynsGIYrmMT5JyeB1MAgAGQNdCAADQPgIABQoyQ Date: Fri, 26 Dec 2014 01:26:11 +0000 Message-ID: References: <1419389808-9559-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-6-git-send-email-changchun.ouyang@intel.com> <549A97F6.30901@cloudius-systems.com> <549C1359.7080107@cloudius-systems.com> In-Reply-To: <549C1359.7080107@cloudius-systems.com> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 26 Dec 2014 01:26:30 -0000 > -----Original Message----- > From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > Sent: Thursday, December 25, 2014 9:39 PM > To: Ouyang, Changchun; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS >=20 >=20 > On 12/25/14 04:43, Ouyang, Changchun wrote: > > Hi, > > Sorry miss some comments, so continue my response below, > > > >> -----Original Message----- > >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] > >> Sent: Wednesday, December 24, 2014 6:40 PM > >> To: Ouyang, Changchun; dev@dpdk.org > >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS > >> > >> > >> On 12/24/14 07:23, Ouyang Changchun wrote: > >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable > VF > >> RSS. > >>> The psrtype will determine how many queues the received packets will > >>> distribute to, and the value of psrtype should depends on both facet: > >>> max VF rxq number which has been negotiated with PF, and the number > >>> of > >> rxq specified in config on guest. > >>> Signed-off-by: Changchun Ouyang > >>> --- > >>> lib/librte_pmd_ixgbe/ixgbe_pf.c | 15 +++++++ > >>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92 > >> ++++++++++++++++++++++++++++++++++----- > >>> 2 files changed, 97 insertions(+), 10 deletions(-) > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c > >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev > >> *eth_dev) > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw- > mac.num_rar_entries), 0); > >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw- > mac.num_rar_entries), 0); > >>> > >>> + /* > >>> + * VF RSS can support at most 4 queues for each VF, even if > >>> + * 8 queues are available for each VF, it need refine to 4 > >>> + * queues here due to this limitation, otherwise no queue > >>> + * will receive any packet even RSS is enabled. > >> According to Table 7-3 in the 82599 spec RSS is not available when > >> port is configured to have 8 queues per pool. This means that if u > >> see this configuration u may immediately disable RSS flow in your code= . > >> > >>> + */ > >>> + if (eth_dev->data->dev_conf.rxmode.mq_mode =3D=3D > >> ETH_MQ_RX_VMDQ_RSS) { > >>> + if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D=3D 8) { > >>> + RTE_ETH_DEV_SRIOV(eth_dev).active =3D > >> ETH_32_POOLS; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool =3D 4; > >>> + RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx =3D > >>> + dev_num_vf(eth_dev) * 4; > >> According to 82599 spec u can't do that since RSS is not allowed when > >> port is configured to have 8 function per-VF. Have u verified that > >> this works? If yes, then spec should be updated. > >> > >>> + } > >>> + } > >>> + > >>> /* set VMDq map to default PF pool */ > >>> hw->mac.ops.set_vmdq(hw, 0, > >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx); > >>> > >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> index f69abda..a7c17a4 100644 > >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c > >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct > >> igb_rx_queue *rxq) > >>> } > >>> > >>> static int > >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) { > >>> + struct ixgbe_hw *hw; > >>> + uint32_t mrqc; > >>> + > >>> + ixgbe_rss_configure(dev); > >>> + > >>> + hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> + > >>> + /* MRQC: enable VF RSS */ > >>> + mrqc =3D IXGBE_READ_REG(hw, IXGBE_MRQC); > >>> + mrqc &=3D ~IXGBE_MRQC_MRQE_MASK; > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> + case ETH_64_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS64EN; > >>> + break; > >>> + > >>> + case ETH_32_POOLS: > >>> + case ETH_16_POOLS: > >>> + mrqc |=3D IXGBE_MRQC_VMDQRSS32EN; > >> Again, this contradicts with the spec. > > Yes, the spec say the hw can't support vf rss at all, but experiment fi= nd that > could be done. >=20 > I have just realized something - why did u have to experiment at all? U w= ork > at Intel, don't u? Can't u just ask a HW engineer that have designed this= NIC? > What do u mean by an "experiment" here? HW and dpdk team are cross-geo and cross division, experiment is often need= ed Because no spec, doc, discussion, meeting minutes is perfect and can includ= e everything absolutely correct, And don't miss one thing, are you agree? =20 Anyway Let's focus on technical question, please.=20 =20 > From my experience u can't just write some random values in the register= s I don't think it is random data at all, do you see I use a random seed to g= enerate a random data here? > and conclude that if it worked for like 5 minutes it will continue to wor= k for > the next minute... There is always a clear procedure of how HW should be > initialized and used and that's the only way it may be used since this wa= s the > way the HW has been tested. U can't assume anything in regards to reliabi= lity > if u don't follow specs and programmer manuals of HW provider. > Sorry I don't think it doesn't follow spec and programmer manuals. and if you have a bit bigger picture here(just mean the overall process for= the vf rss enabling, and nothing else, Don't think too much, hope it is clear), I am sure you won't get confused h= ere. Or my observation is you probably have another perfect solution to support = vf rss in dpdk, Right? If you have it, you could also submit your patch here, this is open = source, I think you are free to do it. > Could u clarify, please? >=20 > > We can focus on discussing the implementation firstly. > > > >>> + break; > >>> + > >>> + default: > >>> + PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode"); > >>> + return -EINVAL; > >>> + } > >>> + > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); > >>> + > >>> + return 0; > >>> +} > >>> + > >>> +static int > >>> ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) > >>> { > >>> struct ixgbe_hw *hw =3D > >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct > >> rte_eth_dev *dev) > >>> default: ixgbe_rss_disable(dev); > >>> } > >>> } else { > >>> - switch (RTE_ETH_DEV_SRIOV(dev).active) { > >>> /* > >>> * SRIOV active scheme > >>> * FIXME if support DCB/RSS together with VMDq & SRIOV > >>> */ > >>> - case ETH_64_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQEN); > >>> + switch (dev->data->dev_conf.rxmode.mq_mode) { > >>> + case ETH_MQ_RX_RSS: > >>> + case ETH_MQ_RX_VMDQ_RSS: > >>> + ixgbe_config_vf_rss(dev); > >>> break; > >>> > >>> - case ETH_32_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT4TCEN); > >>> - break; > >>> + default: > >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { > >> Sorry for nitpicking but have u considered taking this encapsulated > >> "switch- case" block into a separate function? This could make the > >> code look a lot nicer. ;) > >> > >>> + case ETH_64_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQEN); > >>> + break; > >>> > >>> - case ETH_16_POOLS: > >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >> IXGBE_MRQC_VMDQRT8TCEN); > >>> + case ETH_32_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT4TCEN); > >>> + break; > >>> + > >>> + case ETH_16_POOLS: > >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, > >>> + IXGBE_MRQC_VMDQRT8TCEN); > >>> + break; > >>> + default: > >>> + PMD_INIT_LOG(ERR, > >>> + "invalid pool number in IOV mode"); > >>> + break; > >>> + } > >>> break; > >>> - default: > >>> - PMD_INIT_LOG(ERR, "invalid pool number in IOV > >> mode"); > >>> } > >>> } > >>> > >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev > *dev) > >>> uint16_t buf_size; > >>> uint16_t i; > >>> int ret; > >>> + uint16_t valid_rxq_num; > >>> > >>> PMD_INIT_FUNC_TRACE(); > >>> hw =3D IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); > >>> > >>> + valid_rxq_num =3D RTE_MIN(dev->data->nb_rx_queues, > >>> +hw->mac.max_rx_queues); > >>> + > >>> + /* > >>> + * VMDq RSS can't support 3 queues, so config it into 4 queues, > >>> + * and give user a hint that some packets may loss if it doesn't > >>> + * poll the queue where those packets are distributed to. > >>> + */ > >>> + if (valid_rxq_num =3D=3D 3) > >>> + valid_rxq_num =3D 4; > >> Why to configure more queues that requested and not less (2)? Why to > >> configure anything at all and not return an error? > > Sorry, I don't agree this is "anything" you say, because I don't use 5,= 6,7, > 8, ..., 16, 2014, 2015,... etc. > > By considering 2 or 4, > > I prefer 4, the reason is if user need more than 3 queues per vf to do > > something, And pf has also the capability to setup 4 queues per vf, > > confining to 2 queues is also not good thing, So here try to enable 4 q= ueues, > and give user hints here. > > Btw, change it into 2 is another way, depends on other guys' more insig= ht > here. > > > >>> + > >>> + if (dev->data->nb_rx_queues > valid_rxq_num) { > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue invalid, " > >>> + "it should be equal to or less than %d", > >>> + valid_rxq_num); > >>> + return -1; > >>> + } else if (dev->data->nb_rx_queues < valid_rxq_num) > >>> + PMD_INIT_LOG(ERR, "The number of Rx queue is less " > >>> + "than the number of available Rx queues:%d, " > >>> + "packets in Rx queues(q_id >=3D %d) may loss.", > >>> + valid_rxq_num, dev->data->nb_rx_queues); > >> Who ever looks in the "INIT_LOG" if everything "work well" and u make > >> it look so by allowing this call to succeed. And then some packets > >> will just silently not arrive?! And what the used should somehow guess= to > do? > >> - Look in the "INIT_LOG"?! This is a nightmare! > >> > >>> + > >>> /* > >>> * When the VF driver issues a IXGBE_VF_RESET request, the PF > >> driver > >>> * disables the VF receipt of packets if the PF MTU is > 1500. > >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) > >>> IXGBE_PSRTYPE_IPV6HDR; > >>> #endif > >>> > >>> + /* Set RQPL for VF RSS according to max Rx queue */ > >>> + psrtype |=3D (valid_rxq_num >> 1) << > >>> + IXGBE_PSRTYPE_RQPL_SHIFT; > >>> IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype); > >>> > >>> if (dev->data->dev_conf.rxmode.enable_scatter) {