From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wg0-f46.google.com (mail-wg0-f46.google.com [74.125.82.46]) by dpdk.org (Postfix) with ESMTP id 17F2A282 for ; Thu, 25 Dec 2014 14:13:43 +0100 (CET) Received: by mail-wg0-f46.google.com with SMTP id x13so13123134wgg.33 for ; Thu, 25 Dec 2014 05:13:43 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:message-id:date:from:user-agent:mime-version:to :subject:references:in-reply-to:content-type :content-transfer-encoding; bh=cHPyzvQq9aB/keeybnHxOCkl6Voii4o+/dKil1u0v4s=; b=dwP7dvppLjy640uZ0aWXsOUWrYX8VCVzN2ta2nwYPU+0GgjpIelozWUEyInBKwwtmY P8DGVnQ2O9tJRiXNM3WCZdw78j/EEhfqhB2ZCnumFse2r7KWDZ3wKrq6uYPEPsJo/4uT pbZrEOdFI/s/3toKa9K7J+77G0mh62R7x5j3nN3uurOikWTRG0kuZ7hNTYli8gMY53P5 LW/z3pdPXw4HqPBaX/OtTwn/h2NeRSybpMqisYC5OlQQveZF8E442PDEBl86kTM+SwX/ nxuvchY8sQRWTTrUnc1YJyXjSQ541SFWIj1DYAONZp2bIZxNJoMCOa0tirv9eKsiFvJ1 jeOQ== X-Gm-Message-State: ALoCoQntEipYXYnclSpg9PSe1co5qE7jRInPgBn3D09aasobq2XqfpmLflrPkdwjJv3Pn3O5dWEW X-Received: by 10.194.110.69 with SMTP id hy5mr73973302wjb.121.1419513222903; Thu, 25 Dec 2014 05:13:42 -0800 (PST) Received: from [10.0.0.165] (system.cloudius-systems.com. [84.94.198.183]) by mx.google.com with ESMTPSA id nj9sm24707383wic.10.2014.12.25.05.13.41 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 25 Dec 2014 05:13:42 -0800 (PST) Message-ID: <549C0D84.60305@cloudius-systems.com> Date: Thu, 25 Dec 2014 15:13:40 +0200 From: Vlad Zolotarov User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.3.0 MIME-Version: 1.0 To: "Ouyang, Changchun" , "dev@dpdk.org" References: <1419389808-9559-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-1-git-send-email-changchun.ouyang@intel.com> <1419398584-19520-6-git-send-email-changchun.ouyang@intel.com> <549A97F6.30901@cloudius-systems.com> In-Reply-To: Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Dec 2014 13:13:43 -0000 On 12/25/14 04:14, Ouyang, Changchun wrote: > Hi, > >> -----Original Message----- >> From: Vlad Zolotarov [mailto:vladz@cloudius-systems.com] >> Sent: Wednesday, December 24, 2014 6:40 PM >> To: Ouyang, Changchun; dev@dpdk.org >> Subject: Re: [dpdk-dev] [PATCH v3 5/6] ixgbe: Config VF RSS >> >> >> On 12/24/14 07:23, Ouyang Changchun wrote: >>> It needs config RSS and IXGBE_MRQC and IXGBE_VFPSRTYPE to enable VF >> RSS. >>> The psrtype will determine how many queues the received packets will >>> distribute to, and the value of psrtype should depends on both facet: >>> max VF rxq number which has been negotiated with PF, and the number of >> rxq specified in config on guest. >>> Signed-off-by: Changchun Ouyang >>> --- >>> lib/librte_pmd_ixgbe/ixgbe_pf.c | 15 +++++++ >>> lib/librte_pmd_ixgbe/ixgbe_rxtx.c | 92 >> ++++++++++++++++++++++++++++++++++----- >>> 2 files changed, 97 insertions(+), 10 deletions(-) >>> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_pf.c >>> b/lib/librte_pmd_ixgbe/ixgbe_pf.c index cbb0145..9c9dad8 100644 >>> --- a/lib/librte_pmd_ixgbe/ixgbe_pf.c >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_pf.c >>> @@ -187,6 +187,21 @@ int ixgbe_pf_host_configure(struct rte_eth_dev >> *eth_dev) >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_LO(hw- >>> mac.num_rar_entries), 0); >>> IXGBE_WRITE_REG(hw, IXGBE_MPSAR_HI(hw- >>> mac.num_rar_entries), 0); >>> >>> + /* >>> + * VF RSS can support at most 4 queues for each VF, even if >>> + * 8 queues are available for each VF, it need refine to 4 >>> + * queues here due to this limitation, otherwise no queue >>> + * will receive any packet even RSS is enabled. >> According to Table 7-3 in the 82599 spec RSS is not available when port is >> configured to have 8 queues per pool. This means that if u see this >> configuration u may immediately disable RSS flow in your code. >> > 8 queues here means the available number queue per vf, it is calculated according to max vfs, > e.g. if max vfs is 16(or less than), then each vf 'COULD' have 8 queues evenly, pf early init stage estimate this value, > but that is not precise, so need refine this. > User don't know this estimated value, it is internal value, not come from user's input/configure. > Hope it is clear to you. >>> + */ >>> + if (eth_dev->data->dev_conf.rxmode.mq_mode == >> ETH_MQ_RX_VMDQ_RSS) { >>> + if (RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool == 8) { >>> + RTE_ETH_DEV_SRIOV(eth_dev).active = >> ETH_32_POOLS; >>> + RTE_ETH_DEV_SRIOV(eth_dev).nb_q_per_pool = 4; >>> + RTE_ETH_DEV_SRIOV(eth_dev).def_pool_q_idx = >>> + dev_num_vf(eth_dev) * 4; >> According to 82599 spec u can't do that since RSS is not allowed when port is >> configured to have 8 function per-VF. Have u verified that this works? If yes, >> then spec should be updated. >> > Response as above, > Of course I have validated this. It works well. > >>> + } >>> + } >>> + >>> /* set VMDq map to default PF pool */ >>> hw->mac.ops.set_vmdq(hw, 0, >>> RTE_ETH_DEV_SRIOV(eth_dev).def_vmdq_idx); >>> >>> diff --git a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>> b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>> index f69abda..a7c17a4 100644 >>> --- a/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>> +++ b/lib/librte_pmd_ixgbe/ixgbe_rxtx.c >>> @@ -3327,6 +3327,39 @@ ixgbe_alloc_rx_queue_mbufs(struct >> igb_rx_queue *rxq) >>> } >>> >>> static int >>> +ixgbe_config_vf_rss(struct rte_eth_dev *dev) { >>> + struct ixgbe_hw *hw; >>> + uint32_t mrqc; >>> + >>> + ixgbe_rss_configure(dev); >>> + >>> + hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); >>> + >>> + /* MRQC: enable VF RSS */ >>> + mrqc = IXGBE_READ_REG(hw, IXGBE_MRQC); >>> + mrqc &= ~IXGBE_MRQC_MRQE_MASK; >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { >>> + case ETH_64_POOLS: >>> + mrqc |= IXGBE_MRQC_VMDQRSS64EN; >>> + break; >>> + >>> + case ETH_32_POOLS: >>> + case ETH_16_POOLS: >>> + mrqc |= IXGBE_MRQC_VMDQRSS32EN; >> Again, this contradicts with the spec. >> >>> + break; >>> + >>> + default: >>> + PMD_INIT_LOG(ERR, "Invalid pool number in IOV mode"); >>> + return -EINVAL; >>> + } >>> + >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, mrqc); >>> + >>> + return 0; >>> +} >>> + >>> +static int >>> ixgbe_dev_mq_rx_configure(struct rte_eth_dev *dev) >>> { >>> struct ixgbe_hw *hw = >>> @@ -3358,24 +3391,38 @@ ixgbe_dev_mq_rx_configure(struct >> rte_eth_dev *dev) >>> default: ixgbe_rss_disable(dev); >>> } >>> } else { >>> - switch (RTE_ETH_DEV_SRIOV(dev).active) { >>> /* >>> * SRIOV active scheme >>> * FIXME if support DCB/RSS together with VMDq & SRIOV >>> */ >>> - case ETH_64_POOLS: >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, >> IXGBE_MRQC_VMDQEN); >>> + switch (dev->data->dev_conf.rxmode.mq_mode) { >>> + case ETH_MQ_RX_RSS: >>> + case ETH_MQ_RX_VMDQ_RSS: >>> + ixgbe_config_vf_rss(dev); >>> break; >>> >>> - case ETH_32_POOLS: >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, >> IXGBE_MRQC_VMDQRT4TCEN); >>> - break; >>> + default: >>> + switch (RTE_ETH_DEV_SRIOV(dev).active) { >> Sorry for nitpicking but have u considered taking this encapsulated "switch- >> case" block into a separate function? This could make the code look a lot >> nicer. ;) > Only one place use it, so don't need make it a function, > And I prefer to the current code. Functions may be used not only to have a repeatedly called code but also to make a caller code more readable. Encapsulated switch-case is one of the examples of a *not* readable code constructs which should be avoided. > >>> + case ETH_64_POOLS: >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, >>> + IXGBE_MRQC_VMDQEN); >>> + break; >>> >>> - case ETH_16_POOLS: >>> - IXGBE_WRITE_REG(hw, IXGBE_MRQC, >> IXGBE_MRQC_VMDQRT8TCEN); >>> + case ETH_32_POOLS: >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, >>> + IXGBE_MRQC_VMDQRT4TCEN); >>> + break; >>> + >>> + case ETH_16_POOLS: >>> + IXGBE_WRITE_REG(hw, IXGBE_MRQC, >>> + IXGBE_MRQC_VMDQRT8TCEN); >>> + break; >>> + default: >>> + PMD_INIT_LOG(ERR, >>> + "invalid pool number in IOV mode"); >>> + break; >>> + } >>> break; >>> - default: >>> - PMD_INIT_LOG(ERR, "invalid pool number in IOV >> mode"); >>> } >>> } >>> >>> @@ -3989,10 +4036,32 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) >>> uint16_t buf_size; >>> uint16_t i; >>> int ret; >>> + uint16_t valid_rxq_num; >>> >>> PMD_INIT_FUNC_TRACE(); >>> hw = IXGBE_DEV_PRIVATE_TO_HW(dev->data->dev_private); >>> >>> + valid_rxq_num = RTE_MIN(dev->data->nb_rx_queues, >>> +hw->mac.max_rx_queues); >>> + >>> + /* >>> + * VMDq RSS can't support 3 queues, so config it into 4 queues, >>> + * and give user a hint that some packets may loss if it doesn't >>> + * poll the queue where those packets are distributed to. >>> + */ >>> + if (valid_rxq_num == 3) >>> + valid_rxq_num = 4; >> Why to configure more queues that requested and not less (2)? Why to >> configure anything at all and not return an error? >> >>> + >>> + if (dev->data->nb_rx_queues > valid_rxq_num) { >>> + PMD_INIT_LOG(ERR, "The number of Rx queue invalid, " >>> + "it should be equal to or less than %d", >>> + valid_rxq_num); >>> + return -1; >>> + } else if (dev->data->nb_rx_queues < valid_rxq_num) >>> + PMD_INIT_LOG(ERR, "The number of Rx queue is less " >>> + "than the number of available Rx queues:%d, " >>> + "packets in Rx queues(q_id >= %d) may loss.", >>> + valid_rxq_num, dev->data->nb_rx_queues); >> Who ever looks in the "INIT_LOG" if everything "work well" and u make it >> look so by allowing this call to succeed. And then some packets will just >> silently not arrive?! And what the used should somehow guess to do? >> - Look in the "INIT_LOG"?! This is a nightmare! > Sorry, I don't think so again, if user find any packets loss, he will care for log, > Then he can find that log there, then user can refine its rxq number due the wrong rxq number, > Why is it a nightmare? Because usually u expect that if the function call returns with a success it means a success. Why a user has to learn that a device configuration function was provided with wrong parameters from the packet loss? If parameters are not allowed u expect to get an error as a return value. Since when errors are returned in a form of a log message? Why do u think there is a living person running a DPDK based application? How do u expect somebody build an automated environment when part of errors are returned in some log? Should he/she add a log parser? On the other hand, why do u think 4 queues is a better option for a user than 2 queue when he asked for 3 queues? What kind of heuristics is that? To summarize - it would be much better if u just returned an EINVAL error in that case. > > I don't agree with you about "silently not arrive", because we have hint/log there. > > Return error here is also possible way, It's the only possible way! ;) > Again need other guys' insight here. > >>> + >>> /* >>> * When the VF driver issues a IXGBE_VF_RESET request, the PF >> driver >>> * disables the VF receipt of packets if the PF MTU is > 1500. >>> @@ -4094,6 +4163,9 @@ ixgbevf_dev_rx_init(struct rte_eth_dev *dev) >>> IXGBE_PSRTYPE_IPV6HDR; >>> #endif >>> >>> + /* Set RQPL for VF RSS according to max Rx queue */ >>> + psrtype |= (valid_rxq_num >> 1) << >>> + IXGBE_PSRTYPE_RQPL_SHIFT; >>> IXGBE_WRITE_REG(hw, IXGBE_VFPSRTYPE, psrtype); >>> >>> if (dev->data->dev_conf.rxmode.enable_scatter) {