From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 255AFA04DC; Mon, 19 Oct 2020 15:08:56 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6D5D8C92E; Mon, 19 Oct 2020 15:08:54 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id 0392AC92C for ; Mon, 19 Oct 2020 15:08:50 +0200 (CEST) IronPort-SDR: gMwRKd8OPDYRy9aiboHCzdXbxs5DEsgVrzZpRyi527zHSHDZUViqG3jcTMaZAscS5peyhUp3Mz XB9XGEIm/3/Q== X-IronPort-AV: E=McAfee;i="6000,8403,9778"; a="228660874" X-IronPort-AV: E=Sophos;i="5.77,394,1596524400"; d="scan'208";a="228660874" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2020 06:08:03 -0700 IronPort-SDR: oT95CtDJSky60Oa0i+HUuXEiGrswgaiFBCGej1gAbu+drgFcdEaVvULvtW9JVdE0YweSVolaz0 +pQnuhVyMNpg== X-IronPort-AV: E=Sophos;i="5.77,394,1596524400"; d="scan'208";a="347426228" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.252.19.15]) ([10.252.19.15]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 19 Oct 2020 06:08:00 -0700 To: "Ananyev, Konstantin" , "Yang, SteveX" , "Zhang, Qi Z" , "dev@dpdk.org" Cc: "Zhao1, Wei" , "Guo, Jia" , "Yang, Qiming" , "Wu, Jingjing" , "Xing, Beilei" , "Stokes, Ian" References: <20200923040909.73418-1-stevex.yang@intel.com> <20200928065541.7520-1-stevex.yang@intel.com> <20200928065541.7520-4-stevex.yang@intel.com> <8459e979b76c43cdbd5a9fbd809f9b00@intel.com> <6ad9e3ec00194e31891d97849135655c@intel.com> <7704b7ce95fd4db2a9c6a8a33c3f0805@intel.com> <77ac2293-e532-e702-2370-c07cdd957c57@intel.com> From: Ferruh Yigit Message-ID: <483bd509-82b9-9724-d28c-c517ef091e0c@intel.com> Date: Mon, 19 Oct 2020 14:07:56 +0100 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v4 3/5] net/ice: fix max mtu size packets with vlan tag cannot be received by default X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 10/19/2020 11:49 AM, Ananyev, Konstantin wrote: > >>> -----Original Message----- >>> From: Ferruh Yigit >>> Sent: Wednesday, October 14, 2020 11:38 PM >>> To: Zhang, Qi Z ; Yang, SteveX >>> ; Ananyev, Konstantin >>> ; dev@dpdk.org >>> Cc: Zhao1, Wei ; Guo, Jia ; Yang, >>> Qiming ; Wu, Jingjing ; >>> Xing, Beilei ; Stokes, Ian >>> Subject: Re: [dpdk-dev] [PATCH v4 3/5] net/ice: fix max mtu size packets >>> with vlan tag cannot be received by default >>> >>> On 9/30/2020 3:32 AM, Zhang, Qi Z wrote: >>>> >>>> >>>>> -----Original Message----- >>>>> From: Yang, SteveX >>>>> Sent: Wednesday, September 30, 2020 9:32 AM >>>>> To: Zhang, Qi Z ; Ananyev, Konstantin >>>>> ; dev@dpdk.org >>>>> Cc: Zhao1, Wei ; Guo, Jia ; >>>>> Yang, Qiming ; Wu, Jingjing >>>>> ; Xing, Beilei >>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with >>>>> vlan tag cannot be received by default >>>>> >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Zhang, Qi Z >>>>>> Sent: Wednesday, September 30, 2020 8:35 AM >>>>>> To: Ananyev, Konstantin ; Yang, >>> SteveX >>>>>> ; dev@dpdk.org >>>>>> Cc: Zhao1, Wei ; Guo, Jia ; >>>>>> Yang, Qiming ; Wu, Jingjing >>>>>> ; Xing, Beilei >>>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with >>>>>> vlan tag cannot be received by default >>>>>> >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Ananyev, Konstantin >>>>>>> Sent: Wednesday, September 30, 2020 7:02 AM >>>>>>> To: Zhang, Qi Z ; Yang, SteveX >>>>>>> ; dev@dpdk.org >>>>>>> Cc: Zhao1, Wei ; Guo, Jia ; >>>>>>> Yang, Qiming ; Wu, Jingjing >>>>>>> ; Xing, Beilei >>>>>>> Subject: RE: [PATCH v4 3/5] net/ice: fix max mtu size packets with >>>>>>> vlan tag cannot be received by default >>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Yang, SteveX >>>>>>>>> Sent: Monday, September 28, 2020 2:56 PM >>>>>>>>> To: dev@dpdk.org >>>>>>>>> Cc: Zhao1, Wei ; Guo, Jia >>>>>>>>> ; Yang, Qiming ; >>> Zhang, >>>>>>>>> Qi Z ; Wu, Jingjing >>>>>>>>> ; Xing, Beilei ; >>>>>>>>> Ananyev, Konstantin ; Yang, >>> SteveX >>>>>>>>> >>>>>>>>> Subject: [PATCH v4 3/5] net/ice: fix max mtu size packets with >>>>>>>>> vlan tag cannot be received by default >>>>>>>>> >>>>>>>>> testpmd will initialize default max packet length to 1518 which >>>>>>>>> doesn't include vlan tag size in ether overheader. Once, send the >>>>>>>>> max mtu length packet with vlan tag, the max packet length will >>>>>>>>> exceed 1518 that will cause packets dropped directly from NIC hw >>>>> side. >>>>>>>>> >>>>>>>>> ice can support dual vlan tags that need more 8 bytes for max >>>>>>>>> packet size, so, configures the correct max packet size in >>>>>>>>> dev_config >>>>>> ops. >>>>>>>>> >>>>>>>>> Fixes: 50cc9d2a6e9d ("net/ice: fix max frame size") >>>>>>>>> >>>>>>>>> Signed-off-by: SteveX Yang >>>>>>>>> --- >>>>>>>>> drivers/net/ice/ice_ethdev.c | 11 +++++++++++ >>>>>>>>> 1 file changed, 11 insertions(+) >>>>>>>>> >>>>>>>>> diff --git a/drivers/net/ice/ice_ethdev.c >>>>>>>>> b/drivers/net/ice/ice_ethdev.c index >>>>>>>>> cfd357b05..6b7098444 100644 >>>>>>>>> --- a/drivers/net/ice/ice_ethdev.c >>>>>>>>> +++ b/drivers/net/ice/ice_ethdev.c >>>>>>>>> @@ -3146,6 +3146,7 @@ ice_dev_configure(struct rte_eth_dev >>> *dev) >>>>>>>>> struct ice_adapter *ad = >>>>>>>>> ICE_DEV_PRIVATE_TO_ADAPTER(dev->data->dev_private); >>>>>>>>> struct ice_pf *pf = >>>>>>>>> ICE_DEV_PRIVATE_TO_PF(dev->data->dev_private); >>>>>>>>> +uint32_t frame_size = dev->data->mtu + ICE_ETH_OVERHEAD; >>>>>>>>> int ret; >>>>>>>>> >>>>>>>>> /* Initialize to TRUE. If any of Rx queues doesn't meet the @@ >>>>>>>>> -3157,6 >>>>>>>>> +3158,16 @@ ice_dev_configure(struct rte_eth_dev *dev) >>>>>>>>> if (dev->data->dev_conf.rxmode.mq_mode & >>> ETH_MQ_RX_RSS_FLAG) >>>>>>>>> dev->data->dev_conf.rxmode.offloads |= >>>>>> DEV_RX_OFFLOAD_RSS_HASH; >>>>>>>>> >>>>>>>>> +/** >>>>>>>>> + * Considering QinQ packet, max frame size should be equal or >>>>>>>>> + * larger than total size of MTU and Ether overhead. >>>>>>>>> + */ >>>>>>>> >>>>>>>>> +if (frame_size > dev->data->dev_conf.rxmode.max_rx_pkt_len) { >>>>>>>> >>>>>>>> >>>>>>>> Why we need this check? >>>>>>>> Can we just call ice_mtu_set directly >>>>>>> >>>>>>> I think that without that check we can silently overwrite provided >>>>>>> by user dev_conf.rxmode.max_rx_pkt_len value. >>>>>> >>>>>> OK, I see >>>>>> >>>>>> But still have one question >>>>>> dev->data->mtu is initialized to 1518 as default , but if >>>>>> dev->data->application set >>>>>> dev_conf.rxmode.max_rx_pkt_len = 1000 in dev_configure. >>>>>> does that mean we will still will set mtu to 1518, is this expected? >>>>>> >>>>> >>>>> max_rx_pkt_len should be larger than mtu at least, so we should raise >>>>> the max_rx_pkt_len (e.g.:1518) to hold expected mtu value (e.g.: 1500). >>>> >>>> Ok, this describe the problem more general and better to replace exist >>> code comment and commit log for easy understanding. >>>> Please send a new version for reword >>>> >>> >>> I didn't really get this set. >>> >>> Application explicitly sets 'max_rx_pkt_len' to '1518', and a frame bigger than >>> this size is dropped. >> >> Sure, it is normal case for dropping oversize data. >> >>> Isn't this what should be, why we are trying to overwrite user configuration >>> in PMD to prevent this? >>> >> >> But it is a confliction that application/user sets mtu & max_rx_pkt_len at the same time. >> This fix will make a decision when confliction occurred. >> MTU value will come from user operation (e.g.: port config mtu 0 1500) directly, >> so, the max_rx_pkt_len will resize itself to adapt expected MTU value if its size is smaller than MTU + Ether overhead. >> >>> During eth_dev allocation, mtu set to default '1500', by ethdev layer. >>> And testpmd sets 'max_rx_pkt_len' by default to '1518'. >>> I think Qi's concern above is valid, what is user set 'max_rx_pkt_len' to '1000' >>> and mean it? PMD will not honor the user config. >> >> I'm not sure when set 'mtu' to '1500' and 'max_rx_pkt_len' to '1000', what's the behavior expected? >> If still keep the 'max_rx_pkt_len' value, that means the larger 'mtu' will be invalid. >> >>> >>> Why not simply increase the default 'max_rx_pkt_len' in testpmd? >>> >> The default 'max_rx_pkt_len' has been initialized to generical value (1518) and default 'mtu' is '1500' in testpmd, >> But it isn't suitable to those NIC drivers which Ether overhead is larger than 18. (e.g.: ice, i40e) if 'mtu' value is preferable. >> >>> And I guess even better what we need is to tell to the application what the >>> frame overhead PMD accepts. >>> So the application can set proper 'max_rx_pkt_len' value per port for a >>> given/requested MTU value. >>> @Ian, cc'ed, was complaining almost same thing years ago, these PMD >>> overhead macros and 'max_mtu'/'min_mtu' added because of that, perhaps >>> he has a solution now? > > From my perspective the main problem here: > We have 2 different variables for nearly the same thing: > rte_eth_dev_data.mtu and rte_eth_dev_data.dev_conf.max_rx_pkt_len. > and 2 different API to update them: dev_mtu_set() and dev_configure(). According API 'max_rx_pkt_len' is 'Only used if JUMBO_FRAME enabled' Although not sure that is practically what is done for all drivers. > And inside majority of Intel PMDs we don't keep these 2 variables in sync: > - mtu_set() will update both variables. > - dev_configure() will update only max_rx_pkt_len, but will keep mtu intact. > > This patch fixes this inconsistency, which I think is a good thing. > Though yes, it introduces change in behaviour. > > Let say the code: > rte_eth_dev_set_mtu(port, 1500); > dev_conf.max_rx_pkt_len = 1000; > rte_eth_dev_configure(port, 1, 1, &dev_conf); > 'rte_eth_dev_configure()' is one of the first APIs called, it is called before 'rte_eth_dev_set_mtu(). When 'rte_eth_dev_configure()' is called, MTU is set to '1500' by default by ethdev layer, so it is not user configuration, but 'max_rx_pkt_len' is. And later, when 'rte_eth_dev_set_mtu()' is called, but MTU and 'max_rx_pkt_len' are updated (mostly). > Before the patch will result: > mtu==1500, max_rx_pkt_len=1000; //out of sync looks wrong to me > > After the patch: > mtu=1500, max_rx_ptk_len=1518; // in sync, change in behaviour. > > If you think we need to preserve current behaviour, > then I suppose the easiest thing would be to change dev_config() code > to update mtu value based on max_rx_pkt_len. > I.E: dev_configure {...; mtu_set(max_rx_pkt_len - OVERHEAD); ...} > So the code snippet above will result: > mtu=982,max_rx_pkt_len=1000; > The 'max_rx_ptk_len' is annoyance for a long time, what do you think to just drop it? By default device will be up with default MTU (1500), later 'rte_eth_dev_set_mtu' can be used to set the MTU, no frame size setting at all. Will this work? And for short term, for above Intel PMDs, there must be a place this 'max_rx_pkt_len' value taken into account (mostly 'start()' dev_ops), that function can be updated to take 'max_rx_pkt_len' only if JUMBO_FRAME set, otherwise use the 'MTU' value. Without 'start()' updated the current logic won't work after stop & start anyway. > Konstantin > > > > > > > > > > > > > > > > > > > > > > > > >> >>> >>> And why this same thing can't happen to other PMDs? If this is a problem for >>> all PMDs, we should solve in other level, not for only some PMDs. >>> >> No, all PMDs exist the same issue, another proposal: >> - rte_ethdev provides the unique resize 'max_rx_pkt_len' in rte_eth_dev_configure(); >> - provide the uniform API for fetching the NIC's supported Ether Overhead size; >> Is it feasible? >> >>>> >>>>> Generally, the mtu value can be adjustable from user (e.g.: ip link >>>>> set ens801f0 mtu 1400), hence, we just adjust the max_rx_pkt_len to >>>>> satisfy mtu requirement. >>>>> >>>>>> Should we just call ice_mtu_set(dev, dev_conf.rxmode.max_rx_pkt_len) >>>>>> here? >>>>> ice_mtu_set(dev, mtu) will append ether overhead to >>>>> frame_size/max_rx_pkt_len, so we need pass the mtu value as the 2nd >>>>> parameter, or not the max_rx_pkt_len. >>>>> >>>>>> >>>>>> >>>>>>> >>>>>>>> And please remove above comment, since ether overhead is already >>>>>>> considered in ice_mtu_set. >>>>> Ether overhead is already considered in ice_mtu_set, but it also >>>>> should be considered as the adjustment condition that if ice_mtu_set >>> need be invoked. >>>>> So, it perhaps should remain this comment before this if() condition. >>>>> >>>>>>>> >>>>>>>> >>>>>>>>> +ret = ice_mtu_set(dev, dev->data->mtu); if (ret != 0) return >>>>>>>>> +ret; } >>>>>>>>> + >>>>>>>>> ret = ice_init_rss(pf); >>>>>>>>> if (ret) { >>>>>>>>> PMD_DRV_LOG(ERR, "Failed to enable rss for PF"); >>>>>>>>> -- >>>>>>>>> 2.17.1 >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >