From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 00649A09FF; Mon, 11 Jan 2021 15:04:01 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6E0FD140D10; Mon, 11 Jan 2021 15:04:01 +0100 (CET) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by mails.dpdk.org (Postfix) with ESMTP id 7BB40140CF9 for ; Mon, 11 Jan 2021 15:03:59 +0100 (CET) IronPort-SDR: mX15WZcffz0gG7W6UZ9Qvc03aCXJnFRFsm3X9OwLmZFiw0/VUU3nUCfweqCY3ZDVZ9klQBOtEn dBgaSHwXMUFw== X-IronPort-AV: E=McAfee;i="6000,8403,9860"; a="177086496" X-IronPort-AV: E=Sophos;i="5.79,338,1602572400"; d="scan'208";a="177086496" Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2021 06:02:46 -0800 IronPort-SDR: r8YIxmPPGFKb9m5Q/18Ji4wyXRx5ZnL1tNA13Md5qshZDhvTZDEs4W+DlSqjziCi7v7/KvN/Ed JD3AJdc4bOKQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.79,338,1602572400"; d="scan'208";a="352611828" Received: from fmsmsx602.amr.corp.intel.com ([10.18.126.82]) by fmsmga008.fm.intel.com with ESMTP; 11 Jan 2021 06:02:46 -0800 Received: from shsmsx606.ccr.corp.intel.com (10.109.6.216) by fmsmsx602.amr.corp.intel.com (10.18.126.82) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 11 Jan 2021 06:02:45 -0800 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by SHSMSX606.ccr.corp.intel.com (10.109.6.216) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 11 Jan 2021 22:02:43 +0800 Received: from shsmsx601.ccr.corp.intel.com ([10.109.6.141]) by SHSMSX601.ccr.corp.intel.com ([10.109.6.141]) with mapi id 15.01.1713.004; Mon, 11 Jan 2021 22:02:43 +0800 From: "Zhang, Qi Z" To: Thomas Monjalon , "Yigit, Ferruh" , "Guo, Jia" CC: Andrew Rybchenko , Ori Kam , "Wu, Jingjing" , "Yang, Qiming" , "Wang, Haiyue" , "dev@dpdk.org" , Gregory Etelson , "maxime.coquelin@redhat.com" , "jerinj@marvell.com" , "ajit.khaparde@broadcom.com" Thread-Topic: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri Thread-Index: AQHW2cLnLDO52W/MnUi6XmgEhPhIWqoauFcAgAC+FICAAAryAIAAqDkQ//+QKICAAIuhsP//rcCAgAEHCvCAAATsgIAACNMAgAASpwCAAXQdcIABs1eAgAF7OwCAAI+HcP//lfKAABD1cDA= Date: Mon, 11 Jan 2021 14:02:43 +0000 Message-ID: References: <20201216085854.7842-1-jia.guo@intel.com> <8947440.8VpXZ41EjM@thomas> <60e2772d01654b9c8757867f0632811e@intel.com> <9702007.LWHD5a8EkP@thomas> In-Reply-To: <9702007.LWHD5a8EkP@thomas> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.5.1.3 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for ecpri X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > -----Original Message----- > From: Thomas Monjalon > Sent: Monday, January 11, 2021 7:38 PM > To: Yigit, Ferruh ; Guo, Jia ;= Zhang, > Qi Z > Cc: Andrew Rybchenko ; Ori Kam > ; Wu, Jingjing ; Yang, Qiming > ; Wang, Haiyue ; > dev@dpdk.org; Gregory Etelson ; > maxime.coquelin@redhat.com; jerinj@marvell.com; > ajit.khaparde@broadcom.com > Subject: Re: [dpdk-dev] [dpdk-dev v2 1/2] ethdev: add new tunnel type for= ecpri >=20 > 11/01/2021 12:26, Zhang, Qi Z: > > From: Thomas Monjalon > > > 10/01/2021 11:46, Ori Kam: > > > > From: Zhang, Qi Z > > > > > From: Thomas Monjalon > > > > > > 08/01/2021 10:29, Andrew Rybchenko: > > > > > > > On 1/8/21 11:57 AM, Ferruh Yigit wrote: > > > > > > > > On 1/8/2021 1:41 AM, Zhang, Qi Z wrote: > > > > > > > >> From: Thomas Monjalon > > > > > > > >>> 07/01/2021 16:24, Zhang, Qi Z: > > > > > > > >>>> From: Thomas Monjalon > > > > > > > >>>>> 07/01/2021 13:47, Zhang, Qi Z: > > > > > > > >>>>>> From: Thomas Monjalon > > > > > > > >>>>>>> 07/01/2021 10:32, Guo, Jia: > > > > > > > >>>>>>>> From: Thomas Monjalon > > > > > > > >>>>>>>>> 24/12/2020 07:59, Jeff Guo: > > > > > > > >>>>>>>>>> --- a/lib/librte_ethdev/rte_ethdev.h > > > > > > > >>>>>>>>>> +++ b/lib/librte_ethdev/rte_ethdev.h > > > > > > > >>>>>>>>>> @@ -1219,6 +1219,7 @@ enum rte_eth_tunnel_type { > > > > > > > >>>>>>>>>> RTE_TUNNEL_TYPE_IP_IN_GRE, > > > > > > > >>>>>>>>>> RTE_L2_TUNNEL_TYPE_E_TAG, > > > > > > > >>>>>>>>>> RTE_TUNNEL_TYPE_VXLAN_GPE, > > > > > > > >>>>>>>>>> + RTE_TUNNEL_TYPE_ECPRI, > > > > > > > >>>>>>>>>> RTE_TUNNEL_TYPE_MAX, > > > > > > > >>>>>>>>>> }; > > > > > > > >>>>>>>>> > > > > > > > >>>>>>>>> We tried to remove all these legacy API in DPDK 20.= 11. > > > > > > > >>>>>>>>> Andrew decided to not remove this one because it > > > > > > > >>>>>>>>> is not yet completely replaced by rte_flow in all d= rivers. > > > > > > > >>>>>>>>> However, I am against continuing to update this API= . > > > > > > > >>>>>>>>> The opposite work should be done: migrate to rte_fl= ow. > > > > > > > >>>>>>>> > > > > > > > >>>>>>>> Agree but seems that the legacy api and driver > > > > > > > >>>>>>>> legacy implementation still keep in this release, > > > > > > > >>>>>>>> and there is no a general way to replace the legacy = by > rte_flow right now. > > > > > > > >>>>>>> > > > > > > > >>>>>>> I think rte_flow is a complete replacement with more > features. > > > > > > > >>>>>> > > > > > > > >>>>>> Thomas, I may not agree with this. > > > > > > > >>>>>> > > > > > > > >>>>>> Actually the "enum rte_eth_tunnel_type" is used by > > > > > > > >>>>>> rte_eth_dev_udp_tunnel_port_add A packet with > > > > > > > >>>>>> specific dst udp port will be recognized as a > > > > > > > >>>>>> specific tunnel packet type > > > (e.g. > > > > > > > >>>>>> vxlan, vxlan-gpe, > > > > > > > >>>>> ecpri...) In Intel NIC, the API actually changes the > > > > > > > >>>>> configuration of the packet parser in HW but not add a > > > > > > > >>>>> filter rule and I guess all other devices may enable > > > > > > > >>>>> it in a similar > > > way. > > > > > > > >>>>>> so naturally it should be a device (port) level > > > > > > > >>>>>> configuration but not a rte_flow > > > > > > > >>>>> rule for match, encap, decap... > > > > > > > >>>>> > > > > > > > >>>>> I don't understand how it helps to identify an UDP > > > > > > > >>>>> port if there is no rule for this tunnel. > > > > > > > >>>>> What is the usage? > > > > > > > >>>> > > > > > > > >>>> Yes, in general It is a rule, it matches a udp packet's > > > > > > > >>>> dst port and the action is > > > > > > > >>> "now the packet is identified as vxlan packet" then all > > > > > > > >>> other rte_flow rules that match for a vlxan as pattern > > > > > > > >>> will take > > > effect. > > > > > > > >>> but somehow, I think they are not rules in the same > > > > > > > >>> domain, just like we have dedicate API for mac/vlan > > > > > > > >>> filter, we'd better have a dedicate API for this also. ( > > > > > > > >>> RFC for Vxlan explains why we need this. > > > > > https://nam11.safelinks.protection.outlook.com/?url=3Dhttps%3A%2F= % > > > > > 2Fto > > > > > ols.ietf > > > > > .org%2Fhtml%2Frfc7348&data=3D04%7C01%7Corika%40nvidia.com > %7C > > > 46b2 > > > > > > > > > d8f48944422f0d9008d8b43a2293%7C43083d15727340c1b7db39efd9ccc17a > > > %7 > > > > > > > > > C0%7C0%7C637457509081543237%7CUnknown%7CTWFpbGZsb3d8eyJWIjoi > > > MC > > > > > > > > > 4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&a > > > > > > > > > mp;sdata=3DRYWFMjuxkcUZ982kK2s44tBAjf%2FTkDyaa7VEybCtxOo%3D&r > > > es > > > > > erved=3D0). > > > > > > > >>>> > > > > > > > >>>> "Destination Port: IANA has assigned the value 4789 for > > > > > > > >>>> the VXLAN UDP port, and this value SHOULD be used by > > > > > > > >>>> default as the destination UDP port. Some early > > > > > > > >>>> implementations of VXLAN have used other values for the > > > > > > > >>>> destination port. To enable interoperability with > > > > > > > >>>> these implementations, the destination port > > > > > > SHOULD be configurable." > > > > > > > >>> > > > > > > > >>> Yes the port number is free. > > > > > > > >>> But isn't it more natural to specify this port number as > > > > > > > >>> part of the rte_flow rule? > > > > > > > >> > > > > > > > >> I think if we have a rte_flow action type that can be > > > > > > > >> used to set a packet's tunnel type xxx, like below #flow > > > > > > > >> create eth/ipv4/udp port is 4789/... action > > > > > > > >> set_tunnel_type VxLAN / end then we may replace it with > > > > > > > >> rte_flow, but I'm not sure if it's necessary, please share= if you have > a better idea. > > > > > > > > > > > > Of course we can specify the UDP port in rte_flow rule. > > > > > > Please check rte_flow_item_udp. > > > > > > That's a basic of rte_flow. > > > > > > > > > > Its not about the pattern match, it's about the action, what we > > > > > need is a rte_flow action to "define a packet's tunnel type", but= we don't > have. > > > > > > A packet type alone is meaningless. > > > It is always associated to an action, this is what rte_flow does. > > > > As I mentioned in previous, this is a device (port) level configuration= , so it can > only be configured by a PF driver or a privileged VF base on our security= model. > > A typical usage in a NFV environment could be: > > > > 1. A privileged VF (e.g. ice_dcf PMD) use rte_eth_dev_udp_tunnel_port_a= dd > to create tunnel port for eCPRI, them this will impact on all VFs in the = same PF. > > 2. A normal VF driver can create rte_flow rule that match specific patc= h for > queue steering or apply RSS for eCPRI packets, but it DON'T have the > permission to define the tunnel port. >=20 > Whaooh! A normal Intel VF is not allowed to match the tunnel it wants if = not > enabled by a priviledged VF? > I would say it is a HW design flaw, but that's not the question.=09 Why you think this is a design flaw? in real case, is it a typical requirem= ent that different VF need different tunnel port for eCPRI (or VxLan) on th= e same PF?=20 I believe it's not necessary to make it as a per VF resource in most cases,= and I will be surprise if a driver that allow any VF to change the share r= esource without any privilege control. Btw I guess mlx NIC has more flexible way to handle ecpri tunnel, just curi= ous how it works, what's the expected result of below rules? 1. create flow eth / ipv4 / udp dst is 1234 / ecpri msgtype is 0 / ... to q= ueue 0 2. create flow eth / ipv4 / udp dst is 5678 / ecrpi msgtype is 1 / ... to q= ueue 1. So both 1234 and 5678 will be regarded as an ECPRI packet? Or only the firs= t one will work? does dst udp port is always needed if an ecpri pattern is involved? >=20 > > So it does not help to have a rte_flow that match dst udp port as > > tunnel while have an action in the same rule. >=20 > Now I understand you are truly looking for a device configuration. > But as it looks more as a HW limitation, I don't think it should be part = of ethdev. > The fact is that it is already part of ethdev... >=20 > I don't know what to say. I need to digest how Intel limitation is "still= " trying to > drive some parts of the "generic" API. >=20 >=20 > > > > > #flow create eth/ipv4/udp port is 4789/... action > > > > > set_tunnel_type VxLAN > > > > > > > > > > I think rte_eth_dev_udp_tunnel_port_add does this job well > > > > > already, if we plan to move it to rte_flow, at least we need a re= placement > solution. > > > > > > The documentation does not say why it is useful. > > > With rte_flow you don't need it because a flow is specified with its = action. > > > > > > > > > > Let me see if I understand it correctly. > > > > In your case, the issue is that you need to configure the HW to > > > > parse the > > > packet correctly right? > > > > It is not about the matching it is about the configuration of the > > > > HW, you wish to tell the HW that the packet should be parsed by > > > > different means > > > correct? > > > > > > > > If this is the case it sounds to me that you should use rte_flow > > > > and if the user adds the following rule: > > > > #flow create pattern eth / ivp4 / udp port is 4789 / .. action ....= . > > > > You simply need to configure your HW to support the ecpri configura= tion. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Isn't this more a device configuration than filtering, not > > > > > > > > sure about using rte_flow for this. > > > > > > > > > > > > > > +1 > > > > > > > > > > > > A device configuration? No, setting an UDP port is a stack > configuration. > > > > > > > > > > > > > > > > > > > >> BTW, are we going to move all other filter like mac , > > > > > > > >> VLAN filter/strip/insert into rte_flow finally? > > > > > > > > > > > > Yes I think it should be the direction. > > > > > > All of this can be achieved with rte_flow. > > > > > > > > > > > > > > > > > > > >> if that's the plan, though I don't have much inputs for > > > > > > > >> this right now, but I think we may not need to prevent > > > > > > > >> new features be added based on current API if it does not > > > > > > > >> introduce more complexity and not break anything. > > > > > > > > > > > > If we continue updating old API, we are just forking ourself: > > > > > > having 2 APIs for the same feature is a non-sense. > > > > > > We must remove APIs which are superseded by rte_flow. >=20 >=20