From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 3DBD8379E for ; Mon, 4 Jan 2016 11:48:35 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga103.jf.intel.com with ESMTP; 04 Jan 2016 02:48:33 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,520,1444719600"; d="scan'208";a="627685589" Received: from irsmsx151.ger.corp.intel.com ([163.33.192.59]) by FMSMGA003.fm.intel.com with ESMTP; 04 Jan 2016 02:48:33 -0800 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.92]) by IRSMSX151.ger.corp.intel.com ([169.254.4.102]) with mapi id 14.03.0248.002; Mon, 4 Jan 2016 10:48:32 +0000 From: "Walukiewicz, Miroslaw" To: "Liu, Jijiang" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs Thread-Index: AQHRPV741IICh2Gh90Gz8C01jrQQdZ7YUWzggAecWACAC0wM8A== Date: Mon, 4 Jan 2016 10:48:32 +0000 Message-ID: <7C4248CAE043B144B1CD242D275626534557026F@IRSMSX104.ger.corp.intel.com> References: <1450860592-12673-1-git-send-email-jijiang.liu@intel.com> <7C4248CAE043B144B1CD242D2756265345564C3B@IRSMSX104.ger.corp.intel.com> <1ED644BD7E0A5F4091CF203DAFB8E4CC23F3DDDA@SHSMSX101.ccr.corp.intel.com> In-Reply-To: <1ED644BD7E0A5F4091CF203DAFB8E4CC23F3DDDA@SHSMSX101.ccr.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_IC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYWUzZjRmODAtMDRhNS00YzcwLTk0OTgtNjQzMzEwNGE1OWVhIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjQuMTAuMTkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiazV6XC9XOXlvVkJwQjN1ZnA1bjZpXC9hTVJGMEs1WkI3TVBRNEpoVlE1b1ZFPSJ9 x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Jan 2016 10:48:35 -0000 Hi Jijang,=20 My comments below MW> > -----Original Message----- > From: Liu, Jijiang > Sent: Monday, December 28, 2015 6:55 AM > To: Walukiewicz, Miroslaw; dev@dpdk.org > Subject: RE: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs >=20 > Hi Miroslaw, >=20 > The partial answer is below. >=20 > > -----Original Message----- > > From: Walukiewicz, Miroslaw > > Sent: Wednesday, December 23, 2015 7:18 PM > > To: Liu, Jijiang; dev@dpdk.org > > Subject: RE: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs > > > > Hi Jijang, > > > > I like an idea of tunnel API very much. > > > > I have a few questions. > > > > 1. I see that you have only i40e support due to lack of HW tunneling > support > > in other NICs. > > I don't see a way how do you want to handle tunneling requests for NICs > > without HW offload. >=20 > The flow director offload mechanism is used here, flow director is a > common feature in current NICs. > Here I don't use special related tunneling HW offload features, the goal = is > that we want to support all of NICs. >=20 > > I think that we should have one common function for sending tunneled > > packets but the initialization should check the NIC capabilities and ca= ll > some > > registered function making tunneling in SW in case of lack of HW suppor= t. > Yes, we should check NIC capabilities. >=20 > > I know that making tunnel is very time consuming process, but it makes = an > > API more generic. Similar only 3 protocols are supported by i40e by HW > and > > we can imagine about 40 or more different tunnels working with this NIC= . > > > > Making the SW implementation we could support missing tunnels even for > > i40e. >=20 > In this patch set, I just use VXLAN protocol to demonstrate the framework= , > If the framework is accepted, other tunneling protocol will be added one = by > one in future. >=20 > > 2. I understand that we need RX HW queue defined in struct > > rte_eth_tunnel_conf but why tx_queue is necessary?. > > As I know i40e HW we can set tunneled packet descriptors in any HW > queue > > and receive only on one specific queue. >=20 > As for adding tx_queue here, I have already explained here at [1] >=20 > [1] http://dpdk.org/ml/archives/dev/2015-December/030509.html >=20 > Do you think it makes sense? MW> Unfortunately I do not see any explanation for using tx_queue parameter= in this thread.=20 For me this parameter is not necessary. The tunnels will work without it an= yway as they are set in the packet descriptor. >=20 > > 4. In your implementation you are assuming the there is one tunnel > > configured per DPDK interface > > > > rte_eth_dev_tunnel_configure(uint8_t port_id, > > + struct rte_eth_tunnel_conf *tunnel_conf) > > > No, in terms of i40e, there will be up to 8K tunnels in one DPDK inter= face, > It depends on number of flow rules on a pair of queues. >=20 > struct rte_eth_tunnel_conf { > uint16_t rx_queue; > uint16_t tx_queue; > uint16_t udp_tunnel_port; > uint16_t nb_flow; > uint16_t filter_type; > struct rte_eth_tunnel_flow *tunnel_flow; > }; >=20 > If the ' nb_flow ' is set 2000, and you can configure 2000 flow rules on = one > queues on a port. MW> so in your design the tunnel_flow is table of rte_eth_tunnel_flow struc= tures.=20 I did not catch it. I hope that you will add a possibility to dynamically adding/removing tunne= ls from interface. What is a sense of the udp_tunnel_port parameter as the tunnel_flow struct= ure also provides the same parameter. Similar the tunnel_type should be a part of the tunnel_flow also as we assu= me to support different tunnels on single interface (not just VXLAN only) >=20 > > The sense of tunnel is lack of interfaces in the system because number = of > > possible VLANs is too small (4095). > > In the DPDK we have only one tunnel per physical port what is useless e= ven > > with such big acceleration provided with i40e. >=20 > > In normal use cases there is a need for 10,000s of tunnels per interfac= e. > Even > > for Vxlan we have 24 bits for tunnel definition >=20 >=20 > We use flow director HW offload here, in terms of i40e, it support up to = 8K > flow rules of exact match. > This is HW limitation, 10,000s of tunnels per interface is not supported = by > HW. >=20 >=20 > > 5. I see that you have implementations for VXLAN,TEREDO, and GENEVE > > tunnels in i40e drivers. I could find the implementation for VXLAN > > encap/decap. Are all files in the patch present? > No, I have not finished all of codes, just VXLAN here. > Other tunneling protocol will be added one by one in future. >=20 > > Regards, > > > > Mirek > > > > > > > > > > > -----Original Message----- > > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jijiang Liu > > > Sent: Wednesday, December 23, 2015 9:50 AM > > > To: dev@dpdk.org > > > Subject: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs > > > > > > I want to define a set of General tunneling APIs, which are used to > > > accelarate tunneling packet processing in DPDK, In this RFC patch set= , > > > I wll explain my idea using some codes. > > > > > > 1. Using flow director offload to define a tunnel flow in a pair of q= ueues. > > > > > > flow rule: src IP + dst IP + src port + dst port + tunnel ID (for > > > VXLAN) > > > > > > For example: > > > struct rte_eth_tunnel_conf{ > > > .tunnel_type =3D VXLAN, > > > .rx_queue =3D 1, > > > .tx_queue =3D 1, > > > .filter_type =3D 'src ip + dst ip + src port + dst port + tunnel id' > > > .flow_tnl { > > > .tunnel_type =3D VXLAN, > > > .tunnel_id =3D 100, > > > .remote_mac =3D 11.22.33.44.55.66, > > > .ip_type =3D ipv4, > > > .outer_ipv4.src_ip =3D 192.168.10.1 > > > .outer_ipv4.dst_ip =3D 10.239.129.11 > > > .src_port =3D 1000, > > > .dst_port =3D2000 > > > }; > > > > > > 2. Configure tunnel flow for a device and for a pair of queues. > > > > > > rte_eth_dev_tunnel_configure(0, &rte_eth_tunnel_conf); > > > > > > In this API, it will call RX decapsulation and TX encapsulation > > > callback function if HW doesn't support encap/decap, and a space will > > > be allocated for tunnel configuration and store a pointer to this new > > > allocated space as dev->post_rx/tx_burst_cbs[].param. > > > > > > rte_eth_add_rx_callback(port_id, tunnel_conf.rx_queue, > > > rte_eth_tunnel_decap, (void *)tunnel_conf); > > > rte_eth_add_tx_callback(port_id, tunnel_conf.tx_queue, > > > rte_eth_tunnel_encap, (void *)tunnel_conf) > > > > > > 3. Using rte_vxlan_decap_burst() to do decapsulation of tunneling > packet. > > > > > > 4. Using rte_vxlan_encap_burst() to do encapsulation of tunneling > packet. > > > The 'src ip, dst ip, src port, dst port and tunnel ID" can be got > > > from tunnel configuration. > > > And SIMD is used to accelarate the operation. > > > > > > How to use these APIs, there is a example below: > > > > > > 1)at config phase > > > > > > dev_config(port, ...); > > > tunnel_config(port,...); > > > ... > > > dev_start(port); > > > ... > > > rx_burst(port, rxq,... ); > > > tx_burst(port, txq,...); > > > > > > > > > 2)at transmitting packet phase > > > The only outer src/dst MAC address need to be set for TX tunnel > > > configuration in dev->post_tx_burst_cbs[].param. > > > > > > In this patch set, I have not finished all of codes, the purpose of > > > sending patch set is that I would like to collect more comments and > > > sugestions on this idea. > > > > > > > > > Jijiang Liu (6): > > > extend rte_eth_tunnel_flow > > > define tunnel flow structure and APIs > > > implement tunnel flow APIs > > > define rte_vxlan_decap/encap > > > implement rte_vxlan_decap/encap > > > i40e tunnel configure > > > > > > drivers/net/i40e/i40e_ethdev.c | 41 +++++ > > > lib/librte_ether/libtunnel/rte_vxlan_opt.c | 251 > > > ++++++++++++++++++++++++++++ > > > lib/librte_ether/libtunnel/rte_vxlan_opt.h | 49 ++++++ > > > lib/librte_ether/rte_eth_ctrl.h | 14 ++- > > > lib/librte_ether/rte_ethdev.h | 28 +++ > > > lib/librte_ether/rte_ethdev.c | 60 ++ > > > 5 files changed, 440 insertions(+), 3 deletions(-) create mode > > > 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.c > > > create mode 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.h > > > > > > -- > > > 1.7.7.6