From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id F18AFA6A for ; Wed, 23 Dec 2015 12:17:35 +0100 (CET) Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP; 23 Dec 2015 03:17:35 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.20,468,1444719600"; d="scan'208";a="622711061" Received: from irsmsx103.ger.corp.intel.com ([163.33.3.157]) by FMSMGA003.fm.intel.com with ESMTP; 23 Dec 2015 03:17:34 -0800 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.138]) by IRSMSX103.ger.corp.intel.com ([169.254.3.13]) with mapi id 14.03.0248.002; Wed, 23 Dec 2015 11:17:33 +0000 From: "Walukiewicz, Miroslaw" To: "Liu, Jijiang" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs Thread-Index: AQHRPV741IICh2Gh90Gz8C01jrQQdZ7YUWzg Date: Wed, 23 Dec 2015 11:17:33 +0000 Message-ID: <7C4248CAE043B144B1CD242D2756265345564C3B@IRSMSX104.ger.corp.intel.com> References: <1450860592-12673-1-git-send-email-jijiang.liu@intel.com> In-Reply-To: <1450860592-12673-1-git-send-email-jijiang.liu@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ctpclassification: CTP_PUBLIC x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWJiNjRmZTMtMWIxYi00NzUyLTk4YTItNDg0ZDA4Y2E0MGMzIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX1BVQkxJQyJ9XX1dfSwiU3ViamVjdExhYmVscyI6W10sIlRNQ1ZlcnNpb24iOiIxNS40LjEwLjE5IiwiVHJ1c3RlZExhYmVsSGFzaCI6IjVWTHNwakFxc2RxZ2NvZlVWMldiU1FWMWVmbFNxdXZENTl5b2laUVgrZ3c9In0= x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Dec 2015 11:17:36 -0000 Hi Jijang, I like an idea of tunnel API very much.=20 I have a few questions.=20 1. I see that you have only i40e support due to lack of HW tunneling suppor= t in other NICs.=20 I don't see a way how do you want to handle tunneling requests for NICs wit= hout HW offload.=20 I think that we should have one common function for sending tunneled packet= s but the initialization should check the NIC capabilities and call some re= gistered function making tunneling in SW in case of lack of HW support. I know that making tunnel is very time consuming process, but it makes an A= PI more generic. Similar only 3 protocols are supported by i40e by HW and w= e can imagine about 40 or more different tunnels working with this NIC.=20 Making the SW implementation we could support missing tunnels even for i40e= . 2. I understand that we need RX HW queue defined in struct rte_eth_tunnel_c= onf but why tx_queue is necessary?.=20 As I know i40e HW we can set tunneled packet descriptors in any HW queue = and receive only on one specific queue. 3. I see a similar problem with receiving tunneled packets on the single qu= eue only. I know that some NICs like fm10k could make hashing on packets an= d push same tunnel to many queues. Maybe we should support such RSS like fe= ature in the design also. I know that it is not supported by i40e but it is= good to have a more flexible API design.=20 4. In your implementation you are assuming the there is one tunnel configur= ed per DPDK interface rte_eth_dev_tunnel_configure(uint8_t port_id, + struct rte_eth_tunnel_conf *tunnel_conf) The sense of tunnel is lack of interfaces in the system because number of p= ossible VLANs is too small (4095).=20 In the DPDK we have only one tunnel per physical port what is useless even = with such big acceleration provided with i40e. In normal use cases there is a need for 10,000s of tunnels per interface. E= ven for Vxlan we have 24 bits for tunnel definition I think that we need a special API for sending like rte_eth_dev_tunnel_send= _burst where we will provide some tunnel number allocated by rte_eth_dev_tu= nnel_configure to avoid setting the tunnel specific information separately = in each descriptor . Same on RX we should provide in struct rte_eth_tunnel_conf the callback = functions that will make some specific action on received tunnel that could= be pushing packet to the user ring or setting the tunnel information in RX= descriptor or somewhat else. 5. I see that you have implementations for VXLAN,TEREDO, and GENEVE tunnels= in i40e drivers. I could find the implementation for VXLAN encap/decap. A= re all files in the patch present? 6. What about with QinQ HW tunneling also supported by i40e HW. I know that= the implementation is present in different place but why not include QinQ = as additional tunnel. It would be very nice feature to have all tunnels API= in single place. Regards, Mirek > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Jijiang Liu > Sent: Wednesday, December 23, 2015 9:50 AM > To: dev@dpdk.org > Subject: [dpdk-dev] [RFC PATCH 0/6] General tunneling APIs >=20 > I want to define a set of General tunneling APIs, which are used to > accelarate tunneling packet processing in DPDK, > In this RFC patch set, I wll explain my idea using some codes. >=20 > 1. Using flow director offload to define a tunnel flow in a pair of queue= s. >=20 > flow rule: src IP + dst IP + src port + dst port + tunnel ID (for VXLAN) >=20 > For example: > struct rte_eth_tunnel_conf{ > .tunnel_type =3D VXLAN, > .rx_queue =3D 1, > .tx_queue =3D 1, > .filter_type =3D 'src ip + dst ip + src port + dst port + tunnel id' > .flow_tnl { > .tunnel_type =3D VXLAN, > .tunnel_id =3D 100, > .remote_mac =3D 11.22.33.44.55.66, > .ip_type =3D ipv4, > .outer_ipv4.src_ip =3D 192.168.10.1 > .outer_ipv4.dst_ip =3D 10.239.129.11 > .src_port =3D 1000, > .dst_port =3D2000 > }; >=20 > 2. Configure tunnel flow for a device and for a pair of queues. >=20 > rte_eth_dev_tunnel_configure(0, &rte_eth_tunnel_conf); >=20 > In this API, it will call RX decapsulation and TX encapsulation callback > function if HW doesn't support encap/decap, and > a space will be allocated for tunnel configuration and store a pointer to= this > new allocated space as dev->post_rx/tx_burst_cbs[].param. >=20 > rte_eth_add_rx_callback(port_id, tunnel_conf.rx_queue, > rte_eth_tunnel_decap, (void *)tunnel_conf); > rte_eth_add_tx_callback(port_id, tunnel_conf.tx_queue, > rte_eth_tunnel_encap, (void *)tunnel_conf) >=20 > 3. Using rte_vxlan_decap_burst() to do decapsulation of tunneling packet. >=20 > 4. Using rte_vxlan_encap_burst() to do encapsulation of tunneling packet. > The 'src ip, dst ip, src port, dst port and tunnel ID" can be got fro= m tunnel > configuration. > And SIMD is used to accelarate the operation. >=20 > How to use these APIs, there is a example below: >=20 > 1)at config phase >=20 > dev_config(port, ...); > tunnel_config(port,...); > ... > dev_start(port); > ... > rx_burst(port, rxq,... ); > tx_burst(port, txq,...); >=20 >=20 > 2)at transmitting packet phase > The only outer src/dst MAC address need to be set for TX tunnel > configuration in dev->post_tx_burst_cbs[].param. >=20 > In this patch set, I have not finished all of codes, the purpose of sendi= ng > patch set is that I would like to collect more comments and sugestions on > this idea. >=20 >=20 > Jijiang Liu (6): > extend rte_eth_tunnel_flow > define tunnel flow structure and APIs > implement tunnel flow APIs > define rte_vxlan_decap/encap > implement rte_vxlan_decap/encap > i40e tunnel configure >=20 > drivers/net/i40e/i40e_ethdev.c | 41 +++++ > lib/librte_ether/libtunnel/rte_vxlan_opt.c | 251 > ++++++++++++++++++++++++++++ > lib/librte_ether/libtunnel/rte_vxlan_opt.h | 49 ++++++ > lib/librte_ether/rte_eth_ctrl.h | 14 ++- > lib/librte_ether/rte_ethdev.h | 28 +++ > lib/librte_ether/rte_ethdev.c | 60 ++ > 5 files changed, 440 insertions(+), 3 deletions(-) > create mode 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.c > create mode 100644 lib/librte_ether/libtunnel/rte_vxlan_opt.h >=20 > -- > 1.7.7.6