From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id D6CE29ACF for ; Tue, 1 Mar 2016 14:51:35 +0100 (CET) Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP; 01 Mar 2016 05:51:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,523,1449561600"; d="scan'208";a="57597090" Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159]) by fmsmga004.fm.intel.com with ESMTP; 01 Mar 2016 05:51:33 -0800 Received: from irsmsx155.ger.corp.intel.com (163.33.192.3) by IRSMSX104.ger.corp.intel.com (163.33.3.159) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 1 Mar 2016 13:51:32 +0000 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.237]) by irsmsx155.ger.corp.intel.com ([169.254.14.201]) with mapi id 14.03.0248.002; Tue, 1 Mar 2016 13:51:32 +0000 From: "Ananyev, Konstantin" To: "Tan, Jianfeng" , "dev@dpdk.org" Thread-Topic: [PATCH] examples/l3fwd: fix using packet type blindly Thread-Index: AQHRc5PH5XOiXCU2xkqlshX+tOUuFp9ElSlg Date: Tue, 1 Mar 2016 13:51:31 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725836B0C99C@irsmsx105.ger.corp.intel.com> References: <1451544799-70776-1-git-send-email-jianfeng.tan@intel.com> <1456795416-118189-1-git-send-email-jianfeng.tan@intel.com> In-Reply-To: <1456795416-118189-1-git-send-email-jianfeng.tan@intel.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTJiMTYzMjgtYTE1OS00YmYyLTkyY2UtMzI3Mjk1ZTRiMjUzIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IklHc3BLNUp1emJENURVVGRTNFVqeGl6SlplaFdQTGNkbnZVZTFHWFpsSHc9In0= x-ctpclassification: CTP_IC x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] examples/l3fwd: fix using packet type blindly X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 13:51:36 -0000 Hi Jianfeng, > -----Original Message----- > From: Tan, Jianfeng > Sent: Tuesday, March 01, 2016 1:24 AM > To: dev@dpdk.org > Cc: Zhang, Helin; Ananyev, Konstantin; nelio.laranjeiro@6wind.com; adrien= .mazarguil@6wind.com; rahul.lakkireddy@chelsio.com; > pmatilai@redhat.com; Tan, Jianfeng > Subject: [PATCH] examples/l3fwd: fix using packet type blindly >=20 > As a example to use ptype info, l3fwd needs firstly to use > rte_eth_dev_get_ptype_info() API to check if device and/or > its PMD driver will parse and fill the needed packet type; > if not, use the newly added option, --parse-ptype, to > analyze it in the callback softly. >=20 > As the mode of EXACT_MATCH uses the 5 tuples to caculate > hash, so we narrow down its scope to: > a. ip packets with no extensions, and > b. L4 payload should be either tcp or udp. >=20 > Note: this patch does not completely solve the issue, "cannot > run l3fwd on virtio or other devices", because hw_ip_checksum > may be not supported by the devices. Currently we can: option > 1, remove this requirements; option 2, wait for virtio front > end (pmd) to support it. >=20 > Signed-off-by: Jianfeng Tan > --- > doc/guides/rel_notes/release_16_04.rst | 5 ++ > doc/guides/sample_app_ug/l3_forward.rst | 6 ++- > examples/l3fwd/l3fwd.h | 12 +++++ > examples/l3fwd/l3fwd_em.c | 94 +++++++++++++++++++++++++++= ++++++ > examples/l3fwd/l3fwd_lpm.c | 57 ++++++++++++++++++++ > examples/l3fwd/main.c | 50 ++++++++++++++++++ > 6 files changed, 223 insertions(+), 1 deletion(-) >=20 > diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_note= s/release_16_04.rst > index 64e913d..4d6260e 100644 > --- a/doc/guides/rel_notes/release_16_04.rst > +++ b/doc/guides/rel_notes/release_16_04.rst > @@ -68,6 +68,11 @@ This section should contain bug fixes added to the rel= evant sections. Sample for > vhost-switch often fails to allocate mbuf when dequeue from vring beca= use it > wrongly calculates the number of mbufs needed. >=20 > +* **examples/l3fwd: Fixed using packet type blindly.** > + > + l3fwd makes use of packet type information without even query if devic= es or PMDs > + really set it. For those don't set ptypes, add an option to parse it s= oftly. > + >=20 > EAL > ~~~ > diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_= app_ug/l3_forward.rst > index 4ce734b..e0c22e3 100644 > --- a/doc/guides/sample_app_ug/l3_forward.rst > +++ b/doc/guides/sample_app_ug/l3_forward.rst > @@ -93,7 +93,7 @@ The application has a number of command line options: >=20 > .. code-block:: console >=20 > - ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue= ,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len > PKTLEN]] [--no-numa][--hash-entry-num][--ipv6] > + ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue= ,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len > PKTLEN]] [--no-numa][--hash-entry-num][--ipv6] [--parse-ptype] >=20 > where, >=20 > @@ -114,6 +114,8 @@ where, >=20 > * --ipv6: optional, set it if running ipv6 packets >=20 > +* --parse-ptype: optional, set it if use software way to analyze packe= t type > + > For example, consider a dual processor socket platform where cores 0-7 a= nd 16-23 appear on socket 0, while cores 8-15 and 24-31 > appear on socket 1. > Let's say that the programmer wants to use memory from both NUMA nodes, = the platform has only two ports, one connected to > each NUMA node, > and the programmer wants to use two cores from each processor socket to = do the packet processing. > @@ -334,6 +336,8 @@ The key code snippet of simple_ipv4_fwd_4pkts() is sh= own below: >=20 > The simple_ipv6_fwd_4pkts() function is similar to the simple_ipv4_fwd_4= pkts() function. >=20 > +Known issue: IP packets with extensions or IP packets which are not TCP/= UDP cannot work well with this mode. > + > Packet Forwarding for LPM-based Lookups > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ >=20 > diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h > index da6d369..966c83b 100644 > --- a/examples/l3fwd/l3fwd.h > +++ b/examples/l3fwd/l3fwd.h > @@ -198,6 +198,18 @@ void > setup_hash(const int socketid); >=20 > int > +em_check_ptype(int portid); > + > +int > +lpm_check_ptype(int portid); > + > +void > +em_parse_ptype(struct rte_mbuf *); > + > +void > +lpm_parse_ptype(struct rte_mbuf *); > + > +int > em_main_loop(__attribute__((unused)) void *dummy); >=20 > int > diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c > index f6a65d8..1061baf 100644 > --- a/examples/l3fwd/l3fwd_em.c > +++ b/examples/l3fwd/l3fwd_em.c > @@ -42,6 +42,7 @@ > #include > #include > #include > +#include >=20 > #include > #include > @@ -508,6 +509,99 @@ populate_ipv6_many_flow_into_table(const struct rte_= hash *h, > printf("Hash: Adding 0x%x keys\n", nr_flow); > } >=20 > +/* Requirements: > + * 1. IP packets without extension; > + * 2. L4 payload should be either TCP or UDP. > + */ > +int > +em_check_ptype(int portid) > +{ > + int i, ret; > + int ptype_l3_ipv4_ext =3D 0; > + int ptype_l3_ipv6_ext =3D 0; > + int ptype_l4_tcp =3D 0; > + int ptype_l4_udp =3D 0; > + > + ret =3D rte_eth_dev_get_ptype_info(portid, > + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK, > + NULL, 0); > + if (ret <=3D 0) > + return 0; > + > + uint32_t ptypes[ret]; > + > + ret =3D rte_eth_dev_get_ptype_info(portid, > + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK, > + ptypes, ret); > + for (i =3D 0; i < ret; ++i) { > + switch (ptypes[i]) { > + case RTE_PTYPE_L3_IPV4_EXT: > + ptype_l3_ipv4_ext =3D 1; > + break; > + case RTE_PTYPE_L3_IPV6_EXT: > + ptype_l3_ipv6_ext =3D 1; > + break; > + case RTE_PTYPE_L4_TCP: > + ptype_l4_tcp =3D 1; > + break; > + case RTE_PTYPE_L4_UDP: > + ptype_l4_udp =3D 1; > + break; > + } > + } > + > + if (ptype_l3_ipv4_ext =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L3_IPV4_EXT\n", portid); > + if (ptype_l3_ipv6_ext =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L3_IPV6_EXT\n", portid); > + if (ptype_l3_ipv4_ext && ptype_l3_ipv6_ext) > + return 1; Why return here? You'll miss L4 ptype checks below. > + > + if (ptype_l4_tcp =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L4_TCP\n", portid); > + if (ptype_l4_udp =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L4_UDP\n", portid); > + if (ptype_l4_tcp || ptype_l4_udp) > + return 1; > + > + return 0; > +} > + > +void > +em_parse_ptype(struct rte_mbuf *m) > +{ > + struct ether_hdr *eth_hdr; > + uint32_t packet_type =3D 0; > + uint16_t ethertype; > + void *l4; > + int hdr_len; > + struct ipv4_hdr *ipv4_hdr; > + struct ipv6_hdr *ipv6_hdr; > + > + eth_hdr =3D rte_pktmbuf_mtod(m, struct ether_hdr *); > + ethertype =3D rte_be_to_cpu_16(eth_hdr->ether_type); > + l4 =3D (uint8_t *)eth_hdr + sizeof(struct ether_hdr); Just curious why l4? It looks like l3 :) > + switch (ethertype) { > + case ETHER_TYPE_IPv4: > + ipv4_hdr =3D (struct ipv4_hdr *)l4; > + hdr_len =3D (ipv4_hdr->version_ihl & IPV4_HDR_IHL_MASK) * > + IPV4_IHL_MULTIPLIER; > + if (hdr_len =3D=3D sizeof(struct ipv4_hdr) && > + (ipv4_hdr->next_proto_id =3D=3D IPPROTO_TCP || > + ipv4_hdr->next_proto_id =3D=3D IPPROTO_UDP)) > + packet_type |=3D RTE_PTYPE_L3_IPV4; I think it needs to be something like: If (hdr_len =3D=3D sizeof(struct ipv4_hdr)) {=20 packet_type =3D RTE_PTYPE_L3_IPV4; if (ipv4_hdr->next_proto_id =3D=3D IPPROTO_TCP) packet_type |=3D RTE_PTYPE_L4_TCP; else if ipv4_hdr->next_proto_id =3D=3D IPPROTO_UDP) packet_type |=3D RTE_PTYPE_L4_TCP; } And then inside em forward check ptype to be sure that is IPV4 with no opti= ons and UDP/TCP packet. Same for IPv6. > + break; > + case ETHER_TYPE_IPv6: > + ipv6_hdr =3D (struct ipv6_hdr *)l4; > + if (ipv6_hdr->proto =3D=3D IPPROTO_TCP || > + ipv6_hdr->proto =3D=3D IPPROTO_UDP) > + packet_type |=3D RTE_PTYPE_L3_IPV6; > + break; > + } > + > + m->packet_type |=3D packet_type; > +} > + > /* main processing loop */ > int > em_main_loop(__attribute__((unused)) void *dummy) > diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c > index e0ed3c4..981227a 100644 > --- a/examples/l3fwd/l3fwd_lpm.c > +++ b/examples/l3fwd/l3fwd_lpm.c > @@ -280,6 +280,63 @@ setup_lpm(const int socketid) > } > } >=20 > +int > +lpm_check_ptype(int portid) > +{ > + int i, ret; > + int ptype_l3_ipv4 =3D 0, ptype_l3_ipv6 =3D 0; > + > + ret =3D rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, NULL, 0); > + if (ret <=3D 0) > + return 0; > + > + uint32_t ptypes[ret]; > + > + ret =3D rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, > + ptypes, ret); > + for (i =3D 0; i < ret; ++i) { > + if (ptypes[i] & RTE_PTYPE_L3_IPV4) > + ptype_l3_ipv4 =3D 1; > + if (ptypes[i] & RTE_PTYPE_L3_IPV6) > + ptype_l3_ipv6 =3D 1; > + } > + > + if (ptype_l3_ipv4 =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid); > + > + if (ptype_l3_ipv6 =3D=3D 0) > + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid); > + > + if (ptype_l3_ipv4 && ptype_l3_ipv6) > + return 1; > + > + return 0; > + > +} > + > +void > +lpm_parse_ptype(struct rte_mbuf *m) > +{ > + struct ether_hdr *eth_hdr; > + uint32_t packet_type =3D 0; > + uint16_t ethertype; > + > + eth_hdr =3D rte_pktmbuf_mtod(m, struct ether_hdr *); > + ethertype =3D rte_be_to_cpu_16(eth_hdr->ether_type); > + switch (ethertype) { > + case ETHER_TYPE_IPv4: > + packet_type |=3D RTE_PTYPE_L3_IPV4_EXT_UNKNOWN; > + break; > + case ETHER_TYPE_IPv6: > + packet_type |=3D RTE_PTYPE_L3_IPV6_EXT_UNKNOWN; > + break; > + default: > + break; > + } > + > + m->packet_type |=3D packet_type; Might be safer: m->packet_type =3D packet_type; > +} > + > /* Return ipv4/ipv6 lpm fwd lookup struct. */ > void * > lpm_get_ipv4_l3fwd_lookup_struct(const int socketid) > diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c > index 0e33039..8889828 100644 > --- a/examples/l3fwd/main.c > +++ b/examples/l3fwd/main.c > @@ -103,6 +103,8 @@ static int l3fwd_lpm_on; > static int l3fwd_em_on; >=20 > static int numa_on =3D 1; /**< NUMA is enabled by default. */ > +static int parse_ptype; /**< Parse packet type using rx callback, and */ > + /**< disabled by default */ >=20 > /* Global variables. */ >=20 > @@ -172,6 +174,8 @@ static struct rte_mempool * pktmbuf_pool[NB_SOCKETS]; >=20 > struct l3fwd_lkp_mode { > void (*setup)(int); > + int (*check_ptype)(int); > + void (*parse_ptype)(struct rte_mbuf *); > int (*main_loop)(void *); > void* (*get_ipv4_lookup_struct)(int); > void* (*get_ipv6_lookup_struct)(int); > @@ -181,6 +185,8 @@ static struct l3fwd_lkp_mode l3fwd_lkp; >=20 > static struct l3fwd_lkp_mode l3fwd_em_lkp =3D { > .setup =3D setup_hash, > + .check_ptype =3D em_check_ptype, > + .parse_ptype =3D em_parse_ptype, > .main_loop =3D em_main_loop, > .get_ipv4_lookup_struct =3D em_get_ipv4_l3fwd_lookup_struct, > .get_ipv6_lookup_struct =3D em_get_ipv6_l3fwd_lookup_struct, > @@ -188,6 +194,8 @@ static struct l3fwd_lkp_mode l3fwd_em_lkp =3D { >=20 > static struct l3fwd_lkp_mode l3fwd_lpm_lkp =3D { > .setup =3D setup_lpm, > + .check_ptype =3D lpm_check_ptype, > + .parse_ptype =3D lpm_parse_ptype, > .main_loop =3D lpm_main_loop, > .get_ipv4_lookup_struct =3D lpm_get_ipv4_l3fwd_lookup_struct, > .get_ipv6_lookup_struct =3D lpm_get_ipv6_l3fwd_lookup_struct, > @@ -209,6 +217,22 @@ setup_l3fwd_lookup_tables(void) > l3fwd_lkp =3D l3fwd_lpm_lkp; > } >=20 > +static uint16_t > +cb_parse_packet_type(uint8_t port __rte_unused, > + uint16_t queue __rte_unused, > + struct rte_mbuf *pkts[], > + uint16_t nb_pkts, > + uint16_t max_pkts __rte_unused, > + void *user_param __rte_unused) > +{ > + unsigned i; > + > + for (i =3D 0; i < nb_pkts; ++i) > + l3fwd_lkp.parse_ptype(pkts[i]); No need to create callback chains. That way you have extra call per packet. Just 2 callbaclks: cb_lpm_parse_ptype & cb_em_parse_ptype, and then register one depending on which mode are we in. Would be simpler and faster, I believe. =20 Konstantin > + > + return nb_pkts; > +} > + > static int > check_lcore_params(void) > { > @@ -456,6 +480,7 @@ parse_eth_dest(const char *optarg) > #define CMD_LINE_OPT_IPV6 "ipv6" > #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo" > #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num" > +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype" >=20 > /* > * This expression is used to calculate the number of mbufs needed > @@ -486,6 +511,7 @@ parse_args(int argc, char **argv) > {CMD_LINE_OPT_IPV6, 0, 0, 0}, > {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0}, > {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0}, > + {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0}, > {NULL, 0, 0, 0} > }; >=20 > @@ -612,6 +638,14 @@ parse_args(int argc, char **argv) > return -1; > } > } > + > + if (!strncmp(lgopts[option_index].name, > + CMD_LINE_OPT_PARSE_PTYPE, > + sizeof(CMD_LINE_OPT_PARSE_PTYPE))) { > + printf("soft parse-ptype is enabled\n"); > + parse_ptype =3D 1; > + } > + > break; >=20 > default: > @@ -938,6 +972,22 @@ main(int argc, char **argv) > rte_exit(EXIT_FAILURE, > "rte_eth_rx_queue_setup: err=3D%d, port=3D%d\n", > ret, portid); > + > + ret =3D l3fwd_lkp.check_ptype(portid); > + if (ret) > + continue; > + if (!parse_ptype) > + rte_exit(EXIT_FAILURE, > + "port %d cannot parse packet type, please add --%s\n", > + portid, CMD_LINE_OPT_PARSE_PTYPE); > + > + if (rte_eth_add_rx_callback(portid, queueid, > + cb_parse_packet_type, > + NULL)) > + continue; > + rte_exit(EXIT_FAILURE, > + "Failed to add rx callback: port=3D%d\n", > + portid); > } > } >=20 > -- > 2.1.4