From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 717CE2904 for ; Tue, 1 Mar 2016 09:24:25 +0100 (CET) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP; 01 Mar 2016 00:24:13 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,522,1449561600"; d="scan'208";a="898252110" Received: from dpdk06.sh.intel.com ([10.239.128.225]) by orsmga001.jf.intel.com with ESMTP; 01 Mar 2016 00:24:11 -0800 From: Jianfeng Tan To: dev@dpdk.org Date: Tue, 1 Mar 2016 09:23:36 +0800 Message-Id: <1456795416-118189-1-git-send-email-jianfeng.tan@intel.com> X-Mailer: git-send-email 2.1.4 In-Reply-To: <1451544799-70776-1-git-send-email-jianfeng.tan@intel.com> References: <1451544799-70776-1-git-send-email-jianfeng.tan@intel.com> Subject: [dpdk-dev] [PATCH] examples/l3fwd: fix using packet type blindly X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 01 Mar 2016 08:24:26 -0000 As a example to use ptype info, l3fwd needs firstly to use rte_eth_dev_get_ptype_info() API to check if device and/or its PMD driver will parse and fill the needed packet type; if not, use the newly added option, --parse-ptype, to analyze it in the callback softly. As the mode of EXACT_MATCH uses the 5 tuples to caculate hash, so we narrow down its scope to: a. ip packets with no extensions, and b. L4 payload should be either tcp or udp. Note: this patch does not completely solve the issue, "cannot run l3fwd on virtio or other devices", because hw_ip_checksum may be not supported by the devices. Currently we can: option 1, remove this requirements; option 2, wait for virtio front end (pmd) to support it. Signed-off-by: Jianfeng Tan --- doc/guides/rel_notes/release_16_04.rst | 5 ++ doc/guides/sample_app_ug/l3_forward.rst | 6 ++- examples/l3fwd/l3fwd.h | 12 +++++ examples/l3fwd/l3fwd_em.c | 94 +++++++++++++++++++++++++++++++++ examples/l3fwd/l3fwd_lpm.c | 57 ++++++++++++++++++++ examples/l3fwd/main.c | 50 ++++++++++++++++++ 6 files changed, 223 insertions(+), 1 deletion(-) diff --git a/doc/guides/rel_notes/release_16_04.rst b/doc/guides/rel_notes/release_16_04.rst index 64e913d..4d6260e 100644 --- a/doc/guides/rel_notes/release_16_04.rst +++ b/doc/guides/rel_notes/release_16_04.rst @@ -68,6 +68,11 @@ This section should contain bug fixes added to the relevant sections. Sample for vhost-switch often fails to allocate mbuf when dequeue from vring because it wrongly calculates the number of mbufs needed. +* **examples/l3fwd: Fixed using packet type blindly.** + + l3fwd makes use of packet type information without even query if devices or PMDs + really set it. For those don't set ptypes, add an option to parse it softly. + EAL ~~~ diff --git a/doc/guides/sample_app_ug/l3_forward.rst b/doc/guides/sample_app_ug/l3_forward.rst index 4ce734b..e0c22e3 100644 --- a/doc/guides/sample_app_ug/l3_forward.rst +++ b/doc/guides/sample_app_ug/l3_forward.rst @@ -93,7 +93,7 @@ The application has a number of command line options: .. code-block:: console - ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len PKTLEN]] [--no-numa][--hash-entry-num][--ipv6] + ./build/l3fwd [EAL options] -- -p PORTMASK [-P] --config(port,queue,lcore)[,(port,queue,lcore)] [--enable-jumbo [--max-pkt-len PKTLEN]] [--no-numa][--hash-entry-num][--ipv6] [--parse-ptype] where, @@ -114,6 +114,8 @@ where, * --ipv6: optional, set it if running ipv6 packets +* --parse-ptype: optional, set it if use software way to analyze packet type + For example, consider a dual processor socket platform where cores 0-7 and 16-23 appear on socket 0, while cores 8-15 and 24-31 appear on socket 1. Let's say that the programmer wants to use memory from both NUMA nodes, the platform has only two ports, one connected to each NUMA node, and the programmer wants to use two cores from each processor socket to do the packet processing. @@ -334,6 +336,8 @@ The key code snippet of simple_ipv4_fwd_4pkts() is shown below: The simple_ipv6_fwd_4pkts() function is similar to the simple_ipv4_fwd_4pkts() function. +Known issue: IP packets with extensions or IP packets which are not TCP/UDP cannot work well with this mode. + Packet Forwarding for LPM-based Lookups ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/examples/l3fwd/l3fwd.h b/examples/l3fwd/l3fwd.h index da6d369..966c83b 100644 --- a/examples/l3fwd/l3fwd.h +++ b/examples/l3fwd/l3fwd.h @@ -198,6 +198,18 @@ void setup_hash(const int socketid); int +em_check_ptype(int portid); + +int +lpm_check_ptype(int portid); + +void +em_parse_ptype(struct rte_mbuf *); + +void +lpm_parse_ptype(struct rte_mbuf *); + +int em_main_loop(__attribute__((unused)) void *dummy); int diff --git a/examples/l3fwd/l3fwd_em.c b/examples/l3fwd/l3fwd_em.c index f6a65d8..1061baf 100644 --- a/examples/l3fwd/l3fwd_em.c +++ b/examples/l3fwd/l3fwd_em.c @@ -42,6 +42,7 @@ #include #include #include +#include #include #include @@ -508,6 +509,99 @@ populate_ipv6_many_flow_into_table(const struct rte_hash *h, printf("Hash: Adding 0x%x keys\n", nr_flow); } +/* Requirements: + * 1. IP packets without extension; + * 2. L4 payload should be either TCP or UDP. + */ +int +em_check_ptype(int portid) +{ + int i, ret; + int ptype_l3_ipv4_ext = 0; + int ptype_l3_ipv6_ext = 0; + int ptype_l4_tcp = 0; + int ptype_l4_udp = 0; + + ret = rte_eth_dev_get_ptype_info(portid, + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK, + NULL, 0); + if (ret <= 0) + return 0; + + uint32_t ptypes[ret]; + + ret = rte_eth_dev_get_ptype_info(portid, + RTE_PTYPE_L3_MASK | RTE_PTYPE_L4_MASK, + ptypes, ret); + for (i = 0; i < ret; ++i) { + switch (ptypes[i]) { + case RTE_PTYPE_L3_IPV4_EXT: + ptype_l3_ipv4_ext = 1; + break; + case RTE_PTYPE_L3_IPV6_EXT: + ptype_l3_ipv6_ext = 1; + break; + case RTE_PTYPE_L4_TCP: + ptype_l4_tcp = 1; + break; + case RTE_PTYPE_L4_UDP: + ptype_l4_udp = 1; + break; + } + } + + if (ptype_l3_ipv4_ext == 0) + printf("port %d cannot parse RTE_PTYPE_L3_IPV4_EXT\n", portid); + if (ptype_l3_ipv6_ext == 0) + printf("port %d cannot parse RTE_PTYPE_L3_IPV6_EXT\n", portid); + if (ptype_l3_ipv4_ext && ptype_l3_ipv6_ext) + return 1; + + if (ptype_l4_tcp == 0) + printf("port %d cannot parse RTE_PTYPE_L4_TCP\n", portid); + if (ptype_l4_udp == 0) + printf("port %d cannot parse RTE_PTYPE_L4_UDP\n", portid); + if (ptype_l4_tcp || ptype_l4_udp) + return 1; + + return 0; +} + +void +em_parse_ptype(struct rte_mbuf *m) +{ + struct ether_hdr *eth_hdr; + uint32_t packet_type = 0; + uint16_t ethertype; + void *l4; + int hdr_len; + struct ipv4_hdr *ipv4_hdr; + struct ipv6_hdr *ipv6_hdr; + + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type); + l4 = (uint8_t *)eth_hdr + sizeof(struct ether_hdr); + switch (ethertype) { + case ETHER_TYPE_IPv4: + ipv4_hdr = (struct ipv4_hdr *)l4; + hdr_len = (ipv4_hdr->version_ihl & IPV4_HDR_IHL_MASK) * + IPV4_IHL_MULTIPLIER; + if (hdr_len == sizeof(struct ipv4_hdr) && + (ipv4_hdr->next_proto_id == IPPROTO_TCP || + ipv4_hdr->next_proto_id == IPPROTO_UDP)) + packet_type |= RTE_PTYPE_L3_IPV4; + break; + case ETHER_TYPE_IPv6: + ipv6_hdr = (struct ipv6_hdr *)l4; + if (ipv6_hdr->proto == IPPROTO_TCP || + ipv6_hdr->proto == IPPROTO_UDP) + packet_type |= RTE_PTYPE_L3_IPV6; + break; + } + + m->packet_type |= packet_type; +} + /* main processing loop */ int em_main_loop(__attribute__((unused)) void *dummy) diff --git a/examples/l3fwd/l3fwd_lpm.c b/examples/l3fwd/l3fwd_lpm.c index e0ed3c4..981227a 100644 --- a/examples/l3fwd/l3fwd_lpm.c +++ b/examples/l3fwd/l3fwd_lpm.c @@ -280,6 +280,63 @@ setup_lpm(const int socketid) } } +int +lpm_check_ptype(int portid) +{ + int i, ret; + int ptype_l3_ipv4 = 0, ptype_l3_ipv6 = 0; + + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, NULL, 0); + if (ret <= 0) + return 0; + + uint32_t ptypes[ret]; + + ret = rte_eth_dev_get_ptype_info(portid, RTE_PTYPE_L3_MASK, + ptypes, ret); + for (i = 0; i < ret; ++i) { + if (ptypes[i] & RTE_PTYPE_L3_IPV4) + ptype_l3_ipv4 = 1; + if (ptypes[i] & RTE_PTYPE_L3_IPV6) + ptype_l3_ipv6 = 1; + } + + if (ptype_l3_ipv4 == 0) + printf("port %d cannot parse RTE_PTYPE_L3_IPV4\n", portid); + + if (ptype_l3_ipv6 == 0) + printf("port %d cannot parse RTE_PTYPE_L3_IPV6\n", portid); + + if (ptype_l3_ipv4 && ptype_l3_ipv6) + return 1; + + return 0; + +} + +void +lpm_parse_ptype(struct rte_mbuf *m) +{ + struct ether_hdr *eth_hdr; + uint32_t packet_type = 0; + uint16_t ethertype; + + eth_hdr = rte_pktmbuf_mtod(m, struct ether_hdr *); + ethertype = rte_be_to_cpu_16(eth_hdr->ether_type); + switch (ethertype) { + case ETHER_TYPE_IPv4: + packet_type |= RTE_PTYPE_L3_IPV4_EXT_UNKNOWN; + break; + case ETHER_TYPE_IPv6: + packet_type |= RTE_PTYPE_L3_IPV6_EXT_UNKNOWN; + break; + default: + break; + } + + m->packet_type |= packet_type; +} + /* Return ipv4/ipv6 lpm fwd lookup struct. */ void * lpm_get_ipv4_l3fwd_lookup_struct(const int socketid) diff --git a/examples/l3fwd/main.c b/examples/l3fwd/main.c index 0e33039..8889828 100644 --- a/examples/l3fwd/main.c +++ b/examples/l3fwd/main.c @@ -103,6 +103,8 @@ static int l3fwd_lpm_on; static int l3fwd_em_on; static int numa_on = 1; /**< NUMA is enabled by default. */ +static int parse_ptype; /**< Parse packet type using rx callback, and */ + /**< disabled by default */ /* Global variables. */ @@ -172,6 +174,8 @@ static struct rte_mempool * pktmbuf_pool[NB_SOCKETS]; struct l3fwd_lkp_mode { void (*setup)(int); + int (*check_ptype)(int); + void (*parse_ptype)(struct rte_mbuf *); int (*main_loop)(void *); void* (*get_ipv4_lookup_struct)(int); void* (*get_ipv6_lookup_struct)(int); @@ -181,6 +185,8 @@ static struct l3fwd_lkp_mode l3fwd_lkp; static struct l3fwd_lkp_mode l3fwd_em_lkp = { .setup = setup_hash, + .check_ptype = em_check_ptype, + .parse_ptype = em_parse_ptype, .main_loop = em_main_loop, .get_ipv4_lookup_struct = em_get_ipv4_l3fwd_lookup_struct, .get_ipv6_lookup_struct = em_get_ipv6_l3fwd_lookup_struct, @@ -188,6 +194,8 @@ static struct l3fwd_lkp_mode l3fwd_em_lkp = { static struct l3fwd_lkp_mode l3fwd_lpm_lkp = { .setup = setup_lpm, + .check_ptype = lpm_check_ptype, + .parse_ptype = lpm_parse_ptype, .main_loop = lpm_main_loop, .get_ipv4_lookup_struct = lpm_get_ipv4_l3fwd_lookup_struct, .get_ipv6_lookup_struct = lpm_get_ipv6_l3fwd_lookup_struct, @@ -209,6 +217,22 @@ setup_l3fwd_lookup_tables(void) l3fwd_lkp = l3fwd_lpm_lkp; } +static uint16_t +cb_parse_packet_type(uint8_t port __rte_unused, + uint16_t queue __rte_unused, + struct rte_mbuf *pkts[], + uint16_t nb_pkts, + uint16_t max_pkts __rte_unused, + void *user_param __rte_unused) +{ + unsigned i; + + for (i = 0; i < nb_pkts; ++i) + l3fwd_lkp.parse_ptype(pkts[i]); + + return nb_pkts; +} + static int check_lcore_params(void) { @@ -456,6 +480,7 @@ parse_eth_dest(const char *optarg) #define CMD_LINE_OPT_IPV6 "ipv6" #define CMD_LINE_OPT_ENABLE_JUMBO "enable-jumbo" #define CMD_LINE_OPT_HASH_ENTRY_NUM "hash-entry-num" +#define CMD_LINE_OPT_PARSE_PTYPE "parse-ptype" /* * This expression is used to calculate the number of mbufs needed @@ -486,6 +511,7 @@ parse_args(int argc, char **argv) {CMD_LINE_OPT_IPV6, 0, 0, 0}, {CMD_LINE_OPT_ENABLE_JUMBO, 0, 0, 0}, {CMD_LINE_OPT_HASH_ENTRY_NUM, 1, 0, 0}, + {CMD_LINE_OPT_PARSE_PTYPE, 0, 0, 0}, {NULL, 0, 0, 0} }; @@ -612,6 +638,14 @@ parse_args(int argc, char **argv) return -1; } } + + if (!strncmp(lgopts[option_index].name, + CMD_LINE_OPT_PARSE_PTYPE, + sizeof(CMD_LINE_OPT_PARSE_PTYPE))) { + printf("soft parse-ptype is enabled\n"); + parse_ptype = 1; + } + break; default: @@ -938,6 +972,22 @@ main(int argc, char **argv) rte_exit(EXIT_FAILURE, "rte_eth_rx_queue_setup: err=%d, port=%d\n", ret, portid); + + ret = l3fwd_lkp.check_ptype(portid); + if (ret) + continue; + if (!parse_ptype) + rte_exit(EXIT_FAILURE, + "port %d cannot parse packet type, please add --%s\n", + portid, CMD_LINE_OPT_PARSE_PTYPE); + + if (rte_eth_add_rx_callback(portid, queueid, + cb_parse_packet_type, + NULL)) + continue; + rte_exit(EXIT_FAILURE, + "Failed to add rx callback: port=%d\n", + portid); } } -- 2.1.4