From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5B0ADA059F; Sat, 11 Apr 2020 16:14:24 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id DB6FE1C1A0; Sat, 11 Apr 2020 16:14:23 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0a-0016f401.pphosted.com [67.231.148.174]) by dpdk.org (Postfix) with ESMTP id 253451C19F for ; Sat, 11 Apr 2020 16:14:21 +0200 (CEST) Received: from pps.filterd (m0045849.ppops.net [127.0.0.1]) by mx0a-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 03BEA2NU016787; Sat, 11 Apr 2020 07:14:19 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type : content-transfer-encoding; s=pfpt0818; bh=EkvNAKuLzXoWjqNGB3Kz/4kycY8MD4Ac0Ea/wX4EkyA=; b=cnIKlR+vq08mTUAKn9g33Hk0klltAZsyk/Kv+brcvTkqXkwMeTfOHdxYkLocyX3PuiW7 Uf7udMk+p9z+9BxwFo7JAhe6Loz1+QpedHIiXBW0egRL65th6Mhd2ROdpOfFUCcZYiZT DybK1T+cI1Q5X43lB3tu/yEDPzVQY3L+fB1NJJQeOfU78u3ewz46ct+O46URU3wJpFkU 1t/Gmd7mna9wdLC2vrAo02LGb9rChenWd/h76CIvflH318SeKPo003/b/WdfTUOGh++U VAHu2GPjz/C7uqY5qfaPITdxKs4L3nnTRFLdo+5D9A+zcYX8Y7MXk3m4YNuUR6Vrhamr tg== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0a-0016f401.pphosted.com with ESMTP id 30bb8q8n26-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Sat, 11 Apr 2020 07:14:18 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Sat, 11 Apr 2020 07:14:17 -0700 Received: from DC5-EXCH02.marvell.com (10.69.176.39) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Sat, 11 Apr 2020 07:14:16 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH02.marvell.com (10.69.176.39) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Sat, 11 Apr 2020 07:14:16 -0700 Received: from jerin-lab.marvell.com (jerin-lab.marvell.com [10.28.34.14]) by maili.marvell.com (Postfix) with ESMTP id A33E43F7045; Sat, 11 Apr 2020 07:14:13 -0700 (PDT) From: To: CC: , , , , , , , , , , Jerin Jacob Date: Sat, 11 Apr 2020 19:43:59 +0530 Message-ID: <20200411141428.1987768-1-jerinj@marvell.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20200405085613.1336841-1-jerinj@marvell.com> References: <20200405085613.1336841-1-jerinj@marvell.com> MIME-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.138, 18.0.676 definitions=2020-04-11_04:2020-04-09, 2020-04-11 signatures=0 Subject: [dpdk-dev] [PATCH v5 00/29] graph: introduce graph subsystem X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Jerin Jacob Using graph traversal for packet processing is a proven architecture that has been implemented in various open source libraries. Graph architecture for packet processing enables abstracting the data processing functions as “nodes” and “links” them together to create a complex “graph” to create reusable/modular data processing functions. The patchset further includes performance enhancements and modularity to the DPDK as discussed in more detail below. v5..v4: ------ Addressed the following review comments from Andrzej Ostruszka. 1) Addressed and comment in (http://mails.dpdk.org/archives/dev/2020-April/162184.html) and improved following function prototypes/return types and adjusted the implementation a) rte_graph_node_get b) rte_graph_max_count c) rte_graph_export d) rte_graph_destroy 2) Updated UT and l3fwd-graph for updated function prototype 3) bug fix in edge_update 4) avoid reading graph_src_nodes_count() twice in rte_graph_create() 5) Fix graph_mem_fixup_secondray typo 6) Fixed Doxygen comments for rte_node_next_stream_put 7) Updated the documentation to reflect the same. 8) Removed RTE prefix from rte_node_mbuf_priv[1|2] * as they are internal defines 9) Limited next_hop id provided to LPM route add in librte_node/ip4_lookup.c to 24 bits () 10) Fixed pattern array overflow issue with l3fwd-graph/main.c by splitting pattern array to default + non-default array. Updated doc with the same info. 11) Fixed parsing issues in parse_config() in l3fwd-graph/main.c inline with issues reported in l2fwd-event 12)Removed next_hop field in l3fwd-graph/main.c main() 13) Fixed graph create error check in l3fwd-graph/main.c main() v4..v3: ------- Addressed the following review comments from Wang, Xiao W 1) Remove unnecessary line from rte_graph.h 2) Fix a typo from rte_graph.h 3) Move NODE_ID_CHECK to 3rd patch where it is first used. 4) Fixed bug in edge_update() v3..v2: ------- 1) refactor ipv4 node lookup by moving SSE and NEON specific code to lib/librte_node/ip4_lookup_sse.h and lib/librte_node/ip4_lookup_neon.h 2) Add scalar version of process() function for ipv4 lookup to make the node work on NON x86 and arm64 machines. v2..v1: ------ 1) Added programmer guide/implementation documentation and l3fwd-graph doc RFC..v1: -------- 1) Split the patch to more logical ones for review. 2) Added doxygen comments for the API 3) Code cleanup 4) Additional performance improvements. Delta between l3fwd and l3fwd-graph is negligible now. (~1%) on octeontx2. 5) Added SIMD routines for x86 in additional to arm64. Hosted in netlify for easy reference: ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Programmer’s Guide: https://dpdk-graph.netlify.com/doc/html/guides/prog_guide/graph_lib.html l3fwd-graph doc: https://dpdk-graph.netlify.com/doc/html/guides/sample_app_ug/l3_forward_graph.html API doc: https://dpdk-graph.netlify.com/doc/html/api/rte__graph_8h.html https://dpdk-graph.netlify.com/doc/html/api/rte__graph__worker_8h.html https://dpdk-graph.netlify.com/doc/html/api/rte__node__eth__api_8h.html https://dpdk-graph.netlify.com/doc/html/api/rte__node__ip4__api_8h.html 2) Added the release notes for the this feature 3) Fix build issues reported by CI for v1: http://mails.dpdk.org/archives/test-report/2020-March/121326.html Addional nodes planned for v20.08 ---------------------------------- 1) Packet classification node 2) Support for IPV6 LPM node This patchset contains ----------------------------- 1) The API definition to "create" nodes and "link" together to create a "graph" for packet processing. See, lib/librte_graph/rte_graph.h 2) The Fast path API definition for the graph walker and enqueue function used by the workers. See, lib/librte_graph/rte_graph_worker.h 3) Optimized SW implementation for (1) and (2). See, lib/librte_graph/ 4) Test case to verify the graph infrastructure functionality See, app/test/test_graph.c 5) Performance test cases to evaluate the cost of graph walker and nodes enqueue fast-path function for various combinations. See app/test/test_graph_perf.c 6) Packet processing nodes(Null, Rx, Tx, Pkt drop, IPV4 rewrite, IPv4 lookup) using graph infrastructure. See lib/librte_node/* 7) An example application to showcase l3fwd (functionality same as existing examples/l3fwd) using graph infrastructure and use packets processing nodes (item (6)). See examples/l3fwd-graph/. Performance ----------- 1) Graph walk and node enqueue overhead can be tested with performance test case application [1] # If all packets go from a node to another node (we call it as # "homerun") then it will be just a pointer swap for a burst of packets. # In the worst case, a couple of handful cycles to move an object from a node to another node. 2) Performance comparison with existing l3fwd (The complete static code with out any nodes) vs modular l3fwd-graph with 5 nodes (ip4_lookup, ip4_rewrite, ethdev_tx, ethdev_rx, pkt_drop). Here is graphical representation of the l3fwd-graph as Graphviz dot file: http://bit.ly/39UPPGm # l3fwd-graph performance is -1.2% wrt static l3fwd. # We have simulated the similar test with existing librte_pipeline # application [4]. ip_pipline application is -48.62% wrt static l3fwd. The above results are on octeontx2. It may vary on other platforms. The platforms with higher L1 and L2 caches will have further better performance. Tested architectures: -------------------- 1) AArch64 2) X86 Identified tweaking for better performance on different targets --------------------------------------------------------------- 1) Test with various burst size values (256, 128, 64, 32) using CONFIG_RTE_GRAPH_BURST_SIZE config option. Based on our testing, on x86 and arm64 servers, The sweet spot is 256 burst size. While on arm64 embedded SoCs, it is either 64 or 128. 2) Disable node statistics (use CONFIG_RTE_LIBRTE_GRAPH_STATS config option) if not needed. 3) Use arm64 optimized memory copy for arm64 architecture by selecting CONFIG_RTE_ARCH_ARM64_MEMCPY. Commands to run tests --------------------- [1] perf test: echo "graph_perf_autotest" | sudo ./build/app/test/dpdk-test -c 0x30 [2] functionality test: echo "graph_autotest" | sudo ./build/app/test/dpdk-test -c 0x30 [3] l3fwd-graph: ./l3fwd-graph -c 0x100 -- -p 0x3 --config="(0, 0, 8)" -P [4] # ./ip_pipeline --c 0xff0000 -- -s route.cli Route.cli: (Copy paste to the shell to avoid dos format issues) https://pastebin.com/raw/B4Ktx7TT Jerin Jacob (13): graph: define the public API for graph support graph: implement node registration graph: implement node operations graph: implement node debug routines graph: implement internal graph operation helpers graph: populate fastpath memory for graph reel graph: implement create and destroy APIs graph: implement graph operation APIs graph: implement Graphviz export graph: implement debug routines graph: implement stats support graph: implement fastpath API routines doc: add graph library programmer's guide guide Kiran Kumar K (2): graph: add unit test case node: add ipv4 rewrite node Nithin Dabilpuram (11): node: add log infra and null node node: add ethdev Rx node node: add ethdev Tx node node: add ethdev Rx and Tx node ctrl API node: ipv4 lookup for arm64 node: add ipv4 rewrite and lookup ctrl API node: add packet drop node l3fwd-graph: add graph based l3fwd skeleton l3fwd-graph: add ethdev configuration changes l3fwd-graph: add graph config and main loop doc: add l3fwd graph application user guide Pavan Nikhilesh (3): graph: add performance testcase node: add generic ipv4 lookup node node: ipv4 lookup for x86 MAINTAINERS | 14 + app/test/Makefile | 7 + app/test/meson.build | 12 +- app/test/test_graph.c | 819 ++++ app/test/test_graph_perf.c | 1057 ++++++ config/common_base | 12 + config/rte_config.h | 4 + doc/api/doxy-api-index.md | 5 + doc/api/doxy-api.conf.in | 2 + doc/guides/prog_guide/graph_lib.rst | 397 ++ .../prog_guide/img/anatomy_of_a_node.svg | 1078 ++++++ .../prog_guide/img/graph_mem_layout.svg | 702 ++++ doc/guides/prog_guide/img/link_the_nodes.svg | 3330 +++++++++++++++++ doc/guides/prog_guide/index.rst | 1 + doc/guides/rel_notes/release_20_05.rst | 32 + doc/guides/sample_app_ug/index.rst | 1 + doc/guides/sample_app_ug/intro.rst | 4 + doc/guides/sample_app_ug/l3_forward_graph.rst | 334 ++ examples/Makefile | 3 + examples/l3fwd-graph/Makefile | 58 + examples/l3fwd-graph/main.c | 1126 ++++++ examples/l3fwd-graph/meson.build | 13 + examples/meson.build | 6 +- lib/Makefile | 6 + lib/librte_graph/Makefile | 28 + lib/librte_graph/graph.c | 587 +++ lib/librte_graph/graph_debug.c | 84 + lib/librte_graph/graph_ops.c | 169 + lib/librte_graph/graph_populate.c | 234 ++ lib/librte_graph/graph_private.h | 347 ++ lib/librte_graph/graph_stats.c | 406 ++ lib/librte_graph/meson.build | 11 + lib/librte_graph/node.c | 421 +++ lib/librte_graph/rte_graph.h | 668 ++++ lib/librte_graph/rte_graph_version.map | 47 + lib/librte_graph/rte_graph_worker.h | 510 +++ lib/librte_node/Makefile | 32 + lib/librte_node/ethdev_ctrl.c | 115 + lib/librte_node/ethdev_rx.c | 221 ++ lib/librte_node/ethdev_rx_priv.h | 81 + lib/librte_node/ethdev_tx.c | 86 + lib/librte_node/ethdev_tx_priv.h | 62 + lib/librte_node/ip4_lookup.c | 215 ++ lib/librte_node/ip4_lookup_neon.h | 238 ++ lib/librte_node/ip4_lookup_sse.h | 244 ++ lib/librte_node/ip4_rewrite.c | 326 ++ lib/librte_node/ip4_rewrite_priv.h | 77 + lib/librte_node/log.c | 14 + lib/librte_node/meson.build | 10 + lib/librte_node/node_private.h | 79 + lib/librte_node/null.c | 23 + lib/librte_node/pkt_drop.c | 26 + lib/librte_node/rte_node_eth_api.h | 64 + lib/librte_node/rte_node_ip4_api.h | 78 + lib/librte_node/rte_node_version.map | 9 + lib/meson.build | 5 +- meson.build | 1 + mk/rte.app.mk | 2 + 58 files changed, 14538 insertions(+), 5 deletions(-) create mode 100644 app/test/test_graph.c create mode 100644 app/test/test_graph_perf.c create mode 100644 doc/guides/prog_guide/graph_lib.rst create mode 100644 doc/guides/prog_guide/img/anatomy_of_a_node.svg create mode 100644 doc/guides/prog_guide/img/graph_mem_layout.svg create mode 100644 doc/guides/prog_guide/img/link_the_nodes.svg create mode 100644 doc/guides/sample_app_ug/l3_forward_graph.rst create mode 100644 examples/l3fwd-graph/Makefile create mode 100644 examples/l3fwd-graph/main.c create mode 100644 examples/l3fwd-graph/meson.build create mode 100644 lib/librte_graph/Makefile create mode 100644 lib/librte_graph/graph.c create mode 100644 lib/librte_graph/graph_debug.c create mode 100644 lib/librte_graph/graph_ops.c create mode 100644 lib/librte_graph/graph_populate.c create mode 100644 lib/librte_graph/graph_private.h create mode 100644 lib/librte_graph/graph_stats.c create mode 100644 lib/librte_graph/meson.build create mode 100644 lib/librte_graph/node.c create mode 100644 lib/librte_graph/rte_graph.h create mode 100644 lib/librte_graph/rte_graph_version.map create mode 100644 lib/librte_graph/rte_graph_worker.h create mode 100644 lib/librte_node/Makefile create mode 100644 lib/librte_node/ethdev_ctrl.c create mode 100644 lib/librte_node/ethdev_rx.c create mode 100644 lib/librte_node/ethdev_rx_priv.h create mode 100644 lib/librte_node/ethdev_tx.c create mode 100644 lib/librte_node/ethdev_tx_priv.h create mode 100644 lib/librte_node/ip4_lookup.c create mode 100644 lib/librte_node/ip4_lookup_neon.h create mode 100644 lib/librte_node/ip4_lookup_sse.h create mode 100644 lib/librte_node/ip4_rewrite.c create mode 100644 lib/librte_node/ip4_rewrite_priv.h create mode 100644 lib/librte_node/log.c create mode 100644 lib/librte_node/meson.build create mode 100644 lib/librte_node/node_private.h create mode 100644 lib/librte_node/null.c create mode 100644 lib/librte_node/pkt_drop.c create mode 100644 lib/librte_node/rte_node_eth_api.h create mode 100644 lib/librte_node/rte_node_ip4_api.h create mode 100644 lib/librte_node/rte_node_version.map -- 2.25.1