From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EFE4FA0A02; Fri, 26 Mar 2021 15:10:16 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 92C0740685; Fri, 26 Mar 2021 15:10:16 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by mails.dpdk.org (Postfix) with ESMTP id 4119B4067B for ; Fri, 26 Mar 2021 15:10:15 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 12QE5fbU000318; Fri, 26 Mar 2021 07:10:14 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=pfpt0220; bh=XHirDTTnBVzei1cZVlXSVAnLdCDlaNw6XQ99S6fNNI0=; b=Ry1nKbqV+N2IyW7C7E+L1EOzRvtgezJHck+cv1jIlLUjW/nhVn+TokNMvXGO+R7mb31S x+i4huUMWzedtZllIX9MmVzAW0dO4VW1A/HTt86s4+P0HomSO4X0sIfSlE7JJ2qXm/Sm u3s8UlyLOxvSFSad15CZDANmlwj4U9ybW62Tf9EqT/4cbNKbyCacplzpk2vpzOylLDfP Us8gKdMDbCmvY3m0HP38gTXjrXUZvs1XR25vADKE60GaF7H4P22BW+XQeSxNyZM/JzqJ YaRpulvYDX2u9ozPZebyhqtgDOFXxlGhABKykaYK1CyBfjOg+ZpyLNrSPwy29r/1me+W Fg== Received: from dc5-exch01.marvell.com ([199.233.59.181]) by mx0b-0016f401.pphosted.com with ESMTP id 37h11pjqqk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT); Fri, 26 Mar 2021 07:10:14 -0700 Received: from DC5-EXCH01.marvell.com (10.69.176.38) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 26 Mar 2021 07:10:12 -0700 Received: from maili.marvell.com (10.69.176.80) by DC5-EXCH01.marvell.com (10.69.176.38) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Fri, 26 Mar 2021 07:10:12 -0700 Received: from BG-LT7430.marvell.com (unknown [10.193.68.121]) by maili.marvell.com (Postfix) with ESMTP id 28F8B3F7089; Fri, 26 Mar 2021 07:10:03 -0700 (PDT) From: To: , , , , , , , , CC: , Pavan Nikhilesh Date: Fri, 26 Mar 2021 19:38:41 +0530 Message-ID: <20210326140850.7332-1-pbhagavatula@marvell.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210325171057.6699-1-pbhagavatula@marvell.com> References: <20210325171057.6699-1-pbhagavatula@marvell.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Proofpoint-ORIG-GUID: yvsFQPS3jRWLvv31YlBG_Ien90AwkdgL X-Proofpoint-GUID: yvsFQPS3jRWLvv31YlBG_Ien90AwkdgL X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.369, 18.0.761 definitions=2021-03-26_06:2021-03-26, 2021-03-26 signatures=0 Subject: [dpdk-dev] [PATCH v8 0/8] Introduce event vectorization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Pavan Nikhilesh In traditional event programming model, events are identified by a flow-id and a uintptr_t. The flow-id uniquely identifies a given event and determines the order of scheduling based on schedule type, the uintptr_t holds a single object. Event devices also support burst mode with configurable dequeue depth, i.e. each dequeue call would return multiple events and each event might be at a different stage of the pipeline. Having a burst of events belonging to different stages in a dequeue burst is not only difficult to vectorize but also increases the scheduler overhead and application overhead of pipelining events further. Using event vectors we see a performance gain of ~742.3% as shown in [1]. By introducing event vectorization, each event will be capable of holding multiple uintptr_t of the same flow thereby allowing applications to vectorize their pipeline and reduce the complexity of pipelining events across multiple stages. This also reduces the complexity of handling enqueue and dequeue on an event device. Since event devices are transparent to the events they are scheduling so the event producers such as eth_rx_adapter, crypto_adapter , etc.. are responsible for vectorizing the buffers of the same flow into a single event. The series also breaks ABI in the patch [8/8] which is targetted to the v21.11 release. The dpdk-test-eventdev application has been updated with options to test multiple vector sizes and timeouts. [1] As for performance improvement, with a ARM Cortex-A72 equivalent processer, software event device (--vdev=event_sw0), single worker core, single stage and using one service core for Rx adapter, Tx adapter, Scheduling. Without this patchset applied: ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" -- --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue --stlist=a --wlcores=20 Port[0] using Rx adapter[0] configured Port[0] using Tx adapter[0] Configured 5.071 mpps With the patchset applied and Without event vectorization: ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" -- --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue --stlist=a --wlcores=20 Port[0] using Rx adapter[0] configured Port[0] using Tx adapter[0] Configured 5.123 mpps With event vectorization: ./build/app/dpdk-test-eventdev -l 7-23 -s 0x700 --vdev="event_sw0" -- --prod_type_ethdev --nb_pkts=0 --verbose 2 --test=pipeline_queue --stlist=a --wlcores=20 --enable_vector --nb_eth_queues 1 --vector_size 256 Port[0] using Rx adapter[0] configured Port[0] using Tx adapter[0] Configured 42.715 mpps Having dedicated service cores for each Rx queues and tweaking the vector, dequeue burst size would further improve performance. API usage is shown below: Configuration: struct rte_event_eth_rx_adapter_event_vector_config vec_conf; vector_pool = rte_event_vector_pool_create("vector_pool", nb_elem, 0, vector_size, socket_id); rte_event_eth_rx_adapter_create(id, event_id, &adptr_conf); rte_event_eth_rx_adapter_queue_add(id, eth_id, -1, &queue_conf); if (cap & RTE_EVENT_ETH_RX_ADAPTER_CAP_EVENT_VECTOR) { vec_conf.vector_sz = vector_size; vec_conf.vector_timeout_ns = vector_tmo_nsec; vec_conf.vector_mp = vector_pool; rte_event_eth_rx_adapter_queue_event_vector_config(id, eth_id, -1, &vec_conf); } Fastpath: num = rte_event_dequeue_burst(event_id, port_id, &ev, 1, 0); if (!num) continue; if (ev.event_type & RTE_EVENT_TYPE_VECTOR) { switch (ev.event_type) { case RTE_EVENT_TYPE_ETHDEV_VECTOR: case RTE_EVENT_TYPE_ETH_RX_ADAPTER_VECTOR: struct rte_mbuf **mbufs; mbufs = ev.vector_ev->mbufs; for (i = 0; i < ev.vector_ev->nb_elem; i++) //Process mbufs. break; case ... } } ... v8 Changes: - Fix incorrect shift for vector timeout interval.(Jay) - Code reallocation.(Jay) v7 Changes: - More doxygen fixes.(Jay) - Reduce code duplication in 4/8.(Jay) v6 Changes: - Make rte_errno sign consistant.(Jay) - Gramatical and doxygen fixes. (Jay) v5 Changes: - Make `rte_event_vector_pool_create non-inline` to ease ABI stability.(Ray) - Move `rte_event_eth_rx_adapter_queue_event_vector_config` and `rte_event_eth_rx_adapter_vector_limits_get` implementation to the patch where they are initially defined.(Ray) - Multiple gramatical and style fixes.(Jerin) - Add missing release notes.(Jerin) v4 Changes: - Fix missing event vector structure in event structure.(Jay) v3 Changes: - Fix unintended formatting changes. v2 Changes: - Multiple gramatical and style fixes.(Jerin) - Add parameter to define vector size in power of 2. (Jerin) - Redo patch series w/o breaking ABI till the last patch.(David) - Add deprication notice to announce ABI break in 21.11.(David) - Add vector limits validation to app/test-eventdev. Pavan Nikhilesh (8): eventdev: introduce event vector capability eventdev: introduce event vector Rx capability eventdev: introduce event vector Tx capability eventdev: add Rx adapter event vector support eventdev: add Tx adapter event vector support app/eventdev: add event vector mode in pipeline test doc: announce event Rx adapter config changes eventdev: simplify Rx adapter event vector config app/test-eventdev/evt_common.h | 4 + app/test-eventdev/evt_options.c | 52 +++ app/test-eventdev/evt_options.h | 4 + app/test-eventdev/test_pipeline_atq.c | 310 +++++++++++++++-- app/test-eventdev/test_pipeline_common.c | 105 +++++- app/test-eventdev/test_pipeline_common.h | 18 + app/test-eventdev/test_pipeline_queue.c | 320 ++++++++++++++++-- .../prog_guide/event_ethernet_rx_adapter.rst | 38 +++ .../prog_guide/event_ethernet_tx_adapter.rst | 12 + doc/guides/prog_guide/eventdev.rst | 36 +- doc/guides/rel_notes/deprecation.rst | 9 + doc/guides/rel_notes/release_21_05.rst | 8 + doc/guides/tools/testeventdev.rst | 45 ++- lib/librte_eventdev/eventdev_pmd.h | 31 +- .../rte_event_eth_rx_adapter.c | 318 ++++++++++++++++- .../rte_event_eth_rx_adapter.h | 78 +++++ .../rte_event_eth_tx_adapter.c | 66 +++- lib/librte_eventdev/rte_eventdev.c | 53 ++- lib/librte_eventdev/rte_eventdev.h | 114 ++++++- lib/librte_eventdev/version.map | 4 + 20 files changed, 1534 insertions(+), 91 deletions(-) -- 2.17.1