From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 41002A04B7; Tue, 13 Oct 2020 21:22:29 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B3A5F1D936; Tue, 13 Oct 2020 21:22:06 +0200 (CEST) Received: from mellanox.co.il (mail-il-dmz.mellanox.com [193.47.165.129]) by dpdk.org (Postfix) with ESMTP id 95C431D14A for ; Tue, 13 Oct 2020 21:22:03 +0200 (CEST) Received: from Internal Mail-Server by MTLPINE1 (envelope-from viacheslavo@nvidia.com) with SMTP; 13 Oct 2020 22:21:59 +0300 Received: from nvidia.com (pegasus12.mtr.labs.mlnx [10.210.17.40]) by labmailer.mlnx (8.13.8/8.13.8) with ESMTP id 09DJLxis014380; Tue, 13 Oct 2020 22:21:59 +0300 From: Viacheslav Ovsiienko To: dev@dpdk.org Cc: thomasm@monjalon.net, stephen@networkplumber.org, ferruh.yigit@intel.com, olivier.matz@6wind.com, jerinjacobk@gmail.com, maxime.coquelin@redhat.com, david.marchand@redhat.com, arybchenko@solarflare.com Date: Tue, 13 Oct 2020 19:21:51 +0000 Message-Id: <1602616917-22193-1-git-send-email-viacheslavo@nvidia.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: References: Subject: [dpdk-dev] [PATCH v5 0/6] ethdev: introduce Rx buffer split X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The DPDK datapath in the transmit direction is very flexible. An application can build the multi-segment packet and manages almost all data aspects - the memory pools where segments are allocated from, the segment lengths, the memory attributes like external buffers, registered for DMA, etc. In the receiving direction, the datapath is much less flexible, an application can only specify the memory pool to configure the receiving queue and nothing more. In order to extend receiving datapath capabilities it is proposed to add the way to provide extended information how to split the packets being received. The following structure is introduced to specify the Rx packet segment: struct rte_eth_rxseg { struct rte_mempool *mp; /* memory pools to allocate segment from */ uint16_t length; /* segment maximal data length, configures "split point" */ uint16_t offset; /* data offset from beginning of mbuf data buffer */ uint32_t reserved; /* reserved field */ }; The segment descriptions are added to the rte_eth_rxconf structure: rx_seg - pointer the array of segment descriptions, each element describes the memory pool, maximal data length, initial data offset from the beginning of data buffer in mbuf. This array allows to specify the different settings for each segment in individual fashion. n_seg - number of elements in the array If the extended segment descriptions is provided with these new fields the mp parameter of the rte_eth_rx_queue_setup must be specified as NULL to avoid ambiguity. The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device capabilities is introduced to present the way for PMD to report to application about supporting Rx packet split to configurable segments. Prior invoking the rte_eth_rx_queue_setup() routine application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. If the Rx queue is configured with new routine the packets being received will be split into multiple segments pushed to the mbufs with specified attributes. The PMD will split the received packets into multiple segments according to the specification in the description array: - the first network buffer will be allocated from the memory pool, specified in the first segment description element, the second network buffer - from the pool in the second segment description element and so on. If there is no enough elements to describe the buffer for entire packet of maximal length the pool from the last valid element will be used to allocate the buffers from for the rest of segments - the offsets from the segment description elements will provide the data offset from the buffer beginning except the first mbuf - for this one the offset is added to the RTE_PKTMBUF_HEADROOM to get actual offset from the buffer beginning. If there is no enough elements to describe the buffer for entire packet of maximal length the offsets for the rest of segment will be supposed to be zero. - the data length being received to each segment is limited by the length specified in the segment description element. The data receiving starts with filling up the first mbuf data buffer, if the specified maximal segment length is reached and there are data remaining (packet is longer than buffer in the first mbuf) the following data will be pushed to the next segment up to its own maximal length. If the first two segments is not enough to store all the packet remaining data the next (third) segment will be engaged and so on. If the length in the segment description element is zero the actual buffer size will be deduced from the appropriate memory pool properties. If there is no enough elements to describe the buffer for entire packet of maximal length the buffer size will be deduced from the pool of the last valid element for the remaining segments. For example, let's suppose we configured the Rx queue with the following segments: seg0 - pool0, len0=14B, off0=2 seg1 - pool1, len1=20B, off1=128B seg2 - pool2, len2=20B, off2=0B seg3 - pool3, len3=512B, off3=0B The packet 46 bytes long will look like the following: seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B long @ 128 in mbuf from pool1 seg2 - 12B long @ 0 in mbuf from pool2 The packet 1500 bytes long will look like the following: seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 seg1 - 20B @ 128 in mbuf from pool1 seg2 - 20B @ 0 in mbuf from pool2 seg3 - 512B @ 0 in mbuf from pool3 seg4 - 512B @ 0 in mbuf from pool3 seg5 - 422B @ 0 in mbuf from pool3 The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and configured to support new buffer split feature (if n_seg is greater than one). The new approach would allow splitting the ingress packets into multiple parts pushed to the memory with different attributes. For example, the packet headers can be pushed to the embedded data buffers within mbufs and the application data into the external buffers attached to mbufs allocated from the different memory pools. The memory attributes for the split parts may differ either - for example the application data may be pushed into the external memory located on the dedicated physical device, say GPU or NVMe. This would improve the DPDK receiving datapath flexibility with preserving compatibility with existing API. Signed-off-by: Viacheslav Ovsiienko --- v1: http://patches.dpdk.org/patch/79594/ v2: http://patches.dpdk.org/patch/79893/ - add feature support to mlx5 PMD v3: http://patches.dpdk.org/patch/80389/ - rte_eth_rx_queue_setup_ex is renamed to rte_eth_rxseg_queue_setup - DEV_RX_OFFLOAD_BUFFER_SPLIT is renamed to RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT - commit message update - documentaion provided - release notes update - minor bug fixes in testpmd related part v4: http://patches.dpdk.org/patch/80401/ - common part of rx_queue_setup/rxseg_queue_setup v5: - refactored to approach of providing split configuration in the rte_eth_rxconf structure instead of introducing the new API routine - added support for rxoffs command to testpmd to provide segment offsets for complete testing of split configurations - patchset is split into two parts - PMD part will be presented as separate series Viacheslav Ovsiienko (6): ethdev: introduce Rx buffer split app/testpmd: add multiple pools per core creation app/testpmd: add buffer split offload configuration app/testpmd: add rxpkts commands and parameters app/testpmd: add rxoffs commands and parameters app/testpmd: add extended Rx queue setup app/test-pmd/bpf_cmd.c | 4 +- app/test-pmd/cmdline.c | 151 ++++++++++++++++++++++++---- app/test-pmd/config.c | 107 +++++++++++++++++++- app/test-pmd/parameters.c | 54 ++++++++-- app/test-pmd/testpmd.c | 120 ++++++++++++++++------ app/test-pmd/testpmd.h | 44 ++++++-- doc/guides/nics/features.rst | 15 +++ doc/guides/rel_notes/release_20_11.rst | 9 ++ doc/guides/testpmd_app_ug/run_app.rst | 22 +++- doc/guides/testpmd_app_ug/testpmd_funcs.rst | 36 ++++++- lib/librte_ethdev/rte_ethdev.c | 95 +++++++++++++---- lib/librte_ethdev/rte_ethdev.h | 58 ++++++++++- lib/librte_ethdev/rte_ethdev_version.map | 1 + 13 files changed, 621 insertions(+), 95 deletions(-) -- 1.8.3.1