From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id A4BB4A04DB; Fri, 16 Oct 2020 10:58:34 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8CE051D92A; Fri, 16 Oct 2020 10:58:33 +0200 (CEST) Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id 3F33A1C2A0 for ; Fri, 16 Oct 2020 10:58:31 +0200 (CEST) Received: from mx1-us1.ppe-hosted.com (unknown [10.7.65.60]) by dispatch1-us1.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTP id BEA4B60062; Fri, 16 Oct 2020 08:58:29 +0000 (UTC) Received: from us4-mdac16-63.ut7.mdlocal (unknown [10.7.66.62]) by mx1-us1.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTP id BC6862009A; Fri, 16 Oct 2020 08:58:29 +0000 (UTC) X-Virus-Scanned: Proofpoint Essentials engine Received: from mx1-us1.ppe-hosted.com (unknown [10.7.66.40]) by mx1-us1.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTPS id 1AD861C004F; Fri, 16 Oct 2020 08:58:29 +0000 (UTC) Received: from webmail.solarflare.com (uk.solarflare.com [193.34.186.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mx1-us1.ppe-hosted.com (PPE Hosted ESMTP Server) with ESMTPS id 7F055BC0067; Fri, 16 Oct 2020 08:58:28 +0000 (UTC) Received: from [192.168.38.17] (10.17.10.39) by ukex01.SolarFlarecom.com (10.17.10.4) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Fri, 16 Oct 2020 09:58:20 +0100 From: Andrew Rybchenko To: Viacheslav Ovsiienko , CC: , , , , , , References: <1602834519-8696-1-git-send-email-viacheslavo@nvidia.com> <1602834519-8696-2-git-send-email-viacheslavo@nvidia.com> Message-ID: Date: Fri, 16 Oct 2020 11:58:16 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [10.17.10.39] X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To ukex01.SolarFlarecom.com (10.17.10.4) X-TM-AS-Product-Ver: SMEX-12.5.0.1300-8.6.1012-25728.003 X-TM-AS-Result: No-23.753400-8.000000-10 X-TMASE-MatchedRID: IeZYkn8zfFr4ECMHJTM/ufZvT2zYoYOwC/ExpXrHizx8Wx0lu+VLLdMk 9VBp7XNFgNL6W72wJ7T6X7B56FY7bFmGwsZJV9auMiMrbc70PfcvsOOmgOo1mYpl689MOw1GYTm 3vk4/zMnTGMxrXHLw5j4gmzHPr8wbXSqE44npBsp9j6Il8VAHF1sP0tBwe3qDCS92Bs39Kp951G xWl6MpnhblCgDaWhxffqVGF80lcKPI4bQD+ZdsC7BZAi3nrnzbfLNHMurfykg10hFCwEzEuWd7V BcXlRwONt/djahOncDENlHLFvCEtzfOOPFtR30SD3uYMxd01bcHgh3sKJBzP+ZYcdJgScjxHixC Zq+8CAaF35Y+5Kr9cRc0ks7EBxXVzCxwcYWZwRKgpIKfOvr6D2BEcY81WiodIBOIu+xK/hxQkYK WmGU13EAT8bVVNWx9OKcrNmTIoy8y+7jiH+1g946MisxJraxHi84ZMJtUpThwh5cseF+Y9o+ywo /yG5aF8IOYrUPhnGbTS2vuyJT7+wMwOZUXOlJ4w9GVhGa/57Zp4xorO9dSmT2mEJylgvfBcijMZ rr2iZ2t2gtuWr1Lmuiby/1wOUJGXLwtm0/T/c5j/0faIOjwJb/I3arxTrviCEX9AJK2MMWzblfW 6nMhN+dRkWB60/F/OwuDnIXsWyqC7mXLo2tyfy8s/ULwMh46MVx/3ZYby7/8Js19y6otgOKgAbF DLh41PaLEIfsP6BnrX3Z+/WBBpswdQieqpnTaUrViw6AaT1FnAst8At+c3TnZfxjBVQRb8fX8/d cuseX5X//7fD5c4K6/Esai6ideljkQLbuyhxaeAiCmPx4NwLTrdaH1ZWqC1B0Hk1Q1KyLgfCfWl nNb/1cppCzPq+1UkGUtrowrXLg= X-TM-AS-User-Approved-Sender: Yes X-TM-AS-User-Blocked-Sender: No X-TMASE-Result: 10--23.753400-8.000000 X-TMASE-Version: SMEX-12.5.0.1300-8.6.1012-25728.003 X-MDID: 1602838709-bY0jJUg_EzWT X-PPE-DISP: 1602838709;bY0jJUg_EzWT Subject: Re: [dpdk-dev] [PATCH v8 1/6] ethdev: introduce Rx buffer split X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 10/16/20 11:51 AM, Andrew Rybchenko wrote: > On 10/16/20 10:48 AM, Viacheslav Ovsiienko wrote: >> The DPDK datapath in the transmit direction is very flexible. >> An application can build the multi-segment packet and manages >> almost all data aspects - the memory pools where segments >> are allocated from, the segment lengths, the memory attributes >> like external buffers, registered for DMA, etc. >> >> In the receiving direction, the datapath is much less flexible, >> an application can only specify the memory pool to configure the >> receiving queue and nothing more. In order to extend receiving >> datapath capabilities it is proposed to add the way to provide >> extended information how to split the packets being received. >> >> The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device >> capabilities is introduced to present the way for PMD to report to >> application about supporting Rx packet split to configurable >> segments. Prior invoking the rte_eth_rx_queue_setup() routine >> application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. >> >> The following structure is introduced to specify the Rx packet >> segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload: >> >> struct rte_eth_rxseg_split { >> >> struct rte_mempool *mp; /* memory pools to allocate segment from */ >> uint16_t length; /* segment maximal data length, >> configures "split point" */ >> uint16_t offset; /* data offset from beginning >> of mbuf data buffer */ >> uint32_t reserved; /* reserved field */ >> }; >> >> The segment descriptions are added to the rte_eth_rxconf structure: >> rx_seg - pointer the array of segment descriptions, each element >> describes the memory pool, maximal data length, initial >> data offset from the beginning of data buffer in mbuf. >> This array allows to specify the different settings for >> each segment in individual fashion. >> rx_nseg - number of elements in the array >> >> If the extended segment descriptions is provided with these new >> fields the mp parameter of the rte_eth_rx_queue_setup must be >> specified as NULL to avoid ambiguity. >> >> There are two options to specify Rx buffer configuration: >> - mp is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is zero, >> it is compatible configuration, follows existing implementation, >> provides single pool and no description for segment sizes >> and offsets. >> - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not >> zero, it provides the extended configuration, individually for >> each segment. >> >> f the Rx queue is configured with new settings the packets being >> received will be split into multiple segments pushed to the mbufs >> with specified attributes. The PMD will split the received packets >> into multiple segments according to the specification in the >> description array. >> >> For example, let's suppose we configured the Rx queue with the >> following segments: >> seg0 - pool0, len0=14B, off0=2 >> seg1 - pool1, len1=20B, off1=128B >> seg2 - pool2, len2=20B, off2=0B >> seg3 - pool3, len3=512B, off3=0B >> >> The packet 46 bytes long will look like the following: >> seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 >> seg1 - 20B long @ 128 in mbuf from pool1 >> seg2 - 12B long @ 0 in mbuf from pool2 >> >> The packet 1500 bytes long will look like the following: >> seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 >> seg1 - 20B @ 128 in mbuf from pool1 >> seg2 - 20B @ 0 in mbuf from pool2 >> seg3 - 512B @ 0 in mbuf from pool3 >> seg4 - 512B @ 0 in mbuf from pool3 >> seg5 - 422B @ 0 in mbuf from pool3 >> >> The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and >> configured to support new buffer split feature (if rx_nseg >> is greater than one). >> >> The split limitations imposed by underlying PMD is reported >> in the new introduced rte_eth_dev_info->rx_seg_capa field. >> >> The new approach would allow splitting the ingress packets into >> multiple parts pushed to the memory with different attributes. >> For example, the packet headers can be pushed to the embedded >> data buffers within mbufs and the application data into >> the external buffers attached to mbufs allocated from the >> different memory pools. The memory attributes for the split >> parts may differ either - for example the application data >> may be pushed into the external memory located on the dedicated >> physical device, say GPU or NVMe. This would improve the DPDK >> receiving datapath flexibility with preserving compatibility >> with existing API. >> >> Signed-off-by: Viacheslav Ovsiienko >> Acked-by: Ajit Khaparde >> Acked-by: Jerin Jacob With below review notes processed: Reviewed-by: Andrew Rybchenko >> --- >> doc/guides/nics/features.rst | 15 ++++ >> doc/guides/rel_notes/deprecation.rst | 5 -- >> doc/guides/rel_notes/release_20_11.rst | 9 ++ >> lib/librte_ethdev/rte_ethdev.c | 152 +++++++++++++++++++++++++++------ >> lib/librte_ethdev/rte_ethdev.h | 88 ++++++++++++++++++- >> 5 files changed, 238 insertions(+), 31 deletions(-) >> >> diff --git a/doc/guides/nics/features.rst b/doc/guides/nics/features.rst >> index dd8c955..832ea3b 100644 >> --- a/doc/guides/nics/features.rst >> +++ b/doc/guides/nics/features.rst >> @@ -185,6 +185,21 @@ Supports receiving segmented mbufs. >> * **[related] eth_dev_ops**: ``rx_pkt_burst``. >> >> >> +.. _nic_features_buffer_split: >> + >> +Buffer Split on Rx >> +------------------ >> + >> +Scatters the packets being received on specified boundaries to segmented mbufs. >> + >> +* **[uses] rte_eth_rxconf,rte_eth_rxmode**: ``offloads:RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT``. >> +* **[uses] rte_eth_rxconf**: ``rx_conf.rx_seg, rx_conf.rx_nseg``. >> +* **[implements] datapath**: ``Buffer Split functionality``. >> +* **[provides] rte_eth_dev_info**: ``rx_offload_capa:RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT``. >> +* **[provides] eth_dev_ops**: ``rxq_info_get:buffer_split``. >> +* **[related] API**: ``rte_eth_rx_queue_setup()``. >> + >> + >> .. _nic_features_lro: >> >> LRO >> diff --git a/doc/guides/rel_notes/deprecation.rst b/doc/guides/rel_notes/deprecation.rst >> index 584e720..232cd54 100644 >> --- a/doc/guides/rel_notes/deprecation.rst >> +++ b/doc/guides/rel_notes/deprecation.rst >> @@ -138,11 +138,6 @@ Deprecation Notices >> In 19.11 PMDs will still update the field even when the offload is not >> enabled. >> >> -* ethdev: Add new fields to ``rte_eth_rxconf`` to configure the receiving >> - queues to split ingress packets into multiple segments according to the >> - specified lengths into the buffers allocated from the specified >> - memory pools. The backward compatibility to existing API is preserved. >> - >> * ethdev: ``rx_descriptor_done`` dev_ops and ``rte_eth_rx_descriptor_done`` >> will be removed in 21.11. >> Existing ``rte_eth_rx_descriptor_status`` and ``rte_eth_tx_descriptor_status`` >> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst >> index bcc0fc2..bcc2479 100644 >> --- a/doc/guides/rel_notes/release_20_11.rst >> +++ b/doc/guides/rel_notes/release_20_11.rst >> @@ -60,6 +60,12 @@ New Features >> Added the FEC API which provides functions for query FEC capabilities and >> current FEC mode from device. Also, API for configuring FEC mode is also provided. >> >> +* **Introduced extended buffer description for receiving.** >> + >> + Added the extended Rx buffer description for Rx queue setup routine >> + providing the individual settings for each Rx segment with maximal size, >> + buffer offset and memory pool to allocate data buffers from. >> + >> * **Updated Broadcom bnxt driver.** >> >> Updated the Broadcom bnxt driver with new features and improvements, including: >> @@ -253,6 +259,9 @@ API Changes >> As the data of ``uint8_t`` will be truncated when queue number under >> a TC is greater than 256. >> >> +* ethdev: Added fields rx_seg and rx_nseg to rte_eth_rxconf structure >> + to provide extended description of the receiving buffer. >> + >> * vhost: Moved vDPA APIs from experimental to stable. >> >> * rawdev: Added a structure size parameter to the functions >> diff --git a/lib/librte_ethdev/rte_ethdev.c b/lib/librte_ethdev/rte_ethdev.c >> index 892c246..0eda3f4 100644 >> --- a/lib/librte_ethdev/rte_ethdev.c >> +++ b/lib/librte_ethdev/rte_ethdev.c >> @@ -105,6 +105,9 @@ struct rte_eth_xstats_name_off { >> #define RTE_RX_OFFLOAD_BIT2STR(_name) \ >> { DEV_RX_OFFLOAD_##_name, #_name } >> >> +#define RTE_ETH_RX_OFFLOAD_BIT2STR(_name) \ >> + { RTE_ETH_RX_OFFLOAD_##_name, #_name } >> + >> static const struct { >> uint64_t offload; >> const char *name; >> @@ -128,9 +131,11 @@ struct rte_eth_xstats_name_off { >> RTE_RX_OFFLOAD_BIT2STR(SCTP_CKSUM), >> RTE_RX_OFFLOAD_BIT2STR(OUTER_UDP_CKSUM), >> RTE_RX_OFFLOAD_BIT2STR(RSS_HASH), >> + RTE_ETH_RX_OFFLOAD_BIT2STR(BUFFER_SPLIT), >> }; >> >> #undef RTE_RX_OFFLOAD_BIT2STR >> +#undef RTE_ETH_RX_OFFLOAD_BIT2STR >> >> #define RTE_TX_OFFLOAD_BIT2STR(_name) \ >> { DEV_TX_OFFLOAD_##_name, #_name } >> @@ -1763,6 +1768,77 @@ struct rte_eth_dev * >> return ret; >> } >> >> +static int >> +rte_eth_rx_queue_check_split(const struct rte_eth_rxseg *rx_seg, >> + uint16_t n_seg, uint32_t *mbp_buf_size, >> + const struct rte_eth_dev_info *dev_info) >> +{ >> + const struct rte_eth_rxseg_capa *seg_capa = &dev_info->rx_seg_capa; >> + struct rte_mempool *mp_first; >> + uint32_t offset_mask; >> + uint16_t seg_idx; >> + >> + if (n_seg > seg_capa->max_seg) { >> + RTE_ETHDEV_LOG(ERR, >> + "Requested Rx segments %u exceed supported %u\n", >> + n_seg, seg_capa->max_seg); >> + return -EINVAL; >> + } >> + /* >> + * Check the sizes and offsets against buffer sizes >> + * for each segment specified in extended configuration. >> + */ >> + mp_first = rx_seg[0].conf.split.mp; >> + offset_mask = (1u << seg_capa->offset_align_log2) - 1; >> + for (seg_idx = 0; seg_idx < n_seg; seg_idx++) { >> + struct rte_mempool *mpl = rx_seg[seg_idx].conf.split.mp; >> + uint32_t length = rx_seg[seg_idx].conf.split.length; >> + uint32_t offset = rx_seg[seg_idx].conf.split.offset; >> + >> + if (mpl == NULL) { >> + RTE_ETHDEV_LOG(ERR, "null mempool pointer\n"); >> + return -EINVAL; >> + } >> + if (seg_idx != 0 && mp_first != mpl && >> + seg_capa->multi_pools == 0) { >> + RTE_ETHDEV_LOG(ERR, "Receiving to multiple pools is not supported\n"); >> + return -ENOTSUP; >> + } >> + if (offset != 0) { >> + if (seg_capa->offset_allowed == 0) { >> + RTE_ETHDEV_LOG(ERR, "Rx segmentation with offset is not supported\n"); >> + return -ENOTSUP; >> + } >> + if (offset & offset_mask) { >> + RTE_ETHDEV_LOG(ERR, "Rx segmentat invalid offset alignment %u, %u\n", >> + offset, >> + seg_capa->offset_align_log2); >> + return -EINVAL; >> + } >> + } >> + if (mpl->private_data_size < >> + sizeof(struct rte_pktmbuf_pool_private)) { >> + RTE_ETHDEV_LOG(ERR, >> + "%s private_data_size %u < %u\n", >> + mpl->name, mpl->private_data_size, >> + (unsigned int)sizeof >> + (struct rte_pktmbuf_pool_private)); >> + return -ENOSPC; >> + } >> + offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM; >> + *mbp_buf_size = rte_pktmbuf_data_room_size(mpl); >> + length = length != 0 ? length : *mbp_buf_size; >> + if (*mbp_buf_size < length + offset) { >> + RTE_ETHDEV_LOG(ERR, >> + "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n", >> + mpl->name, *mbp_buf_size, >> + length + offset, length, offset); >> + return -EINVAL; >> + } >> + } >> + return 0; >> +} >> + >> int >> rte_eth_rx_queue_setup(uint16_t port_id, uint16_t rx_queue_id, >> uint16_t nb_rx_desc, unsigned int socket_id, >> @@ -1784,38 +1860,64 @@ struct rte_eth_dev * >> return -EINVAL; >> } >> >> - if (mp == NULL) { >> - RTE_ETHDEV_LOG(ERR, "Invalid null mempool pointer\n"); >> - return -EINVAL; >> - } >> - >> RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup, -ENOTSUP); >> >> - /* >> - * Check the size of the mbuf data buffer. >> - * This value must be provided in the private data of the memory pool. >> - * First check that the memory pool has a valid private data. >> - */ >> ret = rte_eth_dev_info_get(port_id, &dev_info); >> if (ret != 0) >> return ret; >> >> - if (mp->private_data_size < sizeof(struct rte_pktmbuf_pool_private)) { >> - RTE_ETHDEV_LOG(ERR, "%s private_data_size %d < %d\n", >> - mp->name, (int)mp->private_data_size, >> - (int)sizeof(struct rte_pktmbuf_pool_private)); >> - return -ENOSPC; >> - } >> - mbp_buf_size = rte_pktmbuf_data_room_size(mp); >> + if (mp != NULL) { >> + /* Single pool configuration check. */ >> + if (rx_conf->rx_nseg != 0) { >> + RTE_ETHDEV_LOG(ERR, >> + "Ambiguous segment configuration\n"); >> + return -EINVAL; >> + } >> + /* >> + * Check the size of the mbuf data buffer, this value >> + * must be provided in the private data of the memory pool. >> + * First check that the memory pool(s) has a valid private data. >> + */ >> + if (mp->private_data_size < >> + sizeof(struct rte_pktmbuf_pool_private)) { >> + RTE_ETHDEV_LOG(ERR, "%s private_data_size %u < %u\n", >> + mp->name, mp->private_data_size, >> + (unsigned int) >> + sizeof(struct rte_pktmbuf_pool_private)); >> + return -ENOSPC; >> + } >> + mbp_buf_size = rte_pktmbuf_data_room_size(mp); >> + if (mbp_buf_size < dev_info.min_rx_bufsize + >> + RTE_PKTMBUF_HEADROOM) { >> + RTE_ETHDEV_LOG(ERR, >> + "%s mbuf_data_room_size %u < %u (RTE_PKTMBUF_HEADROOM=%u + min_rx_bufsize(dev)=%u)\n", >> + mp->name, mbp_buf_size, >> + RTE_PKTMBUF_HEADROOM + >> + dev_info.min_rx_bufsize, >> + RTE_PKTMBUF_HEADROOM, >> + dev_info.min_rx_bufsize); >> + return -EINVAL; >> + } >> + } else { >> + const struct rte_eth_rxseg *rx_seg = rx_conf->rx_seg; >> + uint16_t n_seg = rx_conf->rx_nseg; >> >> - if (mbp_buf_size < dev_info.min_rx_bufsize + RTE_PKTMBUF_HEADROOM) { >> - RTE_ETHDEV_LOG(ERR, >> - "%s mbuf_data_room_size %d < %d (RTE_PKTMBUF_HEADROOM=%d + min_rx_bufsize(dev)=%d)\n", >> - mp->name, (int)mbp_buf_size, >> - (int)(RTE_PKTMBUF_HEADROOM + dev_info.min_rx_bufsize), >> - (int)RTE_PKTMBUF_HEADROOM, >> - (int)dev_info.min_rx_bufsize); >> - return -EINVAL; >> + /* Extended multi-segment configuration check. */ >> + if (rx_conf->rx_seg == NULL || rx_conf->rx_nseg == 0) { >> + RTE_ETHDEV_LOG(ERR, >> + "Memory pool is null and no extended configuration provided\n"); >> + return -EINVAL; >> + } >> + if (rx_conf->offloads & RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT) { >> + ret = rte_eth_rx_queue_check_split(rx_seg, n_seg, >> + &mbp_buf_size, >> + &dev_info); >> + if (ret != 0) >> + return ret; >> + } else { >> + RTE_ETHDEV_LOG(ERR, "No Rx segmentation offload configured\n"); >> + return -EINVAL; >> + } >> } >> >> /* Use default specified by driver, if nb_rx_desc is zero */ >> diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h >> index 5bcfbb8..6f18bec 100644 >> --- a/lib/librte_ethdev/rte_ethdev.h >> +++ b/lib/librte_ethdev/rte_ethdev.h >> @@ -970,6 +970,27 @@ struct rte_eth_txmode { >> }; >> >> /** >> + * A structure used to configure an Rx packet segment to split. >> + */ >> +struct rte_eth_rxseg_split { >> + struct rte_mempool *mp; /**< Memory pool to allocate segment from. */ >> + uint16_t length; /**< Segment data length, configures split point. */ >> + uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */ >> + uint32_t reserved; /**< Reserved field. */ >> +}; >> + >> +/** >> + * A common structure used to describe Rx packet segment properties. >> + */ >> +struct rte_eth_rxseg { >> + union { > > Why not just 'union rte_eth_rxseg' ? > >> + /* The settings for buffer split offload. */ >> + struct rte_eth_rxseg_split split; > > Pointer to a split table must be here. I.e. > struct rte_eth_rxseg_split *split; > Also it must be specified how the array is terminated. > We need either a number of define last item condition > (mp == NULL ?) > >> + /* The other features settings should be added here. */ >> + } conf; >> +}; > > > >> + >> +/** >> * A structure used to configure an RX ring of an Ethernet port. >> */ >> struct rte_eth_rxconf { >> @@ -977,6 +998,46 @@ struct rte_eth_rxconf { >> uint16_t rx_free_thresh; /**< Drives the freeing of RX descriptors. */ >> uint8_t rx_drop_en; /**< Drop packets if no descriptors are available. */ >> uint8_t rx_deferred_start; /**< Do not start queue with rte_eth_dev_start(). */ >> + uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */ >> + /** >> + * Points to the array of segment descriptions. Each array element >> + * describes the properties for each segment in the receiving >> + * buffer according to feature descripting structure. >> + * >> + * The supported capabilities of receiving segmentation is reported >> + * in rte_eth_dev_info ->rx_seg_capa field. >> + * >> + * If RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag is set in offloads field, >> + * the PMD will split the received packets into multiple segments >> + * according to the specification in the description array: >> + * >> + * - the first network buffer will be allocated from the memory pool, >> + * specified in the first array element, the second buffer, from the >> + * pool in the second element, and so on. >> + * >> + * - the offsets from the segment description elements specify >> + * the data offset from the buffer beginning except the first mbuf. >> + * For this one the offset is added with RTE_PKTMBUF_HEADROOM. >> + * >> + * - the lengths in the elements define the maximal data amount >> + * being received to each segment. The receiving starts with filling >> + * up the first mbuf data buffer up to specified length. If the >> + * there are data remaining (packet is longer than buffer in the first >> + * mbuf) the following data will be pushed to the next segment >> + * up to its own length, and so on. >> + * >> + * - If the length in the segment description element is zero >> + * the actual buffer size will be deduced from the appropriate >> + * memory pool properties. >> + * >> + * - if there is not enough elements to describe the buffer for entire >> + * packet of maximal length the following parameters will be used >> + * for the all remaining segments: >> + * - pool from the last valid element >> + * - the buffer size from this pool >> + * - zero offset >> + */ >> + struct rte_eth_rxseg *rx_seg; > > It must not be a pointer. It looks really strange this way > taking into account that it is a union in fact. > Also, why is it put here in the middle of exsiting structure? > IMHO it should be added after offlaods. > >> /** >> * Per-queue Rx offloads to be set using DEV_RX_OFFLOAD_* flags. >> * Only offloads set on rx_queue_offload_capa or rx_offload_capa >> @@ -1260,6 +1321,7 @@ struct rte_eth_conf { >> #define DEV_RX_OFFLOAD_SCTP_CKSUM 0x00020000 >> #define DEV_RX_OFFLOAD_OUTER_UDP_CKSUM 0x00040000 >> #define DEV_RX_OFFLOAD_RSS_HASH 0x00080000 >> +#define RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT 0x00100000 >> >> #define DEV_RX_OFFLOAD_CHECKSUM (DEV_RX_OFFLOAD_IPV4_CKSUM | \ >> DEV_RX_OFFLOAD_UDP_CKSUM | \ >> @@ -1376,6 +1438,17 @@ struct rte_eth_switch_info { >> }; >> >> /** >> + * Ethernet device Rx buffer segmentation capabilities. >> + */ >> +__extension__ >> +struct rte_eth_rxseg_capa { >> + uint16_t max_seg; /**< Maximum amount of segments to split. */ > > May be 'max_segs' to avoid confusing vs maximum segment length. > >> + uint16_t multi_pools:1; /**< Supports receiving to multiple pools.*/ >> + uint16_t offset_allowed:1; /**< Supports buffer offsets. */ >> + uint16_t offset_align_log2:4; /**< Required offset alignment. */ > > 4 bits are even insufficient to specify cache-line alignment. > IMHO at least 8 bits are required. > > Consider to put 32 width bit-fields at start of the structure. > Than, max_segs (16), offset_align_log2 (8), plus reserved (8). > >> +}; >> + >> +/** >> * Ethernet device information >> */ >> >> @@ -1403,6 +1476,7 @@ struct rte_eth_dev_info { >> /** Maximum number of hash MAC addresses for MTA and UTA. */ >> uint16_t max_vfs; /**< Maximum number of VFs. */ >> uint16_t max_vmdq_pools; /**< Maximum number of VMDq pools. */ >> + struct rte_eth_rxseg_capa rx_seg_capa; /**< Segmentation capability.*/ > > Why is it put in the middle of existing structure? > >> uint64_t rx_offload_capa; >> /**< All RX offload capabilities including all per-queue ones */ >> uint64_t tx_offload_capa; >> @@ -2027,9 +2101,21 @@ int rte_eth_dev_configure(uint16_t port_id, uint16_t nb_rx_queue, >> * No need to repeat any bit in rx_conf->offloads which has already been >> * enabled in rte_eth_dev_configure() at port level. An offloading enabled >> * at port level can't be disabled at queue level. >> + * The configuration structure also contains the pointer to the array >> + * of the receiving buffer segment descriptions, see rx_seg and rx_nseg >> + * fields, this extended configuration might be used by split offloads like >> + * RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT. If mp_pool is not NULL, >> + * the extended configuration fields must be set to NULL and zero. >> * @param mb_pool >> * The pointer to the memory pool from which to allocate *rte_mbuf* network >> - * memory buffers to populate each descriptor of the receive ring. >> + * memory buffers to populate each descriptor of the receive ring. There are >> + * two options to provide Rx buffer configuration: >> + * - single pool: >> + * mb_pool is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is 0. >> + * - multiple segments description: >> + * mb_pool is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not 0. >> + * Taken only if flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is set in offloads. >> + * >> * @return >> * - 0: Success, receive queue correctly set up. >> * - -EIO: if device is removed. >> >