From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 72A45A04DB; Fri, 16 Oct 2020 11:21:41 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 14DF31EBC5; Fri, 16 Oct 2020 11:21:40 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by dpdk.org (Postfix) with ESMTP id D8AB71EBA0 for ; Fri, 16 Oct 2020 11:21:38 +0200 (CEST) Received: from [192.168.38.17] (aros.oktetlabs.ru [192.168.38.17]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id 7EEAE7F5FB; Fri, 16 Oct 2020 12:21:37 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru 7EEAE7F5FB DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1602840097; bh=Dt6+S4F6u5fkMBZciSJwscHydb8nO63A+8LN5W+XAaI=; h=Subject:To:Cc:References:From:Date:In-Reply-To; b=OhloseGtq7KHMKiPIo91RA2npaafUQA78YwOJzazggS2Pham3BywmFY7HiYm48sra UQ6XCTmJBeCxmssRtoWd2MX92TVliyFvA5rEQWzL66m0aUnYdhprMY61ViMB/zuVow OeufA7zqGoi9E5kbSvZf2pbhtwJSmclK9mYyIyxM= To: Ferruh Yigit , Viacheslav Ovsiienko , dev@dpdk.org Cc: thomas@monjalon.net, stephen@networkplumber.org, olivier.matz@6wind.com, jerinjacobk@gmail.com, maxime.coquelin@redhat.com, david.marchand@redhat.com, arybchenko@solarflare.com References: <1602834519-8696-1-git-send-email-viacheslavo@nvidia.com> <1602834519-8696-2-git-send-email-viacheslavo@nvidia.com> <3e1da2f4-074d-9d03-c44f-435498eedaa6@intel.com> From: Andrew Rybchenko Organization: OKTET Labs Message-ID: <36127641-1ad6-9d09-b0e8-9ab708ab4e4d@oktetlabs.ru> Date: Fri, 16 Oct 2020 12:21:37 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.3.1 MIME-Version: 1.0 In-Reply-To: <3e1da2f4-074d-9d03-c44f-435498eedaa6@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v8 1/6] ethdev: introduce Rx buffer split X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 10/16/20 12:19 PM, Ferruh Yigit wrote: > On 10/16/2020 8:48 AM, Viacheslav Ovsiienko wrote: >> The DPDK datapath in the transmit direction is very flexible. >> An application can build the multi-segment packet and manages >> almost all data aspects - the memory pools where segments >> are allocated from, the segment lengths, the memory attributes >> like external buffers, registered for DMA, etc. >> >> In the receiving direction, the datapath is much less flexible, >> an application can only specify the memory pool to configure the >> receiving queue and nothing more. In order to extend receiving >> datapath capabilities it is proposed to add the way to provide >> extended information how to split the packets being received. >> >> The new offload flag RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT in device >> capabilities is introduced to present the way for PMD to report to >> application about supporting Rx packet split to configurable >> segments. Prior invoking the rte_eth_rx_queue_setup() routine >> application should check RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag. >> >> The following structure is introduced to specify the Rx packet >> segment for RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT offload: >> >> struct rte_eth_rxseg_split { >> >>      struct rte_mempool *mp; /* memory pools to allocate segment from */ >>      uint16_t length; /* segment maximal data length, >>                    configures "split point" */ >>      uint16_t offset; /* data offset from beginning >>                    of mbuf data buffer */ >>      uint32_t reserved; /* reserved field */ >> }; >> >> The segment descriptions are added to the rte_eth_rxconf structure: >>     rx_seg - pointer the array of segment descriptions, each element >>               describes the memory pool, maximal data length, initial >>               data offset from the beginning of data buffer in mbuf. >>          This array allows to specify the different settings for >>          each segment in individual fashion. >>     rx_nseg - number of elements in the array >> >> If the extended segment descriptions is provided with these new >> fields the mp parameter of the rte_eth_rx_queue_setup must be >> specified as NULL to avoid ambiguity. >> >> There are two options to specify Rx buffer configuration: >> - mp is not NULL, rx_conf.rx_seg is NULL, rx_conf.rx_nseg is zero, >>    it is compatible configuration, follows existing implementation, >>    provides single pool and no description for segment sizes >>    and offsets. >> - mp is NULL, rx_conf.rx_seg is not NULL, rx_conf.rx_nseg is not >>    zero, it provides the extended configuration, individually for >>    each segment. >> >> f the Rx queue is configured with new settings the packets being >> received will be split into multiple segments pushed to the mbufs >> with specified attributes. The PMD will split the received packets >> into multiple segments according to the specification in the >> description array. >> >> For example, let's suppose we configured the Rx queue with the >> following segments: >>      seg0 - pool0, len0=14B, off0=2 >>      seg1 - pool1, len1=20B, off1=128B >>      seg2 - pool2, len2=20B, off2=0B >>      seg3 - pool3, len3=512B, off3=0B >> >> The packet 46 bytes long will look like the following: >>      seg0 - 14B long @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 >>      seg1 - 20B long @ 128 in mbuf from pool1 >>      seg2 - 12B long @ 0 in mbuf from pool2 >> >> The packet 1500 bytes long will look like the following: >>      seg0 - 14B @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 >>      seg1 - 20B @ 128 in mbuf from pool1 >>      seg2 - 20B @ 0 in mbuf from pool2 >>      seg3 - 512B @ 0 in mbuf from pool3 >>      seg4 - 512B @ 0 in mbuf from pool3 >>      seg5 - 422B @ 0 in mbuf from pool3 >> >> The offload RTE_ETH_RX_OFFLOAD_SCATTER must be present and >> configured to support new buffer split feature (if rx_nseg >> is greater than one). >> >> The split limitations imposed by underlying PMD is reported >> in the new introduced rte_eth_dev_info->rx_seg_capa field. >> >> The new approach would allow splitting the ingress packets into >> multiple parts pushed to the memory with different attributes. >> For example, the packet headers can be pushed to the embedded >> data buffers within mbufs and the application data into >> the external buffers attached to mbufs allocated from the >> different memory pools. The memory attributes for the split >> parts may differ either - for example the application data >> may be pushed into the external memory located on the dedicated >> physical device, say GPU or NVMe. This would improve the DPDK >> receiving datapath flexibility with preserving compatibility >> with existing API. >> >> Signed-off-by: Viacheslav Ovsiienko >> Acked-by: Ajit Khaparde >> Acked-by: Jerin Jacob > > <...> > >> +/** >>    * A structure used to configure an RX ring of an Ethernet port. >>    */ >>   struct rte_eth_rxconf { >> @@ -977,6 +998,46 @@ struct rte_eth_rxconf { >>       uint16_t rx_free_thresh; /**< Drives the freeing of RX >> descriptors. */ >>       uint8_t rx_drop_en; /**< Drop packets if no descriptors are >> available. */ >>       uint8_t rx_deferred_start; /**< Do not start queue with >> rte_eth_dev_start(). */ >> +    uint16_t rx_nseg; /**< Number of descriptions in rx_seg array. */ >> +    /** >> +     * Points to the array of segment descriptions. Each array element >> +     * describes the properties for each segment in the receiving >> +     * buffer according to feature descripting structure. >> +     * >> +     * The supported capabilities of receiving segmentation is reported >> +     * in rte_eth_dev_info ->rx_seg_capa field. >> +     * >> +     * If RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT flag is set in offloads field, >> +     * the PMD will split the received packets into multiple segments >> +     * according to the specification in the description array: >> +     * >> +     * - the first network buffer will be allocated from the memory >> pool, >> +     *   specified in the first array element, the second buffer, >> from the >> +     *   pool in the second element, and so on. >> +     * >> +     * - the offsets from the segment description elements specify >> +     *   the data offset from the buffer beginning except the first >> mbuf. >> +     *   For this one the offset is added with RTE_PKTMBUF_HEADROOM. >> +     * >> +     * - the lengths in the elements define the maximal data amount >> +     *   being received to each segment. The receiving starts with >> filling >> +     *   up the first mbuf data buffer up to specified length. If the >> +     *   there are data remaining (packet is longer than buffer in >> the first >> +     *   mbuf) the following data will be pushed to the next segment >> +     *   up to its own length, and so on. >> +     * >> +     * - If the length in the segment description element is zero >> +     *   the actual buffer size will be deduced from the appropriate >> +     *   memory pool properties. >> +     * >> +     * - if there is not enough elements to describe the buffer for >> entire >> +     *   packet of maximal length the following parameters will be used >> +     *   for the all remaining segments: >> +     *     - pool from the last valid element >> +     *     - the buffer size from this pool >> +     *     - zero offset >> +     */ >> +    struct rte_eth_rxseg *rx_seg; > > "struct rte_eth_rxconf" is very commonly used, I think all applications > does the 'rte_eth_rx_queue_setup()', but "buffer split" is not a common > usage, > > I am against the "struct rte_eth_rxseg *rx_seg;" field creating this > much noise in the "struct rte_eth_rxconf" documentation. > As mentioned before, can you please move the above detailed > documentation to where "struct rte_eth_rxseg" defined, and in this > struct put a single comment for "struct rte_eth_rxseg *rx_seg" ? +1