From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C50B1A04B6;
	Mon, 12 Oct 2020 00:17:52 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 37DDD1D58E;
	Mon, 12 Oct 2020 00:17:51 +0200 (CEST)
Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com
 [64.147.123.24]) by dpdk.org (Postfix) with ESMTP id AD2741D58D
 for <dev@dpdk.org>; Mon, 12 Oct 2020 00:17:48 +0200 (CEST)
Received: from compute2.internal (compute2.nyi.internal [10.202.2.42])
 by mailout.west.internal (Postfix) with ESMTP id 54C43F5C;
 Sun, 11 Oct 2020 18:17:46 -0400 (EDT)
Received: from mailfrontend1 ([10.202.2.162])
 by compute2.internal (MEProxy); Sun, 11 Oct 2020 18:17:47 -0400
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h=
 from:to:cc:subject:date:message-id:in-reply-to:references
 :mime-version:content-transfer-encoding:content-type; s=fm2; bh=
 zWEt8z0AhjtqoAh6M9iGQqszh3v2Qld7MuVGfjy/HWY=; b=kbA0EiDB1yWfjYZZ
 CGSAiW8ib7Xz0CaeGDKxzyVoTGVk1tYvDwikUcCYiG+9EeE4BafZqtUknecc4nbi
 PdQsUw4oDV3XSyjtiORWEQXvgm5cWcl2Z8a40nmRpEYIKvvJkH42ahl9WdPqCm5C
 fks0UJqw10Z6+sr6KceZA2Yg+A7UbVadbFMfZGfIcQB57DBvadl5o4288Me2k4W1
 ROyGhd4XVz2XTWbaPvIhHTSyCkCWHusVhY2TXmNbtx8KkoMbDTzjdbzfA8EZc9Ql
 MiMaopxxuJsWV9UoGgdQ6h1WEWxHxuO2ed97RttYvNy5C92YhBlB0Er4NAIAPAVS
 5Ed8CA==
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=
 messagingengine.com; h=cc:content-transfer-encoding:content-type
 :date:from:in-reply-to:message-id:mime-version:references
 :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender
 :x-sasl-enc; s=fm1; bh=zWEt8z0AhjtqoAh6M9iGQqszh3v2Qld7MuVGfjy/H
 WY=; b=GmiGEbzr0m+CGoZpa9uC4uzHzGhdSkgbCl2A14q3dicD2TjJMWSusPm+9
 m3ksXYXbageWBtcZkz4zsu7iZePEujA+cXDaiOV2JVEwtoMyDNE+pYsGN3Ia+XIQ
 y1g6UesU5w5H/+beTJtD3JEb95jFob+G6A60z3LwQo422TfhorGe3ZYJ4stDMzvB
 U+9HGegJETTMlsokOVvatSRrr3WZGaA4ZR/OGijerg6cuu53Mx5cDUj7K05WDVgv
 8mRlXmupt8lmhVpGkeqnZzpUsnXSgv3D72q5tJ0od+p9jPKV0todRkHLeaGTe7BR
 zjNfEU7aDGlDZ8pXpIwVve5woWctQ==
X-ME-Sender: <xms:iISDX9Cd8ZpbRp1PDV8zzIcbCF-z_Ap0Ao105S7e3mPsz4zlKAZHzw>
 <xme:iISDX7gtXzArXXOSu3shUKLZkV2kDhCGAolDbFfugbGIYls3rwHhN6b_yJLT0pF3t
 u-dptnYVmqk7PZACA>
X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedujedrheeigddtlecutefuodetggdotefrodftvf
 curfhrohhfihhlvgemucfhrghsthforghilhdpqfgfvfdpuffrtefokffrpgfnqfghnecu
 uegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenuc
 fjughrpefhvffufffkjghfggfgtgesthfuredttddtvdenucfhrhhomhepvfhhohhmrghs
 ucfoohhnjhgrlhhonhcuoehthhhomhgrshesmhhonhhjrghlohhnrdhnvghtqeenucggtf
 frrghtthgvrhhnpedugefgvdefudfftdefgeelgffhueekgfffhfeujedtteeutdejueei
 iedvffegheenucfkphepjeejrddufeegrddvtdefrddukeegnecuvehluhhsthgvrhfuih
 iivgeptdenucfrrghrrghmpehmrghilhhfrhhomhepthhhohhmrghssehmohhnjhgrlhho
 nhdrnhgvth
X-ME-Proxy: <xmx:iISDX4kfmsjvPOMVTWIsQibCv2jf-9ovBeiA_tGqSaoQqb7oAaHDxw>
 <xmx:iISDX3zvIwdbPV6biF1t5h0k4k2SKmJE06e-Y3G2oEWegPEaOFLwwQ>
 <xmx:iISDXyQkCt6f-9EUb8oXopnElCL3AWA1tVL1iOhvwHXfD18_hIWTpw>
 <xmx:iYSDXxEXQmhbhxjyQmCqmldqUGzoEJTUvfd26Z5KvD0uqBa2Pub5cg>
Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184])
 by mail.messagingengine.com (Postfix) with ESMTPA id A59FE328005D;
 Sun, 11 Oct 2020 18:17:43 -0400 (EDT)
From: Thomas Monjalon <thomas@monjalon.net>
To: Viacheslav Ovsiienko <viacheslavo@nvidia.com>
Cc: dev@dpdk.org, stephen@networkplumber.org, ferruh.yigit@intel.com,
 olivier.matz@6wind.com, jerinjacobk@gmail.com, maxime.coquelin@redhat.com,
 david.marchand@redhat.com, arybchenko@solarflare.com
Date: Mon, 12 Oct 2020 00:17:42 +0200
Message-ID: <5451712.xXqzTqBGid@thomas>
In-Reply-To: <1602083215-22921-2-git-send-email-viacheslavo@nvidia.com>
References: <MWHPR12MB136076E652230CEBD6EE6562DF5F0@MWHPR12MB1360.namprd12.prod.outlook.com>
 <1602083215-22921-1-git-send-email-viacheslavo@nvidia.com>
 <1602083215-22921-2-git-send-email-viacheslavo@nvidia.com>
MIME-Version: 1.0
Content-Transfer-Encoding: 7Bit
Content-Type: text/plain; charset="us-ascii"
Subject: Re: [dpdk-dev] [PATCH v2 1/9] ethdev: introduce Rx buffer split
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

07/10/2020 17:06, Viacheslav Ovsiienko:
> The DPDK datapath in the transmit direction is very flexible.
> An application can build the multi-segment packet and manages
> almost all data aspects - the memory pools where segments
> are allocated from, the segment lengths, the memory attributes
> like external buffers, registered for DMA, etc.
> 
> In the receiving direction, the datapath is much less flexible,
> an application can only specify the memory pool to configure the
> receiving queue and nothing more. In order to extend receiving
> datapath capabilities it is proposed to add the way to provide
> extended information how to split the packets being received.
> 
> The following structure is introduced to specify the Rx packet
> segment:
> 
> struct rte_eth_rxseg {
>     struct rte_mempool *mp; /* memory pools to allocate segment from */
>     uint16_t length; /* segment maximal data length */

The "length" parameter is configuring a split point.
Worth to note in the comment I think.

>     uint16_t offset; /* data offset from beginning of mbuf data buffer */

Is it replacing RTE_PKTMBUF_HEADROOM?

>     uint32_t reserved; /* reserved field */
> };
> 
> The new routine rte_eth_rx_queue_setup_ex() is introduced to
> setup the given Rx queue using the new extended Rx packet segment
> description:
> 
> int
> rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id,
>                           uint16_t nb_rx_desc, unsigned int socket_id,
>                           const struct rte_eth_rxconf *rx_conf,
> 		          const struct rte_eth_rxseg *rx_seg,
>                           uint16_t n_seg)

An alternative name for this function:
	rte_eth_rxseg_queue_setup

> This routine presents the two new parameters:
>     rx_seg - pointer the array of segment descriptions, each element
>              describes the memory pool, maximal data length, initial
>              data offset from the beginning of data buffer in mbuf
>     n_seg - number of elements in the array

Not clear why we need an array.
I suggest writing here that each segment of the same packet
can have different properties, the array representing the full packet.

> The new offload flag DEV_RX_OFFLOAD_BUFFER_SPLIT in device

The name should start with RTE_ prefix.

> capabilities is introduced to present the way for PMD to report to
> application about supporting Rx packet split to configurable
> segments. Prior invoking the rte_eth_rx_queue_setup_ex() routine
> application should check DEV_RX_OFFLOAD_BUFFER_SPLIT flag.
> 
> If the Rx queue is configured with new routine the packets being
> received will be split into multiple segments pushed to the mbufs
> with specified attributes. The PMD will allocate the first mbuf
> from the pool specified in the first segment descriptor and puts
> the data staring at specified offset in the allocated mbuf data
> buffer. If packet length exceeds the specified segment length
> the next mbuf will be allocated according to the next segment
> descriptor (if any) and data will be put in its data buffer at
> specified offset and not exceeding specified length. If there is
> no next descriptor the next mbuf will be allocated and filled in the
> same way (from the same pool and with the same buffer offset/length)
> as the current one.
> 
> For example, let's suppose we configured the Rx queue with the
> following segments:
>     seg0 - pool0, len0=14B, off0=RTE_PKTMBUF_HEADROOM
>     seg1 - pool1, len1=20B, off1=0B
>     seg2 - pool2, len2=20B, off2=0B
>     seg3 - pool3, len3=512B, off3=0B
> 
> The packet 46 bytes long will look like the following:
>     seg0 - 14B long @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
>     seg1 - 20B long @ 0 in mbuf from pool1
>     seg2 - 12B long @ 0 in mbuf from pool2
> 
> The packet 1500 bytes long will look like the following:
>     seg0 - 14B @ RTE_PKTMBUF_HEADROOM in mbuf from pool0
>     seg1 - 20B @ 0 in mbuf from pool1
>     seg2 - 20B @ 0 in mbuf from pool2
>     seg3 - 512B @ 0 in mbuf from pool3
>     seg4 - 512B @ 0 in mbuf from pool3
>     seg5 - 422B @ 0 in mbuf from pool3
> 
> The offload DEV_RX_OFFLOAD_SCATTER must be present and
> configured to support new buffer split feature (if n_seg
> is greater than one).
> 
> The new approach would allow splitting the ingress packets into
> multiple parts pushed to the memory with different attributes.
> For example, the packet headers can be pushed to the embedded
> data buffers within mbufs and the application data into
> the external buffers attached to mbufs allocated from the
> different memory pools. The memory attributes for the split
> parts may differ either - for example the application data
> may be pushed into the external memory located on the dedicated
> physical device, say GPU or NVMe. This would improve the DPDK
> receiving datapath flexibility with preserving compatibility
> with existing API.
> 
> Also, the proposed segment description might be used to specify
> Rx packet split for some other features. For example, provide
> the way to specify the extra memory pool for the Header Split
> feature of some Intel PMD.

I don't understand what you are referring in this last paragraph.
I think explanation above is enough to demonstrate the flexibility.

> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@nvidia.com>

Thank you, I like this feature.
More minor comments below.

[...]
> +* **Introduced extended buffer description for receiving.**

Rewording:
	Introduced extended setup of Rx queue

> +  * Added extended Rx queue setup routine
> +  * Added description for Rx segment sizes

not only "sizes", but also offset and mempool.

> +  * Added capability to specify the memory pool for each segment

This one can be merged with the above, or offset should be added.

[...]
The doxygen comment is missing here.

> +int rte_eth_rx_queue_setup_ex(uint16_t port_id, uint16_t rx_queue_id,
> +		uint16_t nb_rx_desc, unsigned int socket_id,
> +		const struct rte_eth_rxconf *rx_conf,
> +		const struct rte_eth_rxseg *rx_seg, uint16_t n_seg);

This new function should be experimental and it should be added to the .map file.