DPDK patches and discussions
 help / color / mirror / Atom feed
From: Slava Ovsiienko <viacheslavo@mellanox.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Matan Azrad <matan@mellanox.com>,
	Raslan Darawsheh <rasland@mellanox.com>,
	 Thomas Monjalon <thomas@monjalon.net>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"jerinjacobk@gmail.com" <jerinjacobk@gmail.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure
Date: Mon, 3 Aug 2020 16:51:00 +0000
Message-ID: <AM4PR05MB32653565BDFF6FF0AD2F9CB3D24D0@AM4PR05MB3265.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <79886244-1390-6c99-287d-1d868bb4090a@solarflare.com>

> -----Original Message-----
> From: Andrew Rybchenko <arybchenko@solarflare.com>
> Sent: Monday, August 3, 2020 18:31
> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> ferruh.yigit@intel.com; jerinjacobk@gmail.com;
> stephen@networkplumber.org; ajit.khaparde@broadcom.com;
> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> david.marchand@redhat.com
> Subject: Re: [PATCH] doc: announce changes to ethdev rxconf structure
> 
> Hi Slava,
> 
> On 8/3/20 6:18 PM, Slava Ovsiienko wrote:
> > Hi, Andrew
> >
> > Thanks for the comment, please, see below.
> >
> >> -----Original Message-----
> >> From: Andrew Rybchenko <arybchenko@solarflare.com>
> >> Sent: Monday, August 3, 2020 17:31
> >> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
> >> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
> >> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
> >> ferruh.yigit@intel.com; jerinjacobk@gmail.com;
> >> stephen@networkplumber.org; ajit.khaparde@broadcom.com;
> >> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> >> david.marchand@redhat.com
> >> Subject: Re: ***Spam*** [PATCH] doc: announce changes to ethdev
> >> rxconf structure
> >>
> >> On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
> >>> The DPDK datapath in the transmit direction is very flexible.
> >>> The applications can build multisegment packets and manages almost
> >>> all data aspects - the memory pools where segments are allocated
> >>> from, the segment lengths, the memory attributes like external,
> registered, etc.
> >>>
> >>> In the receiving direction, the datapath is much less flexible, the
> >>> applications can only specify the memory pool to configure the
> >>> receiving queue and nothing more. In order to extend the receiving
> >>> datapath capabilities it is proposed to add the new fields into
> >>> rte_eth_rxconf structure:
> >>>
> >>> struct rte_eth_rxconf {
> >>>     ...
> >>>     uint16_t rx_split_num; /* number of segments to split */
> >>>     uint16_t *rx_split_len; /* array of segment lengthes */
> >>>     struct rte_mempool **mp; /* array of segment memory pools */
> >>>     ...
> >>> };
> >>>
> >>> The non-zero value of rx_split_num field configures the receiving
> >>> queue to split ingress packets into multiple segments to the mbufs
> >>> allocated from various memory pools according to the specified
> >>> lengths. The zero value of rx_split_num field provides the backward
> >>> compatibility and queue should be configured in a regular way (with
> >>> single/multiple mbufs of the same data buffer length allocated from
> >>> the single memory pool).
> >>
> >> From the above description it is not 100% clear how it will coexist with:
> >>  - existing mb_pool argument of the rte_eth_rx_queue_setup()
> >>  - DEV_RX_OFFLOAD_SCATTER
> >
> > DEV_RX_OFFLOAD_SCATTER flag is required to be reported and configured
> > for the new feature to indicate the application is prepared for the
> > multisegment packets.
> 
> I hope it will be mentioned in the feature documentation in the future, but
> I'm not 100% sure that it is required. See below.
I suppose there is the hierarchy:
- applications configures DEV_RX_OFFLOAD_SCATTER on the port and tells in this way:
"Hey, driver, I'm ready to handle multi-segment packets". Readiness in general.
- application configures BUFFER_SPLIT and tells PMD _HOW_ it wants to split, in particular way:
"Hey, driver, please, drop ten bytes here, here and here, and the rest - over there"


> >
> > But SCATTER it just tells that ingress packet length can exceed the
> > mbuf data buffer length and the chain of mbufs must be built to store
> > the entire packet. But there is the limitation - all mbufs are
> > allocated  from the same memory pool, and all data buffers have the same
> length.
> > The new feature provides an opportunity to allocated mbufs from the
> > desired pools and specifies the length of each buffer/part.
> 
> Yes, it is clear, but what happens if packet does not fit into the provided
> pools chain? Is the last used many times? May be it logical to use Rx queue
> setup mb_pool as well for the purpose? I.e. use suggested here pools only
> once and use mb_pool many times for the rest if SCATTER is supported and
> only once if SCATTER is not supported.

It could be easily configured w/o involving SCATTER flag - just specify the last pool
multiple times. I.e.
pool 0 - 14B
pool 1 - 20B
...
pool N - 512B
pool N - 512B
pool N - 512B, sum of length >= max packet size 1518

It was supposed the sum of lengths in array covers the maximal packet size.
Currently there is the limitation on packet size, for example mlx5 PMD 
just drops the packets with the length exceeded the one queue is configured for.

> 
> >
> >>  - DEV_RX_OFFLOAD_HEADER_SPLIT
> > The new feature (let's name it as "BUFFER_SPLIT") might be supported
> > in conjunction with HEADER_SPLIT (say, split the rest of the data
> > after the header) or rejected if HEADER_SPLIT is configured on the
> > port, depending on PMD implementation (return ENOTSUP if both features
> are requested on the same port).
> 
> OK, consider to make SCATTER and BUFFER_SPLIT independent as suggested
> above.
Sorry, do you mean HEADER_SPLIT and BUFFER_SPLIT?

> 
> >
> >> How will application know that the feature is supported? Limitations?
> > It is subject for further discussion, I see two options:
> >  - introduce the DEV_RX_OFFLOAD_BUFFER_SPLIT flag
> 
> +1
OK, got it.

> 
> > - return ENOTSUP/EINVAL from rx_queue_setup() if feature is requested
> >   (mp parameter is supposed to be NULL for the case)
> 
> I'd say that it should be used for corner cases only which are hard to
> formalize. It could be important to know maximum number of buffers to
> split, total length which could be split from the remaining, limitations on split
> lengths.
Agree, the dedicated OFFLOAD flag seems to be preferable.

With best regards, Slava

> 
> >
> >> Is it always split by specified/fixed length?
> > Yes, it is simple feature, it splits the data to the buffers with
> > required memory attributes provided by specified pools according to the
> fixed lengths.
> > It should be OK for protocols like eCPRI or some tunneling.
> 
> I see. Thanks.
> 
> >
> >> What happens if header length is actually different?
> > It is per queue configuration, packets might be sorted with rte_flow engine
> between the queues.
> > The supposed use case is to filter out specific protocol packets (say
> > eCPRI with fixed header length) and split ones on specific Rx queue.
> 
> Got it.
> 
> Thanks,
> Andrew.
> 
> >
> >
> > With best regards,
> > Slava
> >
> >>
> >>> The new approach would allow splitting the ingress packets into
> >>> multiple parts pushed to the memory with different attributes.
> >>> For example, the packet headers can be pushed to the embedded data
> >>> buffers within mbufs and the application data into the external
> >>> buffers attached to mbufs allocated from the different memory pools.
> >>> The memory attributes for the split parts may differ either - for
> >>> example the application data may be pushed into the external memory
> >>> located on the dedicated physical device, say GPU or NVMe. This
> >>> would improve the DPDK receiving datapath flexibility preserving
> >>> compatibility with existing API.
> >>>
> >>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
> >>> ---
> >>>  doc/guides/rel_notes/deprecation.rst | 5 +++++
> >>>  1 file changed, 5 insertions(+)
> >>>
> >>> diff --git a/doc/guides/rel_notes/deprecation.rst
> >>> b/doc/guides/rel_notes/deprecation.rst
> >>> index ea4cfa7..cd700ae 100644
> >>> --- a/doc/guides/rel_notes/deprecation.rst
> >>> +++ b/doc/guides/rel_notes/deprecation.rst
> >>> @@ -99,6 +99,11 @@ Deprecation Notices
> >>>    In 19.11 PMDs will still update the field even when the offload is not
> >>>    enabled.
> >>>
> >>> +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the
> >>> +receiving
> >>> +  queues to split ingress packets into multiple segments according
> >>> +to the
> >>> +  specified lengths into the buffers allocated from the specified
> >>> +  memory pools. The backward compatibility to existing API is
> preserved.
> >>> +
> >>>  * ethdev: ``rx_descriptor_done`` dev_ops and
> >> ``rte_eth_rx_descriptor_done``
> >>>    will be deprecated in 20.11 and will be removed in 21.11.
> >>>    Existing ``rte_eth_rx_descriptor_status`` and
> >>> ``rte_eth_tx_descriptor_status``
> >


  reply	other threads:[~2020-08-03 16:51 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-03 15:18 Slava Ovsiienko
2020-08-03 15:31 ` Andrew Rybchenko
2020-08-03 16:51   ` Slava Ovsiienko [this message]
2020-08-30 12:58     ` Andrew Rybchenko
2020-08-30 18:26       ` Stephen Hemminger
2020-08-31  6:35         ` Andrew Rybchenko
2020-08-31 16:59           ` Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2020-08-03 10:58 Viacheslav Ovsiienko
2020-08-03 11:56 ` Jerin Jacob
2020-08-03 13:06   ` Slava Ovsiienko
2020-08-04 13:32     ` Jerin Jacob
2020-08-05  6:35       ` Slava Ovsiienko
2020-08-06 15:58       ` Ferruh Yigit
2020-08-06 16:25         ` Stephen Hemminger
2020-08-06 16:41           ` Jerin Jacob
2020-08-06 17:03           ` Slava Ovsiienko
2020-08-06 18:10             ` Stephen Hemminger
2020-08-07 11:23               ` Slava Ovsiienko
2020-08-03 14:31 ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2020-08-06 16:15   ` [dpdk-dev] " Ferruh Yigit
2020-08-06 16:29     ` Slava Ovsiienko
2020-08-06 16:37       ` Ferruh Yigit
2020-08-06 16:39         ` Slava Ovsiienko
2020-08-06 16:43           ` Ferruh Yigit
2020-08-06 16:48             ` Slava Ovsiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR05MB32653565BDFF6FF0AD2F9CB3D24D0@AM4PR05MB3265.eurprd05.prod.outlook.com \
    --to=viacheslavo@mellanox.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=arybchenko@solarflare.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinjacobk@gmail.com \
    --cc=matan@mellanox.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=olivier.matz@6wind.com \
    --cc=rasland@mellanox.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git