DPDK patches and discussions
 help / color / mirror / Atom feed
From: Andrew Rybchenko <arybchenko@solarflare.com>
To: Slava Ovsiienko <viacheslavo@mellanox.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: Matan Azrad <matan@mellanox.com>,
	Raslan Darawsheh <rasland@mellanox.com>,
	 Thomas Monjalon <thomas@monjalon.net>,
	"ferruh.yigit@intel.com" <ferruh.yigit@intel.com>,
	"jerinjacobk@gmail.com" <jerinjacobk@gmail.com>,
	"stephen@networkplumber.org" <stephen@networkplumber.org>,
	"ajit.khaparde@broadcom.com" <ajit.khaparde@broadcom.com>,
	"maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [PATCH] doc: announce changes to ethdev rxconf structure
Date: Sun, 30 Aug 2020 15:58:57 +0300
Message-ID: <abe27587-0ca4-a5c3-d96b-88640837b0e9@solarflare.com> (raw)
In-Reply-To: <AM4PR05MB32653565BDFF6FF0AD2F9CB3D24D0@AM4PR05MB3265.eurprd05.prod.outlook.com>

On 8/3/20 7:51 PM, Slava Ovsiienko wrote:
>> -----Original Message-----
>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>> Sent: Monday, August 3, 2020 18:31
>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
>> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
>> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>> ferruh.yigit@intel.com; jerinjacobk@gmail.com;
>> stephen@networkplumber.org; ajit.khaparde@broadcom.com;
>> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
>> david.marchand@redhat.com
>> Subject: Re: [PATCH] doc: announce changes to ethdev rxconf structure
>>
>> Hi Slava,
>>
>> On 8/3/20 6:18 PM, Slava Ovsiienko wrote:
>>> Hi, Andrew
>>>
>>> Thanks for the comment, please, see below.
>>>
>>>> -----Original Message-----
>>>> From: Andrew Rybchenko <arybchenko@solarflare.com>
>>>> Sent: Monday, August 3, 2020 17:31
>>>> To: Slava Ovsiienko <viacheslavo@mellanox.com>; dev@dpdk.org
>>>> Cc: Matan Azrad <matan@mellanox.com>; Raslan Darawsheh
>>>> <rasland@mellanox.com>; Thomas Monjalon <thomas@monjalon.net>;
>>>> ferruh.yigit@intel.com; jerinjacobk@gmail.com;
>>>> stephen@networkplumber.org; ajit.khaparde@broadcom.com;
>>>> maxime.coquelin@redhat.com; olivier.matz@6wind.com;
>>>> david.marchand@redhat.com
>>>> Subject: Re: ***Spam*** [PATCH] doc: announce changes to ethdev
>>>> rxconf structure
>>>>
>>>> On 8/3/20 1:58 PM, Viacheslav Ovsiienko wrote:
>>>>> The DPDK datapath in the transmit direction is very flexible.
>>>>> The applications can build multisegment packets and manages almost
>>>>> all data aspects - the memory pools where segments are allocated
>>>>> from, the segment lengths, the memory attributes like external,
>> registered, etc.
>>>>>
>>>>> In the receiving direction, the datapath is much less flexible, the
>>>>> applications can only specify the memory pool to configure the
>>>>> receiving queue and nothing more. In order to extend the receiving
>>>>> datapath capabilities it is proposed to add the new fields into
>>>>> rte_eth_rxconf structure:
>>>>>
>>>>> struct rte_eth_rxconf {
>>>>>     ...
>>>>>     uint16_t rx_split_num; /* number of segments to split */
>>>>>     uint16_t *rx_split_len; /* array of segment lengthes */
>>>>>     struct rte_mempool **mp; /* array of segment memory pools */
>>>>>     ...
>>>>> };
>>>>>
>>>>> The non-zero value of rx_split_num field configures the receiving
>>>>> queue to split ingress packets into multiple segments to the mbufs
>>>>> allocated from various memory pools according to the specified
>>>>> lengths. The zero value of rx_split_num field provides the backward
>>>>> compatibility and queue should be configured in a regular way (with
>>>>> single/multiple mbufs of the same data buffer length allocated from
>>>>> the single memory pool).
>>>>
>>>> From the above description it is not 100% clear how it will coexist with:
>>>>  - existing mb_pool argument of the rte_eth_rx_queue_setup()
>>>>  - DEV_RX_OFFLOAD_SCATTER
>>>
>>> DEV_RX_OFFLOAD_SCATTER flag is required to be reported and configured
>>> for the new feature to indicate the application is prepared for the
>>> multisegment packets.
>>
>> I hope it will be mentioned in the feature documentation in the future, but
>> I'm not 100% sure that it is required. See below.
> I suppose there is the hierarchy:
> - applications configures DEV_RX_OFFLOAD_SCATTER on the port and tells in this way:
> "Hey, driver, I'm ready to handle multi-segment packets". Readiness in general.
> - application configures BUFFER_SPLIT and tells PMD _HOW_ it wants to split, in particular way:
> "Hey, driver, please, drop ten bytes here, here and here, and the rest - over there"

My idea is to keep SCATTER and BUFFER_SPLIT independent.
SCATTER is a possibility to make multi-segment packets getting
mbufs from main rxq mempool as many as required.
BUFFER_SPLIT is support of many mempools and splitting
received packets as specified.

>>>
>>> But SCATTER it just tells that ingress packet length can exceed the
>>> mbuf data buffer length and the chain of mbufs must be built to store
>>> the entire packet. But there is the limitation - all mbufs are
>>> allocated  from the same memory pool, and all data buffers have the same
>> length.
>>> The new feature provides an opportunity to allocated mbufs from the
>>> desired pools and specifies the length of each buffer/part.
>>
>> Yes, it is clear, but what happens if packet does not fit into the provided
>> pools chain? Is the last used many times? May be it logical to use Rx queue
>> setup mb_pool as well for the purpose? I.e. use suggested here pools only
>> once and use mb_pool many times for the rest if SCATTER is supported and
>> only once if SCATTER is not supported.
> 
> It could be easily configured w/o involving SCATTER flag - just specify the last pool
> multiple times. I.e.
> pool 0 - 14B
> pool 1 - 20B
> ...
> pool N - 512B
> pool N - 512B
> pool N - 512B, sum of length >= max packet size 1518

I see, but IMHO it is excessive. pools 0 .. N-1 could be
provided in buffer split config, Nth should be in main
RxQ mempool plus SCATTER enabled.

> It was supposed the sum of lengths in array covers the maximal packet size.
> Currently there is the limitation on packet size, for example mlx5 PMD 
> just drops the packets with the length exceeded the one queue is configured for.
> 
>>
>>>
>>>>  - DEV_RX_OFFLOAD_HEADER_SPLIT
>>> The new feature (let's name it as "BUFFER_SPLIT") might be supported
>>> in conjunction with HEADER_SPLIT (say, split the rest of the data
>>> after the header) or rejected if HEADER_SPLIT is configured on the
>>> port, depending on PMD implementation (return ENOTSUP if both features
>> are requested on the same port).
>>
>> OK, consider to make SCATTER and BUFFER_SPLIT independent as suggested
>> above.
> Sorry, do you mean HEADER_SPLIT and BUFFER_SPLIT?

See above.

>>
>>>
>>>> How will application know that the feature is supported? Limitations?
>>> It is subject for further discussion, I see two options:
>>>  - introduce the DEV_RX_OFFLOAD_BUFFER_SPLIT flag
>>
>> +1
> OK, got it.
> 
>>
>>> - return ENOTSUP/EINVAL from rx_queue_setup() if feature is requested
>>>   (mp parameter is supposed to be NULL for the case)
>>
>> I'd say that it should be used for corner cases only which are hard to
>> formalize. It could be important to know maximum number of buffers to
>> split, total length which could be split from the remaining, limitations on split
>> lengths.
> Agree, the dedicated OFFLOAD flag seems to be preferable.
> 
> With best regards, Slava
> 
>>
>>>
>>>> Is it always split by specified/fixed length?
>>> Yes, it is simple feature, it splits the data to the buffers with
>>> required memory attributes provided by specified pools according to the
>> fixed lengths.
>>> It should be OK for protocols like eCPRI or some tunneling.
>>
>> I see. Thanks.
>>
>>>
>>>> What happens if header length is actually different?
>>> It is per queue configuration, packets might be sorted with rte_flow engine
>> between the queues.
>>> The supposed use case is to filter out specific protocol packets (say
>>> eCPRI with fixed header length) and split ones on specific Rx queue.
>>
>> Got it.
>>
>> Thanks,
>> Andrew.
>>
>>>
>>>
>>> With best regards,
>>> Slava
>>>
>>>>
>>>>> The new approach would allow splitting the ingress packets into
>>>>> multiple parts pushed to the memory with different attributes.
>>>>> For example, the packet headers can be pushed to the embedded data
>>>>> buffers within mbufs and the application data into the external
>>>>> buffers attached to mbufs allocated from the different memory pools.
>>>>> The memory attributes for the split parts may differ either - for
>>>>> example the application data may be pushed into the external memory
>>>>> located on the dedicated physical device, say GPU or NVMe. This
>>>>> would improve the DPDK receiving datapath flexibility preserving
>>>>> compatibility with existing API.
>>>>>
>>>>> Signed-off-by: Viacheslav Ovsiienko <viacheslavo@mellanox.com>
>>>>> ---
>>>>>  doc/guides/rel_notes/deprecation.rst | 5 +++++
>>>>>  1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/doc/guides/rel_notes/deprecation.rst
>>>>> b/doc/guides/rel_notes/deprecation.rst
>>>>> index ea4cfa7..cd700ae 100644
>>>>> --- a/doc/guides/rel_notes/deprecation.rst
>>>>> +++ b/doc/guides/rel_notes/deprecation.rst
>>>>> @@ -99,6 +99,11 @@ Deprecation Notices
>>>>>    In 19.11 PMDs will still update the field even when the offload is not
>>>>>    enabled.
>>>>>
>>>>> +* ethdev: add new fields to ``rte_eth_rxconf`` to configure the
>>>>> +receiving
>>>>> +  queues to split ingress packets into multiple segments according
>>>>> +to the
>>>>> +  specified lengths into the buffers allocated from the specified
>>>>> +  memory pools. The backward compatibility to existing API is
>> preserved.
>>>>> +
>>>>>  * ethdev: ``rx_descriptor_done`` dev_ops and
>>>> ``rte_eth_rx_descriptor_done``
>>>>>    will be deprecated in 20.11 and will be removed in 21.11.
>>>>>    Existing ``rte_eth_rx_descriptor_status`` and
>>>>> ``rte_eth_tx_descriptor_status``
>>>
> 


  reply	other threads:[~2020-08-30 12:59 UTC|newest]

Thread overview: 24+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-08-03 15:18 Slava Ovsiienko
2020-08-03 15:31 ` Andrew Rybchenko
2020-08-03 16:51   ` Slava Ovsiienko
2020-08-30 12:58     ` Andrew Rybchenko [this message]
2020-08-30 18:26       ` Stephen Hemminger
2020-08-31  6:35         ` Andrew Rybchenko
2020-08-31 16:59           ` Stephen Hemminger
  -- strict thread matches above, loose matches on Subject: below --
2020-08-03 10:58 Viacheslav Ovsiienko
2020-08-03 11:56 ` Jerin Jacob
2020-08-03 13:06   ` Slava Ovsiienko
2020-08-04 13:32     ` Jerin Jacob
2020-08-05  6:35       ` Slava Ovsiienko
2020-08-06 15:58       ` Ferruh Yigit
2020-08-06 16:25         ` Stephen Hemminger
2020-08-06 16:41           ` Jerin Jacob
2020-08-06 17:03           ` Slava Ovsiienko
2020-08-06 18:10             ` Stephen Hemminger
2020-08-07 11:23               ` Slava Ovsiienko
2020-08-03 14:31 ` [dpdk-dev] ***Spam*** " Andrew Rybchenko
2020-08-06 16:15   ` [dpdk-dev] " Ferruh Yigit
2020-08-06 16:29     ` Slava Ovsiienko
2020-08-06 16:37       ` Ferruh Yigit
2020-08-06 16:39         ` Slava Ovsiienko
2020-08-06 16:43           ` Ferruh Yigit
2020-08-06 16:48             ` Slava Ovsiienko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=abe27587-0ca4-a5c3-d96b-88640837b0e9@solarflare.com \
    --to=arybchenko@solarflare.com \
    --cc=ajit.khaparde@broadcom.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=jerinjacobk@gmail.com \
    --cc=matan@mellanox.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=olivier.matz@6wind.com \
    --cc=rasland@mellanox.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git