DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Hanoch Haim (hhaim)" <hhaim@cisco.com>
To: "Nélio Laranjeiro" <nelio.laranjeiro@6wind.com>
Cc: Yongseok Koh <yskoh@mellanox.com>, "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] mlx5 reta size is dynamic
Date: Thu, 22 Mar 2018 10:59:36 +0000	[thread overview]
Message-ID: <7ec498986339404ba89851ef2536ece8@XCH-RTP-017.cisco.com> (raw)
In-Reply-To: <20180322104531.ivfs3hdqezobcxjn@laranjeiro-vm.dev.6wind.com>

Hi,

1) Regarding this sentence, 
"Your need is to have a fixed size returned by the rte_eth_dev_info_get(), the PMD can have an internal dynamic size, it won't modify your spreading."

I'm fine with that as long:

1. rte_eth_dev_info_get  will expose the same *size* 
2. rte_eth_dev_rss_reta_update will behave the as there are reta_size for *any* random input (will enlarge the table internally to maximum size)
In other words, from the user prospective you will have static reta_size. 

2) "In such situation, changing the RETA means stopping the traffic, destroying every single flow, hash Rx queue, indirection table to remake everything with the new configuration.
Until then, we always recommended to any application to restart the port on this device after a RETA update to apply this new configuration."

>From an experiment I did, you *can* change it under traffic and it works without issue.
Drivers tested are: igbe/i40e/mlx5

Thanks,
Hanoh


-----Original Message-----
From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com] 
Sent: Thursday, March 22, 2018 12:46 PM
To: Hanoch Haim (hhaim)
Cc: Yongseok Koh; dev@dpdk.org
Subject: Re: [dpdk-dev] mlx5 reta size is dynamic

Hi Hanoch,

On Thu, Mar 22, 2018 at 10:00:45AM +0000, Hanoch Haim (hhaim) wrote:
> Hi Nelio,
> 
> Let me provide more background. 
> The context is TRex running in Advance Stateful (ASTf) mode using multi-core.  
> In this case the flows are distributed using RSS. New flows (c->s) 
> need to have a tuple that will match the generated core. For this 
> calculation there is a need of to know the *RETA table size*
> 
> 
> Code:
> 
>        /*1.  verify that driver can support RSS */
>        rte_eth_dev_info_get(m_repid,&dev_info);
>        save_reta_size = dev_info.reta_size
>        save_hash_key = dev_info.hash_key_size
>        printf("RETA_SIZE : %d \n",save_reta_size);
>        printf("HASH_SIZE : %d \n",save_hash_key);
> 
>        /*2.  configure queues  */
>        ret = rte_eth_dev_configure(m_repid,
>                                    nb_rx_queue,
>                                    nb_tx_queue,
>                                    eth_conf);
> 	..
> 
>        /* 3. reading the RETA again */
>        rte_eth_dev_info_get(m_repid,&dev_info);
>        save_reta_size = dev_info.reta_size        <<
>        save_hash_key = dev_info.hash_key_size
>        printf("RETA_SIZE1 : %d \n",save_reta_size);
> 
> 
>        /* 4. update the RETA table */
>        rte_eth_dev_rss_reta_update(m_repid, &reta_conf[0], 
> dev_info.reta_size)
> 
>        
> 2.   /*Output in case of  Intel i40e*/
> 
>        RETA_SIZE : 512
>        HASH_SIZE : 52
> 
>        RETA_SIZE1 : 512
> 
> 3.       /*Output in case of  Mlx5 */
> 
>        RETA_SIZE : 512
>        HASH_SIZE : 0
> 
>        RETA_SIZE1 : 4  << not round of 64 , depends on the number of 
> rx queues

Your need is to have a fixed size returned by the rte_eth_dev_info_get(), the PMD can have an internal dynamic size, it won't modify your spreading.

An information, you are getting the hash key size, according to the documentation of struct rte_eth_rss_conf, only the i40e can have a key len different from 40 bytes, others should just ignore the field [1].

Regards,

[1] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.h#n380

> Hanoh
> 
> -----Original Message-----
> From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> Sent: Thursday, March 22, 2018 11:28 AM
> To: Hanoch Haim (hhaim)
> Cc: Yongseok Koh; dev@dpdk.org
> Subject: Re: [dpdk-dev] mlx5 reta size is dynamic
> 
> Hi Hanoch,
> 
> On Thu, Mar 22, 2018 at 09:02:19AM +0000, Hanoch Haim (hhaim) wrote:
> > Hi Nelio,
> > I think you didn't understand me. I suggest to keep the RETA table 
> > size constant (maximum 512 in your case) and don't change its base 
> > on the number of configured Rx-queue.
> 
> It is even simpler, we can return the maximum size or a multiple of 
> RTE_RETA_GROUP_SIZE according to the number of Rx queues being used, 
> in the devop->dev_infos_get() as it is what the
> rte_eth_dev_rss_reta_update() implementation will expect.
>  
> > This will make the DPDK API consistent. As a user I need to do 
> > tricks (allocate an odd/prime number of rx-queues) to get the RETA 
> > size constant at 512
> 
> I understand this issue, what I don't fully understand your needs.
> 
> > I'm not talking about changing the values in the RETA table which 
> > can be done while there is traffic.
> 
> On MLX5 changing the entries of the RETA table don't affect the current traffic, it needs a port restart to affect it and only for "default"
> flows, any flow created through the public flow API are not impacted by the RETA table.
> 
> 
> From my understanding, you wish to have a size returned by
> devop->dev_infos_get() usable directly by rte_eth_dev_rss_reta_update().
> This is why you are asking for a fix size?  So, if internally the PMD starts with a smaller RETA table does not really matter, until the RETA API works without any trick from the application side.  Is this correct?
> 
> Thanks,
> 
> > Thanks,
> > Hanoh
> > 
> > 
> > -----Original Message-----
> > From: Nélio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]
> > Sent: Thursday, March 22, 2018 10:55 AM
> > To: Hanoch Haim (hhaim)
> > Cc: Yongseok Koh; dev@dpdk.org
> > Subject: Re: [dpdk-dev] mlx5 reta size is dynamic
> > 
> > On Thu, Mar 22, 2018 at 06:52:53AM +0000, Hanoch Haim (hhaim) wrote:
> > > Hi Yongseok,
> > > 
> > > 
> > > RSS has a DPDK API,application can ask for the reta table size and 
> > > configure it. In your case you are assuming specific use case and 
> > > change the size dynamically which solve 90% of the use-cases but 
> > > break the 10% use-case.
> > > Instead, you could provide the application a consistent API and 
> > > with that 100% of the applications can work with no issue. This is 
> > > what happen with Intel (ixgbe/i40e) Another minor issue the 
> > > rss_key_size return as zero but internally it is 40 bytes
> > 
> > Hi Hanoch,
> > 
> > Legacy DPDK API has always considered there is only a single indirection table aka. RETA whereas this is not true [1][2] on this device.
> > 
> > On MLX5 there is an indirection table per Hash Rx queue according to the list of queues making part of it.
> > The Hash Rx queue is configured to make the hash with configured
> > information:
> >  - Algorithm,
> >  - key
> >  - hash field (Verbs hash field)
> >  - Indirection table
> > An Hash Rx queue cannot handle multiple RSS configuration, we have an Hash Rx queue per protocol and thus a full configuration per protocol.
> > 
> > In such situation, changing the RETA means stopping the traffic, destroying every single flow, hash Rx queue, indirection table to remake everything with the new configuration.
> > Until then, we always recommended to any application to restart the port on this device after a RETA update to apply this new configuration.
> > 
> > Since the flow API is the new way to configure flows, application should move to this new one instead of using old API for such behavior.
> > We should also remove such devop from the PMD to avoid any confusion.
> > 
> > Regards,
> > 
> > > Thanks,
> > > Hanoh
> > > 
> > > -----Original Message-----
> > > From: Yongseok Koh [mailto:yskoh@mellanox.com]
> > > Sent: Wednesday, March 21, 2018 11:48 PM
> > > To: Hanoch Haim (hhaim)
> > > Cc: dev@dpdk.org
> > > Subject: Re: [dpdk-dev] mlx5 reta size is dynamic
> > > 
> > > On Wed, Mar 21, 2018 at 06:56:33PM +0000, Hanoch Haim (hhaim) wrote:
> > > > Hi mlx5 driver expert,
> > > > 
> > > > DPDK: 17.11
> > > > Any reason mlx5 driver change the rate table size dynamically 
> > > > based on the rx- queues# ?
> > > 
> > > The device only supports 2^n-sized indirection table. For example, if the number of Rx queues is 6, device can't have 1-1 mapping but the size of ind tbl could be 8, 16, 32 and so on. If we configure it as 8 for example, 2 out of 6 queues will have 1/4 of traffic while the rest 4 queues receives 1/8. We thought it was too much disparity and preferred setting the max size in order to mitigate the imbalance.
> > > 
> > > > There is a hidden assumption that the user wants to distribute 
> > > > the packets evenly which is not always correct.
> > > 
> > > But it is mostly correct because RSS is used for uniform distribution. The decision wasn't made based on our speculation but by many request from multiple customers.
> > > 
> > > > /* If the requested number of RX queues is not a power of two, use the
> > > >           * maximum indirection table size for better balancing.
> > > >           * The result is always rounded to the next power of two. */
> > > >           reta_idx_n = (1 << log2above((rxqs_n & (rxqs_n - 1)) ?
> > > >                                            priv->ind_table_max_size :
> > > >                                            rxqs_n));
> > > 
> > > Thanks,
> > > Yongseok
> > 
> > [1] https://dpdk.org/ml/archives/dev/2015-October/024668.html
> > [2] https://dpdk.org/ml/archives/dev/2015-October/024669.html
> > 
> > --
> > Nélio Laranjeiro
> > 6WIND
> 
> --
> Nélio Laranjeiro
> 6WIND

--
Nélio Laranjeiro
6WIND

  reply	other threads:[~2018-03-22 10:59 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-03-21 18:56 Hanoch Haim (hhaim)
2018-03-21 21:47 ` Yongseok Koh
2018-03-22  6:52   ` Hanoch Haim (hhaim)
2018-03-22  8:54     ` Nélio Laranjeiro
2018-03-22  9:02       ` Hanoch Haim (hhaim)
2018-03-22  9:27         ` Nélio Laranjeiro
2018-03-22 10:00           ` Hanoch Haim (hhaim)
2018-03-22 10:45             ` Nélio Laranjeiro
2018-03-22 10:59               ` Hanoch Haim (hhaim) [this message]
2018-03-22 12:29                 ` Nélio Laranjeiro
2018-03-22 12:33                   ` Hanoch Haim (hhaim)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7ec498986339404ba89851ef2536ece8@XCH-RTP-017.cisco.com \
    --to=hhaim@cisco.com \
    --cc=dev@dpdk.org \
    --cc=nelio.laranjeiro@6wind.com \
    --cc=yskoh@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).