From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from alln-iport-4.cisco.com (alln-iport-4.cisco.com [173.37.142.91]) by dpdk.org (Postfix) with ESMTP id 0AD4E4F90 for ; Thu, 22 Mar 2018 11:59:38 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=9648; q=dns/txt; s=iport; t=1521716379; x=1522925979; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=63MLhRjPfVZQeEag8q9qc+9/Db5At5SSps3hL4cF6nI=; b=Sae6h4MR6zzkU0McmGEcAjYFsm5RjLVop61i/H9LL9Il4vTWDLAsKrkC w0uz4ckGFq4pxeH0mbBl2wno769w0ca1VDZTVjN0kw9YNQcAKsZMDY2c3 ccOi3Mf5qvoON/HXDLoI5KfZYhVmI/bEbPUBgXDzN9W+9TKvQl6za55GD 0=; X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A0A7AQB0i7Na/4gNJK1dGQEBAQEBAQEBA?= =?us-ascii?q?QEBAQcBAQEBAYM9YXAoCotRjQyBcoEQkleCBgsjhGICg2EhNBgBAgEBAQEBAQJ?= =?us-ascii?q?rKIUlAQEBAwEnRA4FBwQCAQgRBAEBAScHMhQJCAIEDgUIhH4ID6xkNYhBgXYFh?= =?us-ascii?q?S+CEYFTQIEMgwaCfBcDAYFAToUjA4xPiy8IAoVcgmGGK4FAi0qHNIFqhkcCERM?= =?us-ascii?q?BgSQBHDgmgSxwFToNFYIhH4F/AgEYjhZvjzoBgRUBAQ?= X-IronPort-AV: E=Sophos;i="5.48,344,1517875200"; d="scan'208";a="87766357" Received: from alln-core-3.cisco.com ([173.36.13.136]) by alln-iport-4.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 22 Mar 2018 10:59:37 +0000 Received: from XCH-RTP-018.cisco.com (xch-rtp-018.cisco.com [64.101.220.158]) by alln-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id w2MAxbpr028498 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 22 Mar 2018 10:59:37 GMT Received: from xch-rtp-017.cisco.com (64.101.220.157) by XCH-RTP-018.cisco.com (64.101.220.158) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Thu, 22 Mar 2018 06:59:36 -0400 Received: from xch-rtp-017.cisco.com ([64.101.220.157]) by XCH-RTP-017.cisco.com ([64.101.220.157]) with mapi id 15.00.1320.000; Thu, 22 Mar 2018 06:59:36 -0400 From: "Hanoch Haim (hhaim)" To: =?iso-8859-1?Q?N=E9lio_Laranjeiro?= CC: Yongseok Koh , "dev@dpdk.org" Thread-Topic: [dpdk-dev] mlx5 reta size is dynamic Thread-Index: AdPBRlLwtkvkc0EZRb2Y20W7rzsPggAOXX8AAApoimAADOGRgAAIP3wA///HNACAAD1yYP//2FWAgABCMVA= Date: Thu, 22 Mar 2018 10:59:36 +0000 Message-ID: <7ec498986339404ba89851ef2536ece8@XCH-RTP-017.cisco.com> References: <1b6a9384a5604f15948162766cde90a9@XCH-RTP-017.cisco.com> <20180321214749.GA53128@yongseok-MBP.local> <20180322085441.a3o2eyvols7jkzxo@laranjeiro-vm.dev.6wind.com> <92a7d23b9df748b6af83f7dda88672e4@XCH-RTP-017.cisco.com> <20180322092734.6iulb7yxfkbdsi3h@laranjeiro-vm.dev.6wind.com> <20180322104531.ivfs3hdqezobcxjn@laranjeiro-vm.dev.6wind.com> In-Reply-To: <20180322104531.ivfs3hdqezobcxjn@laranjeiro-vm.dev.6wind.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ms-exchange-transport-fromentityheader: Hosted x-originating-ip: [64.103.125.72] Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] mlx5 reta size is dynamic X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Mar 2018 10:59:39 -0000 Hi, 1) Regarding this sentence,=20 "Your need is to have a fixed size returned by the rte_eth_dev_info_get(), = the PMD can have an internal dynamic size, it won't modify your spreading." I'm fine with that as long: 1. rte_eth_dev_info_get will expose the same *size*=20 2. rte_eth_dev_rss_reta_update will behave the as there are reta_size for *= any* random input (will enlarge the table internally to maximum size) In other words, from the user prospective you will have static reta_size.=20 2) "In such situation, changing the RETA means stopping the traffic, destro= ying every single flow, hash Rx queue, indirection table to remake everythi= ng with the new configuration. Until then, we always recommended to any application to restart the port on= this device after a RETA update to apply this new configuration." >>From an experiment I did, you *can* change it under traffic and it works wi= thout issue. Drivers tested are: igbe/i40e/mlx5 Thanks, Hanoh -----Original Message----- From: N=E9lio Laranjeiro [mailto:nelio.laranjeiro@6wind.com]=20 Sent: Thursday, March 22, 2018 12:46 PM To: Hanoch Haim (hhaim) Cc: Yongseok Koh; dev@dpdk.org Subject: Re: [dpdk-dev] mlx5 reta size is dynamic Hi Hanoch, On Thu, Mar 22, 2018 at 10:00:45AM +0000, Hanoch Haim (hhaim) wrote: > Hi Nelio, >=20 > Let me provide more background.=20 > The context is TRex running in Advance Stateful (ASTf) mode using multi-c= ore. =20 > In this case the flows are distributed using RSS. New flows (c->s)=20 > need to have a tuple that will match the generated core. For this=20 > calculation there is a need of to know the *RETA table size* >=20 >=20 > Code: >=20 > /*1. verify that driver can support RSS */ > rte_eth_dev_info_get(m_repid,&dev_info); > save_reta_size =3D dev_info.reta_size > save_hash_key =3D dev_info.hash_key_size > printf("RETA_SIZE : %d \n",save_reta_size); > printf("HASH_SIZE : %d \n",save_hash_key); >=20 > /*2. configure queues */ > ret =3D rte_eth_dev_configure(m_repid, > nb_rx_queue, > nb_tx_queue, > eth_conf); > .. >=20 > /* 3. reading the RETA again */ > rte_eth_dev_info_get(m_repid,&dev_info); > save_reta_size =3D dev_info.reta_size << > save_hash_key =3D dev_info.hash_key_size > printf("RETA_SIZE1 : %d \n",save_reta_size); >=20 >=20 > /* 4. update the RETA table */ > rte_eth_dev_rss_reta_update(m_repid, &reta_conf[0],=20 > dev_info.reta_size) >=20 > =20 > 2. /*Output in case of Intel i40e*/ >=20 > RETA_SIZE : 512 > HASH_SIZE : 52 >=20 > RETA_SIZE1 : 512 >=20 > 3. /*Output in case of Mlx5 */ >=20 > RETA_SIZE : 512 > HASH_SIZE : 0 >=20 > RETA_SIZE1 : 4 << not round of 64 , depends on the number of=20 > rx queues Your need is to have a fixed size returned by the rte_eth_dev_info_get(), t= he PMD can have an internal dynamic size, it won't modify your spreading. An information, you are getting the hash key size, according to the documen= tation of struct rte_eth_rss_conf, only the i40e can have a key len differe= nt from 40 bytes, others should just ignore the field [1]. Regards, [1] https://dpdk.org/browse/dpdk/tree/lib/librte_ether/rte_ethdev.h#n380 > Hanoh >=20 > -----Original Message----- > From: N=E9lio Laranjeiro [mailto:nelio.laranjeiro@6wind.com] > Sent: Thursday, March 22, 2018 11:28 AM > To: Hanoch Haim (hhaim) > Cc: Yongseok Koh; dev@dpdk.org > Subject: Re: [dpdk-dev] mlx5 reta size is dynamic >=20 > Hi Hanoch, >=20 > On Thu, Mar 22, 2018 at 09:02:19AM +0000, Hanoch Haim (hhaim) wrote: > > Hi Nelio, > > I think you didn't understand me. I suggest to keep the RETA table=20 > > size constant (maximum 512 in your case) and don't change its base=20 > > on the number of configured Rx-queue. >=20 > It is even simpler, we can return the maximum size or a multiple of=20 > RTE_RETA_GROUP_SIZE according to the number of Rx queues being used,=20 > in the devop->dev_infos_get() as it is what the > rte_eth_dev_rss_reta_update() implementation will expect. > =20 > > This will make the DPDK API consistent. As a user I need to do=20 > > tricks (allocate an odd/prime number of rx-queues) to get the RETA=20 > > size constant at 512 >=20 > I understand this issue, what I don't fully understand your needs. >=20 > > I'm not talking about changing the values in the RETA table which=20 > > can be done while there is traffic. >=20 > On MLX5 changing the entries of the RETA table don't affect the current t= raffic, it needs a port restart to affect it and only for "default" > flows, any flow created through the public flow API are not impacted by t= he RETA table. >=20 >=20 > From my understanding, you wish to have a size returned by > devop->dev_infos_get() usable directly by rte_eth_dev_rss_reta_update(). > This is why you are asking for a fix size? So, if internally the PMD sta= rts with a smaller RETA table does not really matter, until the RETA API wo= rks without any trick from the application side. Is this correct? >=20 > Thanks, >=20 > > Thanks, > > Hanoh > >=20 > >=20 > > -----Original Message----- > > From: N=E9lio Laranjeiro [mailto:nelio.laranjeiro@6wind.com] > > Sent: Thursday, March 22, 2018 10:55 AM > > To: Hanoch Haim (hhaim) > > Cc: Yongseok Koh; dev@dpdk.org > > Subject: Re: [dpdk-dev] mlx5 reta size is dynamic > >=20 > > On Thu, Mar 22, 2018 at 06:52:53AM +0000, Hanoch Haim (hhaim) wrote: > > > Hi Yongseok, > > >=20 > > >=20 > > > RSS has a DPDK API,application can ask for the reta table size and=20 > > > configure it. In your case you are assuming specific use case and=20 > > > change the size dynamically which solve 90% of the use-cases but=20 > > > break the 10% use-case. > > > Instead, you could provide the application a consistent API and=20 > > > with that 100% of the applications can work with no issue. This is=20 > > > what happen with Intel (ixgbe/i40e) Another minor issue the=20 > > > rss_key_size return as zero but internally it is 40 bytes > >=20 > > Hi Hanoch, > >=20 > > Legacy DPDK API has always considered there is only a single indirectio= n table aka. RETA whereas this is not true [1][2] on this device. > >=20 > > On MLX5 there is an indirection table per Hash Rx queue according to th= e list of queues making part of it. > > The Hash Rx queue is configured to make the hash with configured > > information: > > - Algorithm, > > - key > > - hash field (Verbs hash field) > > - Indirection table > > An Hash Rx queue cannot handle multiple RSS configuration, we have an H= ash Rx queue per protocol and thus a full configuration per protocol. > >=20 > > In such situation, changing the RETA means stopping the traffic, destro= ying every single flow, hash Rx queue, indirection table to remake everythi= ng with the new configuration. > > Until then, we always recommended to any application to restart the por= t on this device after a RETA update to apply this new configuration. > >=20 > > Since the flow API is the new way to configure flows, application shoul= d move to this new one instead of using old API for such behavior. > > We should also remove such devop from the PMD to avoid any confusion. > >=20 > > Regards, > >=20 > > > Thanks, > > > Hanoh > > >=20 > > > -----Original Message----- > > > From: Yongseok Koh [mailto:yskoh@mellanox.com] > > > Sent: Wednesday, March 21, 2018 11:48 PM > > > To: Hanoch Haim (hhaim) > > > Cc: dev@dpdk.org > > > Subject: Re: [dpdk-dev] mlx5 reta size is dynamic > > >=20 > > > On Wed, Mar 21, 2018 at 06:56:33PM +0000, Hanoch Haim (hhaim) wrote: > > > > Hi mlx5 driver expert, > > > >=20 > > > > DPDK: 17.11 > > > > Any reason mlx5 driver change the rate table size dynamically=20 > > > > based on the rx- queues# ? > > >=20 > > > The device only supports 2^n-sized indirection table. For example, if= the number of Rx queues is 6, device can't have 1-1 mapping but the size o= f ind tbl could be 8, 16, 32 and so on. If we configure it as 8 for example= , 2 out of 6 queues will have 1/4 of traffic while the rest 4 queues receiv= es 1/8. We thought it was too much disparity and preferred setting the max = size in order to mitigate the imbalance. > > >=20 > > > > There is a hidden assumption that the user wants to distribute=20 > > > > the packets evenly which is not always correct. > > >=20 > > > But it is mostly correct because RSS is used for uniform distribution= . The decision wasn't made based on our speculation but by many request fro= m multiple customers. > > >=20 > > > > /* If the requested number of RX queues is not a power of two, use = the > > > > * maximum indirection table size for better balancing. > > > > * The result is always rounded to the next power of two. = */ > > > > reta_idx_n =3D (1 << log2above((rxqs_n & (rxqs_n - 1)) ? > > > > priv->ind_table_max_size= : > > > > rxqs_n)); > > >=20 > > > Thanks, > > > Yongseok > >=20 > > [1] https://dpdk.org/ml/archives/dev/2015-October/024668.html > > [2] https://dpdk.org/ml/archives/dev/2015-October/024669.html > >=20 > > -- > > N=E9lio Laranjeiro > > 6WIND >=20 > -- > N=E9lio Laranjeiro > 6WIND -- N=E9lio Laranjeiro 6WIND