From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7EC07A0C4B; Sun, 11 Jul 2021 23:45:40 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DD6DA4067C; Sun, 11 Jul 2021 23:45:39 +0200 (CEST) Received: from office2.cesnet.cz (office2.cesnet.cz [195.113.144.244]) by mails.dpdk.org (Postfix) with ESMTP id 1DB2040040 for ; Sun, 11 Jul 2021 23:45:39 +0200 (CEST) Received: from tanguero.localdomain (unknown [95.82.135.219]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange ECDHE (P-256) server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by office2.cesnet.cz (Postfix) with ESMTPSA id 2309E400052; Sun, 11 Jul 2021 23:45:38 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cesnet.cz; s=office2-2020; t=1626039938; bh=95T+ULCLPb4aYrIqWTDiuLFWZwY8scuGAsdbQdQeJpI=; h=Date:From:To:Cc:Subject:In-Reply-To:References; b=U6nNvbaUVSt4j8CVyBxcy9yYUEHn3xCbpo1NT64W0hnla9m0mj49O0xVa3gSYM8iY 9hKr0re49uF/jEzNUZWrgEsNWGdkCDIVjn7GsVxGiEYu/e+ihQ1r6j97zugMTXLhiS CFNxALpFZ3PUNBcLWhTcH6RDsN06/jM7lMJnU4S7VM6XHf+Uj/M5FMikigg3iQBAJK SWe9P0a8/cVcCEKf98+Ii6KfkvZ3mOyIWv0tOohP7YoY6Q0LL1OHd40R3LaYafEsge AIIWcw9eSZ4rRsAfih1tK/om02VVClZL9rFPL2/Nv6pAPbPJ5U9YvF4USBckHxK3KO UeRCigv496jog== Date: Sun, 11 Jul 2021 23:45:37 +0200 From: Jan Viktorin To: Ori Kam Cc: Matan Azrad , =?UTF-8?B?SGF2bMOtaw==?= Martin , "Min Hu (Connor)" , "chas3@att.com" , "dev@dpdk.org" , Shahaf Shuler , Slava Ovsiienko Message-ID: <20210711234537.6d2b14c4@tanguero.localdomain> In-Reply-To: References: <20210622092531.73112-1-xhavli56@stud.fit.vutbr.cz> <20210622092531.73112-4-xhavli56@stud.fit.vutbr.cz> <5f2cd5e6-1c46-227c-7ab1-5a2a34eb337d@huawei.com> <01bd8d8fcd364584fecdea3bc70abd1c@stud.fit.vutbr.cz> <20210707175420.4d81fdaf@coaster.localdomain> Organization: CESNET MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: Re: [dpdk-dev] [PATCH 3/3] net/bonding: start ethdev prior to setting 8023ad flow X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Sun, 11 Jul 2021 08:49:18 +0000 Ori Kam wrote: > Hi Jan, Hi Ori, >=20 >=20 > > -----Original Message----- > > From: dev On Behalf Of Jan Viktorin > > Sent: Wednesday, July 7, 2021 6:54 PM > >=20 > > On Sun, 4 Jul 2021 15:18:01 +0000 > > Matan Azrad wrote: > > =20 > > > From: Havl=C3=ADk Martin =20 > > > > Dne 2021-06-23 09:04, Min Hu (Connor) napsal: =20 > > > > > =E5=9C=A8 2021/6/22 17:25, Martin Havlik =E5=86=99=E9=81=93: =20 > > > > >> When dedicated queues are enabled, mlx5 PMD fails to install RTE > > > > >> Flows if the underlying ethdev is not started: > > > > >> bond_ethdev_8023ad_flow_set(267) - =20 > > bond_ethdev_8023ad_flow_set: =20 > > > > port =20 > > > > >> not started (slave_port=3D0 queue_id=3D1) > > > > >> =20 > > > > > Why mlx5 PMD doing flow create relys on port started ? > > > > > I noticed other PMDs did not has that reliance. > > > > > =20 > > > > After looking into it, I really can't answer this mlx5 centered > > > > question. Closest related info we found so far is the 5th point in > > > > https://doc.dpdk.org/guides/prog_guide/rte_flow.html#caveats > > > > but it only specifies it's the application's responsibility and that > > > > flow rules are assumed invalid after port stop/close/restart but > > > > doesn't say anything about vs > > > > where the former is the point of > > > > failure on mlx5. I'm addressing the maintainers for mlx5, who might > > > > know a bit more on the topic. =20 > > > =20 > >=20 > > Hello Matan, > > =20 > > > From rte_ethdev.h > > > > > > * Please note that some configuration is not stored between calls to > > > * rte_eth_dev_stop()/rte_eth_dev_start(). The following configuration > > > will > > > * be retained: > > > * > > > * - MTU > > > * - flow control settings > > > * - receive mode configuration (promiscuous mode, all-multicast > > > mode, > > > * hardware checksum mode, RSS/VMDQ settings etc.) > > > * - VLAN filtering configuration > > > * - default MAC address > > > * - MAC addresses supplied to MAC address array > > > * - flow director filtering mode (but not filtering rules) > > > * - NIC queue statistics mappings =20 > >=20 > > just after this section, you can find the following statement: > >=20 > > * Any other configuration will not be stored and will need to be re-en= tered > > * before a call to rte_eth_dev_start(). > >=20 > > It is not very clear how is this exactly related to flows (and this app= lies for all > > the quoted section, I think) but at least it can be used as a counter a= rgument. > > =20 > I agree the doc is not clear, as I see it flows are not part of configura= tion, at least not > when we are talking about rte_flow. Agree. >=20 > > > > > > > > > Mlx5 assumes flows are allowed to be configured only after > > > rte_eth_dev_start(). Before start \ after stop - no flow is valid > > > anymore. =20 > >=20 > > I believe that this discussion is not about validity of flows. Let the = flows be > > invalid after calling to rte_eth_dev_stop(). This is OK, flows must be > > recreated and the bonding driver works this way. But why not *before > > start*? =20 > Think about it this way by changing the configuration you may create inva= lid flows, > for example, you can only change the number of queues after port stop, so= if > you create a flow with jump to queue 3 and then you remove queue 3 then, > the flow that is cached is not valid anymore. This goes for other configu= ration that > may affect the validity of a flow. This is a valid argument, of course. The thing is whether it is a responsibility of the PMD to take care of those corner cases or if this is up to the application developer. If we respect the fact that calling to stop invalidates all flows then when you do: > port stop 0 > flow create 0 ingress pattern ... / end actions queue 6 / end > port config rxq 3 > port start 0 it's clear that something is really wrong with the caller/user. I would say that this is an application bug. I would expect that you first reconfigure port and after that you modify flows. It seems quite logical and intuitive, doesn't it? Anyway, the PMD can possibly catch this rxq inconsistency but I can imagine that there are more complicated sitations than just changing count of queues. Any idea for a more complex real case? Martin Havlik, can you please check how ixgbe and i40e behave in the case above with "rxq change after flow created"? >=20 > > Does somebody know how other drivers behaves in this situation? (We know > > and can check for Intel, there it does not seem to be an issue.) > >=20 > > By the way, the mlx5 behaviour opens a (probably short) time window > > between starting of a port and configuation of filtering flows. You may= want > > to start the port with thousands of flows that apply just when the port= starts > > (not after, that's late). This may introduce glitches in filtering and = measuring > > of traffic (well, it is a question how serious issue could it be...). > > =20 > Agree but this is always true nothing is done is zero time and even if it= was the insertion > is not in zero time, (assuming that even if the flows are stored by the P= MD until start > they still will not all be inserted in the same time)=20 Sorry, I probably do not understand well. Yes, creation of a flow would never be zero time, agree. But if I create flows before the port starts, than I do not have to care too much about how long the creation takes. Because the card is expected to be configured already before pushing the "start button". Maybe, you assume that the created flows would first be cached inside the PMD and when the port is finally started, it than transparently write the queues to the HW. But this is not what I was talking about, that's maybe even worse because you are hiding such behaviour from users... (I do not know the exact mlx5 implementation details, so my answer can miss something, sorry for that.) >=20 > > This matters for the bonding case as well, doesn't it?. It is not desir= able to > > accidently omit a packet that was received by primary ingress logic ins= tead of > > being redirected into the dedicated queue. > >=20 > > Are there any chances that for mlx5 it would be possible to insert flow= rules > > before calling rte_eth_dev_start? Anyway, the behaviour should be speci= fied > > and documented in DPDK more precisely to avoid such uncertainty in the > > future. > > =20 > I agree the documentation should be fixed. +1 Jan >=20 > Ori > > Jan > > =20 > > > > > > Matan > > > =20 > > > > >> Signed-off-by: Martin Havlik > > > > >> Cc: Jan Viktorin > > > > >> --- > > > > >> drivers/net/bonding/rte_eth_bond_pmd.c | 26 > > > > >> ++++++++++++++++++-------- > > > > >> 1 file changed, 18 insertions(+), 8 deletions(-) > > > > >> > > > > >> diff --git a/drivers/net/bonding/rte_eth_bond_pmd.c > > > > >> b/drivers/net/bonding/rte_eth_bond_pmd.c > > > > >> index a6755661c..fea3bc537 100644 > > > > >> --- a/drivers/net/bonding/rte_eth_bond_pmd.c > > > > >> +++ b/drivers/net/bonding/rte_eth_bond_pmd.c > > > > >> @@ -1818,25 +1818,35 @@ slave_configure(struct rte_eth_dev > > > > >> *bonded_eth_dev, > > > > >> > > > > >> rte_flow_destroy(slave_eth_dev->data->port_id, > > > > >> > > > > >> internals- =20 > > > > >mode4.dedicated_queues.flow[slave_eth_dev->data->port_id], =20 > > > > >> &flow_error); > > > > >> + } > > > > >> + /* Start device */ > > > > >> + errval =3D rte_eth_dev_start(slave_eth_dev->data->port_id); > > > > >> + if (errval !=3D 0) { > > > > >> + RTE_BOND_LOG(ERR, "rte_eth_dev_start: port=3D%u, err > > > > >> (%d)", > > > > >> + slave_eth_dev->data->port_id, > > > > >> errval); > > > > >> + return -1; > > > > >> + } > > > > >> + > > > > >> + if (internals->mode =3D=3D BONDING_MODE_8023AD && > > > > >> + internals->mode4.dedicated_queues.enabled > > > > >> =3D=3D 1) > > > > >> + { > > > > >> errval =3D bond_ethdev_8023ad_flow_set(bonded_eth_d= ev, > > > > >> slave_eth_dev->data->port_id); > > > > >> if (errval !=3D 0) { > > > > >> RTE_BOND_LOG(ERR, > > > > >> "bond_ethdev_8023ad_flow_set: > > > > >> port=3D%d, err (%d)", slave_eth_dev->data->port_id, errval); > > > > >> + > > > > >> + errval =3D > > > > >> rte_eth_dev_stop(slave_eth_dev->data->port_id); > > > > >> + if (errval < 0) { > > > > >> + RTE_BOND_LOG(ERR, > > > > >> + "rte_eth_dev_stop: port=3D%= d, > > > > >> err (%d)", > > > > >> + > > > > >> slave_eth_dev->data->port_id, errval); > > > > >> + } > > > > >> return errval; > > > > >> } > > > > >> } > > > > >> - /* Start device */ > > > > >> - errval =3D rte_eth_dev_start(slave_eth_dev->data->port_id); > > > > >> - if (errval !=3D 0) { > > > > >> - RTE_BOND_LOG(ERR, "rte_eth_dev_start: port=3D%u, err > > > > >> (%d)", > > > > >> - slave_eth_dev->data->port_id, > > > > >> errval); > > > > >> - return -1; > > > > >> - } > > > > >> - > > > > >> /* If RSS is enabled for bonding, synchronize RETA */ > > > > >> if (bonded_eth_dev->data->dev_conf.rxmode.mq_mode & > > > > >> ETH_MQ_RX_RSS) { > > > > >> int i; > > > > >> =20 >=20