DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev]  [PATCH] net/failsafe: fix fd leak
@ 2020-04-27 10:44 wangyunjian
  2020-04-27 11:12 ` Gaëtan Rivet
  0 siblings, 1 reply; 23+ messages in thread
From: wangyunjian @ 2020-04-27 10:44 UTC (permalink / raw)
  To: dev, grive; +Cc: jerry.lilijun, xudingke, Yunjian Wang, stable

From: Yunjian Wang <wangyunjian@huawei.com>

Zero is a valid fd. The fd won't be closed thus leading fd leak,
when it is zero.

Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
Cc: stable@dpdk.org

Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
---
 drivers/net/failsafe/failsafe_intr.c | 2 +-
 drivers/net/failsafe/failsafe_ops.c  | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
index d8728fe7e..602c04033 100644
--- a/drivers/net/failsafe/failsafe_intr.c
+++ b/drivers/net/failsafe/failsafe_intr.c
@@ -393,7 +393,7 @@ fs_rx_event_proxy_uninstall(struct fs_priv *priv)
 		free(priv->rxp.evec);
 		priv->rxp.evec = NULL;
 	}
-	if (priv->rxp.efd > 0) {
+	if (priv->rxp.efd >= 0) {
 		close(priv->rxp.efd);
 		priv->rxp.efd = -1;
 	}
diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
index 50f2aca4e..e1d08e46c 100644
--- a/drivers/net/failsafe/failsafe_ops.c
+++ b/drivers/net/failsafe/failsafe_ops.c
@@ -380,7 +380,7 @@ fs_rx_queue_release(void *queue)
 	rxq = queue;
 	dev = &rte_eth_devices[rxq->priv->data->port_id];
 	fs_lock(dev, 0);
-	if (rxq->event_fd > 0)
+	if (rxq->event_fd >= 0)
 		close(rxq->event_fd);
 	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
 		if (ETH(sdev)->data->rx_queues != NULL &&
-- 
2.19.1



^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH] net/failsafe: fix fd leak
  2020-04-27 10:44 [dpdk-dev] [PATCH] net/failsafe: fix fd leak wangyunjian
@ 2020-04-27 11:12 ` Gaëtan Rivet
  2020-04-27 16:55   ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Gaëtan Rivet @ 2020-04-27 11:12 UTC (permalink / raw)
  To: wangyunjian; +Cc: dev, jerry.lilijun, xudingke, stable

On 27/04/20 18:44 +0800, wangyunjian wrote:
> From: Yunjian Wang <wangyunjian@huawei.com>
> 
> Zero is a valid fd. The fd won't be closed thus leading fd leak,
> when it is zero.
> 
> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> Cc: stable@dpdk.org
> 

Hello Yunjian,

Nothing prevents a DPDK app from closing 0 and getting it from
another call, good catch.

> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>

Acked-by: Gaetan Rivet <grive@u256.net>

> ---
>  drivers/net/failsafe/failsafe_intr.c | 2 +-
>  drivers/net/failsafe/failsafe_ops.c  | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/failsafe/failsafe_intr.c b/drivers/net/failsafe/failsafe_intr.c
> index d8728fe7e..602c04033 100644
> --- a/drivers/net/failsafe/failsafe_intr.c
> +++ b/drivers/net/failsafe/failsafe_intr.c
> @@ -393,7 +393,7 @@ fs_rx_event_proxy_uninstall(struct fs_priv *priv)
>  		free(priv->rxp.evec);
>  		priv->rxp.evec = NULL;
>  	}
> -	if (priv->rxp.efd > 0) {
> +	if (priv->rxp.efd >= 0) {
>  		close(priv->rxp.efd);
>  		priv->rxp.efd = -1;
>  	}
> diff --git a/drivers/net/failsafe/failsafe_ops.c b/drivers/net/failsafe/failsafe_ops.c
> index 50f2aca4e..e1d08e46c 100644
> --- a/drivers/net/failsafe/failsafe_ops.c
> +++ b/drivers/net/failsafe/failsafe_ops.c
> @@ -380,7 +380,7 @@ fs_rx_queue_release(void *queue)
>  	rxq = queue;
>  	dev = &rte_eth_devices[rxq->priv->data->port_id];
>  	fs_lock(dev, 0);
> -	if (rxq->event_fd > 0)
> +	if (rxq->event_fd >= 0)
>  		close(rxq->event_fd);
>  	FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) {
>  		if (ETH(sdev)->data->rx_queues != NULL &&
> -- 
> 2.19.1
> 
> 

-- 
Gaëtan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-04-27 11:12 ` Gaëtan Rivet
@ 2020-04-27 16:55   ` Ferruh Yigit
  2020-05-03 11:33     ` Ali Alnubani
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2020-04-27 16:55 UTC (permalink / raw)
  To: Gaëtan Rivet, wangyunjian; +Cc: dev, jerry.lilijun, xudingke, stable

On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> On 27/04/20 18:44 +0800, wangyunjian wrote:
>> From: Yunjian Wang <wangyunjian@huawei.com>
>>
>> Zero is a valid fd. The fd won't be closed thus leading fd leak,
>> when it is zero.
>>
>> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
>> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
>> Cc: stable@dpdk.org
>>
> 
> Hello Yunjian,
> 
> Nothing prevents a DPDK app from closing 0 and getting it from
> another call, good catch.
> 
>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> 
> Acked-by: Gaetan Rivet <grive@u256.net>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-04-27 16:55   ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
@ 2020-05-03 11:33     ` Ali Alnubani
  2020-05-04 16:22       ` Gaëtan Rivet
  2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
  0 siblings, 2 replies; 23+ messages in thread
From: Ali Alnubani @ 2020-05-03 11:33 UTC (permalink / raw)
  To: Ferruh Yigit, Gaëtan Rivet, wangyunjian
  Cc: dev, jerry.lilijun, xudingke, stable, Raslan Darawsheh

Hi,

> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> Sent: Monday, April 27, 2020 7:56 PM
> To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> <wangyunjian@huawei.com>
> Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> stable@dpdk.org
> Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> 
> On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > On 27/04/20 18:44 +0800, wangyunjian wrote:
> >> From: Yunjian Wang <wangyunjian@huawei.com>
> >>
> >> Zero is a valid fd. The fd won't be closed thus leading fd leak, when
> >> it is zero.
> >>
> >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
> >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> >> Cc: stable@dpdk.org
> >>
> >
> > Hello Yunjian,
> >
> > Nothing prevents a DPDK app from closing 0 and getting it from another
> > call, good catch.
> >
> >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> >
> > Acked-by: Gaetan Rivet <grive@u256.net>
> 
> Applied to dpdk-next-net/master, thanks.

This patch is causing Testpmd to quit when I issue a "port stop" command. Testpmd log:

"""
x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i --forward-mode=mac
EAL: Detected 8 lcore(s)
EAL: Detected 1 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'PA'
EAL: No available hugepages reported in hugepages-1048576kB
EAL: Probing VFIO support...
EAL: PCI device 0002:00:02.0 on NUMA socket 0
EAL:   probe driver: 15b3:1004 net_mlx4
Interactive-mode selected
Set mac packet forwarding mode
Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456, size=2176, socket=0
testpmd: preferred mempool ops selected: ring_mp_mc

Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.

Configuring Port 1 (socket 0)
Port 1: 00:15:5D:26:2B:00
Checking link statuses...
Done
testpmd> port stop 1
Stopping ports...
Checking link statuses...
Done
testpmd>
Stopping port 1...
Stopping ports...
Done

Shutting down port 1...
Closing ports...
Done

Bye...
"""

My terminal gets broken at this point, and I have to reinitialize it with a "reset".

- Ali

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-05-03 11:33     ` Ali Alnubani
@ 2020-05-04 16:22       ` Gaëtan Rivet
  2020-05-04 16:28         ` Stephen Hemminger
  2020-05-05  9:14         ` Ali Alnubani
  2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
  1 sibling, 2 replies; 23+ messages in thread
From: Gaëtan Rivet @ 2020-05-04 16:22 UTC (permalink / raw)
  To: Ali Alnubani
  Cc: Ferruh Yigit, wangyunjian, dev, jerry.lilijun, xudingke, stable,
	Raslan Darawsheh

On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> Hi,
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> > Sent: Monday, April 27, 2020 7:56 PM
> > To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> > <wangyunjian@huawei.com>
> > Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> > stable@dpdk.org
> > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> > 
> > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > > On 27/04/20 18:44 +0800, wangyunjian wrote:
> > >> From: Yunjian Wang <wangyunjian@huawei.com>
> > >>
> > >> Zero is a valid fd. The fd won't be closed thus leading fd leak, when
> > >> it is zero.
> > >>
> > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
> > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > >> Cc: stable@dpdk.org
> > >>
> > >
> > > Hello Yunjian,
> > >
> > > Nothing prevents a DPDK app from closing 0 and getting it from another
> > > call, good catch.
> > >
> > >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > >
> > > Acked-by: Gaetan Rivet <grive@u256.net>
> > 
> > Applied to dpdk-next-net/master, thanks.
> 
> This patch is causing Testpmd to quit when I issue a "port stop" command. Testpmd log:
> 
> """
> x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i --forward-mode=mac
> EAL: Detected 8 lcore(s)
> EAL: Detected 1 NUMA nodes
> EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> EAL: Selected IOVA mode 'PA'
> EAL: No available hugepages reported in hugepages-1048576kB
> EAL: Probing VFIO support...
> EAL: PCI device 0002:00:02.0 on NUMA socket 0
> EAL:   probe driver: 15b3:1004 net_mlx4
> Interactive-mode selected
> Set mac packet forwarding mode
> Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
> testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456, size=2176, socket=0
> testpmd: preferred mempool ops selected: ring_mp_mc
> 
> Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
> 
> Configuring Port 1 (socket 0)
> Port 1: 00:15:5D:26:2B:00
> Checking link statuses...
> Done
> testpmd> port stop 1
> Stopping ports...
> Checking link statuses...
> Done
> testpmd>
> Stopping port 1...
> Stopping ports...
> Done
> 
> Shutting down port 1...
> Closing ports...
> Done
> 
> Bye...
> """
> 
> My terminal gets broken at this point, and I have to reinitialize it with a "reset".
> 
> - Ali

Hi Ali,

Thanks for the report, I am looking into it.

Are you testing failsafe on Azure?

I see a segfault currently at startup, so in any case there are fixes to be pushed.
I'll see afterward if I need a specific platform to reproduce your bug.

Regards,
-- 
Gaëtan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-05-04 16:22       ` Gaëtan Rivet
@ 2020-05-04 16:28         ` Stephen Hemminger
  2020-05-05  9:47           ` Ali Alnubani
  2020-05-05  9:14         ` Ali Alnubani
  1 sibling, 1 reply; 23+ messages in thread
From: Stephen Hemminger @ 2020-05-04 16:28 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Ali Alnubani, Ferruh Yigit, wangyunjian, dev, jerry.lilijun,
	xudingke, stable, Raslan Darawsheh

On Mon, 4 May 2020 18:22:26 +0200
Gaëtan Rivet <grive@u256.net> wrote:

> On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> > Hi,
> >   
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> > > Sent: Monday, April 27, 2020 7:56 PM
> > > To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> > > <wangyunjian@huawei.com>
> > > Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> > > stable@dpdk.org
> > > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> > > 
> > > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:  
> > > > On 27/04/20 18:44 +0800, wangyunjian wrote:  
> > > >> From: Yunjian Wang <wangyunjian@huawei.com>
> > > >>
> > > >> Zero is a valid fd. The fd won't be closed thus leading fd leak, when
> > > >> it is zero.
> > > >>
> > > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx interrupts")
> > > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt mode")
> > > >> Cc: stable@dpdk.org
> > > >>  
> > > >
> > > > Hello Yunjian,
> > > >
> > > > Nothing prevents a DPDK app from closing 0 and getting it from another
> > > > call, good catch.
> > > >  
> > > >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>  
> > > >
> > > > Acked-by: Gaetan Rivet <grive@u256.net>  
> > > 
> > > Applied to dpdk-next-net/master, thanks.  
> > 
> > This patch is causing Testpmd to quit when I issue a "port stop" command. Testpmd log:
> > 
> > """
> > x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i --forward-mode=mac
> > EAL: Detected 8 lcore(s)
> > EAL: Detected 1 NUMA nodes
> > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > EAL: Selected IOVA mode 'PA'
> > EAL: No available hugepages reported in hugepages-1048576kB
> > EAL: Probing VFIO support...
> > EAL: PCI device 0002:00:02.0 on NUMA socket 0
> > EAL:   probe driver: 15b3:1004 net_mlx4
> > Interactive-mode selected
> > Set mac packet forwarding mode
> > Warning: NUMA should be configured manually by using --port-numa-config and --ring-numa-config parameters along with --numa.
> > testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456, size=2176, socket=0
> > testpmd: preferred mempool ops selected: ring_mp_mc
> > 
> > Warning! port-topology=paired and odd forward ports number, the last port will pair with itself.
> > 
> > Configuring Port 1 (socket 0)
> > Port 1: 00:15:5D:26:2B:00
> > Checking link statuses...
> > Done  
> > testpmd> port stop 1  
> > Stopping ports...
> > Checking link statuses...
> > Done  
> > testpmd>  
> > Stopping port 1...
> > Stopping ports...
> > Done
> > 
> > Shutting down port 1...
> > Closing ports...
> > Done
> > 
> > Bye...
> > """
> > 
> > My terminal gets broken at this point, and I have to reinitialize it with a "reset".

The problem is that you did not blacklist the PCI address of the Mellanox device associated with
your login session (normally this is the PCI device associated with eth0).

By default, DPDK will take over all VF devices it finds as part of the Mellanox device
startup. This means the traffic that was going to the VF associated with eth0 (your ssh)
is now going to DPDK; which is not what you want.

The solution is to either use blacklist (-b option) or whitelist (-w option) to get only
the PCI devices you want to be part of the DPDK.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-05-04 16:22       ` Gaëtan Rivet
  2020-05-04 16:28         ` Stephen Hemminger
@ 2020-05-05  9:14         ` Ali Alnubani
  2020-05-05 18:35           ` Gaëtan Rivet
  1 sibling, 1 reply; 23+ messages in thread
From: Ali Alnubani @ 2020-05-05  9:14 UTC (permalink / raw)
  To: Gaëtan Rivet
  Cc: Ferruh Yigit, wangyunjian, dev, jerry.lilijun, xudingke, stable,
	Raslan Darawsheh

> -----Original Message-----
> From: Gaëtan Rivet <grive@u256.net>
> Sent: Monday, May 4, 2020 7:22 PM
> To: Ali Alnubani <alialnu@mellanox.com>
> Cc: Ferruh Yigit <ferruh.yigit@intel.com>; wangyunjian
> <wangyunjian@huawei.com>; dev@dpdk.org; jerry.lilijun@huawei.com;
> xudingke@huawei.com; stable@dpdk.org; Raslan Darawsheh
> <rasland@mellanox.com>
> Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> 
> On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> > Hi,
> >
> > > -----Original Message-----
> > > From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> > > Sent: Monday, April 27, 2020 7:56 PM
> > > To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> > > <wangyunjian@huawei.com>
> > > Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> > > stable@dpdk.org
> > > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd
> > > leak
> > >
> > > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > > > On 27/04/20 18:44 +0800, wangyunjian wrote:
> > > >> From: Yunjian Wang <wangyunjian@huawei.com>
> > > >>
> > > >> Zero is a valid fd. The fd won't be closed thus leading fd leak,
> > > >> when it is zero.
> > > >>
> > > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx
> > > >> interrupts")
> > > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt
> > > >> mode")
> > > >> Cc: stable@dpdk.org
> > > >>
> > > >
> > > > Hello Yunjian,
> > > >
> > > > Nothing prevents a DPDK app from closing 0 and getting it from
> > > > another call, good catch.
> > > >
> > > >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > >
> > > > Acked-by: Gaetan Rivet <grive@u256.net>
> > >
> > > Applied to dpdk-next-net/master, thanks.
> >
> > This patch is causing Testpmd to quit when I issue a "port stop" command.
> Testpmd log:
> >
> > """
> > x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i
> > --forward-mode=mac
> > EAL: Detected 8 lcore(s)
> > EAL: Detected 1 NUMA nodes
> > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > EAL: Selected IOVA mode 'PA'
> > EAL: No available hugepages reported in hugepages-1048576kB
> > EAL: Probing VFIO support...
> > EAL: PCI device 0002:00:02.0 on NUMA socket 0
> > EAL:   probe driver: 15b3:1004 net_mlx4
> > Interactive-mode selected
> > Set mac packet forwarding mode
> > Warning: NUMA should be configured manually by using --port-numa-config
> and --ring-numa-config parameters along with --numa.
> > testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456,
> > size=2176, socket=0
> > testpmd: preferred mempool ops selected: ring_mp_mc
> >
> > Warning! port-topology=paired and odd forward ports number, the last port
> will pair with itself.
> >
> > Configuring Port 1 (socket 0)
> > Port 1: 00:15:5D:26:2B:00
> > Checking link statuses...
> > Done
> > testpmd> port stop 1
> > Stopping ports...
> > Checking link statuses...
> > Done
> > testpmd>
> > Stopping port 1...
> > Stopping ports...
> > Done
> >
> > Shutting down port 1...
> > Closing ports...
> > Done
> >
> > Bye...
> > """
> >
> > My terminal gets broken at this point, and I have to reinitialize it with a
> "reset".
> >
> > - Ali
> 
> Hi Ali,
> 
> Thanks for the report, I am looking into it.
> 
> Are you testing failsafe on Azure?

This reproduces with Failsafe, but not necessarily on Azure. You can try to reproduce on any platform if you pass something like '-w 00:00.0 --vdev="net_failsafe0,dev(0000:08:00.0)"'.

> 
> I see a segfault currently at startup, so in any case there are fixes to be
> pushed.
> I'll see afterward if I need a specific platform to reproduce your bug.
> 
> Regards,
> --
> Gaëtan

Regards,
Ali

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-05-04 16:28         ` Stephen Hemminger
@ 2020-05-05  9:47           ` Ali Alnubani
  0 siblings, 0 replies; 23+ messages in thread
From: Ali Alnubani @ 2020-05-05  9:47 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: Ferruh Yigit, wangyunjian, dev, jerry.lilijun, xudingke, stable,
	Raslan Darawsheh, Gaëtan Rivet

Hi Stephen,

> -----Original Message-----
> From: Stephen Hemminger <stephen@networkplumber.org>
> Sent: Monday, May 4, 2020 7:29 PM
> To: Gaëtan Rivet <grive@u256.net>
> Cc: Ali Alnubani <alialnu@mellanox.com>; Ferruh Yigit
> <ferruh.yigit@intel.com>; wangyunjian <wangyunjian@huawei.com>;
> dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> stable@dpdk.org; Raslan Darawsheh <rasland@mellanox.com>
> Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> 
> On Mon, 4 May 2020 18:22:26 +0200
> Gaëtan Rivet <grive@u256.net> wrote:
> 
> > On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> > > > Sent: Monday, April 27, 2020 7:56 PM
> > > > To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> > > > <wangyunjian@huawei.com>
> > > > Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> > > > stable@dpdk.org
> > > > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd
> > > > leak
> > > >
> > > > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > > > > On 27/04/20 18:44 +0800, wangyunjian wrote:
> > > > >> From: Yunjian Wang <wangyunjian@huawei.com>
> > > > >>
> > > > >> Zero is a valid fd. The fd won't be closed thus leading fd
> > > > >> leak, when it is zero.
> > > > >>
> > > > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx
> > > > >> interrupts")
> > > > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt
> > > > >> mode")
> > > > >> Cc: stable@dpdk.org
> > > > >>
> > > > >
> > > > > Hello Yunjian,
> > > > >
> > > > > Nothing prevents a DPDK app from closing 0 and getting it from
> > > > > another call, good catch.
> > > > >
> > > > >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > >
> > > > > Acked-by: Gaetan Rivet <grive@u256.net>
> > > >
> > > > Applied to dpdk-next-net/master, thanks.
> > >
> > > This patch is causing Testpmd to quit when I issue a "port stop" command.
> Testpmd log:
> > >
> > > """
> > > x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i
> > > --forward-mode=mac
> > > EAL: Detected 8 lcore(s)
> > > EAL: Detected 1 NUMA nodes
> > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > > EAL: Selected IOVA mode 'PA'
> > > EAL: No available hugepages reported in hugepages-1048576kB
> > > EAL: Probing VFIO support...
> > > EAL: PCI device 0002:00:02.0 on NUMA socket 0
> > > EAL:   probe driver: 15b3:1004 net_mlx4
> > > Interactive-mode selected
> > > Set mac packet forwarding mode
> > > Warning: NUMA should be configured manually by using --port-numa-config
> and --ring-numa-config parameters along with --numa.
> > > testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456,
> > > size=2176, socket=0
> > > testpmd: preferred mempool ops selected: ring_mp_mc
> > >
> > > Warning! port-topology=paired and odd forward ports number, the last
> port will pair with itself.
> > >
> > > Configuring Port 1 (socket 0)
> > > Port 1: 00:15:5D:26:2B:00
> > > Checking link statuses...
> > > Done
> > > testpmd> port stop 1
> > > Stopping ports...
> > > Checking link statuses...
> > > Done
> > > testpmd>
> > > Stopping port 1...
> > > Stopping ports...
> > > Done
> > >
> > > Shutting down port 1...
> > > Closing ports...
> > > Done
> > >
> > > Bye...
> > > """
> > >
> > > My terminal gets broken at this point, and I have to reinitialize it with a
> "reset".
> 
> The problem is that you did not blacklist the PCI address of the Mellanox device
> associated with your login session (normally this is the PCI device associated
> with eth0).
> 
> By default, DPDK will take over all VF devices it finds as part of the Mellanox
> device startup. This means the traffic that was going to the VF associated with
> eth0 (your ssh) is now going to DPDK; which is not what you want.
> 
> The solution is to either use blacklist (-b option) or whitelist (-w option) to get
> only the PCI devices you want to be part of the DPDK.

I'm confused.
I don't think my login interface was being whitelisted in DPDK because my issue isn't that I lose my connection, it's that testpmd quits when I stop a port.

Regards,
Ali

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable]  [PATCH] net/failsafe: fix fd leak
  2020-05-05  9:14         ` Ali Alnubani
@ 2020-05-05 18:35           ` Gaëtan Rivet
  0 siblings, 0 replies; 23+ messages in thread
From: Gaëtan Rivet @ 2020-05-05 18:35 UTC (permalink / raw)
  To: Ali Alnubani
  Cc: Ferruh Yigit, wangyunjian, dev, jerry.lilijun, xudingke, stable,
	Raslan Darawsheh

On 05/05/20 09:14 +0000, Ali Alnubani wrote:
> > -----Original Message-----
> > From: Gaëtan Rivet <grive@u256.net>
> > Sent: Monday, May 4, 2020 7:22 PM
> > To: Ali Alnubani <alialnu@mellanox.com>
> > Cc: Ferruh Yigit <ferruh.yigit@intel.com>; wangyunjian
> > <wangyunjian@huawei.com>; dev@dpdk.org; jerry.lilijun@huawei.com;
> > xudingke@huawei.com; stable@dpdk.org; Raslan Darawsheh
> > <rasland@mellanox.com>
> > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd leak
> > 
> > On 03/05/20 11:33 +0000, Ali Alnubani wrote:
> > > Hi,
> > >
> > > > -----Original Message-----
> > > > From: dev <dev-bounces@dpdk.org> On Behalf Of Ferruh Yigit
> > > > Sent: Monday, April 27, 2020 7:56 PM
> > > > To: Gaëtan Rivet <grive@u256.net>; wangyunjian
> > > > <wangyunjian@huawei.com>
> > > > Cc: dev@dpdk.org; jerry.lilijun@huawei.com; xudingke@huawei.com;
> > > > stable@dpdk.org
> > > > Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH] net/failsafe: fix fd
> > > > leak
> > > >
> > > > On 4/27/2020 12:12 PM, Gaëtan Rivet wrote:
> > > > > On 27/04/20 18:44 +0800, wangyunjian wrote:
> > > > >> From: Yunjian Wang <wangyunjian@huawei.com>
> > > > >>
> > > > >> Zero is a valid fd. The fd won't be closed thus leading fd leak,
> > > > >> when it is zero.
> > > > >>
> > > > >> Fixes: f234e5bd996d ("net/failsafe: register slaves Rx
> > > > >> interrupts")
> > > > >> Fixes: 9e0360aebf23 ("net/failsafe: register as Rx interrupt
> > > > >> mode")
> > > > >> Cc: stable@dpdk.org
> > > > >>
> > > > >
> > > > > Hello Yunjian,
> > > > >
> > > > > Nothing prevents a DPDK app from closing 0 and getting it from
> > > > > another call, good catch.
> > > > >
> > > > >> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
> > > > >
> > > > > Acked-by: Gaetan Rivet <grive@u256.net>
> > > >
> > > > Applied to dpdk-next-net/master, thanks.
> > >
> > > This patch is causing Testpmd to quit when I issue a "port stop" command.
> > Testpmd log:
> > >
> > > """
> > > x86_64-native-linuxapp-gcc/build/app/test-pmd/testpmd -n 4 -- -i
> > > --forward-mode=mac
> > > EAL: Detected 8 lcore(s)
> > > EAL: Detected 1 NUMA nodes
> > > EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
> > > EAL: Selected IOVA mode 'PA'
> > > EAL: No available hugepages reported in hugepages-1048576kB
> > > EAL: Probing VFIO support...
> > > EAL: PCI device 0002:00:02.0 on NUMA socket 0
> > > EAL:   probe driver: 15b3:1004 net_mlx4
> > > Interactive-mode selected
> > > Set mac packet forwarding mode
> > > Warning: NUMA should be configured manually by using --port-numa-config
> > and --ring-numa-config parameters along with --numa.
> > > testpmd: create a new mbuf pool <mbuf_pool_socket_0>: n=203456,
> > > size=2176, socket=0
> > > testpmd: preferred mempool ops selected: ring_mp_mc
> > >
> > > Warning! port-topology=paired and odd forward ports number, the last port
> > will pair with itself.
> > >
> > > Configuring Port 1 (socket 0)
> > > Port 1: 00:15:5D:26:2B:00
> > > Checking link statuses...
> > > Done
> > > testpmd> port stop 1
> > > Stopping ports...
> > > Checking link statuses...
> > > Done
> > > testpmd>
> > > Stopping port 1...
> > > Stopping ports...
> > > Done
> > >
> > > Shutting down port 1...
> > > Closing ports...
> > > Done
> > >
> > > Bye...
> > > """
> > >
> > > My terminal gets broken at this point, and I have to reinitialize it with a
> > "reset".
> > >
> > > - Ali
> > 
> > Hi Ali,
> > 
> > Thanks for the report, I am looking into it.
> > 
> > Are you testing failsafe on Azure?
> 
> This reproduces with Failsafe, but not necessarily on Azure. You can try to reproduce on any platform if you pass something like '-w 00:00.0 --vdev="net_failsafe0,dev(0000:08:00.0)"'.
> 

Hi,

Indeed, I am able to reproduce the issue using this command:
   bash> ./build/app/dpdk-testpmd -n4 -m 4096 --no-huge --vdev='net_failsafe0,dev(net_ring0)' -- -i
(no need of PCI bus nor hugepages to validate failsafe sometimes).

I was asking about Azure because you did not give the command line
options for fail-safe, so I assumed it had been probed automagically.

I made a fix, will send soon.

Regards,
-- 
Gaëtan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes
  2020-05-03 11:33     ` Ali Alnubani
  2020-05-04 16:22       ` Gaëtan Rivet
@ 2020-05-05 19:10       ` Gaetan Rivet
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev Gaetan Rivet
                           ` (2 more replies)
  1 sibling, 3 replies; 23+ messages in thread
From: Gaetan Rivet @ 2020-05-05 19:10 UTC (permalink / raw)
  To: dev

Some issues seen on next-net.

Gaetan Rivet (3):
  net/failsafe: avoid crash on malformed eth_dev
  net/ring: fix eth_dev device pointer on allocation
  net/failsafe: fix default service proxy state

 drivers/net/failsafe/failsafe.c         |  1 +
 drivers/net/failsafe/failsafe_ether.c   |  5 ++++
 drivers/net/failsafe/failsafe_private.h |  8 ++++++
 drivers/net/ring/rte_eth_ring.c         | 36 ++++++++++++++++++++-----
 4 files changed, 43 insertions(+), 7 deletions(-)

-- 
2.26.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev
  2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
@ 2020-05-05 19:10         ` Gaetan Rivet
  2020-05-06 17:16           ` Ferruh Yigit
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation Gaetan Rivet
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state Gaetan Rivet
  2 siblings, 1 reply; 23+ messages in thread
From: Gaetan Rivet @ 2020-05-05 19:10 UTC (permalink / raw)
  To: dev

Some PMD do not respect the eth_dev API when allocating their
rte_eth_dev. As a result, on device add event resulting from
rte_eth_dev_probing_finish() call, the eth_dev processed is incomplete.

The segfault is a good way to focus the developer on the issue, but does
not inspire confidence. Instead, warn the user of the error repeatedly.

The failsafe PMD can warn of the issue and continue. It will repeatedly
attempt to initialize the failed port and complain about it, which
should result in the same developer focus but with less crashing.

Signed-off-by: Gaetan Rivet <grive@u256.net>
---
 drivers/net/failsafe/failsafe_ether.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/net/failsafe/failsafe_ether.c b/drivers/net/failsafe/failsafe_ether.c
index 93deacd13..2b748bd8b 100644
--- a/drivers/net/failsafe/failsafe_ether.c
+++ b/drivers/net/failsafe/failsafe_ether.c
@@ -623,6 +623,11 @@ failsafe_eth_new_event_callback(uint16_t port_id,
 	FOREACH_SUBDEV_STATE(sdev, i, fs_dev, DEV_PARSED) {
 		if (sdev->state >= DEV_PROBED)
 			continue;
+		if (dev->device == NULL) {
+			WARN("Trying to probe malformed device %s.\n",
+			     sdev->devargs.name);
+			continue;
+		}
 		if (strcmp(sdev->devargs.name, dev->device->name) != 0)
 			continue;
 		rte_eth_dev_owner_set(port_id, &PRIV(fs_dev)->my_owner);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation
  2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev Gaetan Rivet
@ 2020-05-05 19:10         ` Gaetan Rivet
  2020-05-06 11:48           ` Ferruh Yigit
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state Gaetan Rivet
  2 siblings, 1 reply; 23+ messages in thread
From: Gaetan Rivet @ 2020-05-05 19:10 UTC (permalink / raw)
  To: dev; +Cc: stable, ferruh.yigit, thomas

When a net_ring device is allocated, its device pointer is not set
before calling rte_eth_dev_probing_finish, which is incorrect.

The following:
  commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
  commit: a6992e961050 ("net/ring: set ethernet device field")

already attempted to fix this issue in 17.08, which was fine at the
time. Adding the hook rte_eth_dev_probing_finish() however created this
bug, as the eth_dev exposed when this hook is executed is expected to be
complete.

Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
write the pointer properly in do_eth_dev_ring_create().

Cc: stable@dpdk.org
Fixes: fbe90cdd776c ("ethdev: add probing finish function")
Cc: ferruh.yigit@intel.com
Cc: thomas@monjalon.net
Signed-off-by: Gaetan Rivet <grive@u256.net>
---
 drivers/net/ring/rte_eth_ring.c | 36 ++++++++++++++++++++++++++-------
 1 file changed, 29 insertions(+), 7 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 41acbc513..ad27f9d89 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -244,6 +244,26 @@ static const struct eth_dev_ops ops = {
 	.mac_addr_add = eth_mac_addr_add,
 };
 
+static int
+dev_name_cmp(const struct rte_device *dev, const void *_name)
+{
+	const char *name = _name;
+
+	return strcmp(dev->name, name);
+}
+
+static struct rte_device *
+ring_device_from_name(const char *name)
+{
+	struct rte_bus *vdev_bus;
+
+	vdev_bus = rte_bus_find_by_name("vdev");
+	if (vdev_bus == NULL)
+		return NULL;
+
+	return vdev_bus->find_device(NULL, dev_name_cmp, name);
+}
+
 static int
 do_eth_dev_ring_create(const char *name,
 		struct rte_ring * const rx_queues[],
@@ -294,6 +314,7 @@ do_eth_dev_ring_create(const char *name,
 	 * - store queue data in internals,
 	 * - store numa_node info in eth_dev_data
 	 * - point eth_dev_data to internals
+	 * - store EAL device in eth_dev,
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 
@@ -325,10 +346,17 @@ do_eth_dev_ring_create(const char *name,
 	data->kdrv = RTE_KDRV_NONE;
 	data->numa_node = numa_node;
 
-	/* finally assign rx and tx ops */
+	/* assign rx and tx ops */
 	eth_dev->rx_pkt_burst = eth_ring_rx;
 	eth_dev->tx_pkt_burst = eth_ring_tx;
 
+	/* finally set the rte_device pointer in eth_dev. */
+	eth_dev->device = ring_device_from_name(name);
+	if (eth_dev->device == NULL) {
+		rte_errno = ENODEV;
+		goto error;
+	}
+
 	rte_eth_dev_probing_finish(eth_dev);
 	*eth_dev_p = eth_dev;
 
@@ -584,9 +612,6 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 							  DEV_ATTACH, &eth_dev);
 			}
 
-			if (eth_dev)
-				eth_dev->device = &dev->device;
-
 			return ret;
 		}
 
@@ -644,9 +669,6 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 		}
 	}
 
-	if (eth_dev)
-		eth_dev->device = &dev->device;
-
 out_free:
 	rte_kvargs_free(kvlist);
 	rte_free(info);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state
  2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev Gaetan Rivet
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation Gaetan Rivet
@ 2020-05-05 19:10         ` Gaetan Rivet
  2020-05-06  8:58           ` Ali Alnubani
  2020-05-06 17:16           ` Ferruh Yigit
  2 siblings, 2 replies; 23+ messages in thread
From: Gaetan Rivet @ 2020-05-05 19:10 UTC (permalink / raw)
  To: dev; +Cc: wangyunjian, Ali Alnubani

The service proxy is initialized at 0. This is assuming that all of
its fields are invalid at 0. The issue is that a file descriptor at 0 is a
valid one.

The value -1 is used as sentinel during cleanup. Initialize the RX proxy
file descriptor to -1.

Fixes: 366226dd859f ("net/failsafe: fix fd leak")
Signed-off-by: Gaetan Rivet <grive@u256.net>
Cc: wangyunjian@huawei.com
Cc: Ali Alnubani <alialnu@mellanox.com>
---
 drivers/net/failsafe/failsafe.c         | 1 +
 drivers/net/failsafe/failsafe_private.h | 8 ++++++++
 2 files changed, 9 insertions(+)

diff --git a/drivers/net/failsafe/failsafe.c b/drivers/net/failsafe/failsafe.c
index 8af31d71b..72362f35d 100644
--- a/drivers/net/failsafe/failsafe.c
+++ b/drivers/net/failsafe/failsafe.c
@@ -190,6 +190,7 @@ fs_eth_dev_create(struct rte_vdev_device *vdev)
 	}
 	priv = PRIV(dev);
 	priv->data = dev->data;
+	priv->rxp = FS_RX_PROXY_INIT;
 	dev->dev_ops = &failsafe_ops;
 	dev->data->mac_addrs = &PRIV(dev)->mac_addrs[0];
 	dev->data->dev_link = eth_link;
diff --git a/drivers/net/failsafe/failsafe_private.h b/drivers/net/failsafe/failsafe_private.h
index 8e9706aef..651578a12 100644
--- a/drivers/net/failsafe/failsafe_private.h
+++ b/drivers/net/failsafe/failsafe_private.h
@@ -58,6 +58,14 @@ struct rx_proxy {
 	enum rxp_service_state sstate;
 };
 
+#define FS_RX_PROXY_INIT (struct rx_proxy){ \
+	.efd = -1, \
+	.evec = NULL, \
+	.sid = 0, \
+	.scid = 0, \
+	.sstate = SS_NO_SERVICE, \
+}
+
 struct rxq {
 	struct fs_priv *priv;
 	uint16_t qid;
-- 
2.26.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state Gaetan Rivet
@ 2020-05-06  8:58           ` Ali Alnubani
  2020-05-06 17:16           ` Ferruh Yigit
  1 sibling, 0 replies; 23+ messages in thread
From: Ali Alnubani @ 2020-05-06  8:58 UTC (permalink / raw)
  To: Gaetan Rivet, dev; +Cc: wangyunjian

Thanks Gaetan.

> -----Original Message-----
> From: Gaetan Rivet <grive@u256.net>
> Sent: Tuesday, May 5, 2020 10:11 PM
> To: dev@dpdk.org
> Cc: wangyunjian@huawei.com; Ali Alnubani <alialnu@mellanox.com>
> Subject: [PATCH v1 3/3] net/failsafe: fix default service proxy state
> 
> The service proxy is initialized at 0. This is assuming that all of its fields are
> invalid at 0. The issue is that a file descriptor at 0 is a valid one.
> 
> The value -1 is used as sentinel during cleanup. Initialize the RX proxy file
> descriptor to -1.
> 
> Fixes: 366226dd859f ("net/failsafe: fix fd leak")
> Signed-off-by: Gaetan Rivet <grive@u256.net>
> Cc: wangyunjian@huawei.com
> Cc: Ali Alnubani <alialnu@mellanox.com>

Tested-by: Ali Alnubani <alialnu@mellanox.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation Gaetan Rivet
@ 2020-05-06 11:48           ` Ferruh Yigit
  2020-05-06 12:33             ` Gaëtan Rivet
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-06 11:48 UTC (permalink / raw)
  To: Gaetan Rivet, dev; +Cc: stable, thomas

On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
> When a net_ring device is allocated, its device pointer is not set
> before calling rte_eth_dev_probing_finish, which is incorrect.
> 
> The following:
>   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
>   commit: a6992e961050 ("net/ring: set ethernet device field")
> 
> already attempted to fix this issue in 17.08, which was fine at the
> time. Adding the hook rte_eth_dev_probing_finish() however created this
> bug, as the eth_dev exposed when this hook is executed is expected to be
> complete.
> 
> Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
> write the pointer properly in do_eth_dev_ring_create().
> 
> Cc: stable@dpdk.org
> Fixes: fbe90cdd776c ("ethdev: add probing finish function")
> Cc: ferruh.yigit@intel.com
> Cc: thomas@monjalon.net
> Signed-off-by: Gaetan Rivet <grive@u256.net>

<...>

> @@ -325,10 +346,17 @@ do_eth_dev_ring_create(const char *name,
>  	data->kdrv = RTE_KDRV_NONE;
>  	data->numa_node = numa_node;
>  
> -	/* finally assign rx and tx ops */
> +	/* assign rx and tx ops */
>  	eth_dev->rx_pkt_burst = eth_ring_rx;
>  	eth_dev->tx_pkt_burst = eth_ring_tx;
>  
> +	/* finally set the rte_device pointer in eth_dev. */
> +	eth_dev->device = ring_device_from_name(name);
> +	if (eth_dev->device == NULL) {
> +		rte_errno = ENODEV;
> +		goto error;
> +	}
> +
>  	rte_eth_dev_probing_finish(eth_dev);
>  	*eth_dev_p = eth_dev;

Why not move the 'rte_eth_dev_probing_finish()' to the 'rte_pmd_ring_probe()',
below where 'eth_dev->device' set?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation
  2020-05-06 11:48           ` Ferruh Yigit
@ 2020-05-06 12:33             ` Gaëtan Rivet
  2020-05-06 13:43               ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Gaëtan Rivet @ 2020-05-06 12:33 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, stable, thomas

On 06/05/20 12:48 +0100, Ferruh Yigit wrote:
> On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
> > When a net_ring device is allocated, its device pointer is not set
> > before calling rte_eth_dev_probing_finish, which is incorrect.
> > 
> > The following:
> >   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
> >   commit: a6992e961050 ("net/ring: set ethernet device field")
> > 
> > already attempted to fix this issue in 17.08, which was fine at the
> > time. Adding the hook rte_eth_dev_probing_finish() however created this
> > bug, as the eth_dev exposed when this hook is executed is expected to be
> > complete.
> > 
> > Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
> > write the pointer properly in do_eth_dev_ring_create().
> > 
> > Cc: stable@dpdk.org
> > Fixes: fbe90cdd776c ("ethdev: add probing finish function")
> > Cc: ferruh.yigit@intel.com
> > Cc: thomas@monjalon.net
> > Signed-off-by: Gaetan Rivet <grive@u256.net>
> 
> <...>
> 
> > @@ -325,10 +346,17 @@ do_eth_dev_ring_create(const char *name,
> >  	data->kdrv = RTE_KDRV_NONE;
> >  	data->numa_node = numa_node;
> >  
> > -	/* finally assign rx and tx ops */
> > +	/* assign rx and tx ops */
> >  	eth_dev->rx_pkt_burst = eth_ring_rx;
> >  	eth_dev->tx_pkt_burst = eth_ring_tx;
> >  
> > +	/* finally set the rte_device pointer in eth_dev. */
> > +	eth_dev->device = ring_device_from_name(name);
> > +	if (eth_dev->device == NULL) {
> > +		rte_errno = ENODEV;
> > +		goto error;
> > +	}
> > +
> >  	rte_eth_dev_probing_finish(eth_dev);
> >  	*eth_dev_p = eth_dev;
> 
> Why not move the 'rte_eth_dev_probing_finish()' to the 'rte_pmd_ring_probe()',
> below where 'eth_dev->device' set?

Hi Ferruh,

Sure it would work. The reason I did it this way is two-fold:

  * I disliked having two places where eth_dev->device was conditionally
    set. It makes it harder to read rte_pmd_ring_probe.

  * I was actually thinking, doing this patch, that we should modify
    rte_eth_dev_allocate() to take an rte_device as parameter, as all
    eth_dev are meant to be backed by an rte_device. Keeping this in
    mind, I meant to move writing the pointer closer to the
    rte_eth_dev_allocate() call.

But you are right that it is needlessly verbose, using
vdev_bus->find_device() to do this stuff. I'm ok with changing it as you
described if you prefer.

-- 
Gaëtan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation
  2020-05-06 12:33             ` Gaëtan Rivet
@ 2020-05-06 13:43               ` Ferruh Yigit
  2020-05-06 17:32                 ` Gaëtan Rivet
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-06 13:43 UTC (permalink / raw)
  To: Gaëtan Rivet; +Cc: dev, stable, thomas

On 5/6/2020 1:33 PM, Gaëtan Rivet wrote:
> On 06/05/20 12:48 +0100, Ferruh Yigit wrote:
>> On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
>>> When a net_ring device is allocated, its device pointer is not set
>>> before calling rte_eth_dev_probing_finish, which is incorrect.
>>>
>>> The following:
>>>   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
>>>   commit: a6992e961050 ("net/ring: set ethernet device field")
>>>
>>> already attempted to fix this issue in 17.08, which was fine at the
>>> time. Adding the hook rte_eth_dev_probing_finish() however created this
>>> bug, as the eth_dev exposed when this hook is executed is expected to be
>>> complete.
>>>
>>> Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
>>> write the pointer properly in do_eth_dev_ring_create().
>>>
>>> Cc: stable@dpdk.org
>>> Fixes: fbe90cdd776c ("ethdev: add probing finish function")
>>> Cc: ferruh.yigit@intel.com
>>> Cc: thomas@monjalon.net
>>> Signed-off-by: Gaetan Rivet <grive@u256.net>
>>
>> <...>
>>
>>> @@ -325,10 +346,17 @@ do_eth_dev_ring_create(const char *name,
>>>  	data->kdrv = RTE_KDRV_NONE;
>>>  	data->numa_node = numa_node;
>>>  
>>> -	/* finally assign rx and tx ops */
>>> +	/* assign rx and tx ops */
>>>  	eth_dev->rx_pkt_burst = eth_ring_rx;
>>>  	eth_dev->tx_pkt_burst = eth_ring_tx;
>>>  
>>> +	/* finally set the rte_device pointer in eth_dev. */
>>> +	eth_dev->device = ring_device_from_name(name);
>>> +	if (eth_dev->device == NULL) {
>>> +		rte_errno = ENODEV;
>>> +		goto error;
>>> +	}
>>> +
>>>  	rte_eth_dev_probing_finish(eth_dev);
>>>  	*eth_dev_p = eth_dev;
>>
>> Why not move the 'rte_eth_dev_probing_finish()' to the 'rte_pmd_ring_probe()',
>> below where 'eth_dev->device' set?
> 
> Hi Ferruh,
> 
> Sure it would work. The reason I did it this way is two-fold:
> 
>   * I disliked having two places where eth_dev->device was conditionally
>     set. It makes it harder to read rte_pmd_ring_probe.

Agree, what about using a 'goto' to have the assignment and
'rte_eth_dev_probing_finish()' in a single place?
But check seems needed since creation may failed at that stage, if you think
better check can be done on 'ret' instead of 'eth_dev'...

> 
>   * I was actually thinking, doing this patch, that we should modify
>     rte_eth_dev_allocate() to take an rte_device as parameter, as all
>     eth_dev are meant to be backed by an rte_device. Keeping this in
>     mind, I meant to move writing the pointer closer to the
>     rte_eth_dev_allocate() call.

That is a bigger change, may affect many (if not all) PMDs, so I think this can
be considered when that change is available.

And although that change may fix the issues that 'eth_dev->device' is not set,
which we had a few times before, not sure it worth to change all PMDs and ethdev
API directly couple with rte_device, instead of PMD being the glue. Can be
discussed more on its own patch.

> 
> But you are right that it is needlessly verbose, using
> vdev_bus->find_device() to do this stuff. I'm ok with changing it as you
> described if you prefer.
> 

That was the concern, that is too much code to take a value which is already
available a few level up the stack.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state Gaetan Rivet
  2020-05-06  8:58           ` Ali Alnubani
@ 2020-05-06 17:16           ` Ferruh Yigit
  1 sibling, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-06 17:16 UTC (permalink / raw)
  To: Gaetan Rivet, dev; +Cc: wangyunjian, Ali Alnubani

On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
> The service proxy is initialized at 0. This is assuming that all of
> its fields are invalid at 0. The issue is that a file descriptor at 0 is a
> valid one.
> 
> The value -1 is used as sentinel during cleanup. Initialize the RX proxy
> file descriptor to -1.
> 
> Fixes: 366226dd859f ("net/failsafe: fix fd leak")
> Signed-off-by: Gaetan Rivet <grive@u256.net>

I have squashed this to the commit in the fixes line, to not leave any commit
that can cause crash, and kept both sign-offs and Ali's test tag.

Squashed into relevant commit in next-net, thanks.


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev
  2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev Gaetan Rivet
@ 2020-05-06 17:16           ` Ferruh Yigit
  0 siblings, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-06 17:16 UTC (permalink / raw)
  To: Gaetan Rivet, dev

On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
> Some PMD do not respect the eth_dev API when allocating their
> rte_eth_dev. As a result, on device add event resulting from
> rte_eth_dev_probing_finish() call, the eth_dev processed is incomplete.
> 
> The segfault is a good way to focus the developer on the issue, but does
> not inspire confidence. Instead, warn the user of the error repeatedly.
> 
> The failsafe PMD can warn of the issue and continue. It will repeatedly
> attempt to initialize the failed port and complain about it, which
> should result in the same developer focus but with less crashing.
> 
> Signed-off-by: Gaetan Rivet <grive@u256.net>

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [dpdk-stable] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation
  2020-05-06 13:43               ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
@ 2020-05-06 17:32                 ` Gaëtan Rivet
  2020-05-06 18:09                   ` [dpdk-dev] [PATCH v2] " Gaetan Rivet
  0 siblings, 1 reply; 23+ messages in thread
From: Gaëtan Rivet @ 2020-05-06 17:32 UTC (permalink / raw)
  To: Ferruh Yigit; +Cc: dev, stable, thomas

On 06/05/20 14:43 +0100, Ferruh Yigit wrote:
> On 5/6/2020 1:33 PM, Gaëtan Rivet wrote:
> > On 06/05/20 12:48 +0100, Ferruh Yigit wrote:
> >> On 5/5/2020 8:10 PM, Gaetan Rivet wrote:
> >>> When a net_ring device is allocated, its device pointer is not set
> >>> before calling rte_eth_dev_probing_finish, which is incorrect.
> >>>
> >>> The following:
> >>>   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
> >>>   commit: a6992e961050 ("net/ring: set ethernet device field")
> >>>
> >>> already attempted to fix this issue in 17.08, which was fine at the
> >>> time. Adding the hook rte_eth_dev_probing_finish() however created this
> >>> bug, as the eth_dev exposed when this hook is executed is expected to be
> >>> complete.
> >>>
> >>> Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
> >>> write the pointer properly in do_eth_dev_ring_create().
> >>>
> >>> Cc: stable@dpdk.org
> >>> Fixes: fbe90cdd776c ("ethdev: add probing finish function")
> >>> Cc: ferruh.yigit@intel.com
> >>> Cc: thomas@monjalon.net
> >>> Signed-off-by: Gaetan Rivet <grive@u256.net>
> >>
> >> <...>
> >>
> >>> @@ -325,10 +346,17 @@ do_eth_dev_ring_create(const char *name,
> >>>  	data->kdrv = RTE_KDRV_NONE;
> >>>  	data->numa_node = numa_node;
> >>>  
> >>> -	/* finally assign rx and tx ops */
> >>> +	/* assign rx and tx ops */
> >>>  	eth_dev->rx_pkt_burst = eth_ring_rx;
> >>>  	eth_dev->tx_pkt_burst = eth_ring_tx;
> >>>  
> >>> +	/* finally set the rte_device pointer in eth_dev. */
> >>> +	eth_dev->device = ring_device_from_name(name);
> >>> +	if (eth_dev->device == NULL) {
> >>> +		rte_errno = ENODEV;
> >>> +		goto error;
> >>> +	}
> >>> +
> >>>  	rte_eth_dev_probing_finish(eth_dev);
> >>>  	*eth_dev_p = eth_dev;
> >>
> >> Why not move the 'rte_eth_dev_probing_finish()' to the 'rte_pmd_ring_probe()',
> >> below where 'eth_dev->device' set?
> > 
> > Hi Ferruh,
> > 
> > Sure it would work. The reason I did it this way is two-fold:
> > 
> >   * I disliked having two places where eth_dev->device was conditionally
> >     set. It makes it harder to read rte_pmd_ring_probe.
> 
> Agree, what about using a 'goto' to have the assignment and
> 'rte_eth_dev_probing_finish()' in a single place?
> But check seems needed since creation may failed at that stage, if you think
> better check can be done on 'ret' instead of 'eth_dev'...
> 
> > 
> >   * I was actually thinking, doing this patch, that we should modify
> >     rte_eth_dev_allocate() to take an rte_device as parameter, as all
> >     eth_dev are meant to be backed by an rte_device. Keeping this in
> >     mind, I meant to move writing the pointer closer to the
> >     rte_eth_dev_allocate() call.
> 
> That is a bigger change, may affect many (if not all) PMDs, so I think this can
> be considered when that change is available.
> 
> And although that change may fix the issues that 'eth_dev->device' is not set,
> which we had a few times before, not sure it worth to change all PMDs and ethdev
> API directly couple with rte_device, instead of PMD being the glue. Can be
> discussed more on its own patch.
> 
> > 
> > But you are right that it is needlessly verbose, using
> > vdev_bus->find_device() to do this stuff. I'm ok with changing it as you
> > described if you prefer.
> > 
> 
> That was the concern, that is too much code to take a value which is already
> available a few level up the stack.

Ok, future-proofing is a bad habit so let's not do it.

I'm not a fan of the goto for the 'happy path' though. Another simple
way would be to bring the vdev pointer with me as we go down the stack.

I will send a v2 shortly, if it's too ugly to move the pointer down I'll
use the goto, or if you tell me you'd prefer it.

-- 
Gaëtan

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [dpdk-dev] [PATCH v2] net/ring: fix eth_dev device pointer on allocation
  2020-05-06 17:32                 ` Gaëtan Rivet
@ 2020-05-06 18:09                   ` Gaetan Rivet
  2020-05-08 11:00                     ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Gaetan Rivet @ 2020-05-06 18:09 UTC (permalink / raw)
  To: ferruh.yigit; +Cc: dev, stable

When a net_ring device is allocated, its device pointer is not set
before calling rte_eth_dev_probing_finish, which is incorrect.

The following:
  commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
  commit: a6992e961050 ("net/ring: set ethernet device field")

already fixed the same issue in 17.08, which was fine at the time.
Adding the hook rte_eth_dev_probing_finish() however created this bug,
as the eth_dev exposed when this hook is executed is expected to be
complete.

Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
write the pointer properly in do_eth_dev_ring_create().

Cc: stable@dpdk.org
Fixes: fbe90cdd776c ("ethdev: add probing finish function")
Signed-off-by: Gaetan Rivet <grive@u256.net>
---
 drivers/net/ring/rte_eth_ring.c | 29 +++++++++++++++--------------
 1 file changed, 15 insertions(+), 14 deletions(-)

diff --git a/drivers/net/ring/rte_eth_ring.c b/drivers/net/ring/rte_eth_ring.c
index 41acbc513..f0fafa0c0 100644
--- a/drivers/net/ring/rte_eth_ring.c
+++ b/drivers/net/ring/rte_eth_ring.c
@@ -246,6 +246,7 @@ static const struct eth_dev_ops ops = {
 
 static int
 do_eth_dev_ring_create(const char *name,
+		struct rte_vdev_device *vdev,
 		struct rte_ring * const rx_queues[],
 		const unsigned int nb_rx_queues,
 		struct rte_ring *const tx_queues[],
@@ -291,12 +292,15 @@ do_eth_dev_ring_create(const char *name,
 	}
 
 	/* now put it all together
+	 * - store EAL device in eth_dev,
 	 * - store queue data in internals,
 	 * - store numa_node info in eth_dev_data
 	 * - point eth_dev_data to internals
 	 * - and point eth_dev structure to new eth_dev_data structure
 	 */
 
+	eth_dev->device = &vdev->device;
+
 	data = eth_dev->data;
 	data->rx_queues = rx_queues_local;
 	data->tx_queues = tx_queues_local;
@@ -408,7 +412,9 @@ rte_eth_from_ring(struct rte_ring *r)
 }
 
 static int
-eth_dev_ring_create(const char *name, const unsigned int numa_node,
+eth_dev_ring_create(const char *name,
+		struct rte_vdev_device *vdev,
+		const unsigned int numa_node,
 		enum dev_action action, struct rte_eth_dev **eth_dev)
 {
 	/* rx and tx are so-called from point of view of first port.
@@ -438,7 +444,7 @@ eth_dev_ring_create(const char *name, const unsigned int numa_node,
 			return -1;
 	}
 
-	if (do_eth_dev_ring_create(name, rxtx, num_rings, rxtx, num_rings,
+	if (do_eth_dev_ring_create(name, vdev, rxtx, num_rings, rxtx, num_rings,
 		numa_node, action, eth_dev) < 0)
 		return -1;
 
@@ -560,12 +566,12 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 	PMD_LOG(INFO, "Initializing pmd_ring for %s", name);
 
 	if (params == NULL || params[0] == '\0') {
-		ret = eth_dev_ring_create(name, rte_socket_id(), DEV_CREATE,
+		ret = eth_dev_ring_create(name, dev, rte_socket_id(), DEV_CREATE,
 				&eth_dev);
 		if (ret == -1) {
 			PMD_LOG(INFO,
 				"Attach to pmd_ring for %s", name);
-			ret = eth_dev_ring_create(name, rte_socket_id(),
+			ret = eth_dev_ring_create(name, dev, rte_socket_id(),
 						  DEV_ATTACH, &eth_dev);
 		}
 	} else {
@@ -574,19 +580,16 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 		if (!kvlist) {
 			PMD_LOG(INFO,
 				"Ignoring unsupported parameters when creatingrings-backed ethernet device");
-			ret = eth_dev_ring_create(name, rte_socket_id(),
+			ret = eth_dev_ring_create(name, dev, rte_socket_id(),
 						  DEV_CREATE, &eth_dev);
 			if (ret == -1) {
 				PMD_LOG(INFO,
 					"Attach to pmd_ring for %s",
 					name);
-				ret = eth_dev_ring_create(name, rte_socket_id(),
+				ret = eth_dev_ring_create(name, dev, rte_socket_id(),
 							  DEV_ATTACH, &eth_dev);
 			}
 
-			if (eth_dev)
-				eth_dev->device = &dev->device;
-
 			return ret;
 		}
 
@@ -597,7 +600,7 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 			if (ret < 0)
 				goto out_free;
 
-			ret = do_eth_dev_ring_create(name,
+			ret = do_eth_dev_ring_create(name, dev,
 				internal_args->rx_queues,
 				internal_args->nb_rx_queues,
 				internal_args->tx_queues,
@@ -627,6 +630,7 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 
 			for (info->count = 0; info->count < info->total; info->count++) {
 				ret = eth_dev_ring_create(info->list[info->count].name,
+							  dev,
 							  info->list[info->count].node,
 							  info->list[info->count].action,
 							  &eth_dev);
@@ -635,7 +639,7 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 					PMD_LOG(INFO,
 						"Attach to pmd_ring for %s",
 						name);
-					ret = eth_dev_ring_create(name,
+					ret = eth_dev_ring_create(name, dev,
 							info->list[info->count].node,
 							DEV_ATTACH,
 							&eth_dev);
@@ -644,9 +648,6 @@ rte_pmd_ring_probe(struct rte_vdev_device *dev)
 		}
 	}
 
-	if (eth_dev)
-		eth_dev->device = &dev->device;
-
 out_free:
 	rte_kvargs_free(kvlist);
 	rte_free(info);
-- 
2.26.2


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v2] net/ring: fix eth_dev device pointer on allocation
  2020-05-06 18:09                   ` [dpdk-dev] [PATCH v2] " Gaetan Rivet
@ 2020-05-08 11:00                     ` Ferruh Yigit
  2020-05-11 16:54                       ` Ferruh Yigit
  0 siblings, 1 reply; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-08 11:00 UTC (permalink / raw)
  To: Gaetan Rivet; +Cc: dev, stable, Bruce Richardson

On 5/6/2020 7:09 PM, Gaetan Rivet wrote:
> When a net_ring device is allocated, its device pointer is not set
> before calling rte_eth_dev_probing_finish, which is incorrect.
> 
> The following:
>   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
>   commit: a6992e961050 ("net/ring: set ethernet device field")
> 
> already fixed the same issue in 17.08, which was fine at the time.
> Adding the hook rte_eth_dev_probing_finish() however created this bug,
> as the eth_dev exposed when this hook is executed is expected to be
> complete.
> 
> Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
> write the pointer properly in do_eth_dev_ring_create().
> 
> Cc: stable@dpdk.org
> Fixes: fbe90cdd776c ("ethdev: add probing finish function")
> Signed-off-by: Gaetan Rivet <grive@u256.net>

I would prefer moving the assignment up in the stack where 'device' is
available, instead of moving the variable down in the stack to assign it, but
both does the work ...

Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>


^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [dpdk-dev] [PATCH v2] net/ring: fix eth_dev device pointer on allocation
  2020-05-08 11:00                     ` Ferruh Yigit
@ 2020-05-11 16:54                       ` Ferruh Yigit
  0 siblings, 0 replies; 23+ messages in thread
From: Ferruh Yigit @ 2020-05-11 16:54 UTC (permalink / raw)
  To: Gaetan Rivet; +Cc: dev, stable, Bruce Richardson

On 5/8/2020 12:00 PM, Ferruh Yigit wrote:
> On 5/6/2020 7:09 PM, Gaetan Rivet wrote:
>> When a net_ring device is allocated, its device pointer is not set
>> before calling rte_eth_dev_probing_finish, which is incorrect.
>>
>> The following:
>>   commit: 96cb19521147 ("net/ring: use EAL APIs in PMD specific API")
>>   commit: a6992e961050 ("net/ring: set ethernet device field")
>>
>> already fixed the same issue in 17.08, which was fine at the time.
>> Adding the hook rte_eth_dev_probing_finish() however created this bug,
>> as the eth_dev exposed when this hook is executed is expected to be
>> complete.
>>
>> Remove the prior attempts to fix the issue in rte_pmd_ring_probe() and
>> write the pointer properly in do_eth_dev_ring_create().
>>
>> Cc: stable@dpdk.org
>> Fixes: fbe90cdd776c ("ethdev: add probing finish function")
>> Signed-off-by: Gaetan Rivet <grive@u256.net>
> 
> I would prefer moving the assignment up in the stack where 'device' is
> available, instead of moving the variable down in the stack to assign it, but
> both does the work ...
> 
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
> 

Applied to dpdk-next-net/master, thanks.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2020-05-11 16:54 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-04-27 10:44 [dpdk-dev] [PATCH] net/failsafe: fix fd leak wangyunjian
2020-04-27 11:12 ` Gaëtan Rivet
2020-04-27 16:55   ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
2020-05-03 11:33     ` Ali Alnubani
2020-05-04 16:22       ` Gaëtan Rivet
2020-05-04 16:28         ` Stephen Hemminger
2020-05-05  9:47           ` Ali Alnubani
2020-05-05  9:14         ` Ali Alnubani
2020-05-05 18:35           ` Gaëtan Rivet
2020-05-05 19:10       ` [dpdk-dev] [PATCH v1 0/3] failsafe & ring fixes Gaetan Rivet
2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 1/3] net/failsafe: avoid crash on malformed eth_dev Gaetan Rivet
2020-05-06 17:16           ` Ferruh Yigit
2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 2/3] net/ring: fix eth_dev device pointer on allocation Gaetan Rivet
2020-05-06 11:48           ` Ferruh Yigit
2020-05-06 12:33             ` Gaëtan Rivet
2020-05-06 13:43               ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
2020-05-06 17:32                 ` Gaëtan Rivet
2020-05-06 18:09                   ` [dpdk-dev] [PATCH v2] " Gaetan Rivet
2020-05-08 11:00                     ` Ferruh Yigit
2020-05-11 16:54                       ` Ferruh Yigit
2020-05-05 19:10         ` [dpdk-dev] [PATCH v1 3/3] net/failsafe: fix default service proxy state Gaetan Rivet
2020-05-06  8:58           ` Ali Alnubani
2020-05-06 17:16           ` Ferruh Yigit

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git