DPDK patches and discussions
 help / color / Atom feed
From: Matan Azrad <matan@mellanox.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>,
	"Yigit, Ferruh" <ferruh.yigit@linux.intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	Bernard Iremonger <bernard.iremonger@intel.com>
Cc: Gaetan Rivet <gaetan.rivet@6wind.com>,
	Thomas Monjalon <thomas@monjalon.net>,
	"stable@dpdk.org" <stable@dpdk.org>,
	David Marchand <david.marchand@redhat.com>,
	Jeff Guo <jia.guo@intel.com>, Qi Zhang <qi.z.zhang@intel.com>
Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH 2/2] app/testpmd: fix invalid port detaching
Date: Thu, 23 Jan 2020 15:29:20 +0000
Message-ID: <AM0PR0502MB4019D21A3F45C52C8AFA88C2D20F0@AM0PR0502MB4019.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <19a86d69-9bcc-42c9-b000-98b3860de42f@intel.com>


Hi

From: Ferruh Yigit
> On 1/23/2020 2:05 PM, Matan Azrad wrote:
> > Hi
> >
> > From: Yigit, Ferruh
> >> On 11/12/2019 8:47 AM, Matan Azrad wrote:
> >>> The port was not validated before detaching.
> >>>
> >>> Ignore port detach operation when the port is not valid.
> >>>
> >>> Fixes: f8e5baa2662d ("app/testpmd: check not detaching device
> >>> twice")
> >>> Cc: thomas@monjalon.net
> >>> Cc: stable@dpdk.org
> >>>
> >>> Signed-off-by: Matan Azrad <matan@mellanox.com>
> >>> ---
> >>>  app/test-pmd/testpmd.c | 3 +++
> >>>  1 file changed, 3 insertions(+)
> >>>
> >>> diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index
> >>> 4444346..370eefe 100644
> >>> --- a/app/test-pmd/testpmd.c
> >>> +++ b/app/test-pmd/testpmd.c
> >>> @@ -2545,6 +2545,9 @@ struct extmem_param {
> >>>
> >>>  	printf("Removing a device...\n");
> >>>
> >>> +	if (port_id_is_invalid(port_id, ENABLED_WARN))
> >>> +		return;
> >>> +
> >>>  	dev = rte_eth_devices[port_id].device;
> >>>  	if (dev == NULL) {
> >>>  		printf("Device already removed\n");
> >>>
> >>
> >> The patch is already in 19.11 [1] but it is breaking the testpmd
> >> hotplug support.
> >> Before 'detach_port_device()' called, the port has been stopped and
> >> closed [2], which will make port fail from 'port_id_is_invalid()'
> >> check and the device removal path never fully called.
> >> The implication is, since device not detached, vfio request interrupt
> >> keeps triggered continuously and re-starts the detach path, but
> >> because of the half cleaned device it fails and app gets stuck with a
> continuous log [3].
> >>
> >> I wonder if the actual hotplug has been tested with this patch, the
> >> commit log is not clear about the motivation and implication of the
> >> patch, I am not clear why this check is added but I am sending a
> >> patch soon to remove it back.
> >
> > The motivation of this patch was to prevent double detach on same port,
> so the user cannot call detach of invalid port.
> 
> What is the definition of the 'invalid port', if you mean device already
> detached case, in the second call of the function "if (dev == NULL)" check
> should prevent it going forward.

No, ethdev doesn't zero the device pointer when it release a port.
So even if the port is in unused state already - means invalid, the device pointer still may be valid and point to the last port that used the same id.


> But according the 'port_id_is_invalid()' API, a closed port is an invalid port, I
> think that is wrong in this context.

Why?

You are going to look on ethdev portid structure, don't you think we should valid the port before using its structure?


> >
> > I agree this patch is not good and we need a fix but I think the bug is
> conceptual.
> >
> > Testpmd tries to do detach by port_id which is derived by ethdev port id
> while detach work with rte_device.
> >
> > For example:
> > you can see in the line above after +++: dev =
> > rte_eth_devices[port_id].device, Testpmd may access invalid  or
> reallocated ethdev structure to get the device name and may even detach
> unwanted rte_device.
> 
> I thinks whichever function calling 'detach_port_device()' should check the
> port validity.
> 'detach_port_device()' doesn't know if port reallocated or not, it will free the
> given port_id, and when freeing done 'rte_eth_devices[port_id].device' will
> be NULL, this looks to me a valid check.

Please validate me, check ethdev, I don't think so, 'rte_eth_devices[port_id].device still valid after detach.

> The caller of the 'detach_port_device()' should ensure correct port_id
> passed to the function.

What is correct port id, if the port was released , is it correct?

> >
> > So, detach is broken with and without this patch.
> 
> I can't see how it is broken without the check, how the problem you
> mentioned can be reproduced? Or is it a theoretical issue?
> But with this check hotplug support is %100 reproducible broken.
> 
> >
> >
> > I think Testpmd should change the concept of rte_device mapping and put
> attention to next:
> > 1. Don't detach by ethdev port ID.
> > 2. Multiple ethdev port IDs may related to the same rte_device.
> >
> > The Testpmd user should be sure that all the port IDs of the rte_device are
> released before the detach call and Testpmd maybe need to validate it.
> > And like attach, detach should be triggered by PCI address \ rte_device
> name.
> >
> 
> We need to know about port_id too to be able to stop/close it.
> And sure no objection to improve the hotplug support but it is broken now,
> lets fix it first.
> 
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >> Regards,
> >> ferruh
> >>
> >>
> >> [1]
> >> https://eur03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgit
> >> .dp
> >>
> dk.org%2Fdpdk%2Fcommit%2F%3Fid%3D43d0e304980a1527bcac92dc679057
> >>
> b189e2545a&amp;data=02%7C01%7Cmatan%40mellanox.com%7Cc3f40356d
> >>
> d124e20faf708d7a006e68c%7Ca652971c7d2e4d9ba6a4d149256f461b%7C0%7
> >>
> C0%7C637153823809699996&amp;sdata=dBy9m%2BxCA%2Bme1IpX2LqPARa
> >> 62giznKi8Xbtu220GA%2Bg%3D&amp;reserved=0
> >>
> >> [2]
> >> rmv_port_callback
> >>   stop_port(port_id);
> >>   close_port(port_id);
> >>   detach_port_device(port_id);
> >>
> >> [3]
> >> EAL: can not get port by device 0000:00:05.0!
> >> EAL: can not get port by device 0000:00:05.0!
> >> EAL: can not get port by device 0000:00:05.0!
> >> EAL: can not get port by device 0000:00:05.0!
> >> EAL: can not get port by device 0000:00:05.0!
> >> EAL: can not get port by device 0000:00:05.0!
> >> ...


  reply index

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-12  8:47 [dpdk-dev] [PATCH 1/2] bus/pci: fix driver detach clear Matan Azrad
2019-11-12  8:47 ` [dpdk-dev] [PATCH 2/2] app/testpmd: fix invalid port detaching Matan Azrad
2019-11-12 11:20   ` Iremonger, Bernard
2019-11-20 22:52     ` [dpdk-dev] [dpdk-stable] " David Marchand
2020-01-23 13:19   ` [dpdk-dev] " Yigit, Ferruh
2020-01-23 14:05     ` Matan Azrad
2020-01-23 14:48       ` [dpdk-dev] [dpdk-stable] " Ferruh Yigit
2020-01-23 15:29         ` Matan Azrad [this message]
2020-01-23 18:14           ` Ferruh Yigit
2020-01-23 19:25             ` Matan Azrad
2020-01-24 16:28               ` Ferruh Yigit
2020-01-25 18:56                 ` Matan Azrad
2020-02-03 15:58                   ` Ferruh Yigit
2020-02-03 17:10                     ` Matan Azrad
2020-02-12 13:49                       ` Ferruh Yigit
2020-02-13 12:37                         ` Thomas Monjalon
2020-02-13 13:36                           ` Thomas Monjalon
2020-02-13 14:00                             ` Ferruh Yigit
2019-11-19 22:40 ` [dpdk-dev] [dpdk-stable] [PATCH 1/2] bus/pci: fix driver detach clear Thomas Monjalon
2019-11-20  9:02   ` Matan Azrad
2019-11-20  9:47 ` [dpdk-dev] [PATCH v2] " Matan Azrad
2019-11-20 13:03   ` [dpdk-dev] [dpdk-stable] " David Marchand
2019-11-20 13:44     ` Matan Azrad
2019-11-20 13:51     ` Thomas Monjalon
2019-11-20 17:22       ` David Marchand
2019-11-20 22:52   ` David Marchand

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM0PR0502MB4019D21A3F45C52C8AFA88C2D20F0@AM0PR0502MB4019.eurprd05.prod.outlook.com \
    --to=matan@mellanox.com \
    --cc=bernard.iremonger@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=ferruh.yigit@linux.intel.com \
    --cc=gaetan.rivet@6wind.com \
    --cc=jia.guo@intel.com \
    --cc=qi.z.zhang@intel.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox