DPDK patches and discussions
 help / color / mirror / Atom feed
From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Andrew Rybchenko <arybchenko@solarflare.com>,
	Thomas Monjalon <thomas@monjalon.net>
Cc: 方统浩50450 <fangtonghao@sangfor.com.cn>,
	dev@dpdk.org, stable@dpdk.org, jia.guo@intel.com,
	cunming.liang@intel.com, qi.z.zhang@intel.com,
	jungle845943968@outlook.com,
	"Jerin Jacob Kollanukkaran" <jerinj@marvell.com>
Subject: Re: [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory
Date: Thu, 16 Jan 2020 09:04:09 +0000	[thread overview]
Message-ID: <98e5d683-bd7c-6f4a-e8f9-216a01798ac8@intel.com> (raw)
In-Reply-To: <58542797-ae4c-9882-b777-60ee704de9dc@solarflare.com>

On 1/16/2020 7:43 AM, Andrew Rybchenko wrote:
> On 1/15/20 11:43 PM, Thomas Monjalon wrote:
>> 15/01/2020 19:35, Ferruh Yigit:
>>> On 1/15/2020 6:49 AM, 方统浩50450 wrote:
>>>> Hi Ferruh, thanks for your message.
>>>>
>>>>
>>>> We developed a ethtool-dpdk which is secondary process based dpdk 17.08 version. Our device
>>>> support hotplug detach, but hotplug deatch is failed when we use ethtool-dpdk.We found the
>>>> secondary process will change the shared memory when initializing.Secondary process calls
>>>> "rte_eth_dev_pci_allocate" function and enters "rte_eth_copy_pci_info" function.
>>>> (rte_eth_dev_pci_generic_probe -> rte_eth_dev_pci_allocate -> rte_eth_copy_pci_info)
>>>> Then it sets the value of struct "rte_eth_dev_data.dev_flags" to zero.In our platform, this value
>>>> is equal to 0x0003.(RTE_ETH_DEV_DETACHABLE | RTE_ETH_DEV_INTR_LSC),but after reset
>>>> the "dev_flags", the value changed to 0x0002.(RTE_ETH_DEV_DETACHABLE).So, our device hotplug
>>>> detach is failed.I found the similar problem in other dpdk version, include dpdk 19.11.Even though
>>>> the deivce hotplug detach is discarded,but i think the shared memory changed is unexpected by primary
>>>> process.
> 
> Hold on, just for my understanding. As far as I can see
> RTE_ETH_DEV_DETACHABLE was removed in 17.11. Does it
> change something in above description?

Overall secondary overwrites primary values, I think we should fix it
independent from the flags involved.

> 
>>> I agree this is the problem.
>>> In the driver code, 'rte_eth_copy_pci_info' is called only by primary process,
>>>
>>> but the generic code is faulty.
>>>
>>> And in 19.11 additionally 'eth_dev_pci_specific_init' also seems has same problem.
> 
> Yes, as I understand RTE_ETH_DEV_CLOSE_REMOVE,
> RTE_ETH_DEV_BONDED_SLAVE, RTE_ETH_DEV_REPRESENTOR and
> RTE_ETH_DEV_NOLIVE_MAC_ADDR may be lost because of
> reinit (if not restored in other branches). Bad anyway.
> 
>>>> Our driver is ixgbe, i think this problem has a little relationship with driver, Secondary process
>>>> enters "rte_eth_copy_pci_info" by "rte_eth_dev_pci_allocate".And I agree your opinion, the helper
>>>> function should simple on what it does.I have two ways to fix this problem, one is add an if-statement
>>>>
>>>> in "rte_eth_dev_pci_allocate" function to forbid secondary process enters "rte_eth_copy_pci_info" function,
>>>> another way is add an if-statement in "rte_eth_copy_pci_info" function to forbid secondary process change
>>>> shared memory.And First way need to ensure the "rte_eth_copy_pci_info" function won't be called anywhere else.
>>>> I think the second way is simple and lower risk.
>>>
>>> Yes these are the two options.
>>>
>>> I agree adding check in the 'rte_eth_copy_pci_info' covers all cases and safer.
>>> BUT my concern was adding decision making to simple/leaf function and make it
>>> harder to debug/use, instead of giving what primary/secondary process should
>>> call decision in higher level.
>>>
>>> But I just recognized that some PMDs are calling 'rte_eth_copy_pci_info' on
>>> secondary process, like mlx4 or szedata2, and most probably this is not their
>>> intention.
>>> And 'eth_dev->intr_handle' set in 'rte_eth_copy_pci_info', not calling this
>>> function may have side affect of 'eth_dev->intr_handle' not set in secondary.
>>>
>>> With above considerations I am OK to your proposal to cover all cases, Thomas,
>>> Andrew, any concern?
> 
> I would put if condition in rte_eth_copy_pci_info().
> It is the function which writes shared space from
> secondary process when it should not be done and it
> should be fixed there.

OK

> 
>> Do you mean drivers need to be fixed?
> 
> I'm not sure that I fully understand it. Since copy function
> cares about intr_handle copying I'm afraid that it is not
> 100% correct to skip it in secondary process completely as
> many drivers do right now. Basically it makes eth_dev structure
> in secondary process inconsistent. However, it looks like
> most of these drivers simply obtain handle from pci_dev
> directly and it explains why they are not affected.
> There are exceptions which are potentially bugs, e.g.
> drivers/net/ice/ice_ethdev.c: ice_interrupt_handler at the end.
> 
> I think that it would be better if intr_handle is always
> correct in eth_dev (both primary and secondary cases) and
> drivers use it instead of the same from pci_dev.
> 

OK

So this suggest going on with Fang's patch. I only requested an additional note
in function comment related to this secondary check.

  reply	other threads:[~2020-01-16  9:04 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-01-09 12:27 [dpdk-dev] [PATCH] ethdev: fix secondary process change share memory Fang TongHao
2020-01-10  7:30 ` Jeff Guo
2020-01-10  7:53   ` 方统浩50450
2020-01-13  5:03 ` [dpdk-dev] [PATCH v2] Fixes: ethdev: secondary process change shared memory Fang TongHao
2020-01-14 14:45   ` Ferruh Yigit
2020-01-15  6:49     ` 方统浩50450
2020-01-15 18:35       ` Ferruh Yigit
2020-01-15 20:43         ` Thomas Monjalon
2020-01-16  7:43           ` Andrew Rybchenko
2020-01-16  9:04             ` Ferruh Yigit [this message]
2020-01-16 11:35               ` 方统浩50450
2020-01-16 12:18                 ` Ferruh Yigit
2020-01-17  2:11                   ` 方统浩50450
2020-01-16  9:00           ` Ferruh Yigit
2020-01-17  2:08   ` [dpdk-dev] [PATCH v3] " Fang TongHao
2020-01-17  8:33     ` Andrew Rybchenko
2020-01-17 17:58       ` Ferruh Yigit

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=98e5d683-bd7c-6f4a-e8f9-216a01798ac8@intel.com \
    --to=ferruh.yigit@intel.com \
    --cc=arybchenko@solarflare.com \
    --cc=cunming.liang@intel.com \
    --cc=dev@dpdk.org \
    --cc=fangtonghao@sangfor.com.cn \
    --cc=jerinj@marvell.com \
    --cc=jia.guo@intel.com \
    --cc=jungle845943968@outlook.com \
    --cc=qi.z.zhang@intel.com \
    --cc=stable@dpdk.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).