From: Ferruh Yigit <ferruh.yigit@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>
Cc: "Roberts, Lee A." <lee.roberts@hpe.com>,
"Tan, Jianfeng" <jianfeng.tan@intel.com>,
Thomas Monjalon <thomas@monjalon.net>,
"dev@dpdk.org" <dev@dpdk.org>,
"stable@dpdk.org" <stable@dpdk.org>,
"Wu, Jingjing" <jingjing.wu@intel.com>,
Shijith Thotton <shijith.thotton@caviumnetworks.com>,
Gregory Etelson <gregory@weka.io>,
Harish Patil <harish.patil@cavium.com>,
George Prekas <george.prekas@epfl.ch>,
"Gonzalez Monroy, Sergio" <sergio.gonzalez.monroy@intel.com>,
Rasesh Mody <rasesh.mody@cavium.com>
Subject: Re: [dpdk-dev] [PATCH v2] igb_uio: add config option to control reset
Date: Mon, 6 Nov 2017 10:41:15 -0800 [thread overview]
Message-ID: <d92e5567-c2f1-0a9f-4688-df417c2ffe26@intel.com> (raw)
In-Reply-To: <CAOaVG17r0KKKU90G_R75f0XNYC7tV9yM+Fu1RGK2Jm+UqNRu9Q@mail.gmail.com>
On 11/4/2017 3:08 AM, Stephen Hemminger wrote:
>
>
> On Nov 4, 2017 01:03, "Ferruh Yigit" <ferruh.yigit@intel.com
> <mailto:ferruh.yigit@intel.com>> wrote:
>
> On 11/3/2017 12:42 PM, Roberts, Lee A. wrote:
> >
> >
> >> -----Original Message-----
> >> From: dev [mailto:dev-bounces@dpdk.org <mailto:dev-bounces@dpdk.org>] On
> Behalf Of Tan, Jianfeng
> >> Sent: Thursday, November 02, 2017 8:57 PM
> >> To: Ferruh Yigit <ferruh.yigit@intel.com
> <mailto:ferruh.yigit@intel.com>>; Thomas Monjalon <thomas@monjalon.net
> <mailto:thomas@monjalon.net>>
> >> Cc: dev@dpdk.org <mailto:dev@dpdk.org>; stable@dpdk.org
> <mailto:stable@dpdk.org>; Jingjing Wu <jingjing.wu@intel.com
> <mailto:jingjing.wu@intel.com>>; Shijith Thotton
> >> <shijith.thotton@caviumnetworks.com
> <mailto:shijith.thotton@caviumnetworks.com>>; Gregory Etelson
> <gregory@weka.io <mailto:gregory@weka.io>>; Harish Patil
> >> <harish.patil@cavium.com <mailto:harish.patil@cavium.com>>; George Prekas
> <george.prekas@epfl.ch <mailto:george.prekas@epfl.ch>>; Sergio Gonzalez Monroy
> >> <sergio.gonzalez.monroy@intel.com
> <mailto:sergio.gonzalez.monroy@intel.com>>; Rasesh Mody
> <rasesh.mody@cavium.com <mailto:rasesh.mody@cavium.com>>
> >> Subject: Re: [dpdk-dev] [PATCH v2] igb_uio: add config option to control
> reset
> >>
> >>
> >>
> >> On 11/3/2017 8:51 AM, Ferruh Yigit wrote:
> >>> Adding a compile time configuration option to control device reset done
> >>> during DPDK application exit.
> >>>
> >>> Config option is CONFIG_RTE_EAL_IGB_UIO_RESET and enabled by default,
> >>> so by default reset will happen. Having this reset is safer to be sure
> >>> device left in a proper case.
> >>>
> >>> But for special cases [1] it is possible to disable the config option
> >>> to prevent the device reset.
> >>>
> >>> [1]
> >>> http://dpdk.org/ml/archives/dev/2017-November/080927.html
> <http://dpdk.org/ml/archives/dev/2017-November/080927.html>
> >>>
> >>> Fixes: b58eedfc7dd5 ("igb_uio: issue FLR during open and release of
> device file")
> >>> Cc: stable@dpdk.org <mailto:stable@dpdk.org>
> >>>
> >>> Signed-off-by: Ferruh Yigit <ferruh.yigit@intel.com
> <mailto:ferruh.yigit@intel.com>>
> >>
> >> Realize that we do have a pci_clear_master() in the release() to disable
> >> the DMA from device until the next open() will enable the DMA again .
> >> Here is my:
> >>
> >> Reviewed-by: Jianfeng Tan <jianfeng.tan@intel.com
> <mailto:jianfeng.tan@intel.com>>
> >>
> >> Thanks,
> >> Jianfeng
> >>
> >>> ---
> >>> Cc: Jianfeng Tan <jianfeng.tan@intel.com <mailto:jianfeng.tan@intel.com>>
> >>> Cc: Jingjing Wu <jingjing.wu@intel.com <mailto:jingjing.wu@intel.com>>
> >>> Cc: Shijith Thotton <shijith.thotton@caviumnetworks.com
> <mailto:shijith.thotton@caviumnetworks.com>>
> >>> Cc: Gregory Etelson <gregory@weka.io <mailto:gregory@weka.io>>
> >>> Cc: Harish Patil <harish.patil@cavium.com <mailto:harish.patil@cavium.com>>
> >>> Cc: George Prekas <george.prekas@epfl.ch <mailto:george.prekas@epfl.ch>>
> >>> Cc: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com
> <mailto:sergio.gonzalez.monroy@intel.com>>
> >>> Cc: Rasesh Mody <rasesh.mody@cavium.com <mailto:rasesh.mody@cavium.com>>
> >>>
> >>> v2:
> >>> * fix typo in commit log
> >>> ---
> >>> config/common_base | 1 +
> >>> config/common_linuxapp | 1 +
> >>> lib/librte_eal/linuxapp/igb_uio/igb_uio.c | 2 ++
> >>> 3 files changed, 4 insertions(+)
> >>>
> >>> diff --git a/config/common_base b/config/common_base
> >>> index 82ee75456..2a9947420 100644
> >>> --- a/config/common_base
> >>> +++ b/config/common_base
> >>> @@ -102,6 +102,7 @@ CONFIG_RTE_LIBEAL_USE_HPET=n
> >>> CONFIG_RTE_EAL_ALLOW_INV_SOCKET_ID=n
> >>> CONFIG_RTE_EAL_ALWAYS_PANIC_ON_ERROR=n
> >>> CONFIG_RTE_EAL_IGB_UIO=n
> >>> +CONFIG_RTE_EAL_IGB_UIO_RESET=n
> >>> CONFIG_RTE_EAL_VFIO=n
> >>> CONFIG_RTE_MALLOC_DEBUG=n
> >>> CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=n
> >>> diff --git a/config/common_linuxapp b/config/common_linuxapp
> >>> index 74c7d64ec..b3a602909 100644
> >>> --- a/config/common_linuxapp
> >>> +++ b/config/common_linuxapp
> >>> @@ -37,6 +37,7 @@ CONFIG_RTE_EXEC_ENV_LINUXAPP=y
> >>>
> >>> CONFIG_RTE_EAL_NUMA_AWARE_HUGEPAGES=y
> >>> CONFIG_RTE_EAL_IGB_UIO=y
> >>> +CONFIG_RTE_EAL_IGB_UIO_RESET=y
> >>> CONFIG_RTE_EAL_VFIO=y
> >>> CONFIG_RTE_KNI_KMOD=y
> >>> CONFIG_RTE_LIBRTE_KNI=y
> >>> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> >>> index fd320d87d..0325722c0 100644
> >>> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> >>> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> >>> @@ -360,7 +360,9 @@ igbuio_pci_release(struct uio_info *info, struct
> inode *inode)
> >>> /* stop the device from further DMA */
> >>> pci_clear_master(dev);
> >>>
> >>> +#ifdef RTE_EAL_IGB_UIO_RESET
> >>> pci_reset_function(dev);
> >>> +#endif
> >>>
> >>> return 0;
> >>> }
> >
> > A compile time configuration option makes life very difficult for
> application providers.
> >
> > Consider the case where an application such as Open vSwitch with DPDK
> support is being provided
> > with a Linux distribution. One would want the Open vSwitch binary to
> support as many vendor NICs
> > as possible---without the need to recompile. With a change such as this,
> one would need to have
> > different versions of the kernel igb_uio module to support different NICs.
>
> Agreed, I am against adding more compile time options although I am end up
> sending a few of them these days.
>
> > The Linux kernel is already aware of, and provides work-arounds for,
> various PCI quirks.
> > For example, see linux/drivers/pci/quirks.c
> (http://elixir.free-electrons.com/linux/latest/source/drivers/pci/quirks.c
> <http://elixir.free-electrons.com/linux/latest/source/drivers/pci/quirks.c>).
> >
> > At this point in igb_uio.c, one is aware of the struct pci_dev "dev" for
> the device in question.
> > Access to the vendor and device information should be simple:
> >
> > struct pci_dev {
> > struct list_head bus_list;/* node in per-bus list */
> > struct pci_bus*bus;/* bus this device is on */
> > struct pci_bus*subordinate;/* bus this device bridges to */
> >
> > void*sysdata;/* hook for sys-specific extension */
> > struct proc_dir_entry *procent;/* device entry in /proc/bus/pci */
> > struct pci_slot*slot;/* Physical slot this device is in */
> >
> > unsigned intdevfn;/* encoded device & function index */
> > unsigned shortvendor;
> > unsigned shortdevice;
> > unsigned shortsubsystem_vendor;
> > unsigned shortsubsystem_device;
> > ...
> >
> > One could imagine using logic to implement corresponding PCI quirks that
> can be evaluated
> > at runtime. For example (in pseudocode),
> >
> > if not (vendor = "Cavium" and device = "bnx2x")
> > then pci_reset_function(dev);
>
> It wouldn't be nice to add device specific checks into generic igb_uio module,
> but also it is not nice to add compile time option, comparing two I would be OK
> to device checks.
>
> What do you think about following?
> If there is no objection and Rasesh confirms that patch is working, I can send a
> proper patch for it.
>
>
>
> diff --git a/lib/librte_eal/linuxapp/igb_uio/compat.h
> b/lib/librte_eal/linuxapp/igb_uio/compat.h
> index 30508f35c..264206af3 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/compat.h
> +++ b/lib/librte_eal/linuxapp/igb_uio/compat.h
> @@ -134,3 +134,21 @@ static bool pci_check_and_mask_intx(struct pci_dev *pdev)
> #endif
>
>
> +#define BROADCOM_PCI_VENDOR_ID 0x14E4
> +static const struct pci_device_id no_reset_pci_tbl[] = {
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x168a) }, /* 57800 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x164f) }, /* 57711 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x168e) }, /* 57810 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x163d) }, /* 57811 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x168d) }, /* 57840_OBS */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16a1) }, /* 57840_4_10 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16a2) }, /* 57840_2_20 */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16ae) }, /* 57810_MF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x163e) }, /* 57811_MF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16a4) }, /* 57840_MF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16a9) }, /* 57800_VF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16af) }, /* 57810_VF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x163f) }, /* 57811_VF */
> + { PCI_DEVICE(BROADCOM_PCI_VENDOR_ID, 0x16ad) }, /* 57840_VF */
> + { 0 },
> +};
> diff --git a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> index fd320d87d..b0d92b51e 100644
> --- a/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> +++ b/lib/librte_eal/linuxapp/igb_uio/igb_uio.c
> @@ -348,6 +348,14 @@ igbuio_pci_open(struct uio_info *info, struct inode *inode)
> return 0;
> }
>
> +static int is_device_excluded_from_reset(struct pci_dev *pdev)
> +{
> + if (pci_match_id(no_reset_pci_tbl, pdev))
> + return 1;
> +
> + return 0;
> +}
> +
>
>
> Personal preference is for more concise:
> static book is_device_excluded(const struct pci_dev *pdev)
> {
> return pci_match_id(no_reset_pci_tbl, pdev);
>
> }
I will update function, but I am for keeping function name to clarify what
device is excluded from.
>
> static int
> igbuio_pci_release(struct uio_info *info, struct inode *inode)
> {
> @@ -360,7 +368,8 @@ igbuio_pci_release(struct uio_info *info, struct inode
> *inode)
> /* stop the device from further DMA */
> pci_clear_master(dev);
>
> - pci_reset_function(dev);
> + if (!is_device_excluded_from_reset(dev))
> + pci_reset_function(dev);
>
> return 0;
> }
>
>
> >
> > There are other possible implementations. If there are enough quirks, one
> might have action
> > functions defined---and a table of function pointers associated with each
> PMD to select the
> > proper action.
> >
> > - Lee Roberts
> >
> >
> >
> >
>
>
prev parent reply other threads:[~2017-11-06 18:41 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-11-03 0:30 [dpdk-dev] [PATCH] " Ferruh Yigit
2017-11-03 0:51 ` [dpdk-dev] [PATCH v2] " Ferruh Yigit
2017-11-03 2:57 ` Tan, Jianfeng
2017-11-03 19:42 ` Roberts, Lee A.
2017-11-03 22:03 ` Ferruh Yigit
2017-11-04 10:08 ` Stephen Hemminger
2017-11-06 18:41 ` Ferruh Yigit [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=d92e5567-c2f1-0a9f-4688-df417c2ffe26@intel.com \
--to=ferruh.yigit@intel.com \
--cc=dev@dpdk.org \
--cc=george.prekas@epfl.ch \
--cc=gregory@weka.io \
--cc=harish.patil@cavium.com \
--cc=jianfeng.tan@intel.com \
--cc=jingjing.wu@intel.com \
--cc=lee.roberts@hpe.com \
--cc=rasesh.mody@cavium.com \
--cc=sergio.gonzalez.monroy@intel.com \
--cc=shijith.thotton@caviumnetworks.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).