From: Bruce Richardson <bruce.richardson@intel.com>
To: Ferruh Yigit <ferruh.yigit@intel.com>
Cc: Thomas Monjalon <thomas@monjalon.net>,
dev@dpdk.org, Stephen Hemminger <stephen@networkplumber.org>,
"Chilikin, Andrey" <andrey.chilikin@intel.com>,
"Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
"Wu, Jingjing" <jingjing.wu@intel.com>
Subject: Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features
Date: Fri, 4 Aug 2017 13:56:35 +0100 [thread overview]
Message-ID: <20170804125635.GA23392@bricha3-MOBL3.ger.corp.intel.com> (raw)
In-Reply-To: <12c296b7-0696-6c12-7fb2-e43bcd49d784@intel.com>
On Fri, Aug 04, 2017 at 12:58:01PM +0100, Ferruh Yigit wrote:
> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:
> > 03/08/2017 18:15, Stephen Hemminger:
> >> On Thu, 3 Aug 2017 14:21:38 +0100
> >> Bruce Richardson <bruce.richardson@intel.com> wrote:
> >>
> >>> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:
> >>>> To control some device-specific features public device-specific functions
> >>>> rte_pmd_*.h are used.
> >>>>
> >>>> But this solution requires applications to distinguish devices at runtime
> >>>> and, depending on the device type, call corresponding device-specific
> >>>> functions even if functions' parameters are the same.
> >>>>
> >>>> IOCTL-like API can be added to ethdev instead of public device-specific
> >>>> functions to address the following:
> >>>>
> >>>> * allow more usable support of features across a range of NIC from
> >>>> one vendor, but not others
> >>>> * allow features to be implemented by multiple NIC drivers without
> >>>> relying on a critical mass to get the functionality in ethdev
> >>>> * there are a large number of possible device specific functions, and
> >>>> creating individual APIs for each one is not a good solution
> >>>> * IOCTLs are a proven method for solving this problem in other areas,
> >>>> i.e. OS kernels.
> >>>>
> >>>> Control requests for this API will be globally defined at ethdev level, so
> >>>> an application will use single API call to control different devices from
> >>>> one/multiple vendors.
> >>>>
> >>>> API call may look like as a classic ioctl with an extra parameter for
> >>>> argument length for better sanity checks:
> >>>>
> >>>> int
> >>>> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
> >>>> unsigned arg_length);
> >>>>
> >>>> Regards,
> >>>> Andrey
> >>>
> >>> I think we need to start putting in IOCTLs for ethdevs, much as I hate
> >>> to admit it, since I dislike IOCTLs and other functions with opaque
> >>> arguments! Having driver specific functions I don't think will scale
> >>> well as each vendor tries to expose as much of their driver specific
> >>> functionality as possible.
> >>>
> >>> One other additional example: I discovered just this week another issue
> >>> with driver specific functions and testpmd, when I was working on the
> >>> meson build rework.
> >>>
> >>> * With shared libraries, when we do "ninja install" we want our DPDK
> >>> libs moved to e.g. /usr/local/lib, but the drivers moved to a separate
> >>> driver folder, so that they can be automatically loaded from that
> >>> single location by DPDK apps [== CONFIG_RTE_EAL_PMD_PATH].
> >>> * However, testpmd, as well as using the drivers as plugins, uses
> >>> driver-specific functions, which means that it explicitly links
> >>> against the pmd .so files.
> >>> * Those driver .so files are not in with the other libraries, so ld.so
> >>> does not find the pmd, and the installed testpmd fails to run due to
> >>> missing library dependencies.
> >>> * The workaround is to add the drivers path to the ld load path, but we
> >>> should not require ld library path changes just to get DPDK apps to
> >>> work.
> >>>
> >>> Using ioctls instead of driver-specific functions would solve this.
> >>>
> >>> My 2c.
> >>
> >> My 2c. No.
> >>
> >> Short answer:
> >> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
> >> despised by Linux kernel developers. They provide an unstructured, unsecured,
> >> back door for device driver abuse. Try to get a new driver in Linux with
> >> a unique ioctl, and it will be hard to get accepted.
> >>
> >> Long answer:
> >> So far every device specific feature has fit into ethdev model. Doing ioctl
> >> is admitting "it is too hard to be general, we need need an out". For something
> >> that is a flag, it should fit into existing config model; ignoring silly ABI constraints.
> >> For a real feature (think flow direction), we want a first class API for that.
> >> For a wart, then devargs will do.
> >>
> >> Give a good example of something that should be an ioctl. Don't build the
> >> API first and then let it get cluttered.
> >
> > I agree with Stephen.
> >
> > And please do not forget that ioctl still requires an API:
> > the argument that you put in ioctl is the API of the feature.
> > So it is the same thing as defining a new function.
>
> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
> and prevent argument check by compiler.
>
> BUT, the number of the increasing PMD specific APIs are also worrying,
> it is becoming harder to maintain, and I believe this is something NOT
> sustainable in long run.
>
>
> What about having *eth_dev_extended_ops* ?
>
>
> As a part of the rte_eth_dev. This can be in the librte_ether library
> but in a separated file.
>
> And the APIs for these ops can be less strict on compatibility, and
> easier to add.
>
> Benefits of having this new dev_ops:
>
> * Having an abstraction layer for common checks.
>
> * Even feature is not generic for all NICs, still a few NICs can share
> the ops.
>
> * All APIs are in the same file makes it easy to see PMD specific APIs
> comparing to scattered into various PMDs.
>
> * This is very like ioctl approach, but APIs are more clear and
> arguments can be verified.
>
Sounds like an ethdev-staging library, where features can be put until
such time as they get critical mass for acceptance and promoted to
ethdev? It's sounds better than IOCTL, while giving the same benefits.
I'd be happy enough with any solution that allows NIC features to be
exposed that does not have functions limited to each individual driver,
so that common functionality can be exposed to apps via an API even if
only 2 drivers support it.
> Thanks,
> ferruh
>
>
> >
> > The real debate is to decide if we want to continue adding more
> > control path features in DPDK or focus on Rx/Tx.
I don't see dropping control path as an option. It would severely limit
the usefulness of DPDK.
/Bruce
next prev parent reply other threads:[~2017-08-04 12:56 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-03 12:21 Chilikin, Andrey
2017-08-03 13:21 ` Bruce Richardson
2017-08-03 16:15 ` Stephen Hemminger
2017-08-03 19:53 ` Thomas Monjalon
2017-08-04 9:59 ` Chilikin, Andrey
2017-08-04 10:08 ` Thomas Monjalon
2017-08-04 11:58 ` Ferruh Yigit
2017-08-04 12:56 ` Bruce Richardson [this message]
2017-08-08 8:32 ` Ferruh Yigit
2017-08-08 15:27 ` Stephen Hemminger
2017-08-08 17:23 ` Wiles, Keith
2017-08-08 17:28 ` Wiles, Keith
2017-08-08 18:02 ` Stephen Hemminger
2017-08-08 18:21 ` Wiles, Keith
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20170804125635.GA23392@bricha3-MOBL3.ger.corp.intel.com \
--to=bruce.richardson@intel.com \
--cc=andrey.chilikin@intel.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=jingjing.wu@intel.com \
--cc=konstantin.ananyev@intel.com \
--cc=stephen@networkplumber.org \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).