From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <keith.wiles@intel.com>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 2E0362BE9
 for <dev@dpdk.org>; Tue,  8 Aug 2017 19:28:22 +0200 (CEST)
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 08 Aug 2017 10:28:21 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.41,344,1498546800"; d="scan'208";a="121341314"
Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203])
 by orsmga002.jf.intel.com with ESMTP; 08 Aug 2017 10:28:20 -0700
Received: from fmsmsx122.amr.corp.intel.com (10.18.125.37) by
 FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS)
 id 14.3.319.2; Tue, 8 Aug 2017 10:28:20 -0700
Received: from fmsmsx113.amr.corp.intel.com ([169.254.13.214]) by
 fmsmsx122.amr.corp.intel.com ([169.254.5.215]) with mapi id 14.03.0319.002;
 Tue, 8 Aug 2017 10:28:20 -0700
From: "Wiles, Keith" <keith.wiles@intel.com>
To: "Yigit, Ferruh" <ferruh.yigit@intel.com>
CC: Thomas Monjalon <thomas@monjalon.net>, DPDK <dev@dpdk.org>, "Stephen
 Hemminger" <stephen@networkplumber.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>, "Chilikin, Andrey" <andrey.chilikin@intel.com>, 
 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, "Wu, Jingjing"
 <jingjing.wu@intel.com>
Thread-Topic: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device
 specific features
Thread-Index: AQHTDRj4kO/62365mUS+R0MtbMBkT6J7MpQA
Date: Tue, 8 Aug 2017 17:28:19 +0000
Message-ID: <14B94271-5849-47DE-A50C-06A2ABD14176@intel.com>
References: <AAC06825A3B29643AF5372F5E0DDF0538DBC372E@IRSMSX106.ger.corp.intel.com>
 <20170803132138.GA8732@bricha3-MOBL3.ger.corp.intel.com>
 <20170803091531.5902f86c@xeon-e3> <1581480.IJArXVfUmc@xps>
 <12c296b7-0696-6c12-7fb2-e43bcd49d784@intel.com>
In-Reply-To: <12c296b7-0696-6c12-7fb2-e43bcd49d784@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.255.230.19]
Content-Type: text/plain; charset="us-ascii"
Content-ID: <ABE093A19E5CD44FBCB250B40814221B@intel.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device
 specific features
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 08 Aug 2017 17:28:22 -0000

Fix format.

> On Aug 4, 2017, at 6:58 AM, Ferruh Yigit <ferruh.yigit@intel.com> wrote:
>=20
> On 8/3/2017 8:53 PM, Thomas Monjalon wrote:
>> 03/08/2017 18:15, Stephen Hemminger:
>>> On Thu, 3 Aug 2017 14:21:38 +0100
>>> Bruce Richardson <bruce.richardson@intel.com> wrote:
>>>=20
>>>> On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote:
>>>>> To control some device-specific features public device-specific funct=
ions
>>>>> rte_pmd_*.h are used.
>>>>>=20
>>>>> But this solution requires applications to distinguish devices at run=
time
>>>>> and, depending on the device type, call corresponding device-specific
>>>>> functions even if functions' parameters are the same.
>>>>>=20
>>>>> IOCTL-like API can be added to ethdev instead of public device-specif=
ic
>>>>> functions to address the following:
>>>>>=20
>>>>> * allow more usable support of features across a range of NIC from
>>>>>  one vendor, but not others
>>>>> * allow features to be implemented by multiple NIC drivers without
>>>>>  relying on a critical mass to get the functionality in ethdev
>>>>> * there are a large number of possible device specific functions, and
>>>>>  creating individual APIs for each one is not a good solution
>>>>> * IOCTLs are a proven method for solving this problem in other areas,
>>>>>  i.e. OS kernels.
>>>>>=20
>>>>> Control requests for this API will be globally defined at ethdev leve=
l, so
>>>>> an application will use single API call to control different devices =
from
>>>>> one/multiple vendors.
>>>>>=20
>>>>> API call may look like as a classic ioctl with an extra parameter for
>>>>> argument length for better sanity checks:
>>>>>=20
>>>>> int
>>>>> rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp,
>>>>>        unsigned arg_length);
>>>>>=20
>>>>> Regards,
>>>>> Andrey =20
>>>>=20
>>>> I think we need to start putting in IOCTLs for ethdevs, much as I hate
>>>> to admit it, since I dislike IOCTLs and other functions with opaque
>>>> arguments! Having driver specific functions I don't think will scale
>>>> well as each vendor tries to expose as much of their driver specific
>>>> functionality as possible.
>>>>=20
>>>> One other additional example: I discovered just this week another issu=
e
>>>> with driver specific functions and testpmd, when I was working on the
>>>> meson build rework.
>>>>=20
>>>> * With shared libraries, when we do "ninja install" we want our DPDK
>>>>  libs moved to e.g. /usr/local/lib, but the drivers moved to a separat=
e
>>>>  driver folder, so that they can be automatically loaded from that
>>>>  single location by DPDK apps [=3D=3D CONFIG_RTE_EAL_PMD_PATH].
>>>> * However, testpmd, as well as using the drivers as plugins, uses
>>>>  driver-specific functions, which means that it explicitly links
>>>>  against the pmd .so files.
>>>> * Those driver .so files are not in with the other libraries, so ld.so
>>>>  does not find the pmd, and the installed testpmd fails to run due to
>>>>  missing library dependencies.
>>>> * The workaround is to add the drivers path to the ld load path, but w=
e
>>>>  should not require ld library path changes just to get DPDK apps to
>>>>  work.
>>>>=20
>>>> Using ioctls instead of driver-specific functions would solve this.
>>>>=20
>>>> My 2c.
>>>=20
>>> My 2c. No.
>>>=20
>>> Short answer:
>>> Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now
>>> despised by Linux kernel developers. They provide an unstructured, unse=
cured,
>>> back door for device driver abuse. Try to get a new driver in Linux wit=
h
>>> a unique ioctl, and it will be hard to get accepted.
>>>=20
>>> Long answer:
>>> So far every device specific feature has fit into ethdev model. Doing i=
octl
>>> is admitting "it is too hard to be general, we need need an out". For s=
omething
>>> that is a flag, it should fit into existing config model; ignoring sill=
y ABI constraints.
>>> For a real feature (think flow direction), we want a first class API fo=
r that.
>>> For a wart, then devargs will do.
>>>=20
>>> Give a good example of something that should be an ioctl. Don't build t=
he
>>> API first and then let it get cluttered.
>>=20
>> I agree with Stephen.
>>=20
>> And please do not forget that ioctl still requires an API:
>> the argument that you put in ioctl is the API of the feature.
>> So it is the same thing as defining a new function.
>=20
> I am also not fan of the ioctl usage. I believe it hides APIs behind ids
> and prevent argument check by compiler.
>=20
> BUT, the number of the increasing PMD specific APIs are also worrying,
> it is becoming harder to maintain, and I believe this is something NOT
> sustainable in long run.
>=20
>=20
> What about having *eth_dev_extended_ops* ?


We had talk about adding something like device specific APIs to DPDK in the=
 past, which to me are just IOCTL like APIs. The big problem with IOCTLs is=
 trying to cram a bunch of specific requests into a very generic API and I =
do not like ioctl as defined in Linux/Unix today. The old IOCTLs calls are =
too opaque and difficult for compilers to test args and many other issues.

We talked about having a single API in rte_eth_dev that would allow a user =
to ask for and possible get a list of function pointers in a given structur=
e for the requested type. If a user calls this API to get some feature from=
 a given NIC he would get NULL or a pointer to a set of functions. The gene=
ric API in rte_eth would allow the user to request what structures or types=
 of APIs it supports.

Using a specific API to get the list of APIs or supported features in a NIC=
, will allow the developer to request the set of APIs (in an array or some =
method). Then we have real APIs for specific control or requests and not a =
generic API like ioctl.

Cristian had suggested an API like this to make it easy to add any IOCTL li=
ke needs to a driver. We can define a set of structures that seem generic f=
or some IOCTL like needs or just allow the NIC to define his own structures=
 and APIs. Allowing the developer to define his own structures and APIs is =
not very generic or usable by the users, so I would lean toward defining st=
ructures set need today and expand those structures in the future or add mo=
re structures.

int rte_eth_dev_something(uint16_t port_id, const char *feature, void **obj=
);

Using strings we can define or the NIC vendor can define to ask for a point=
er to a structure he knows via a driver header. Strings are good, because w=
e can read them via the debug or print them out quickly instead of trying t=
o use some lookup table. Plus we can have any length or characters for defi=
ning the structure request.

Just off the top of my head, but it can be changed if needed.


>=20
>=20
> As a part of the rte_eth_dev. This can be in the librte_ether library
> but in a separated file.
>=20
> And the APIs for these ops can be less strict on compatibility, and
> easier to add.
>=20
> Benefits of having this new dev_ops:
>=20
> * Having an abstraction layer for common checks.
>=20
> * Even feature is not generic for all NICs, still a few NICs can share
> the ops.
>=20
> * All APIs are in the same file makes it easy to see PMD specific APIs
> comparing to scattered into various PMDs.
>=20
> * This is very like ioctl approach, but APIs are more clear and
> arguments can be verified.
>=20
> Thanks,
> ferruh
>=20
>=20
>>=20
>> The real debate is to decide if we want to continue adding more
>> control path features in DPDK or focus on Rx/Tx.
>> But this discussion would be better lead with some examples/requests.

Regards,
Keith