From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by dpdk.org (Postfix) with ESMTP id 5755F2C55 for ; Fri, 4 Aug 2017 11:59:49 +0200 (CEST) Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 Aug 2017 02:59:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,320,1498546800"; d="scan'208";a="136118455" Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159]) by fmsmga006.fm.intel.com with ESMTP; 04 Aug 2017 02:59:47 -0700 Received: from irsmsx106.ger.corp.intel.com ([169.254.8.236]) by IRSMSX104.ger.corp.intel.com ([169.254.5.26]) with mapi id 14.03.0319.002; Fri, 4 Aug 2017 10:59:46 +0100 From: "Chilikin, Andrey" To: Thomas Monjalon , "dev@dpdk.org" CC: Stephen Hemminger , "Richardson, Bruce" , "Ananyev, Konstantin" , "Wu, Jingjing" Thread-Topic: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features Thread-Index: AdMMUkb0+6rZdn91TFOM0eBbJSgGcAAAMcAAAAYSpIAAB58ogAAdpEQA Date: Fri, 4 Aug 2017 09:59:46 +0000 Message-ID: References: <20170803132138.GA8732@bricha3-MOBL3.ger.corp.intel.com> <20170803091531.5902f86c@xeon-e3> <1581480.IJArXVfUmc@xps> In-Reply-To: <1581480.IJArXVfUmc@xps> Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC] ethdev: add ioctl-like API to control device specific features X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 Aug 2017 09:59:50 -0000 > 03/08/2017 18:15, Stephen Hemminger: > > On Thu, 3 Aug 2017 14:21:38 +0100 > > Bruce Richardson wrote: > > > > > On Thu, Aug 03, 2017 at 01:21:35PM +0100, Chilikin, Andrey wrote: > > > > To control some device-specific features public device-specific > functions > > > > rte_pmd_*.h are used. > > > > > > > > But this solution requires applications to distinguish devices at r= untime > > > > and, depending on the device type, call corresponding device-specif= ic > > > > functions even if functions' parameters are the same. > > > > > > > > IOCTL-like API can be added to ethdev instead of public device-spec= ific > > > > functions to address the following: > > > > > > > > * allow more usable support of features across a range of NIC from > > > > one vendor, but not others > > > > * allow features to be implemented by multiple NIC drivers without > > > > relying on a critical mass to get the functionality in ethdev > > > > * there are a large number of possible device specific functions, a= nd > > > > creating individual APIs for each one is not a good solution > > > > * IOCTLs are a proven method for solving this problem in other area= s, > > > > i.e. OS kernels. > > > > > > > > Control requests for this API will be globally defined at ethdev le= vel, so > > > > an application will use single API call to control different device= s from > > > > one/multiple vendors. > > > > > > > > API call may look like as a classic ioctl with an extra parameter f= or > > > > argument length for better sanity checks: > > > > > > > > int > > > > rte_eth_dev_ioctl(uint16_t port, uint64_t ctl, void *argp, > > > > unsigned arg_length); > > > > > > > > Regards, > > > > Andrey > > > > > > I think we need to start putting in IOCTLs for ethdevs, much as I hat= e > > > to admit it, since I dislike IOCTLs and other functions with opaque > > > arguments! Having driver specific functions I don't think will scale > > > well as each vendor tries to expose as much of their driver specific > > > functionality as possible. > > > > > > One other additional example: I discovered just this week another iss= ue > > > with driver specific functions and testpmd, when I was working on the > > > meson build rework. > > > > > > * With shared libraries, when we do "ninja install" we want our DPDK > > > libs moved to e.g. /usr/local/lib, but the drivers moved to a separ= ate > > > driver folder, so that they can be automatically loaded from that > > > single location by DPDK apps [=3D=3D CONFIG_RTE_EAL_PMD_PATH]. > > > * However, testpmd, as well as using the drivers as plugins, uses > > > driver-specific functions, which means that it explicitly links > > > against the pmd .so files. > > > * Those driver .so files are not in with the other libraries, so ld.s= o > > > does not find the pmd, and the installed testpmd fails to run due t= o > > > missing library dependencies. > > > * The workaround is to add the drivers path to the ld load path, but = we > > > should not require ld library path changes just to get DPDK apps to > > > work. > > > > > > Using ioctls instead of driver-specific functions would solve this. > > > > > > My 2c. > > > > My 2c. No. > > > > Short answer: > > Ioctl's were a bad idea in Unix (per Dennis Ritchie et al) and are now > > despised by Linux kernel developers. They provide an unstructured, > unsecured, > > back door for device driver abuse. Try to get a new driver in Linux wit= h > > a unique ioctl, and it will be hard to get accepted. > > > > Long answer: > > So far every device specific feature has fit into ethdev model. Doing i= octl > > is admitting "it is too hard to be general, we need need an out". For > something > > that is a flag, it should fit into existing config model; ignoring sill= y ABI > constraints. > > For a real feature (think flow direction), we want a first class API fo= r that. > > For a wart, then devargs will do. > > > > Give a good example of something that should be an ioctl. Don't build t= he > > API first and then let it get cluttered. >=20 > I agree with Stephen. >=20 > And please do not forget that ioctl still requires an API: > the argument that you put in ioctl is the API of the feature. > So it is the same thing as defining a new function. >=20 > The real debate is to decide if we want to continue adding more > control path features in DPDK or focus on Rx/Tx. > But this discussion would be better lead with some examples/requests. In addition to what Bruce mentioned above, anything that requires dynamic r= e-configuration at run time would be a good example: * Internal resources partitioning, for example, RX buffers allocation for d= ifferent traffic classes/flow types, depending on the load * Mapping user priorities from different sources (VLAN's PCP bits, IP DSCP,= MPLS Exp) to traffic classes * Dynamic queue regions allocation for traffic classes * Dynamic statistics allocation * Dynamic flow types configuration depending on loaded parser profile