From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-wr0-f194.google.com (mail-wr0-f194.google.com [209.85.128.194]) by dpdk.org (Postfix) with ESMTP id 0CC381B2A9 for ; Thu, 21 Dec 2017 17:19:28 +0100 (CET) Received: by mail-wr0-f194.google.com with SMTP id w68so13175115wrc.10 for ; Thu, 21 Dec 2017 08:19:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fEf+eEbrIaKTaDeFXhZcgWiyGQ7GTGMpfF+j/zdeeCU=; b=VGNeUuym/lsgUCWnq4M1OWXMm2+Mi3gY4kCE1k7O7mx3H8uZu5/3mdkCcV+gjA47Ra gJx9sT5ttFos8kKQQj4hQAP8jX5wxtlBsaWxI0UKXzuazps0qAl8uS7Ot1Pm+AhZ+0/q phcLRhKp2w71/Vw4Y8wBU1gsJlVMaozhnOPY7Z91BBxbJ0GJKaLDKEvWD8ldZr1pcQsw mE+NgJYq8iEPLtX7TTBSycFRNoxf64cFeS5JtJ2l/ZwDBo21k7c705C1rSxAZCeKiLwV I/NHOD0pnY9fPzQJYAOcbuFbo5/V211MKGc2kdfexSxU6xqdiTRDQ47tOl9r909QKvTn xL7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fEf+eEbrIaKTaDeFXhZcgWiyGQ7GTGMpfF+j/zdeeCU=; b=JiNKiK80XhPpM2obHutQBp6WE3Hl029ptsCvvhQdg6eUHqP94VV/pYsQrqtwhrC3qW kRU++5vSfDjs855FB79N2Dfx3odoE39P49E+FdbCM1mfW0KFWWvMUvFjefylzmCH4nf0 biRAxvY1G9uiqLlCcoEnoCZ0zEigZaGTD4eRLauQYv/GqP+NQHJTs/pbVIIYHe5GOen0 FsuJsr6zhfCBSfrUggNQDo8KhER5yMWwExC43qJye+56QaY9FlZzRj4Ksod5MeYWscco ymacmTKbM2or1GZ2r8qIZYdVjJKDHJObVpZ11lI/z51WLc6eve1c7+ZRa98lbsZLz3fq CYJA== X-Gm-Message-State: AKGB3mLbSg4+50ssE2yh5MJlUw3FAWUhNg6UbMT/Xn9ufrqCgW6nFWqk XO9bBJg3zXawf4PiY5+HcPg/8SCb X-Google-Smtp-Source: ACJfBosk1YlYQ7dA5SJgPTPj1mzAQaLxYt49s6EG3ZS7gs+Icoj6dp7sQR38aEI7d74sHRx7DoHdbA== X-Received: by 10.223.153.72 with SMTP id x66mr13334774wrb.209.1513873168358; Thu, 21 Dec 2017 08:19:28 -0800 (PST) Received: from 6wind.com (host.78.145.23.62.rev.coltfrance.com. [62.23.145.78]) by smtp.gmail.com with ESMTPSA id n74sm25663529wmi.1.2017.12.21.08.19.26 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 21 Dec 2017 08:19:27 -0800 (PST) Date: Thu, 21 Dec 2017 17:19:15 +0100 From: Adrien Mazarguil To: Ferruh Yigit Cc: dev@dpdk.org, Stephen Hemminger , Thomas Monjalon Message-ID: <20171221161915.GN5377@6wind.com> References: <20171124172132.GW4062@6wind.com> <20171218162443.12971-1-adrien.mazarguil@6wind.com> <20171218162443.12971-3-adrien.mazarguil@6wind.com> <29ee4328-363b-19e2-fcc5-c1af57395277@intel.com> <20171219150605.GG5377@6wind.com> <1f1dbbf9-d1b4-504a-8cbd-e2717ee804d7@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1f1dbbf9-d1b4-504a-8cbd-e2717ee804d7@intel.com> Subject: Re: [dpdk-dev] [PATCH v1 2/3] net/hyperv: implement core functionality X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 21 Dec 2017 16:19:29 -0000 Disclaimer: I agree with Thomas's suggestions in his reply [1] to your message, I'm replying below as well to provide more details of my own and clarify the motivations behind this approach a bit more. On Tue, Dec 19, 2017 at 12:44:35PM -0800, Ferruh Yigit wrote: > On 12/19/2017 7:06 AM, Adrien Mazarguil wrote: > > On Mon, Dec 18, 2017 at 05:54:45PM -0800, Ferruh Yigit wrote: > >> On 12/18/2017 8:46 AM, Adrien Mazarguil wrote: > >>> As described in more details in the attached documentation (see patch > >>> contents), this virtual device driver manages NetVSC interfaces in virtual > >>> machines hosted by Hyper-V/Azure platforms. > >>> > >>> This driver does not manage traffic nor Ethernet devices directly; it acts > >>> as a thin configuration layer that automatically instantiates and controls > >>> fail-safe PMD instances combining tap and PCI sub-devices, so that each > >>> NetVSC interface is exposed as a single consolidated port to DPDK > >>> applications. > >>> > >>> PCI sub-devices being hot-pluggable (e.g. during VM migration), > >>> applications automatically benefit from increased throughput when present > >>> and automatic fallback on NetVSC otherwise without interruption thanks to > >>> fail-safe's hot-plug handling. > >>> > >>> Once initialized, the sole job of the hyperv driver is to regularly scan > >>> for PCI devices to associate with NetVSC interfaces and feed their > >>> addresses to corresponding fail-safe instances. > >>> > >>> Signed-off-by: Adrien Mazarguil > >> > >> <...> > >> > >>> + RTE_ETH_FOREACH_DEV(port_id) { > >> <..> > >>> + ret = rte_eal_hotplug_remove(bus->name, dev->name); > >> <..> > >>> + ret = rte_eal_hotplug_add("vdev", ctx->devname, ctx->devargs); > >> > >> Overall why this logic implemented as network PMD? > >> Yes technically you can implement *anything* as PMD :), but should we? > >> > >> This code does eal level work (scans bus, add/remove devices), and for control > >> path, and not a generic solution either (specific to netvsc and failsafe). > >> > >> Only device argument part of a PMD seems used, rest is unrelated to being a PMD. > >> Scans netvsc changes in background and reflects them into failsafe PMD... > >> > >> Why this is implemented as PMD, not another entity, like bus driver perhaps? > >> > >> Or indeed why this in DPDK instead of being in application? > > > > I'll address that last question first: the point of this driver is enabling > > existing applications to run within a Hyper-V environment unmodified, > > because they'd otherwise need to manage two driver instances correctly on > > their own in addition to hot-plug events during VM migration. > > > > Some kind of driver generating a front end to what otherwise appears as two > > distinct ethdev to applications is therefore necessary. > > > > Currently without it, users have to manually configure failsafe properly for > > each NetVSC interface on their system. Besides the inconvenience, it's not > > even a possibility with DPDK applications that don't rely on EAL > > command-line arguments. > > > > As such it's more correctly defined as a "platform" driver rather than a > > true PMD. It leaves VF device handling to their respective PMDs while > > automatically managing the platform-specific part itself. There's no simpler > > alternative when running in blacklist mode (i.e. not specifying any device > > parameters on the command line). > > > > Regarding its presence in drivers/net rather than drivers/bus, the end > > result from an application standpoint is that each instance exposes a single > > ethdev, even if not its own (failsafe's). Busses don't do that. It also > > allows passing arguments to individual devices through --vdev if needed. > > > > You're right about putting device detection at the bus level though, and I > > think there's work in progress to do just that, this driver will be updated > > to benefit from it once applied. In the meantime, the code as submitted > > works fine with the current DPDK code base and addresses an existing use > > case for which there is no solution at this point. > > This may be working but this looks like a hack to me. > > If we need a platform driver why not properly work on it. If we need to improve > eal hotplug, this is a good motivation to improve it. Hotplug surely can be improved but I don't think that alone will be enough for what this driver does. Here's how things are sequenced as currently implemented: 1. DPDK application starts. 2. EAL scans for PCI devices, ethdev ports are created for relevant ones. 3. hyperv vdev scans the system for appropriate NetVSC netdevices, instantiates failsafe PMD accordingly to create ethdev ports for each of them. At this stage, rte_eal_hotplug_remove() is also called on physical devices found in 2. that will be given to failsafe (see 4.), since they're not supposed to be seen or owned by the application (keep in mind this happens on Hyper-V platforms only). 4. From this point on, application can use the remaining ports normally. 5. A PCI device gets plugged in, kernel recognizes it and creates a netdevice for it. 6. hyperv's timer callback detects the new netdevice, if its properties match NetVSC's then it proceeds to tell failsafe its location. 7. failsafe probes the given address on the appropriate bus to instantiate another hidden ethdev out of it and primarily uses that device for TX until it gets unplugged. Meanwhile, RX is still performed on both underlying devices. Let's now assume hot-plug is perfectly implemented in DPDK along with Gaetan's netdevice bus [2] (or equivalent) with hotplug properties as well: 1. DPDK application starts. 2. EAL scans for PCI devices, ethdev ports are created for relevant ones. 3. EAL scans for net_bus devices, ethdev ports are created for relevant ones. 4. The piece of code formerly known as the hyperv driver looks at detected net_bus devices, finds relevant ones with NetVSC properties and promptly kicks them out through rte_eal_hotplug_remove() (or equivalent) so that the application doesn't get a chance to "see" them. It then instantiates fail-safe PMD like before, with fail-safe re-discovering devices as its own. 5. From this point on, application can use the remaining ports normally. 6. A PCI device gets plugged in, kernel recognizes it and creates a netdevice for it. 7. EAL's net_bus hotplug handler kicks in, automatically creates a new ethdev port out of it (note: device properties such as MAC addresses are not known before the associated PMD is initialized and an ethdev created). 8. The piece of code formerly known as the hyperv driver that happens to also be listening for hotplug events sees that new ethdev port; if its properties match NetVSC's then it proceeds to hide it before telling failsafe its location. 9. failsafe probes the given address on the appropriate bus to instantiate another hidden ethdev out of it and primarily uses that device for TX until it gets unplugged. Meanwhile, RX is still performed on both underlying devices. Hotplug basically removes the timer callback and some of the probing code. I agree it's perfectly fine to update this PMD once hotplug is implemented that way. Now what about the rest? Without a driver there's no way to orchestrate all the above. A separate layer between applications and PMDs is necessary for that; the handover of ethdev ports to failsafe is mandatory. > And if this logic needs to be in application let it be, your argument is to not > change the existing application but this logic may lead implementing many > unrelated things as PMD to not change application, what is the line here. Well, for this particular case I don't think many applications want to retrieve multicast and some other traffic out of one ethdev and the rest from another only when the latter is present. This complexity must be handled by the framework, not by applications, which ideally are not supposed to know much about the environment they're running in. For this reason, even a specific API is out of the question. > What is the work in progress, exact list, that will replace this solution? If > this hackish solution will prevent that real work, I am against this solution. > Is there a way to ensure this will be a temporary solution and that real work > will happen? I think Thomas answers this question [1], I'll just add that the current approach was developed and submitted in a way that doesn't have any impact on public APIs precisely to avoid conflicts with other work on EAL in the meantime. If the hotplug subsystem evolves, this driver will catch up, particularly since it's small and shouldn't be too complex to adapt. I volunteer for that work once APIs are ready in any case; failing that, the experimental tag (I'll add it for v2) means its pure and simple removal. I'd like your opinion on the current approach to determine the next steps: - Do you agree with the fact hotplug and platform-related functionality are two separate problems, that the approach to implement the former doesn't address the latter? - About implementing the latter in DPDK as a kind of platform driver so that applications don't need to be modified? - If you had to choose between drivers/bus and drivers/net for it? (keep in mind the ability to provide per-device options would be great) [1] http://dpdk.org/ml/archives/dev/2017-December/084558.html [2] http://dpdk.org/ml/archives/dev/2017-June/067546.html -- Adrien Mazarguil 6WIND