From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from wout1-smtp.messagingengine.com (wout1-smtp.messagingengine.com [64.147.123.24]) by dpdk.org (Postfix) with ESMTP id 77D77160 for ; Mon, 15 Oct 2018 10:43:42 +0200 (CEST) Received: from compute1.internal (compute1.nyi.internal [10.202.2.41]) by mailout.west.internal (Postfix) with ESMTP id B1E356B5; Mon, 15 Oct 2018 04:43:40 -0400 (EDT) Received: from mailfrontend1 ([10.202.2.162]) by compute1.internal (MEProxy); Mon, 15 Oct 2018 04:43:41 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=monjalon.net; h= from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding:content-type; s=mesmtp; bh=ugOd/s/Q0DZvXt1kUi4cVYbzgJi36Hat3YtWeRiFAQU=; b=XDJ4RupejueQ vZVOu9S0l+Wriyz2HvXHab4qlLh7VCcwl8Zl3C3S3sxyIEl4T7LbDxV3afecDhs2 x4vcNsgoH3nYfusMHgprrwxzd1ACRDTnQamLCp8/hCGChb4ruBhZlT/lVGziZl60 xrbXRL8mInLv0mQe1+KLG/hnautyEwg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:content-transfer-encoding:content-type :date:from:in-reply-to:message-id:mime-version:references :subject:to:x-me-proxy:x-me-proxy:x-me-sender:x-me-sender :x-sasl-enc; s=fm1; bh=ugOd/s/Q0DZvXt1kUi4cVYbzgJi36Hat3YtWeRiFA QU=; b=DXjz2s36NfNlI7U7jOBz+vvlsoUFxfp54fsSflAQ+hmM+VfNo82hxOfI5 XdastFzJIdxBiWEkAR8Shlsw2rvY3Iiw1yM5nVWufw0D6qhFC7hjs5hGxn9FEiyL o4hYnNm2BW2kGPPmsdSdU6830ELYmIu4ALJS4DYZ7DWHEln8mrWuUkyvVz2+JldM l8ia0vr5rNWGAohX9jfYlyF58YmZ6LpFe22P1r+D/Dgy3E9O4L3rr61Yi+WRswVC 0GymJgHTWR7Fr9R0h/+yKWs+x9CiIMPo97MZeg+zKC/rqlAu6ki83EFHsLn1rf2I RtjF1Z1C0xV/GCa3HP8BmSYRkSP5A== X-ME-Sender: X-ME-Proxy: Received: from xps.localnet (184.203.134.77.rev.sfr.net [77.134.203.184]) by mail.messagingengine.com (Postfix) with ESMTPA id 79FBEE4623; Mon, 15 Oct 2018 04:43:38 -0400 (EDT) From: Thomas Monjalon To: Qi Zhang Cc: dev@dpdk.org, gaetan.rivet@6wind.com, anatoly.burakov@intel.com, arybchenko@solarflare.com, konstantin.ananyev@intel.com, bruce.richardson@intel.com, ferruh.yigit@intel.com, benjamin.h.shelton@intel.com, narender.vangati@intel.com Date: Mon, 15 Oct 2018 10:43:39 +0200 Message-ID: <2154531.6HW50fN39v@xps> In-Reply-To: <20180928042326.50471-3-qi.z.zhang@intel.com> References: <20180607123849.14439-1-qi.z.zhang@intel.com> <20180928042326.50471-1-qi.z.zhang@intel.com> <20180928042326.50471-3-qi.z.zhang@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 7Bit Content-Type: text/plain; charset="us-ascii" Subject: Re: [dpdk-dev] [PATCH v16 2/6] eal: enable hotplug on multi-process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 15 Oct 2018 08:43:42 -0000 Hi Qi, I wanted to push this old series, but I still have some questions and comments. Please let's fix them quickly. Note: I did not review the IPC mechanism and rollback. I trust you :) 28/09/2018 06:23, Qi Zhang: > --- a/doc/guides/rel_notes/release_18_11.rst > +++ b/doc/guides/rel_notes/release_18_11.rst > @@ -67,6 +67,12 @@ New Features > SR-IOV option in Hyper-V and Azure. This is an alternative to the previous > vdev_netvsc, tap, and failsafe drivers combination. > > +* **Support device multi-process hotplug.** > + > + Hotplug and hot-unplug for devices will now be supported in multiprocessing > + scenario. Any ethdev devices created in the primary process will be regarded > + as shared and will be available for all DPDK processes. Synchronization > + between processes will be done using DPDK IPC. > A blank line is missing here. > API Changes > ----------- > @@ -91,6 +97,11 @@ API Changes > flag the MAC can be properly configured in any case. This is particularly > important for bonding. > > +* eal: scope of rte_eal_hotplug_add and rte_eal_hotplug_remove is extended. > + > + In primary-secondary process model, ``rte_eal_hotplug_add`` will guarantee > + that device be attached on all processes, while ``rte_eal_hotplug_remove`` > + will guarantee device be detached on all processes. > Here too, double blank line before next heading. > --- a/lib/librte_eal/common/eal_common_dev.c > +++ b/lib/librte_eal/common/eal_common_dev.c > -int > -rte_eal_hotplug_add(const char *busname, const char *devname, > - const char *drvargs) > +/* help funciton to build devargs, caller should free the memory */ "help funciton" -> "helper function" > +static char * > +build_devargs(const char *busname, const char *devname, > + const char *drvargs) > { > char *devargs = NULL; > int size, length = -1; > @@ -140,19 +143,33 @@ rte_eal_hotplug_add(const char *busname, const char *devname, > if (length >= size) > devargs = malloc(length + 1); > if (devargs == NULL) > - return -ENOMEM; > + break; > } while (size == 0); It is an old code, please rebase on master. > -int __rte_experimental > -rte_dev_remove(struct rte_device *dev) > +/* remove device at local process. */ > +int > +local_dev_remove(struct rte_device *dev) > { > struct rte_bus *bus; > int ret; > @@ -248,7 +268,194 @@ rte_dev_remove(struct rte_device *dev) > if (ret) > RTE_LOG(ERR, EAL, "Driver cannot detach the device (%s)\n", > dev->name); > - rte_devargs_remove(dev->devargs); > + else > + rte_devargs_remove(dev->devargs); It looks you are fixing a bug here. Good catch! > +int __rte_experimental > +rte_dev_probe(const char *devargs) > +{ > + struct eal_dev_mp_req req; > + struct rte_device *dev; > + int ret; > + > + memset(&req, 0, sizeof(req)); > + req.t = EAL_DEV_REQ_TYPE_ATTACH; > + strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN); > + > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + /** > + * If in secondary process, just send IPC request to > + * primary process. > + */ > + ret = eal_dev_hotplug_request_to_primary(&req); > + if (ret) { > + RTE_LOG(ERR, EAL, > + "Failed to send hotplug request to primary\n"); > + return -ENOMSG; > + } > + if (req.result) > + RTE_LOG(ERR, EAL, > + "Failed to hotplug add device\n"); > + return req.result; > + } > + > + /* attach a shared device from primary start from here: */ > + > + /* primary attach the new device itself. */ > + ret = local_dev_probe(devargs, &dev); > + > + if (ret) { > + RTE_LOG(ERR, EAL, > + "Failed to attach device on primary process\n"); > + > + /** > + * it is possible that secondary process failed to attached a > + * device that primary process have during initialization, > + * so for -EEXIST case, we still need to sync with secondary > + * process. > + */ > + if (ret != -EEXIST) > + return ret; > + } > + > + /* primary send attach sync request to secondary. */ > + ret = eal_dev_hotplug_request_to_secondary(&req); > + > + /* if any commnunication error, we need to rollback. */ typo: communication > + if (ret) { > + RTE_LOG(ERR, EAL, > + "Failed to send hotplug add request to secondary\n"); > + ret = -ENOMSG; > + goto rollback; > + } > + > + /** > + * if any secondary failed to attach, we need to consider if rollback > + * is necessary. > + */ > + if (req.result) { > + RTE_LOG(ERR, EAL, > + "Failed to attach device on secondary process\n"); > + ret = req.result; > + > + /* for -EEXIST, we don't need to rollback. */ > + if (ret == -EEXIST) > + return ret; > + goto rollback; > + } > + > + return 0; > + > +rollback: > + req.t = EAL_DEV_REQ_TYPE_ATTACH_ROLLBACK; > + > + /* primary send rollback request to secondary. */ > + if (eal_dev_hotplug_request_to_secondary(&req)) For all occurences of "if (function())", the coding style is requesting an explicit check of the return value: if (function() != 0) > + RTE_LOG(WARNING, EAL, > + "Failed to rollback device attach on secondary." > + "Devices in secondary may not sync with primary\n"); > + > + /* primary rollback itself. */ > + if (local_dev_remove(dev)) > + RTE_LOG(WARNING, EAL, > + "Failed to rollback device attach on primary." > + "Devices in secondary may not sync with primary\n"); > + > + return ret; > +} > + > +int __rte_experimental > +rte_dev_remove(struct rte_device *dev) > +{ > + struct eal_dev_mp_req req; > + char *devargs; > + int ret; > + > + devargs = build_devargs(dev->devargs->bus->name, dev->name, ""); > + if (devargs == NULL) > + return -ENOMEM; > + > + memset(&req, 0, sizeof(req)); > + req.t = EAL_DEV_REQ_TYPE_DETACH; > + strlcpy(req.devargs, devargs, EAL_DEV_MP_DEV_ARGS_MAX_LEN); > + free(devargs); > + > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + /** > + * If in secondary process, just send IPC request to > + * primary process. > + */ > + ret = eal_dev_hotplug_request_to_primary(&req); > + if (ret) { > + RTE_LOG(ERR, EAL, > + "Failed to send hotplug request to primary\n"); > + return -ENOMSG; > + } > + if (req.result) > + RTE_LOG(ERR, EAL, > + "Failed to hotplug remove device\n"); > + return req.result; > + } > + > + /* detach a device from primary start from here: */ > + > + /* primary send detach sync request to secondary */ > + ret = eal_dev_hotplug_request_to_secondary(&req); > + > + /** > + * if communication error, we need to rollback, because it is possible > + * part of the secondary processes still detached it successfully. > + */ > + if (ret) { ret is not a boolean, please do explicit check != 0. > + RTE_LOG(ERR, EAL, > + "Failed to send device detach request to secondary\n"); > + ret = -ENOMSG; > + goto rollback; > + } > + > + /** > + * if any secondary failed to detach, we need to consider if rollback > + * is necessary. > + */ > + if (req.result) { result is not a boolean, please do explicit check != 0. > + RTE_LOG(ERR, EAL, > + "Failed to detach device on secondary process\n"); > + ret = req.result; > + /** > + * if -ENOENT, we don't need to rollback, since devices is > + * already detached on secondary process. > + */ > + if (ret != -ENOENT) > + goto rollback; > + } > + > + /* primary detach the device itself. */ > + ret = local_dev_remove(dev); > + > + /* if primary failed, still need to consider if rollback is necessary */ > + if (ret) { > + RTE_LOG(ERR, EAL, > + "Failed to detach device on primary process\n"); > + /* if -ENOENT, we don't need to rollback */ > + if (ret == -ENOENT) > + return ret; > + goto rollback; > + } > + > + return 0; > + > +rollback: > + req.t = EAL_DEV_REQ_TYPE_DETACH_ROLLBACK; > + > + /* primary send rollback request to secondary. */ > + if (eal_dev_hotplug_request_to_secondary(&req)) > + RTE_LOG(WARNING, EAL, > + "Failed to rollback device detach on secondary." > + "Devices in secondary may not sync with primary\n"); > + > return ret; > } > --- a/lib/librte_eal/common/eal_private.h > +++ b/lib/librte_eal/common/eal_private.h > +/** > + * Register all mp action callbacks for hotplug. > + * > + * @return > + * 0 on success, negative on error. > + */ > +int rte_dev_hotplug_mp_init(void); This function is called by the init, so it should not be private. The app should be free to build its own init routine. > --- /dev/null > +++ b/lib/librte_eal/common/hotplug_mp.h > +#include > +#include I think EAL headers should be included with quotes. > +/** > + * this is a synchronous wrapper for secondary process send Missing uppercase at the beggining of sentence. > + * request to primary process, this is invoked when an attach > + * or detach request issued from primary process. Missing "is" before "issued": request is issued. > --- a/lib/librte_eal/common/include/rte_dev.h > +++ b/lib/librte_eal/common/include/rte_dev.h > @@ -190,6 +190,9 @@ int rte_eal_dev_detach(struct rte_device *dev); > > /** > * Hotplug add a given device to a specific bus. > + * In multi-process, this function will inform all other processes > + * to hotplug add the same device. Any failure on other process rollback > + * the action. Better to leave a blank line before this comment. Small reword: * Hotplug add a given device to a specific bus. * * In multi-process, it will request other processes to add the same device. * A failure, in any process, will rollback the action. You should add the same comment for rte_dev_probe(). > @@ -219,6 +222,9 @@ int __rte_experimental rte_dev_probe(const char *devargs); > > /** > * Hotplug remove a given device from a specific bus. > + * In multi-process, this function will inform all other processes > + * to hotplug remove the same device. Any failure on other process > + * will rollback the action. Same reword: * Hotplug remove a given device from a specific bus. * * In multi-process, it will request other processes to remove the same device. * A failure, in any process, will rollback the action. [...] > --- a/lib/librte_eal/linuxapp/eal/eal.c > +++ b/lib/librte_eal/linuxapp/eal/eal.c > + /* register mp action callbacks for hotplug */ > + if (rte_dev_hotplug_mp_init() < 0) { In the comment, better to say "multi-process" instead of the cryptic "mp".