From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id E5B3998 for ; Mon, 18 Jun 2018 10:18:23 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2018 01:18:20 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,238,1526367600"; d="scan'208";a="65005320" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.30.40]) ([10.252.30.40]) by fmsmga001.fm.intel.com with ESMTP; 18 Jun 2018 01:18:18 -0700 To: Qi Zhang , thomas@monjalon.net Cc: konstantin.ananyev@intel.com, dev@dpdk.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, benjamin.h.shelton@intel.com, narender.vangati@intel.com References: <20180607123849.14439-1-qi.z.zhang@intel.com> <20180607123849.14439-5-qi.z.zhang@intel.com> From: "Burakov, Anatoly" Message-ID: <3c273e40-9954-e9a3-2c78-9e245bacc19d@intel.com> Date: Mon, 18 Jun 2018 09:18:17 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180607123849.14439-5-qi.z.zhang@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 04/22] ethdev: enable hotplug on multi-process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jun 2018 08:18:26 -0000 On 07-Jun-18 1:38 PM, Qi Zhang wrote: > The patch introduce the solution to handle different hotplug cases in > multi-process situation, it include below scenario: > > 1. Attach a share device from primary > 2. Detach a share device from primary > 3. Attach a share device from secondary > 4. Detach a share device from secondary > 5. Attach a private device from secondary > 6. Detach a private device from secondary > 7. Detach a share device from secondary privately > 8. Attach a share device from secondary privately > > In primary-secondary process model, we assume device is shared by default. > that means attach or detach a device on any process will broadcast to > all other processes through mp channel then device information will be > synchronized on all processes. > > Any failure during attaching process will cause inconsistent status > between processes, so proper rollback action should be considered. > Also it is not safe to detach a share device when other process still use > it, so a handshake mechanism is introduced, it will be implemented in > following separate patch. > > Scenario for Case 1, 2: > > attach device > a) primary attach the new device if failed goto h). > b) primary send attach sync request to all secondary. > c) secondary receive request and attach device and send reply. > d) primary check the reply if all success go to i). > e) primary send attach rollback sync request to all secondary. > f) secondary receive the request and detach device and send reply. > g) primary receive the reply and detach device as rollback action. > h) attach fail > i) attach success > > detach device > a) primary perform pre-detach check, if device is locked, goto i). > b) primary send pre-detach sync request to all secondary. > c) secondary perform pre-detach check and send reply. > d) primary check the reply if any fail goto i). > e) primary send detach sync request to all secondary > f) secondary detach the device and send reply (assume no fail) > g) primary detach the device. > h) detach success > i) detach failed > > Case 3, 4: > This will be implemented in following patch. If these will be implemented in following patch, why spend half the commit message talking about it? :) This commit doesn't implement secondary process functionality at all, so the commit message should probably be reworded to only include primary process logic, no? > > Case 5, 6: > Secondary process can attach private device which only visible to itself, > in this case no IPC is involved, primary process is not allowd to have > private device so far. > > Case 7, 8: > Secondary process can also temporally to detach a share device "privately" > then attach it back later, this action also not impact other processes. > > APIs chenages: Multiple typos - "chenages", "temporally", "allowd", etc. > > rte_eth_dev_attach and rte_eth_dev_attach are extended to support > share device attach/detach in primary-secondary process model, it will > be called in case 1,2,3,4. > > New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are > introduced to cover case 5,6,7,8, this API can only be invoked in secondary > process. > > Signed-off-by: Qi Zhang > --- > rte_eal_mcfg_complete(); > > + if (rte_eth_dev_mp_init()) { > + rte_eal_init_alert("rte_eth_dev_mp_init() failed\n"); > + rte_errno = ENOEXEC; > + return -1; > + } > + Why is this done after the end of init? rte_eal_mcfg_complete() makes it so that secondaries can initialize, at that point all initialization should have been finished. I would expect this to be called after (before?) bus probe, since this is device-related. > return fctret; > } > > diff --git a/lib/librte_ethdev/Makefile b/lib/librte_ethdev/Makefile > index c2f2f7d82..04e93f337 100644 > --- a/lib/librte_ethdev/Makefile > +++ b/lib/librte_ethdev/Makefile > @@ -19,6 +19,7 @@ EXPORT_MAP := rte_ethdev_version.map > LIBABIVER := 9 > > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + > + /** > + * If secondary process, we just send request to primray > + * to start the process. > + */ > + req.t = REQ_TYPE_ATTACH; > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN); > + > + ret = rte_eth_dev_request_to_primary(&req); > + if (ret) { > + ethdev_log(ERR, "Failed to send device attach request to primary\n"); The log message is a little misleading. It can be that secondary has failed to send request. It can also be that it succeeded, but the attach itself has failed. I think a better message would be "attach request has failed" or something to that effect. > + return ret; > + } > + > + *port_id = req.port_id; > + return req.result; > + } > + > + ret = do_eth_dev_attach(devargs, port_id); > + if (ret) > + return ret; > + > + /* send attach request to seoncary */ > + req.t = REQ_TYPE_ATTACH; > + strlcpy(req.devargs, devargs, MAX_DEV_ARGS_LEN); > + req.port_id = *port_id; > + ret = rte_eth_dev_request_to_secondary(&req); > + if (ret) { > + ethdev_log(ERR, "Failed to send device attach request to secondary\n"); Same as above - log message can/might be misleading. There are a few other places where similar log message is present, those should be corrected too. > + goto rollback; > + } > + > + if (req.result) > + goto rollback; > + > + return 0; > +{ > + uint32_t dev_flags; > + > + if (rte_eal_process_type() == RTE_PROC_PRIMARY) > + return -ENOTSUP; > + > + RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL); > + > + dev_flags = rte_eth_devices[port_id].data->dev_flags; > + if (dev_flags & RTE_ETH_DEV_BONDED_SLAVE) { > + ethdev_log(ERR, > + "Port %" PRIu16 " is bonded, cannot detach", port_id); > + return -ENOTSUP; > + } Do we have to do a similar check for failsafe devices? > + > + return do_eth_dev_detach(port_id); > +} > + > static int > rte_eth_dev_rx_queue_config(struct rte_eth_dev *dev, uint16_t nb_queues) > { > diff --git a/lib/librte_ethdev/rte_ethdev.h b/lib/librte_ethdev/rte_ethdev.h > index 36e3984ea..bb03d613b 100644 > --- a/lib/librte_ethdev/rte_ethdev.h > +++ b/lib/librte_ethdev/rte_ethdev.h > /** > + * Attach a private Ethernet device specified by arguments. > + * A private device is invisible to other process. > + * Can only be invoked in secondary process. > + * > + * @param devargs > + * A pointer to a strings array describing the new device > + * to be attached. The strings should be a pci address like > + * '0000:01:00.0' or virtual device name like 'net_pcap0'. > + * @param port_id > + * A pointer to a port identifier actually attached. > + * @return > + * 0 on success and port_id is filled, negative on error > + */ > +int rte_eth_dev_attach_private(const char *devargs, uint16_t *port_id); New API's should be marked as __rte_experimental. > + > +/** > * Detach a Ethernet device specified by port identifier. > * This function must be called when the device is in the > * closed state. > + * In multi-process mode, it will sync with other process > + * to detach the device. > * > * @param port_id > * The port identifier of the device to detach. > @@ -1490,6 +1511,22 @@ int rte_eth_dev_attach(const char *devargs, uint16_t *port_id); > + * Detach a Ethernet device in current process. > + * > + * @param port_id > + * The port identifier of the device to detach. > + * @param devname > + * A pointer to a buffer that will be filled with the device name. > + * This buffer must be at least RTE_DEV_NAME_MAX_LEN long. > + * @return > + * 0 on success and devname is filled, negative on error > + */ > +int do_eth_dev_detach(uint16_t port_id); > + Why is this made part of an external API? You should have a separate, private header file for these. > #ifdef __cplusplus > } > #endif > diff --git a/lib/librte_ethdev/rte_ethdev_mp.c b/lib/librte_ethdev/rte_ethdev_mp.c > new file mode 100644 > index 000000000..8ede8151d > --- /dev/null > +++ b/lib/librte_ethdev/rte_ethdev_mp.c > @@ -0,0 +1,195 @@ > +/* SPDX-License-Identifier: BSD-3-Clause > + * Copyright(c) 2010-2018 Intel Corporation > + */ > + > +#include "rte_ethdev_driver.h" > +#include "rte_ethdev_mp.h" > + > +static int detach_on_secondary(uint16_t port_id) > + free(da.args); > + return 0; > +} > + > +static int handle_secondary_request(const struct rte_mp_msg *msg, const void *peer) > +{ > + (void)msg; > + (void)(peer); > + return -ENOTSUP; Please either mark arguments as __rte_unused, or use RTE_SET_USED(blah) macro. Same in other similar places. > +} > + > +static int handle_primary_response(const struct rte_mp_msg *msg, const void *peer) > +{ > + (void)msg; > + (void)(peer); > + return -ENOTSUP; > +} > + > +static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer) > +{ > + const struct eth_dev_mp_req *req = > + (const struct eth_dev_mp_req *)msg->param; > + case REQ_TYPE_DETACH: > + case REQ_TYPE_ATTACH_ROLLBACK: > + ret = detach_on_secondary(req->port_id); > + break; > + default: > + ret = -EINVAL; > + } > + > + strcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST); Here and in other places: rte_strlcpy? -- Thanks, Anatoly