From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id CBB5F200 for ; Mon, 18 Jun 2018 10:51:12 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga105.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 18 Jun 2018 01:51:11 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,238,1526367600"; d="scan'208";a="65011502" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.30.40]) ([10.252.30.40]) by fmsmga001.fm.intel.com with ESMTP; 18 Jun 2018 01:51:09 -0700 To: Qi Zhang , thomas@monjalon.net Cc: konstantin.ananyev@intel.com, dev@dpdk.org, bruce.richardson@intel.com, ferruh.yigit@intel.com, benjamin.h.shelton@intel.com, narender.vangati@intel.com References: <20180607123849.14439-1-qi.z.zhang@intel.com> <20180607123849.14439-7-qi.z.zhang@intel.com> From: "Burakov, Anatoly" Message-ID: <3146157e-4bc1-0662-45cb-0ae8d1626e6e@intel.com> Date: Mon, 18 Jun 2018 09:51:09 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.8.0 MIME-Version: 1.0 In-Reply-To: <20180607123849.14439-7-qi.z.zhang@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 06/22] ethdev: support attach or detach share device from secondary X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Jun 2018 08:51:13 -0000 On 07-Jun-18 1:38 PM, Qi Zhang wrote: > This patch cover the multi-process hotplug case when a share device > attach/detach request be issued from secondary process, the implementation > references malloc_mp.c. > > device attach on secondary: > a) seconary send asycn request to primary and wait on a condition > which will be released by matched response from primary. > b) primary receive the request and attach the new device if failed > goto i). > c) primary forward attach request to all secondary as async request > (because this in mp thread context, use sync request will deadlock) > d) secondary receive request and attach device and send reply. > e) primary check the reply if all success go to j). > f) primary send attach rollback async request to all secondary. > g) secondary receive the request and detach device and send reply. > h) primary receive the reply and detach device as rollback action. > i) send fail response to secondary, goto k). > j) send success response to secondary. > k) secondary process receive response and return. > > device detach on secondary: > a) secondary send async request to primary and wait on a condition > which will be released by matched response from primary. > b) primary receive the request and perform pre-detach check, if device > is locked, goto j). > c) primary send pre-detach async request to all secondary. > d) secondary perform pre-detach check and send reply. > e) primary check the reply if any fail goto j). > f) primary send detach async request to all secondary > g) secondary detach the device and send reply > h) primary detach the device. > i) send success response to secondary, goto k). > j) send fail response to secondary. > k) secondary process receive response and return. > > Signed-off-by: Qi Zhang > --- > +TAILQ_HEAD(mp_request_list, mp_request); > +static struct { > + struct mp_request_list list; > + pthread_mutex_t lock; > +} mp_request_list = { > + .list = TAILQ_HEAD_INITIALIZER(mp_request_list.list), > + .lock = PTHREAD_MUTEX_INITIALIZER > +}; > + > +#define MP_TIMEOUT_S 5 /**< 5 seconds timeouts */ Patch number 4 should've used this #define to set up its timeout. > + > +static struct mp_request * > +find_request_by_id(uint64_t id) > +{ > + struct mp_request *req; > + > + TAILQ_FOREACH(req, &mp_request_list.list, next) { > + if (req->user_req.id == id) > + break; > + } > + return req; > +} > + > + if (resp->result) > + return resp->result; > + } > + > + return 0; > +} > + > +static int > +send_response_to_secondary(const struct eth_dev_mp_req *req, int result) > +{ > + struct rte_mp_msg resp_msg = {0}; I've been bitten by this in the past - some compilers (*cough* clang *cough*) don't like this kind of zero-initialization depending on which type of parameter comes first in the structure, so i would refrain from using it and used memset(0) instead. > + struct eth_dev_mp_req *resp = > + (struct eth_dev_mp_req *)resp_msg.param; > + int ret = 0; > + > + resp_msg.len_param = sizeof(*resp); > + strcpy(resp_msg.name, ETH_DEV_MP_ACTION_RESPONSE); here and in other places - strlcpy()? > + memcpy(resp, req, sizeof(*req)); > + resp->result = result; > + > + ret = rte_mp_sendmsg(&resp_msg); > + if (ret) > + ethdev_log(ERR, "failed to send response to secondary\n"); > + > + return ret; > +} > + > +static int > +handle_async_attach_response(const struct rte_mp_msg *request, > + const struct rte_mp_reply *reply) > +{ > + else > + return -1; > + do { > + ret = rte_mp_request_async(&mp_req, &ts, clb); > + } while (ret != 0 && rte_errno == EEXIST); > + > + if (ret) > + ethdev_log(ERR, "couldn't send async request\n"); > + entry = find_request_by_id(req->id > + (void)entry; Why did you look up entry and then marked it as used without checking the return value? Leftover? Some code missing? > + return ret; > +} > + > +static int handle_secondary_request(const struct rte_mp_msg *msg, > + const void *peer __rte_unused) > +{ > + const struct eth_dev_mp_req *req = > + (const struct eth_dev_mp_req *)msg->param; > + struct eth_dev_mp_req tmp_req; > @@ -124,10 +490,101 @@ static int handle_primary_request(const struct rte_mp_msg *msg, const void *peer > return 0; > } > > +/** > + * secondary to primary request. > + * > + * device attach: > + * a) seconary send request to primary. > + * b) primary attach the new device if failed goto i). > + * c) primary forward attach request to all secondary. > + * d) secondary receive request and attach device and send reply. > + * e) primary check the reply if all success go to j). > + * f) primary send attach rollback request to all secondary. > + * g) secondary receive the request and detach device and send reply. > + * h) primary receive the reply and detach device as rollback action. > + * i) send fail response to secondary, goto k). > + * j) send success response to secondary. > + * k) end. > + > + * device detach: > + * a) secondary send request to primary. > + * b) primary perform pre-detach check, if device is locked, got j). > + * c) primary send pre-detach check request to all secondary. > + * d) secondary perform pre-detach check and send reply. > + * e) primary check the reply if any fail goto j). > + * f) primary send detach request to all secondary > + * g) secondary detach the device and send reply > + * h) primary detach the device. > + * i) send success response to secondary, goto k). > + * j) send fail response to secondary. > + * k) end. > + */ I think this comment should be at the top of this file. -- Thanks, Anatoly