From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <anatoly.burakov@intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id C68731B597
 for <dev@dpdk.org>; Tue, 26 Jun 2018 14:49:55 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga001.fm.intel.com ([10.253.24.23])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 26 Jun 2018 05:49:53 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.51,274,1526367600"; d="scan'208";a="67383233"
Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.28])
 ([10.237.220.28])
 by fmsmga001.fm.intel.com with ESMTP; 26 Jun 2018 05:49:51 -0700
To: "Zhang, Qi Z" <qi.z.zhang@intel.com>,
 "thomas@monjalon.net" <thomas@monjalon.net>
Cc: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 "dev@dpdk.org" <dev@dpdk.org>, "Richardson, Bruce"
 <bruce.richardson@intel.com>, "Yigit, Ferruh" <ferruh.yigit@intel.com>,
 "Shelton, Benjamin H" <benjamin.h.shelton@intel.com>,
 "Vangati, Narender" <narender.vangati@intel.com>
References: <20180607123849.14439-1-qi.z.zhang@intel.com>
 <20180626070832.3055-1-qi.z.zhang@intel.com>
 <20180626070832.3055-7-qi.z.zhang@intel.com>
 <969f0cb7-4186-d05c-442f-6341f1c44e93@intel.com>
 <039ED4275CED7440929022BC67E706115323E71A@SHSMSX103.ccr.corp.intel.com>
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Message-ID: <30b1c93c-1d7c-07d9-d41a-76e84f63d6a1@intel.com>
Date: Tue, 26 Jun 2018 13:49:50 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.8.0
MIME-Version: 1.0
In-Reply-To: <039ED4275CED7440929022BC67E706115323E71A@SHSMSX103.ccr.corp.intel.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH v4 06/24] ethdev: enable hotplug on
	multi-process
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 26 Jun 2018 12:49:56 -0000

On 26-Jun-18 1:19 PM, Zhang, Qi Z wrote:
> 
> 
>> -----Original Message-----
>> From: Burakov, Anatoly
>> Sent: Tuesday, June 26, 2018 8:09 PM
>> To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
>> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org;
>> Richardson, Bruce <bruce.richardson@intel.com>; Yigit, Ferruh
>> <ferruh.yigit@intel.com>; Shelton, Benjamin H
>> <benjamin.h.shelton@intel.com>; Vangati, Narender
>> <narender.vangati@intel.com>
>> Subject: Re: [PATCH v4 06/24] ethdev: enable hotplug on multi-process
>>
>> On 26-Jun-18 8:08 AM, Qi Zhang wrote:
>>> We are going to introduce the solution to handle different hotplug
>>> cases in multi-process situation, it include below scenario:
>>>
>>> 1. Attach a share device from primary
>>> 2. Detach a share device from primary
>>> 3. Attach a share device from secondary 4. Detach a share device from
>>> secondary 5. Attach a private device from secondary 6. Detach a
>>> private device from secondary 7. Detach a share device from secondary
>>> privately 8. Attach a share device from secondary privately
>>>
>>> In primary-secondary process model, we assume device is shared by default.
>>> that means attach or detach a device on any process will broadcast to
>>> all other processes through mp channel then device information will be
>>> synchronized on all processes.
>>>
>>> Any failure during attaching process will cause inconsistent status
>>> between processes, so proper rollback action should be considered.
>>> Also it is not safe to detach a share device when other process still
>>> use it, so a handshake mechanism is introduced.
>>>
>>> This patch covers the implementation of case 1,2,5,6,7,8.
>>> Case 3,4 will be implemented on separate patch as well as handshake
>>> mechanism.
>>>
>>> Scenario for Case 1, 2:
>>>
>>> attach device
>>> a) primary attach the new device if failed goto h).
>>> b) primary send attach sync request to all secondary.
>>> c) secondary receive request and attach device and send reply.
>>> d) primary check the reply if all success go to i).
>>> e) primary send attach rollback sync request to all secondary.
>>> f) secondary receive the request and detach device and send reply.
>>> g) primary receive the reply and detach device as rollback action.
>>> h) attach fail
>>> i) attach success
>>>
>>> detach device
>>> a) primary perform pre-detach check, if device is locked, goto i).
>>> b) primary send pre-detach sync request to all secondary.
>>> c) secondary perform pre-detach check and send reply.
>>> d) primary check the reply if any fail goto i).
>>> e) primary send detach sync request to all secondary
>>> f) secondary detach the device and send reply (assume no fail)
>>> g) primary detach the device.
>>> h) detach success
>>> i) detach failed
>>>
>>> Case 5, 6:
>>> Secondary process can attach private device which only visible to
>>> itself, in this case no IPC is involved, primary process is not
>>> allowed to have private device so far.
>>>
>>> Case 7, 8:
>>> Secondary process can also temporally to detach a share device "privately"
>>> then attach it back later, this action also not impact other processes.
>>>
>>> APIs changes:
>>>
>>> rte_eth_dev_attach and rte_eth_dev_attach are extended to support
>>> share device attach/detach in primary-secondary process model, it will
>>> be called in case 1,2,3,4.
>>>
>>> New API rte_eth_dev_attach_private and rte_eth_dev_detach_private are
>>> introduced to cover case 5,6,7,8, this API can only be invoked in
>>> secondary process.
>>>
>>> Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
>>> ---
>>
>> <snip>
>>
>>> +static int
>>> +handle_primary_request(const struct rte_mp_msg *msg, const void
>>> +*peer) {
>>> +
>>> +	struct rte_mp_msg mp_resp;
>>> +	const struct eth_dev_mp_req *req =
>>> +		(const struct eth_dev_mp_req *)msg->param;
>>> +	struct eth_dev_mp_req *resp =
>>> +		(struct eth_dev_mp_req *)mp_resp.param;
>>> +	struct mp_reply_bundle *bundle;
>>> +	int ret = 0;
>>> +
>>> +	memset(&mp_resp, 0, sizeof(mp_resp));
>>> +	strlcpy(mp_resp.name, ETH_DEV_MP_ACTION_REQUEST,
>> sizeof(mp_resp.name));
>>> +	mp_resp.len_param = sizeof(*req);
>>> +	memcpy(resp, req, sizeof(*resp));
>>> +
>>> +	bundle = calloc(1, sizeof(*bundle));
>>> +	if (bundle == NULL) {
>>> +		resp->result = -ENOMEM;
>>> +		ret = rte_mp_reply(&mp_resp, peer);
>>> +		if (ret) {
>>> +			ethdev_log(ERR, "failed to send reply to primary request\n");
>>> +			return ret;
>>> +		}
>>> +	}
>>> +
>>> +	bundle->msg = *msg;
>>> +	bundle->peer = peer;
>>> +
>>> +	ret = rte_eal_mp_task_add(__handle_primary_request, bundle);
>>> +	if (ret) {
>>> +		resp->result = ret;
>>> +		ret = rte_mp_reply(&mp_resp, peer);
>>> +		if (ret) {
>>> +			ethdev_log(ERR, "failed to send reply to primary request\n");
>>> +			return ret;
>>> +		}
>>> +	}
>>
>> What you're doing here is quite dangerous. The parameter "const void *peer"
>> is only guaranteed to be valid at the time of the callback - not necessarily
>> afterwards. So, if you're handing off sending replies to a separate thread,
>> things might blow up because the pointer may no longer be valid.
> 
> OK, so what about clone the content a buffer, I think the content should be valid before reply is sent, right?

Yes, but even if you clone the content of the buffer, where would you 
send it *to*? You'll need the peer parameter to know where to send your 
response.

> 
> Thanks
> Qi
>>
>> --
>> Thanks,
>> Anatoly


-- 
Thanks,
Anatoly