From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <anatoly.burakov@intel.com>
Received: from mga03.intel.com (mga03.intel.com [134.134.136.65])
 by dpdk.org (Postfix) with ESMTP id 39C4E1B03C
 for <dev@dpdk.org>; Thu, 25 Jan 2018 19:03:01 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga002.jf.intel.com ([10.7.209.21])
 by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 25 Jan 2018 10:02:59 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.46,412,1511856000"; d="scan'208";a="29470121"
Received: from nstojanx-wtg.amr.corp.intel.com (HELO [10.252.26.38])
 ([10.252.26.38])
 by orsmga002.jf.intel.com with ESMTP; 25 Jan 2018 10:02:58 -0800
To: "Tan, Jianfeng" <jianfeng.tan@intel.com>,
 "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
 "dev@dpdk.org" <dev@dpdk.org>
Cc: "Richardson, Bruce" <bruce.richardson@intel.com>,
 "thomas@monjalon.net" <thomas@monjalon.net>
References: <1512067450-59203-1-git-send-email-jianfeng.tan@intel.com>
 <1516853783-108023-1-git-send-email-jianfeng.tan@intel.com>
 <1516853783-108023-3-git-send-email-jianfeng.tan@intel.com>
 <93ea032e-e2da-f087-567a-2397fad7ff02@intel.com>
 <2601191342CEEE43887BDE71AB977258862836A2@irsmsx105.ger.corp.intel.com>
 <a7f9b0f9-128a-c01c-d745-3a0085deda75@intel.com>
 <2601191342CEEE43887BDE71AB97725886283712@irsmsx105.ger.corp.intel.com>
 <de0c24f3-758f-3b06-654e-7b7975560c09@intel.com>
 <374c05b2-1334-925f-3035-a24b79fe7ff9@intel.com>
 <2601191342CEEE43887BDE71AB9772588628380A@irsmsx105.ger.corp.intel.com>
 <02b00727-e8cc-2041-f8b8-503f75f008f3@intel.com>
 <3178bc08-02d4-9020-853f-a8a71f39bb50@intel.com>
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Message-ID: <b57c9112-2fcb-f629-f562-21ddc3aee6c3@intel.com>
Date: Thu, 25 Jan 2018 18:02:57 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.5.2
MIME-Version: 1.0
In-Reply-To: <3178bc08-02d4-9020-853f-a8a71f39bb50@intel.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous multi-process
 communication
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Jan 2018 18:03:01 -0000

On 25-Jan-18 5:10 PM, Tan, Jianfeng wrote:
> 
> 
> On 1/26/2018 12:22 AM, Burakov, Anatoly wrote:
>> On 25-Jan-18 3:03 PM, Ananyev, Konstantin wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Burakov, Anatoly
>>>> Sent: Thursday, January 25, 2018 1:10 PM
>>>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, 
>>>> Jianfeng <jianfeng.tan@intel.com>; dev@dpdk.org
>>>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; thomas@monjalon.net
>>>> Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous 
>>>> multi-process communication
>>>>
>>>> On 25-Jan-18 1:05 PM, Burakov, Anatoly wrote:
>>>>> On 25-Jan-18 1:00 PM, Ananyev, Konstantin wrote:
>>>>>>
>>>>>>
>>>>>>> -----Original Message-----
>>>>>>> From: Burakov, Anatoly
>>>>>>> Sent: Thursday, January 25, 2018 12:26 PM
>>>>>>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Tan, 
>>>>>>> Jianfeng
>>>>>>> <jianfeng.tan@intel.com>; dev@dpdk.org
>>>>>>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; 
>>>>>>> thomas@monjalon.net
>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>>> communication
>>>>>>>
>>>>>>> On 25-Jan-18 12:19 PM, Ananyev, Konstantin wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Burakov, Anatoly
>>>>>>>>> Sent: Thursday, January 25, 2018 12:00 PM
>>>>>>>>> To: Tan, Jianfeng <jianfeng.tan@intel.com>; dev@dpdk.org
>>>>>>>>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; Ananyev,
>>>>>>>>> Konstantin <konstantin.ananyev@intel.com>; thomas@monjalon.net
>>>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process
>>>>>>>>> communication
>>>>>>>>>
>>>>>>>>> On the overall patch,
>>>>>>>>>
>>>>>>>>> Reviewed-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>>>>
>>>>>>>>> For request(), returning number of replies received actually makes
>>>>>>>>> sense, because now we get use the value to read our replies, if we
>>>>>>>>> were
>>>>>>>>> a primary process sending messages to secondary processes.
>>>>>>>>
>>>>>>>> Yes, I also think it is good to return number of sends.
>>>>>>>> Then caller can compare number of sended requests with number of
>>>>>>>> received replies and decide should it be considered a failure or 
>>>>>>>> no.
>>>>>>>>
>>>>>>>
>>>>>>> Well, OK, that might make sense. However, i think it would've be 
>>>>>>> of more
>>>>>>> value to make the API consistent (0/-1 on success/failure) and put
>>>>>>> number of sent messages into the reply, like number of received. 
>>>>>>> I.e.
>>>>>>> something like
>>>>>>>
>>>>>>> struct reply {
>>>>>>>       int nb_sent;
>>>>>>>       int nb_received;
>>>>>>> };
>>>>>>>
>>>>>>> We do it for the latter already, so why not the former?
>>>>>>
>>>>>> The question is what treat as success/failure?
>>>>>> Let say we sent 2 requests (of 3 possible), got back 1 response...
>>>>>> Should we consider it as success or failure?
>>>>>>
>>>>>
>>>>> I think "failure" is "something went wrong", not "secondary processes
>>>>> didn't respond". For example, invalid parameters, or our socket 
>>>>> suddenly
>>>>> being closed, or some other error that prevents us from sending 
>>>>> requests
>>>>> to secondaries.
>>>>>
>>>>> As far as i can tell from the code, there's no way to know if the
>>>>> secondary process is running other than by attempting to connect to 
>>>>> it,
>>>>> and get a response. So, failed connection should not be a failure
>>>>> condition, because we can't know if we *can* connect to the process
>>>>> until we do. Process may have ended, but socket files will still be
>>>>> around, and there's nothing we can do about that. So i wouldn't 
>>>>> consider
>>>>> inability to send a message a failure condition.
>>>>>
>>>>
>>>> Just to clarify - i'm suggesting leaving this decision up to the user.
>>>> If a user expects there to be "n" processes running, but only "m"
>>>> responses were received, he could treat it as error. Another user might
>>>> simply send periodical updates/polls to secondaries, for whatever 
>>>> reason
>>>> (say, stats display), and won't really care if one of them just 
>>>> died, so
>>>> there's no error for that user.
>>>>
>>>> However, all of this has nothing to do with API. If we're able to send
>>>> messages - it's not a failure. If we can't - it is. That's the part API
>>>> should be concerned about, and that's what the return value should
>>>> indicate, IMO.
>>>
>>> Ok so to clarify, you are suggesting:
>>> we have N peers - if send_msg() returns success for all N - return 
>>> success
>>> (no matter did we get a reply or not)
>>> Otherwise return a failure.
>>> ?
>>> Konstantin
>>
>> More along the lines of, return -1 if and only if something went 
>> wrong. That might be invalid parameters, or that might be an error 
>> with our own socket,
> 
> To check if the error is caused by our own socket, we check the errno 
> after sendmsg?
> 
> Like for remote socket errors, we check:
> - ECONNRESET
> - ECONNREFUSED
> - ENOBUFS
> 
> Right?
> 
> Thanks,
> Jianfeng

Well, that was only an example. If it doesn't make much sense to do so 
in this case, then don't, and only return -1 on invalid parameters. 
AFAIU we're using connectionless sockets so a bunch of these errors 
won't be applicable to us. Maybe -ENOBUFS, but i'm not sure it's worth 
it to check for that.

> 
> 
>> or something else to that effect. In all other cases, return 0 (that 
>> includes cases where we sent N messages but M replies where N != M). 
>> So, in other words, return 0 if we *could have succeeded* if nothing 
>> went wrong on the other side, and only return -1 if something went 
>> wrong on our side.
>>
>>>
>>>
>>>>
>>>> -- 
>>>> Thanks,
>>>> Anatoly
>>
>>
> 
> 

-- 
Thanks,
Anatoly