From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 39C4E1B03C for ; Thu, 25 Jan 2018 19:03:01 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jan 2018 10:02:59 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,412,1511856000"; d="scan'208";a="29470121" Received: from nstojanx-wtg.amr.corp.intel.com (HELO [10.252.26.38]) ([10.252.26.38]) by orsmga002.jf.intel.com with ESMTP; 25 Jan 2018 10:02:58 -0800 To: "Tan, Jianfeng" , "Ananyev, Konstantin" , "dev@dpdk.org" Cc: "Richardson, Bruce" , "thomas@monjalon.net" References: <1512067450-59203-1-git-send-email-jianfeng.tan@intel.com> <1516853783-108023-1-git-send-email-jianfeng.tan@intel.com> <1516853783-108023-3-git-send-email-jianfeng.tan@intel.com> <93ea032e-e2da-f087-567a-2397fad7ff02@intel.com> <2601191342CEEE43887BDE71AB977258862836A2@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886283712@irsmsx105.ger.corp.intel.com> <374c05b2-1334-925f-3035-a24b79fe7ff9@intel.com> <2601191342CEEE43887BDE71AB9772588628380A@irsmsx105.ger.corp.intel.com> <02b00727-e8cc-2041-f8b8-503f75f008f3@intel.com> <3178bc08-02d4-9020-853f-a8a71f39bb50@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Thu, 25 Jan 2018 18:02:57 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.5.2 MIME-Version: 1.0 In-Reply-To: <3178bc08-02d4-9020-853f-a8a71f39bb50@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous multi-process communication X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jan 2018 18:03:01 -0000 On 25-Jan-18 5:10 PM, Tan, Jianfeng wrote: > > > On 1/26/2018 12:22 AM, Burakov, Anatoly wrote: >> On 25-Jan-18 3:03 PM, Ananyev, Konstantin wrote: >>> >>> >>>> -----Original Message----- >>>> From: Burakov, Anatoly >>>> Sent: Thursday, January 25, 2018 1:10 PM >>>> To: Ananyev, Konstantin ; Tan, >>>> Jianfeng ; dev@dpdk.org >>>> Cc: Richardson, Bruce ; thomas@monjalon.net >>>> Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous >>>> multi-process communication >>>> >>>> On 25-Jan-18 1:05 PM, Burakov, Anatoly wrote: >>>>> On 25-Jan-18 1:00 PM, Ananyev, Konstantin wrote: >>>>>> >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Burakov, Anatoly >>>>>>> Sent: Thursday, January 25, 2018 12:26 PM >>>>>>> To: Ananyev, Konstantin ; Tan, >>>>>>> Jianfeng >>>>>>> ; dev@dpdk.org >>>>>>> Cc: Richardson, Bruce ; >>>>>>> thomas@monjalon.net >>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process >>>>>>> communication >>>>>>> >>>>>>> On 25-Jan-18 12:19 PM, Ananyev, Konstantin wrote: >>>>>>>> >>>>>>>> >>>>>>>>> -----Original Message----- >>>>>>>>> From: Burakov, Anatoly >>>>>>>>> Sent: Thursday, January 25, 2018 12:00 PM >>>>>>>>> To: Tan, Jianfeng ; dev@dpdk.org >>>>>>>>> Cc: Richardson, Bruce ; Ananyev, >>>>>>>>> Konstantin ; thomas@monjalon.net >>>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process >>>>>>>>> communication >>>>>>>>> >>>>>>>>> On the overall patch, >>>>>>>>> >>>>>>>>> Reviewed-by: Anatoly Burakov >>>>>>>>> >>>>>>>>> For request(), returning number of replies received actually makes >>>>>>>>> sense, because now we get use the value to read our replies, if we >>>>>>>>> were >>>>>>>>> a primary process sending messages to secondary processes. >>>>>>>> >>>>>>>> Yes, I also think it is good to return number of sends. >>>>>>>> Then caller can compare number of sended requests with number of >>>>>>>> received replies and decide should it be considered a failure or >>>>>>>> no. >>>>>>>> >>>>>>> >>>>>>> Well, OK, that might make sense. However, i think it would've be >>>>>>> of more >>>>>>> value to make the API consistent (0/-1 on success/failure) and put >>>>>>> number of sent messages into the reply, like number of received. >>>>>>> I.e. >>>>>>> something like >>>>>>> >>>>>>> struct reply { >>>>>>>       int nb_sent; >>>>>>>       int nb_received; >>>>>>> }; >>>>>>> >>>>>>> We do it for the latter already, so why not the former? >>>>>> >>>>>> The question is what treat as success/failure? >>>>>> Let say we sent 2 requests (of 3 possible), got back 1 response... >>>>>> Should we consider it as success or failure? >>>>>> >>>>> >>>>> I think "failure" is "something went wrong", not "secondary processes >>>>> didn't respond". For example, invalid parameters, or our socket >>>>> suddenly >>>>> being closed, or some other error that prevents us from sending >>>>> requests >>>>> to secondaries. >>>>> >>>>> As far as i can tell from the code, there's no way to know if the >>>>> secondary process is running other than by attempting to connect to >>>>> it, >>>>> and get a response. So, failed connection should not be a failure >>>>> condition, because we can't know if we *can* connect to the process >>>>> until we do. Process may have ended, but socket files will still be >>>>> around, and there's nothing we can do about that. So i wouldn't >>>>> consider >>>>> inability to send a message a failure condition. >>>>> >>>> >>>> Just to clarify - i'm suggesting leaving this decision up to the user. >>>> If a user expects there to be "n" processes running, but only "m" >>>> responses were received, he could treat it as error. Another user might >>>> simply send periodical updates/polls to secondaries, for whatever >>>> reason >>>> (say, stats display), and won't really care if one of them just >>>> died, so >>>> there's no error for that user. >>>> >>>> However, all of this has nothing to do with API. If we're able to send >>>> messages - it's not a failure. If we can't - it is. That's the part API >>>> should be concerned about, and that's what the return value should >>>> indicate, IMO. >>> >>> Ok so to clarify, you are suggesting: >>> we have N peers - if send_msg() returns success for all N - return >>> success >>> (no matter did we get a reply or not) >>> Otherwise return a failure. >>> ? >>> Konstantin >> >> More along the lines of, return -1 if and only if something went >> wrong. That might be invalid parameters, or that might be an error >> with our own socket, > > To check if the error is caused by our own socket, we check the errno > after sendmsg? > > Like for remote socket errors, we check: > - ECONNRESET > - ECONNREFUSED > - ENOBUFS > > Right? > > Thanks, > Jianfeng Well, that was only an example. If it doesn't make much sense to do so in this case, then don't, and only return -1 on invalid parameters. AFAIU we're using connectionless sockets so a bunch of these errors won't be applicable to us. Maybe -ENOBUFS, but i'm not sure it's worth it to check for that. > > >> or something else to that effect. In all other cases, return 0 (that >> includes cases where we sent N messages but M replies where N != M). >> So, in other words, return 0 if we *could have succeeded* if nothing >> went wrong on the other side, and only return -1 if something went >> wrong on our side. >> >>> >>> >>>> >>>> -- >>>> Thanks, >>>> Anatoly >> >> > > -- Thanks, Anatoly