From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 170D31B1AD for ; Thu, 25 Jan 2018 18:10:37 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jan 2018 09:10:37 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,412,1511856000"; d="scan'208";a="22538236" Received: from tanjianf-mobl.ccr.corp.intel.com (HELO [10.255.27.46]) ([10.255.27.46]) by FMSMGA003.fm.intel.com with ESMTP; 25 Jan 2018 09:10:35 -0800 To: "Burakov, Anatoly" , "Ananyev, Konstantin" , "dev@dpdk.org" References: <1512067450-59203-1-git-send-email-jianfeng.tan@intel.com> <1516853783-108023-1-git-send-email-jianfeng.tan@intel.com> <1516853783-108023-3-git-send-email-jianfeng.tan@intel.com> <93ea032e-e2da-f087-567a-2397fad7ff02@intel.com> <2601191342CEEE43887BDE71AB977258862836A2@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725886283712@irsmsx105.ger.corp.intel.com> <374c05b2-1334-925f-3035-a24b79fe7ff9@intel.com> <2601191342CEEE43887BDE71AB9772588628380A@irsmsx105.ger.corp.intel.com> <02b00727-e8cc-2041-f8b8-503f75f008f3@intel.com> Cc: "Richardson, Bruce" , "thomas@monjalon.net" From: "Tan, Jianfeng" Message-ID: <3178bc08-02d4-9020-853f-a8a71f39bb50@intel.com> Date: Fri, 26 Jan 2018 01:10:34 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: <02b00727-e8cc-2041-f8b8-503f75f008f3@intel.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous multi-process communication X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 25 Jan 2018 17:10:38 -0000 On 1/26/2018 12:22 AM, Burakov, Anatoly wrote: > On 25-Jan-18 3:03 PM, Ananyev, Konstantin wrote: >> >> >>> -----Original Message----- >>> From: Burakov, Anatoly >>> Sent: Thursday, January 25, 2018 1:10 PM >>> To: Ananyev, Konstantin ; Tan, >>> Jianfeng ; dev@dpdk.org >>> Cc: Richardson, Bruce ; thomas@monjalon.net >>> Subject: Re: [dpdk-dev] [PATCH v3 2/3] eal: add synchronous >>> multi-process communication >>> >>> On 25-Jan-18 1:05 PM, Burakov, Anatoly wrote: >>>> On 25-Jan-18 1:00 PM, Ananyev, Konstantin wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Burakov, Anatoly >>>>>> Sent: Thursday, January 25, 2018 12:26 PM >>>>>> To: Ananyev, Konstantin ; Tan, >>>>>> Jianfeng >>>>>> ; dev@dpdk.org >>>>>> Cc: Richardson, Bruce ; >>>>>> thomas@monjalon.net >>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process >>>>>> communication >>>>>> >>>>>> On 25-Jan-18 12:19 PM, Ananyev, Konstantin wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Burakov, Anatoly >>>>>>>> Sent: Thursday, January 25, 2018 12:00 PM >>>>>>>> To: Tan, Jianfeng ; dev@dpdk.org >>>>>>>> Cc: Richardson, Bruce ; Ananyev, >>>>>>>> Konstantin ; thomas@monjalon.net >>>>>>>> Subject: Re: [PATCH v3 2/3] eal: add synchronous multi-process >>>>>>>> communication >>>>>>>> >>>>>>>> On the overall patch, >>>>>>>> >>>>>>>> Reviewed-by: Anatoly Burakov >>>>>>>> >>>>>>>> For request(), returning number of replies received actually makes >>>>>>>> sense, because now we get use the value to read our replies, if we >>>>>>>> were >>>>>>>> a primary process sending messages to secondary processes. >>>>>>> >>>>>>> Yes, I also think it is good to return number of sends. >>>>>>> Then caller can compare number of sended requests with number of >>>>>>> received replies and decide should it be considered a failure or >>>>>>> no. >>>>>>> >>>>>> >>>>>> Well, OK, that might make sense. However, i think it would've be >>>>>> of more >>>>>> value to make the API consistent (0/-1 on success/failure) and put >>>>>> number of sent messages into the reply, like number of received. >>>>>> I.e. >>>>>> something like >>>>>> >>>>>> struct reply { >>>>>> int nb_sent; >>>>>> int nb_received; >>>>>> }; >>>>>> >>>>>> We do it for the latter already, so why not the former? >>>>> >>>>> The question is what treat as success/failure? >>>>> Let say we sent 2 requests (of 3 possible), got back 1 response... >>>>> Should we consider it as success or failure? >>>>> >>>> >>>> I think "failure" is "something went wrong", not "secondary processes >>>> didn't respond". For example, invalid parameters, or our socket >>>> suddenly >>>> being closed, or some other error that prevents us from sending >>>> requests >>>> to secondaries. >>>> >>>> As far as i can tell from the code, there's no way to know if the >>>> secondary process is running other than by attempting to connect to >>>> it, >>>> and get a response. So, failed connection should not be a failure >>>> condition, because we can't know if we *can* connect to the process >>>> until we do. Process may have ended, but socket files will still be >>>> around, and there's nothing we can do about that. So i wouldn't >>>> consider >>>> inability to send a message a failure condition. >>>> >>> >>> Just to clarify - i'm suggesting leaving this decision up to the user. >>> If a user expects there to be "n" processes running, but only "m" >>> responses were received, he could treat it as error. Another user might >>> simply send periodical updates/polls to secondaries, for whatever >>> reason >>> (say, stats display), and won't really care if one of them just >>> died, so >>> there's no error for that user. >>> >>> However, all of this has nothing to do with API. If we're able to send >>> messages - it's not a failure. If we can't - it is. That's the part API >>> should be concerned about, and that's what the return value should >>> indicate, IMO. >> >> Ok so to clarify, you are suggesting: >> we have N peers - if send_msg() returns success for all N - return >> success >> (no matter did we get a reply or not) >> Otherwise return a failure. >> ? >> Konstantin > > More along the lines of, return -1 if and only if something went > wrong. That might be invalid parameters, or that might be an error > with our own socket, To check if the error is caused by our own socket, we check the errno after sendmsg? Like for remote socket errors, we check: - ECONNRESET - ECONNREFUSED - ENOBUFS Right? Thanks, Jianfeng > or something else to that effect. In all other cases, return 0 (that > includes cases where we sent N messages but M replies where N != M). > So, in other words, return 0 if we *could have succeeded* if nothing > went wrong on the other side, and only return -1 if something went > wrong on our side. > >> >> >>> >>> -- >>> Thanks, >>> Anatoly > >