From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id 115B41B1A9 for ; Wed, 17 Jan 2018 11:50:25 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 17 Jan 2018 02:50:25 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.46,372,1511856000"; d="scan'208";a="10920031" Received: from irsmsx107.ger.corp.intel.com ([163.33.3.99]) by fmsmga007.fm.intel.com with ESMTP; 17 Jan 2018 02:50:24 -0800 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.236]) by IRSMSX107.ger.corp.intel.com ([169.254.10.239]) with mapi id 14.03.0319.002; Wed, 17 Jan 2018 10:50:22 +0000 From: "Ananyev, Konstantin" To: "Tan, Jianfeng" , "dev@dpdk.org" , "Burakov, Anatoly" CC: "Richardson, Bruce" , "thomas@monjalon.net" Thread-Topic: [PATCH v2 3/4] eal: add synchronous multi-process communication Thread-Index: AQHTipGGLptO92489k+SILW6rHFPTqN1nPiQgACQ2ICAADIq4IAAXlsAgAEoDUA= Date: Wed, 17 Jan 2018 10:50:22 +0000 Message-ID: <2601191342CEEE43887BDE71AB9772588627EE16@irsmsx105.ger.corp.intel.com> References: <1512067450-59203-1-git-send-email-jianfeng.tan@intel.com> <1515643654-129489-1-git-send-email-jianfeng.tan@intel.com> <1515643654-129489-4-git-send-email-jianfeng.tan@intel.com> <2601191342CEEE43887BDE71AB9772588627E0E5@irsmsx105.ger.corp.intel.com> <5733adcf-ef47-2c07-a39c-7eda01add6e0@intel.com> <2601191342CEEE43887BDE71AB9772588627E2D8@irsmsx105.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMjNjYjZkYTktMDJhMi00NzBiLTg5NzctNWViMWM3ODk3ZjcyIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6ImFETFdRQmZ5N3FHOWtZTmE2ckFZekVjR3BNemhTeVN1b01oaHc5Nmx1ZlE9In0= x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.0.0.116 dlp-reaction: no-action x-originating-ip: [163.33.239.180] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 3/4] eal: add synchronous multi-process communication X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 17 Jan 2018 10:50:26 -0000 > > Hi Jianfeng, > > > >> -----Original Message----- > >> From: Tan, Jianfeng > >> Sent: Tuesday, January 16, 2018 8:11 AM > >> To: Ananyev, Konstantin ; dev@dpdk.org; = Burakov, Anatoly > >> Cc: Richardson, Bruce ; thomas@monjalon.ne= t > >> Subject: Re: [PATCH v2 3/4] eal: add synchronous multi-process communi= cation > >> > >> Thank you, Konstantin and Anatoly firstly. Other comments are well > >> received and I'll send out a new version. > >> > >> > >> On 1/16/2018 8:00 AM, Ananyev, Konstantin wrote: > >>>> We need the synchronous way for multi-process communication, > >>>> i.e., blockingly waiting for reply message when we send a request > >>>> to the peer process. > >>>> > >>>> We add two APIs rte_eal_mp_request() and rte_eal_mp_reply() for > >>>> such use case. By invoking rte_eal_mp_request(), a request message > >>>> is sent out, and then it waits there for a reply message. The > >>>> timeout is hard-coded 5 Sec. And the replied message will be copied > >>>> in the parameters of this API so that the caller can decide how > >>>> to translate those information (including params and fds). Note > >>>> if a primary process owns multiple secondary processes, this API > >>>> will fail. > >>>> > >>>> The API rte_eal_mp_reply() is always called by an mp action handler. > >>>> Here we add another parameter for rte_eal_mp_t so that the action > >>>> handler knows which peer address to reply. > >>>> > >>>> We use mutex in rte_eal_mp_request() to guarantee that only one > >>>> request is on the fly for one pair of processes. > >>> You don't need to do things in such strange and restrictive way. > >>> Instead you can do something like that: > >>> 1) Introduce new struct, list for it and mutex > >>> struct sync_request { > >>> int reply_received; > >>> char dst[PATH_MAX]; > >>> char reply[...]; > >>> LIST_ENTRY(sync_request) next; > >>> }; > >>> > >>> static struct > >>> LIST_HEAD(list, sync_request); > >>> pthread_mutex_t lock; > >>> pthead_cond_t cond; > >>> } sync_requests; > >>> > >>> 2) then at request() call: > >>> Grab sync_requests.lock > >>> Check do we already have a pending request for that destination, > >>> If yes - the release the lock and returns with error. > >>> - allocate and init new sync_request struct, set reply_received= =3D0 > >>> - do send_msg() > >>> -then in a cycle: > >>> pthread_cond_timed_wait(&sync_requests.cond, &sync_request.lock, = ×pec); > >>> - at return from it check if sync_request.reply_received =3D=3D 1= , if not > >>> check if timeout expired and either return a failure or go to the sta= rt of the cycle. > >>> > >>> 3) at mp_handler() if REPLY received - grab sync_request.lock, > >>> search through sync_requests.list for dst[] , > >>> if found, then set it's reply_received=3D1, copy the received me= ssage into reply > >>> and call pthread_cond_braodcast((&sync_requests.cond); > >> The only benefit I can see is that now the sender can request to > >> multiple receivers at the same time. And it makes things more > >> complicated. Do we really need this? > > The benefit is that one thread is blocked waiting for response, > > your mp_handler can still receive and handle other messages. >=20 > This can already be done in the original implementation. mp_handler > listens for msg, request from the other peer(s), and replies the > requests, which is not affected. >=20 > > Plus as you said - other threads can keep sending messages. >=20 > For this one, in the original implementation, other threads can still > send msg, but not request. I suppose the request is not in a fast path, > why we care to make it fast? >=20 +int +rte_eal_mp_request(const char *action_name, + void *params, + int len_p, + int fds[], + int fds_in, + int fds_out) +{ + int i, j; + int sockfd; + int nprocs; + int ret =3D 0; + struct mp_msghdr *req; + struct timeval tv; + char buf[MAX_MSG_LENGTH]; + struct mp_msghdr *hdr; + + RTE_LOG(DEBUG, EAL, "request: %s\n", action_name); + + if (fds_in > SCM_MAX_FD || fds_out > SCM_MAX_FD) { + RTE_LOG(ERR, EAL, "Cannot send more than %d FDs\n", SCM_MAX_FD); + rte_errno =3D -E2BIG; + return 0; + } + + req =3D format_msg(action_name, params, len_p, fds_in, MP_REQ); + if (req =3D=3D NULL) + return 0; + + if ((sockfd =3D open_unix_fd(0)) < 0) { + free(req); + return 0; + } + + tv.tv_sec =3D 5; /* 5 Secs Timeout */ + tv.tv_usec =3D 0; + if (setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO, + (const void *)&tv, sizeof(struct timeval)) < 0) + RTE_LOG(INFO, EAL, "Failed to set recv timeout\n"); I f you set it just for one call, why do you not restore it? Also I don't think it is a good idea to change it here -=20 if you'll make timeout a parameter value - then it could be overwritten by different threads.=20 + + /* Only allow one req at a time */ + pthread_mutex_lock(&mp_mutex_request); + + if (rte_eal_process_type() =3D=3D RTE_PROC_PRIMARY) { + nprocs =3D 0; + for (i =3D 0; i < MAX_SECONDARY_PROCS; ++i) + if (!mp_sec_sockets[i]) { + j =3D i; + nprocs++; + } + + if (nprocs > 1) { + RTE_LOG(ERR, EAL, + "multi secondary processes not supported\n"); + goto free_and_ret; + } + + ret =3D send_msg(sockfd, mp_sec_sockets[j], req, fds); As I remember - sndmsg() is also blocking call, so under some conditions yo= u can stall there forever. As mp_mutex_requestis still held - next rte_eal_mp_request(0 will also bloc= k forever here. + } else + ret =3D send_msg(sockfd, eal_mp_unix_path(), req, fds); + + if (ret =3D=3D 0) { + RTE_LOG(ERR, EAL, "failed to send request: %s\n", action_name); + ret =3D -1; + goto free_and_ret; + } + + ret =3D read_msg(sockfd, buf, MAX_MSG_LENGTH, fds, fds_out, NULL); if the message you receive is not a reply you are expecting - it will be simply dropped - mp_handler() would never process it. + if (ret > 0) { + hdr =3D (struct mp_msghdr *)buf; + if (hdr->len_params =3D=3D len_p) + memcpy(params, hdr->params, len_p); + else { + RTE_LOG(ERR, EAL, "invalid reply\n"); + ret =3D 0; + } + } + +free_and_ret: + free(req); + close(sockfd); + pthread_mutex_unlock(&mp_mutex_request); + return ret; +} All of the above makes me think that current implementation is erroneous and needs to be reworked. Konstantin