DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: "Richardson, Bruce" <bruce.richardson@intel.com>,
	"thomas@monjalon.net" <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [PATCH v2 3/4] eal: add synchronous multi-process communication
Date: Wed, 17 Jan 2018 21:09:22 +0800	[thread overview]
Message-ID: <74ccd840-86af-4dba-e5ba-494017052841@intel.com> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB9772588627EE16@irsmsx105.ger.corp.intel.com>



On 1/17/2018 6:50 PM, Ananyev, Konstantin wrote:
>
>>> Hi Jianfeng,
>>>
>>>> -----Original Message-----
>>>> From: Tan, Jianfeng
>>>> Sent: Tuesday, January 16, 2018 8:11 AM
>>>> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org; Burakov, Anatoly <anatoly.burakov@intel.com>
>>>> Cc: Richardson, Bruce <bruce.richardson@intel.com>; thomas@monjalon.net
>>>> Subject: Re: [PATCH v2 3/4] eal: add synchronous multi-process communication
>>>>
>>>> Thank you, Konstantin and Anatoly firstly. Other comments are well
>>>> received and I'll send out a new version.
>>>>
>>>>
>>>> On 1/16/2018 8:00 AM, Ananyev, Konstantin wrote:
>>>>>> We need the synchronous way for multi-process communication,
>>>>>> i.e., blockingly waiting for reply message when we send a request
>>>>>> to the peer process.
>>>>>>
>>>>>> We add two APIs rte_eal_mp_request() and rte_eal_mp_reply() for
>>>>>> such use case. By invoking rte_eal_mp_request(), a request message
>>>>>> is sent out, and then it waits there for a reply message. The
>>>>>> timeout is hard-coded 5 Sec. And the replied message will be copied
>>>>>> in the parameters of this API so that the caller can decide how
>>>>>> to translate those information (including params and fds). Note
>>>>>> if a primary process owns multiple secondary processes, this API
>>>>>> will fail.
>>>>>>
>>>>>> The API rte_eal_mp_reply() is always called by an mp action handler.
>>>>>> Here we add another parameter for rte_eal_mp_t so that the action
>>>>>> handler knows which peer address to reply.
>>>>>>
>>>>>> We use mutex in rte_eal_mp_request() to guarantee that only one
>>>>>> request is on the fly for one pair of processes.
>>>>> You don't need to do things in such strange and restrictive way.
>>>>> Instead you can do something like that:
>>>>> 1) Introduce new struct, list for it and mutex
>>>>>     struct sync_request {
>>>>>          int reply_received;
>>>>>          char dst[PATH_MAX];
>>>>>          char reply[...];
>>>>>          LIST_ENTRY(sync_request) next;
>>>>> };
>>>>>
>>>>> static struct
>>>>>        LIST_HEAD(list, sync_request);
>>>>>        pthread_mutex_t lock;
>>>>>       pthead_cond_t cond;
>>>>> } sync_requests;
>>>>>
>>>>> 2) then at request() call:
>>>>>      Grab sync_requests.lock
>>>>>      Check do we already have a pending request for that destination,
>>>>>      If yes - the release the lock and returns with error.
>>>>>      - allocate and init new sync_request struct, set reply_received=0
>>>>>      - do send_msg()
>>>>>      -then in a cycle:
>>>>>      pthread_cond_timed_wait(&sync_requests.cond, &sync_request.lock, &timespec);
>>>>>      - at return from it check if sync_request.reply_received == 1, if not
>>>>> check if timeout expired and either return a failure or go to the start of the cycle.
>>>>>
>>>>> 3) at mp_handler() if REPLY received - grab sync_request.lock,
>>>>>        search through sync_requests.list for dst[] ,
>>>>>       if found, then set it's reply_received=1, copy the received message into reply
>>>>>       and call pthread_cond_braodcast((&sync_requests.cond);
>>>> The only benefit I can see is that now the sender can request to
>>>> multiple receivers at the same time. And it makes things more
>>>> complicated. Do we really need this?
>>> The benefit is that one thread is blocked waiting for response,
>>> your mp_handler can still receive and handle other messages.
>> This can already be done in the original implementation. mp_handler
>> listens for msg, request from the other peer(s), and replies the
>> requests, which is not affected.
>>
>>> Plus as you said - other threads can keep sending messages.
>> For this one, in the original implementation, other threads can still
>> send msg, but not request. I suppose the request is not in a fast path,
>> why we care to make it fast?
>>
> +int
> +rte_eal_mp_request(const char *action_name,
> +		   void *params,
> +		   int len_p,
> +		   int fds[],
> +		   int fds_in,
> +		   int fds_out)
> +{
> +	int i, j;
> +	int sockfd;
> +	int nprocs;
> +	int ret = 0;
> +	struct mp_msghdr *req;
> +	struct timeval tv;
> +	char buf[MAX_MSG_LENGTH];
> +	struct mp_msghdr *hdr;
> +
> +	RTE_LOG(DEBUG, EAL, "request: %s\n", action_name);
> +
> +	if (fds_in > SCM_MAX_FD || fds_out > SCM_MAX_FD) {
> +		RTE_LOG(ERR, EAL, "Cannot send more than %d FDs\n", SCM_MAX_FD);
> +		rte_errno = -E2BIG;
> +		return 0;
> +	}
> +
> +	req = format_msg(action_name, params, len_p, fds_in, MP_REQ);
> +	if (req == NULL)
> +		return 0;
> +
> +	if ((sockfd = open_unix_fd(0)) < 0) {
> +		free(req);
> +		return 0;
> +	}
> +
> +	tv.tv_sec = 5;  /* 5 Secs Timeout */
> +	tv.tv_usec = 0;
> +	if (setsockopt(sockfd, SOL_SOCKET, SO_RCVTIMEO,
> +			(const void *)&tv, sizeof(struct timeval)) < 0)
> +		RTE_LOG(INFO, EAL, "Failed to set recv timeout\n");
>
> I f you set it just for one call, why do you not restore it?

Yes, original code is buggy, I should have put it into the critical section.

Do you mean we just create once and use for ever? if yes, we could put 
the open and setting into mp_init().

> Also I don't think it is a good idea to change it here -
> if you'll make timeout a parameter value - then it could be overwritten
> by different threads.

For simplicity, I'm not inclined to put the timeout as an parameter 
exposing to caller. So if you agree, I'll put it into the mp_init() with 
open.

>
> +
> +	/* Only allow one req at a time */
> +	pthread_mutex_lock(&mp_mutex_request);
> +
> +	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
> +		nprocs = 0;
> +		for (i = 0; i < MAX_SECONDARY_PROCS; ++i)
> +			if (!mp_sec_sockets[i]) {
> +				j = i;
> +				nprocs++;
> +			}
> +
> +		if (nprocs > 1) {
> +			RTE_LOG(ERR, EAL,
> +				"multi secondary processes not supported\n");
> +			goto free_and_ret;
> +		}
> +
> +		ret = send_msg(sockfd, mp_sec_sockets[j], req, fds);
>
> As I remember - sndmsg() is also blocking call, so under some conditions you can stall
> there forever.

 From linux's unix_diagram_sendmsg(), we see:
     timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);

I assume it will not block for datagram unix socket in Linux. But I'm 
not sure what it behaves in freebsd.

Anyway, better to add an explicit setsockopt() to make it not blocking.

> As mp_mutex_requestis still held - next rte_eal_mp_request(0 will also block forever here.
>
> +	} else
> +		ret = send_msg(sockfd, eal_mp_unix_path(), req, fds);
> +
> +	if (ret == 0) {
> +		RTE_LOG(ERR, EAL, "failed to send request: %s\n", action_name);
> +		ret = -1;
> +		goto free_and_ret;
> +	}
> +
> +	ret = read_msg(sockfd, buf, MAX_MSG_LENGTH, fds, fds_out, NULL);
>
> if the message you receive is not a reply you are expecting -
> it will be simply dropped - mp_handler() would never process it.

We cannot detect if it's the right reply absolutely correctly, but just 
check the action_name, which means, it still possibly gets a wrong reply 
if an action_name contains multiple requests.

Is just comparing the action_name acceptable?

>
> +	if (ret > 0) {
> +		hdr = (struct mp_msghdr *)buf;
> +		if (hdr->len_params == len_p)
> +			memcpy(params, hdr->params, len_p);
> +		else {
> +			RTE_LOG(ERR, EAL, "invalid reply\n");
> +			ret = 0;
> +		}
> +	}
> +
> +free_and_ret:
> +	free(req);
> +	close(sockfd);
> +	pthread_mutex_unlock(&mp_mutex_request);
> +	return ret;
> +}
>
> All of the above makes me think that current implementation is erroneous
> and needs to be reworked.

Thank you for your review. I'll work on a new version.

Thanks,
Jianfeng

> Konstantin
>
>

  reply	other threads:[~2018-01-17 13:09 UTC|newest]

Thread overview: 88+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-11-30 18:44 [dpdk-dev] [PATCH 0/3] generic channel for " Jianfeng Tan
2017-11-30 18:44 ` [dpdk-dev] [PATCH 1/3] eal: add " Jianfeng Tan
2017-12-11 11:04   ` Burakov, Anatoly
2017-12-11 16:43   ` Ananyev, Konstantin
2017-11-30 18:44 ` [dpdk-dev] [PATCH 2/3] eal: add synchronous " Jianfeng Tan
2017-12-11 11:39   ` Burakov, Anatoly
2017-12-11 16:49     ` Ananyev, Konstantin
2017-11-30 18:44 ` [dpdk-dev] [PATCH 3/3] vfio: use the generic multi-process channel Jianfeng Tan
2017-12-11 12:01   ` Burakov, Anatoly
2017-12-11  9:59 ` [dpdk-dev] [PATCH 0/3] generic channel for multi-process communication Burakov, Anatoly
2017-12-12  7:34   ` Tan, Jianfeng
2017-12-12 16:18     ` Burakov, Anatoly
2018-01-11  4:07 ` [dpdk-dev] [PATCH v2 0/4] " Jianfeng Tan
2018-01-11  4:07   ` [dpdk-dev] [PATCH v2 1/4] eal: add " Jianfeng Tan
2018-01-13 12:57     ` Burakov, Anatoly
2018-01-15 19:52     ` Ananyev, Konstantin
2018-01-11  4:07   ` [dpdk-dev] [PATCH v2 2/4] eal: add and del secondary processes in the primary Jianfeng Tan
2018-01-13 13:11     ` Burakov, Anatoly
2018-01-15 21:45     ` Ananyev, Konstantin
2018-01-11  4:07   ` [dpdk-dev] [PATCH v2 3/4] eal: add synchronous multi-process communication Jianfeng Tan
2018-01-13 13:41     ` Burakov, Anatoly
2018-01-16  0:00     ` Ananyev, Konstantin
2018-01-16  8:10       ` Tan, Jianfeng
2018-01-16 11:12         ` Ananyev, Konstantin
2018-01-16 16:47           ` Tan, Jianfeng
2018-01-17 10:50             ` Ananyev, Konstantin
2018-01-17 13:09               ` Tan, Jianfeng [this message]
2018-01-17 13:15                 ` Tan, Jianfeng
2018-01-17 17:20                 ` Ananyev, Konstantin
2018-01-11  4:07   ` [dpdk-dev] [PATCH v2 4/4] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-13 14:03     ` Burakov, Anatoly
2018-03-04 14:57     ` [dpdk-dev] [PATCH v5] vfio: change to use " Jianfeng Tan
2018-03-14 13:27       ` Burakov, Anatoly
2018-03-19  6:53         ` Tan, Jianfeng
2018-03-20 10:33           ` Burakov, Anatoly
2018-03-20 10:56             ` Burakov, Anatoly
2018-03-20  8:50     ` [dpdk-dev] [PATCH v6] " Jianfeng Tan
2018-04-05 14:26       ` Tan, Jianfeng
2018-04-05 14:39         ` Burakov, Anatoly
2018-04-12 23:27         ` Thomas Monjalon
2018-04-12 15:26       ` Burakov, Anatoly
2018-04-15 15:06     ` [dpdk-dev] [PATCH v7] " Jianfeng Tan
2018-04-15 15:10       ` Tan, Jianfeng
2018-04-17 23:04       ` Thomas Monjalon
2018-01-25  4:16 ` [dpdk-dev] [PATCH v3 0/3] generic channel for multi-process communication Jianfeng Tan
2018-01-25  4:16   ` [dpdk-dev] [PATCH v3 1/3] eal: add " Jianfeng Tan
2018-01-25 10:41     ` Thomas Monjalon
2018-01-25 11:27     ` Burakov, Anatoly
2018-01-25 11:34       ` Thomas Monjalon
2018-01-25 12:21     ` Ananyev, Konstantin
2018-01-25  4:16   ` [dpdk-dev] [PATCH v3 2/3] eal: add synchronous " Jianfeng Tan
2018-01-25 12:00     ` Burakov, Anatoly
2018-01-25 12:19       ` Burakov, Anatoly
2018-01-25 12:19       ` Ananyev, Konstantin
2018-01-25 12:25         ` Burakov, Anatoly
2018-01-25 13:00           ` Ananyev, Konstantin
2018-01-25 13:05             ` Burakov, Anatoly
2018-01-25 13:10               ` Burakov, Anatoly
2018-01-25 15:03                 ` Ananyev, Konstantin
2018-01-25 16:22                   ` Burakov, Anatoly
2018-01-25 17:10                     ` Tan, Jianfeng
2018-01-25 18:02                       ` Burakov, Anatoly
2018-01-25 12:22     ` Ananyev, Konstantin
2018-01-25  4:16   ` [dpdk-dev] [PATCH v3 3/3] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-25 10:47     ` Thomas Monjalon
2018-01-25 10:52       ` Burakov, Anatoly
2018-01-25 10:57         ` Thomas Monjalon
2018-01-25 12:15           ` Burakov, Anatoly
2018-01-25 19:14 ` [dpdk-dev] [PATCH v4 0/2] generic channel for multi-process communication Jianfeng Tan
2018-01-25 19:14   ` [dpdk-dev] [PATCH v4 1/2] eal: add synchronous " Jianfeng Tan
2018-01-25 19:14   ` [dpdk-dev] [PATCH v4 2/2] vfio: use the generic multi-process channel Jianfeng Tan
2018-01-25 19:15   ` [dpdk-dev] [PATCH v4 0/2] generic channel for multi-process communication Tan, Jianfeng
2018-01-25 19:21 ` [dpdk-dev] [PATCH v5 " Jianfeng Tan
2018-01-25 19:21   ` [dpdk-dev] [PATCH v5 1/2] eal: add " Jianfeng Tan
2018-01-25 19:21   ` [dpdk-dev] [PATCH v5 2/2] eal: add synchronous " Jianfeng Tan
2018-01-25 21:23   ` [dpdk-dev] [PATCH v5 0/2] generic channel for " Thomas Monjalon
2018-01-26  3:41 ` [dpdk-dev] [PATCH v6 " Jianfeng Tan
2018-01-26  3:41   ` [dpdk-dev] [PATCH v6 1/2] eal: add " Jianfeng Tan
2018-01-26 10:25     ` Burakov, Anatoly
2018-01-29  6:37       ` Tan, Jianfeng
2018-01-29  9:37         ` Burakov, Anatoly
2018-01-26  3:41   ` [dpdk-dev] [PATCH v6 2/2] eal: add synchronous " Jianfeng Tan
2018-01-26 10:31     ` Burakov, Anatoly
2018-01-29 23:52   ` [dpdk-dev] [PATCH v6 0/2] generic channel for " Thomas Monjalon
2018-01-30  6:58 ` [dpdk-dev] [PATCH v7 " Jianfeng Tan
2018-01-30  6:58   ` [dpdk-dev] [PATCH v7 1/2] eal: add " Jianfeng Tan
2018-01-30  6:58   ` [dpdk-dev] [PATCH v7 2/2] eal: add synchronous " Jianfeng Tan
2018-01-30 14:46   ` [dpdk-dev] [PATCH v7 0/2] generic channel for " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=74ccd840-86af-4dba-e5ba-494017052841@intel.com \
    --to=jianfeng.tan@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).