DPDK patches and discussions
 help / color / mirror / Atom feed
From: Jan Wickbom <jan.wickbom@ericsson.com>
To: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	Patrik Andersson R <patrik.r.andersson@ericsson.com>
Subject: Re: [dpdk-dev] [PATCH v3] vhost: allow for many vhost user ports
Date: Tue, 13 Dec 2016 13:15:20 +0000	[thread overview]
Message-ID: <AM4PR0701MB2146454F237F1645EDD3EDB8949B0@AM4PR0701MB2146.eurprd07.prod.outlook.com> (raw)
In-Reply-To: <20161213091454.GB18991@yliu-dev.sh.intel.com>



> -----Original Message-----
> From: Yuanhan Liu [mailto:yuanhan.liu@linux.intel.com]
> Sent: den 13 december 2016 10:15
> To: Jan Wickbom <jan.wickbom@ericsson.com>
> Cc: dev@dpdk.org; Patrik Andersson R <patrik.r.andersson@ericsson.com>
> Subject: Re: [PATCH v3] vhost: allow for many vhost user ports
> 
> On Mon, Dec 12, 2016 at 05:50:34PM +0100, Jan Wickbom wrote:
> > Currently select() is used to monitor file descriptors for vhostuser
> > ports. This limits the number of ports possible to create since the
> > fd number is used as index in the fd_set and we have seen fds > 1023.
> > This patch changes select() to poll(). This way we can keep an
> > packed (pollfd) array for the fds, e.g. as many fds as the size of
> > the array.
> >
> > Also see:
> > http://dpdk.org/ml/archives/dev/2016-April/037024.html
> >
> > Signed-off-by: Jan Wickbom <jan.wickbom@ericsson.com>
> > Reported-by: Patrik Andersson <patrik.r.andersson@ericsson.com>
> ...
> > +static struct pollfd rwfds[MAX_FDS];
> 
> Though it's unlikely, but just assume we have multiple instance of
> fdset_event_dispatch(pfdset), a global rwfds will not work.
> 
> Thought twice, and it's better to put it into the fdset struct:
> 
>     struct fdset {
>             struct fdentry fd[MAX_FDS];
>             struct pollfd rwfds[MAX_FDS];
>             ...
> 

Done


> >  /**
> >   * This functions runs in infinite blocking loop until there is no fd in
> >   * pfdset. It calls corresponding r/w handler if there is event on the fd.
> > @@ -229,55 +217,71 @@
> >  void
> >  fdset_event_dispatch(struct fdset *pfdset)
> >  {
> > -	fd_set rfds, wfds;
> > -	int i, maxfds;
> > +	int i;
> >  	struct fdentry *pfdentry;
> > -	int num = MAX_FDS;
> >  	fd_cb rcb, wcb;
> >  	void *dat;
> >  	int fd;
> >  	int remove1, remove2;
> > -	int ret;
> >
> >  	if (pfdset == NULL)
> >  		return;
> >
> > -	while (1) {
> > -		struct timeval tv;
> > -		tv.tv_sec = 1;
> > -		tv.tv_usec = 0;
> > -		FD_ZERO(&rfds);
> > -		FD_ZERO(&wfds);
> > -		pthread_mutex_lock(&pfdset->fd_mutex);
> > -
> > -		maxfds = fdset_fill(&rfds, &wfds, pfdset);
> > -
> > -		pthread_mutex_unlock(&pfdset->fd_mutex);
> > +	memset(rwfds, 0, sizeof(rwfds));
> >
> > +	while (1) {
> >  		/*
> > -		 * When select is blocked, other threads might
> unregister
> > +		 * When poll is blocked, other threads might
> unregister
> >  		 * listenfds from and register new listenfds into
> fdset.
> > -		 * When select returns, the entries for listenfds
> in the fdset
> > +		 * When poll returns, the entries for listenfds in
> the fdset
> >  		 * might have been updated. It is ok if there is
> unwanted call
> >  		 * for new listenfds.
> >  		 */
> > -		ret = select(maxfds + 1, &rfds, &wfds, NULL,
> &tv);
> > -		if (ret <= 0)
> > -			continue;
> > +		poll(rwfds, pfdset->num, 1000 /* millisecs */);
> >
> > -		for (i = 0; i < num; i++) {
> > -			remove1 = remove2 = 0;
> > +		for (i = 0; i < pfdset->num; ) {
> >  			pthread_mutex_lock(&pfdset-
> >fd_mutex);
> > +
> >  			pfdentry = &pfdset->fd[i];
> >  			fd = pfdentry->fd;
> > +
> > +			if (fd < 0) {
> > +				/* Removed during
> poll */
> > +
> > +
> 	fdset_move_last(pfdset, i);
> > +
> 	fdset_shrink(pfdset);
> > +
> > +
> 	pthread_mutex_unlock(&pfdset->fd_mutex);
> > +
> > +				continue;
> > +			}
> > +
> > +			if (!rwfds[i].revents) {
> > +				/* No revents,
> maybe added during poll */
> > +
> > +				rwfds[i].fd = fd;
> > +				rwfds[i].events =
> pfdentry->rcb ? POLLIN : 0;
> > +				rwfds[i].events |=
> pfdentry->wcb ? POLLOUT : 0;
> > +
> 	pthread_mutex_unlock(&pfdset->fd_mutex);
> > +
> > +				i++;
> > +				continue;
> 
> I think it's error-prone to manipulate the rwfds here. Besides, it
> registers an fd repeatedly.
> 
> The way I think of is:
> 
> - set rwfds[i] at fdset_add().
> 
>   This also simply makes sure that pfdset->rwfds[i] and pfdset->fd[i] is
>   correctly bond.
> 
> 	fdset_add(fdset, fd, ...) {
> 		lock();
> 
> 		i = fdset_find_free_slot(..);
> 
> 		pfdset->fd[i]->fd  = fd;
> 		pfdset->fd[i]->rcb = rcb;
> 		pfdset->fd[i]->...;
> 
> 		pfdset->rwfds[i]->fd = fd;
> 		pfdset->rwfds[i]->events  = ...;
> 		pfdset->rwfds[i]->revents = 0;
> 	}
> 
> 
> - set pfdset->fd[i]->fd = -1 on fdset_del. Note we should not decrease
>   'num' here, as we may be at poll.
> 
> 	fdset_del(fdset, fd) {
> 		lock();
> 
> 		i = fdset_find_fd(pfdset, fd);
> 		pfdset->fd[i]->fd = -1;
> 
> 		...
> 	}
> 
> 
> 
> - log down pfdset->num before poll, because 'num' may increase during poll.
> 
>   I think it's optional, since even 'num' is increased during poll, it just
>   leads to few more rwfds entries will be accessed. But it's not tracked by
>   kernel, and revents is initiated with 0, that there is no issue.
> 
> 
> - shrink the fdset and rwfds (together) for those removed entries,
> __outside__
>   the for loop after poll.
> 
> Works to you?

I did quite of a rework of this yesterday before reading your mail. I think
That should work, please have a look.
V4 coming in a minute
> 
> 	--yliu

  reply	other threads:[~2016-12-13 13:15 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-12-01 15:26 [dpdk-dev] [PATCH] " Jan Wickbom
2016-12-06  6:56 ` Yuanhan Liu
2016-12-06 11:42   ` Jan Wickbom
2016-12-07  7:43     ` Yuanhan Liu
2016-12-07 10:12 ` Yuanhan Liu
2016-12-07 13:23   ` Jan Wickbom
2016-12-08  5:50     ` Yuanhan Liu
2016-12-12 16:55       ` Jan Wickbom
2016-12-07 12:46 ` [dpdk-dev] [PATCH v2] " Jan Wickbom
2016-12-12 16:50 ` [dpdk-dev] [PATCH v3] " Jan Wickbom
2016-12-13  9:14   ` Yuanhan Liu
2016-12-13 13:15     ` Jan Wickbom [this message]
2016-12-13 13:19 ` [dpdk-dev] [PATCH v4] " Jan Wickbom
2016-12-14  3:25   ` Yuanhan Liu
2016-12-14 15:30 ` [dpdk-dev] [PATCH v5] " Jan Wickbom
2016-12-21  9:45   ` [dpdk-dev] [PATCH v6] " Yuanhan Liu
2016-12-21 18:06     ` Stephen Hemminger
2016-12-22  3:16       ` Yuanhan Liu
2017-01-12  4:05     ` Yuanhan Liu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=AM4PR0701MB2146454F237F1645EDD3EDB8949B0@AM4PR0701MB2146.eurprd07.prod.outlook.com \
    --to=jan.wickbom@ericsson.com \
    --cc=dev@dpdk.org \
    --cc=patrik.r.andersson@ericsson.com \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).