DPDK patches and discussions
 help / color / mirror / Atom feed
From: Tiwei Bie <tiwei.bie@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Cc: maxime.coquelin@redhat.com, zhihong.wang@intel.com, dev@dpdk.org,
	seanbh@gmail.com
Subject: Re: [dpdk-dev] [PATCH 2/3] net/virtio-user: avoid parsing process mappings
Date: Mon, 17 Sep 2018 19:57:51 +0800	[thread overview]
Message-ID: <20180917115751.GA7807@debian> (raw)
In-Reply-To: <e299be5a-59d6-16cb-b201-baa62f566db9@intel.com>

On Mon, Sep 17, 2018 at 11:17:42AM +0100, Burakov, Anatoly wrote:
> On 10-Sep-18 4:59 AM, Tiwei Bie wrote:
> > On Fri, Sep 07, 2018 at 01:21:35PM +0100, Burakov, Anatoly wrote:
> > > On 07-Sep-18 12:35 PM, Tiwei Bie wrote:
> > > > On Fri, Sep 07, 2018 at 10:39:16AM +0100, Burakov, Anatoly wrote:
> > > > > On 05-Sep-18 5:28 AM, Tiwei Bie wrote:
> > > > > > Recently some memory APIs were introduced to allow users to
> > > > > > get the file descriptor and offset for each memory segment.
> > > > > > We can leverage those APIs to get rid of the /proc magic on
> > > > > > memory table preparation in vhost-user backend.
> > > > > > 
> > > > > > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > > > > > ---
> > > > > 
> > > > > <snip>
> > > > > 
> > > > > > -	for (i = 0; i < num; ++i) {
> > > > > > -		mr = &msg->payload.memory.regions[i];
> > > > > > -		mr->guest_phys_addr = huges[i].addr; /* use vaddr! */
> > > > > > -		mr->userspace_addr = huges[i].addr;
> > > > > > -		mr->memory_size = huges[i].size;
> > > > > > -		mr->mmap_offset = 0;
> > > > > > -		fds[i] = open(huges[i].path, O_RDWR);
> > > > > > +	if (rte_memseg_get_fd_offset_thread_unsafe(ms, &offset) < 0) {
> > > > > > +		PMD_DRV_LOG(ERR, "Failed to get offset, ms=%p rte_errno=%d",
> > > > > > +			ms, rte_errno);
> > > > > > +		return -1;
> > > > > > +	}
> > > > > > +
> > > > > > +	start_addr = (uint64_t)(uintptr_t)ms->addr;
> > > > > > +	end_addr = start_addr + ms->len;
> > > > > > +
> > > > > > +	for (i = 0; i < wa->region_nr; i++) {
> > > > > 
> > > > > There has to be a better way than to run this loop on every segment. Maybe
> > > > > store last-used region, and only do a region look up if there's a mismatch?
> > > > > Generally, in single-file segments mode, you'll get lots of segments from
> > > > > one memseg list one after the other, so you will need to do a lookup once
> > > > > memseg list changes, instead of on each segment.
> > > > 
> > > > We may have holes in one memseg list due to dynamic free.
> > > > Do you mean we just need to do rte_memseg_contig_walk()
> > > > and we can assume that fds of the contiguous memegs will
> > > > be the same?
> > > 
> > > No, i didn't mean that.
> > > 
> > > Whether or not you are in single-file segments mode, you still need to scan
> > > each segment. However, you lose your state when you exit this function, and
> > > thus have to look up the appropriate memory region (that matches your fd)
> > > again, over and over. It would be good if you could store last-used memory
> > > region somewhere, so that next time you come back into this function, if the
> > > memseg has the same fd, you will not have to look it up.
> > > 
> > > Something like this:
> > > 
> > > struct walk_arg {
> > > 	*last_used;
> > > 	<snip>
> > > }
> > > 
> > > int walk_func() {
> > > 	<snip>
> > > 	cur_region = wa->last_used; // check if it matches
> > > 	if (cur->region->fd != fd) {
> > > 		// fd is different - we've changed the segment
> > > 		<snip>
> > > 		wa->last_used = cur_region
> > > 	}
> > > }
> > 
> > Thanks for the code. :)
> > 
> > > 
> > > So, cache last used region to not have to look it up again, because chances
> > > are, you won't have to. That way, you will still do region lookups, but only
> > > if you have to - not every time.
> > 
> > I can do it, but I'm not sure this optimization is really
> > necessary. Because this loop should be quite fast, as the
> > max number of regions permitted by vhost-user is quite
> > small. And actually we need to do that loop at least once
> > for each packet in vhost-user's dequeue and enqueue path,
> > i.e. the data path.
> 
> The number of regions is small, but the number of segments may be in the
> thousands. Think worst case - 8K segments in the 16th region

The number of regions permitted by vhost-user is 8.
And most likely, we just have two regions as the
single-file-segment mode is mandatory when using
2MB pages.

> - with my code,
> you will execute only 16 iterations on first segment and use "last used" for
> the rest of the segments,

We still need to do 8K iterations on the segments.

> while with your code, it'll be 8K times 16 :)

IMO, what we really need is a way to reduce "8K",
i.e. reduce the number of segments (which could
be thousands currently) we need to parse.

And the loop should be faster than the function
call to rte_memseg_get_fd_thread_unsafe() and
rte_memseg_get_fd_offset_thread_unsafe() (which
are also called for each segment).

> 
> You'll have to clarify the "for each packet" part, not sure i follow.

Take the vhost-PMD as an example, when doing Rx burst
and Tx burst, for each mbuf (i.e. packet), we need to
do that loop at least once.

> 
> -- 
> Thanks,
> Anatoly

  reply	other threads:[~2018-09-17 11:58 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-05  4:28 [dpdk-dev] [PATCH 0/3] Some fixes/improvements for virtio-user memory table Tiwei Bie
2018-09-05  4:28 ` [dpdk-dev] [PATCH 1/3] net/virtio-user: fix deadlock in memory events callback Tiwei Bie
2018-09-05  8:10   ` Sean Harte
2018-09-07  9:30   ` Burakov, Anatoly
2018-09-11 12:52   ` Maxime Coquelin
2018-09-05  4:28 ` [dpdk-dev] [PATCH 2/3] net/virtio-user: avoid parsing process mappings Tiwei Bie
2018-09-07  9:39   ` Burakov, Anatoly
2018-09-07 11:35     ` Tiwei Bie
2018-09-07 12:21       ` Burakov, Anatoly
2018-09-10  3:59         ` Tiwei Bie
2018-09-17 10:17           ` Burakov, Anatoly
2018-09-17 11:57             ` Tiwei Bie [this message]
2018-09-17 13:06               ` Burakov, Anatoly
2018-09-11 12:58   ` Maxime Coquelin
2018-09-05  4:28 ` [dpdk-dev] [PATCH 3/3] net/virtio-user: fix memory hotplug support in vhost-kernel Tiwei Bie
2018-09-07  9:44   ` Burakov, Anatoly
2018-09-07 11:37     ` Tiwei Bie
2018-09-07 12:24       ` Burakov, Anatoly
2018-09-10  4:04         ` Tiwei Bie
2018-09-17 10:18           ` Burakov, Anatoly
2018-09-11 13:10   ` Maxime Coquelin
2018-09-11 13:11 ` [dpdk-dev] [PATCH 0/3] Some fixes/improvements for virtio-user memory table Maxime Coquelin
2018-09-20  7:35 ` Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180917115751.GA7807@debian \
    --to=tiwei.bie@intel.com \
    --cc=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=maxime.coquelin@redhat.com \
    --cc=seanbh@gmail.com \
    --cc=zhihong.wang@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).