From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <seanbh@gmail.com>
Received: from mail-oi0-f67.google.com (mail-oi0-f67.google.com
 [209.85.218.67]) by dpdk.org (Postfix) with ESMTP id C425C98;
 Thu, 23 Aug 2018 13:19:54 +0200 (CEST)
Received: by mail-oi0-f67.google.com with SMTP id l202-v6so1691549oig.7;
 Thu, 23 Aug 2018 04:19:54 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;
 h=mime-version:references:in-reply-to:from:date:message-id:subject:to
 :cc:content-transfer-encoding;
 bh=ZVt3/XvUA7hwb23dPEyjPU4mxXcjPmyHIc+Ssit53D8=;
 b=jLWi99vVJ+6kImmUEi1MjTPcgx2chYaFXz4a46eEuXupUF5Z6Th7mOwJ5BsUgrz9TT
 MgqXRj7gGJ9jRTUy3q7Kx3VkH0dVyeWXpocHScxd6JtnYL5zWSFrLnLviu1ctU4aovk0
 Ut+NpUg+tdBFYNwCxTX+JqynRBZtfpjbv6mKpvcf27HjCA8i37J0f9hT4FeXyALgTcyx
 c/w6Ka77WwKx4gXJ6+EXgIkPzl7CRVzojCUBZCLv8fXa+x7hNZV4+EoJotlH5s7he1jt
 kamdRAHxKVsT9ziSzUGlLM2JNyNiMu0xLckC5eY8Y0ZKJNmsIYkpxz+DuBmLu4kLL1/S
 hPjw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc:content-transfer-encoding;
 bh=ZVt3/XvUA7hwb23dPEyjPU4mxXcjPmyHIc+Ssit53D8=;
 b=P9bRUpx/whLoaultKP9CDSllUWw/iO/UC8qmU+YWMiqBWTJnH7lhN4dH05xRJG1DTm
 5fW7efkaWwyxJN0iyab/JsU9RIkhrjCgmWW98+TuadnlzNZYfDrFFdkO7ppPMNnzKi5W
 Ps6U5DDGwv3AIErtJkTZvIZo7rD2UFM7V9uhERnDtoSUcCuq14yNbGM83IRTiUrlmnA/
 YvnLpzVA6fNyw1wKV/Pnzeo1BZD1mdkkmj8sIQ/F2EsTow/Ujjdqw6Qr7dqL5oCzuN/e
 MH110R8q0J9/S9MCkP73K1XJlaNWqc9fJJAvKdMjRsTDuG38wjbDWGyaKqIQeaJe0eht
 4dww==
X-Gm-Message-State: APzg51DkOLS0Iak/tg5j4y+jm+8HGqPvWaUHvBHhRz5LkRn0I7maXkkr
 8cL//F87kVmTijbAe/XJp2crTQCKhzJpA9J33us=
X-Google-Smtp-Source: ANB0VdZBotvedYw0uzm+qrrKR9wrjzY0sHrrwl+6LIwRalWMJv8CPgUCuhoF9VSgd+NuWWygQ/1m1dkZP9tJ0+8kJuk=
X-Received: by 2002:aca:5ad5:: with SMTP id
 o204-v6mr8273986oib.26.1535023194015; 
 Thu, 23 Aug 2018 04:19:54 -0700 (PDT)
MIME-Version: 1.0
References: <20180823025721.18300-1-tiwei.bie@intel.com>
 <992313c9-cd03-03db-888b-f5dd8c072cf0@intel.com>
In-Reply-To: <992313c9-cd03-03db-888b-f5dd8c072cf0@intel.com>
From: Sean Harte <seanbh@gmail.com>
Date: Thu, 23 Aug 2018 12:19:17 +0100
Message-ID: <CACA5i9YD9cYuti5b7-djuQq9sZRx2t-=aPk853QsG7q7mOztmA@mail.gmail.com>
To: anatoly.burakov@intel.com
Cc: tiwei.bie@intel.com, maxime.coquelin@redhat.com, zhihong.wang@intel.com, 
 dev@dpdk.org, stable@dpdk.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: Re: [dpdk-dev] [PATCH] net/virtio-user: fix memory hotplug support
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Aug 2018 11:19:55 -0000

On Thu, 23 Aug 2018 at 10:05, Burakov, Anatoly
<anatoly.burakov@intel.com> wrote:
>
> On 23-Aug-18 3:57 AM, Tiwei Bie wrote:
> > Deadlock can occur when allocating memory if a vhost-kernel
> > based virtio-user device is in use. Besides, it's possible
> > to have much more than 64 non-contiguous hugepage backed
> > memory regions due to the memory hotplug, which may cause
> > problems when handling VHOST_SET_MEM_TABLE request. A better
> > solution is to have the virtio-user pass all the VA ranges
> > reserved by DPDK to vhost-kernel.
> >
> > Bugzilla ID: 81
> > Fixes: 12ecb2f63b12 ("net/virtio-user: support memory hotplug")
> > Cc: stable@dpdk.org
> >
> > Reported-by: Se=C3=A1n Harte <seanbh@gmail.com>
> > Signed-off-by: Tiwei Bie <tiwei.bie@intel.com>
> > ---
> >   drivers/net/virtio/virtio_user/vhost_kernel.c | 64 ++++++++----------=
-
> >   1 file changed, 27 insertions(+), 37 deletions(-)
> >
> > diff --git a/drivers/net/virtio/virtio_user/vhost_kernel.c b/drivers/ne=
t/virtio/virtio_user/vhost_kernel.c
> > index b2444096c..49bd1b821 100644
> > --- a/drivers/net/virtio/virtio_user/vhost_kernel.c
> > +++ b/drivers/net/virtio/virtio_user/vhost_kernel.c
> > @@ -70,41 +70,12 @@ static uint64_t vhost_req_user_to_kernel[] =3D {
> >       [VHOST_USER_SET_MEM_TABLE] =3D VHOST_SET_MEM_TABLE,
> >   };
> >
> > -struct walk_arg {
> > -     struct vhost_memory_kernel *vm;
> > -     uint32_t region_nr;
> > -};
> > -static int
> > -add_memory_region(const struct rte_memseg_list *msl __rte_unused,
> > -             const struct rte_memseg *ms, size_t len, void *arg)
> > -{
> > -     struct walk_arg *wa =3D arg;
> > -     struct vhost_memory_region *mr;
> > -     void *start_addr;
> > -
> > -     if (wa->region_nr >=3D max_regions)
> > -             return -1;
> > -
> > -     mr =3D &wa->vm->regions[wa->region_nr++];
> > -     start_addr =3D ms->addr;
> > -
> > -     mr->guest_phys_addr =3D (uint64_t)(uintptr_t)start_addr;
> > -     mr->userspace_addr =3D (uint64_t)(uintptr_t)start_addr;
> > -     mr->memory_size =3D len;
> > -     mr->mmap_offset =3D 0;
> > -
> > -     return 0;
> > -}
> > -
> > -/* By default, vhost kernel module allows 64 regions, but DPDK allows
> > - * 256 segments. As a relief, below function merges those virtually
> > - * adjacent memsegs into one region.
> > - */
> >   static struct vhost_memory_kernel *
> >   prepare_vhost_memory_kernel(void)
> >   {
> > +     struct rte_mem_config *mcfg =3D rte_eal_get_configuration()->mem_=
config;
> >       struct vhost_memory_kernel *vm;
> > -     struct walk_arg wa;
> > +     uint32_t region_nr =3D 0, i;
> >
> >       vm =3D malloc(sizeof(struct vhost_memory_kernel) +
> >                       max_regions *
> > @@ -112,15 +83,34 @@ prepare_vhost_memory_kernel(void)
> >       if (!vm)
> >               return NULL;
> >
> > -     wa.region_nr =3D 0;
> > -     wa.vm =3D vm;
> > +     for (i =3D 0; i < RTE_MAX_MEMSEG_LISTS; i++) {
> > +             struct rte_memseg_list *msl =3D &mcfg->memsegs[i];
> > +             struct vhost_memory_region *mr;
> > +             void *start_addr;
> > +             uint64_t len;
>
> There is a rte_memseg_list_walk() - please do not iterate over memseg
> lists manually.
>

rte_memseg_list_walk() can't be used here because
prepare_vhost_memory_kernel() is sometimes called from a memory
callback. It will then hang trying to get a read lock on
memory_hotplug_lock.
I don't think the rte_memseg_list_walk_thread_unsafe() function is
appropriate because prepare_vhost_memory_kernel() may not always be
called from a memory callback.

> >
> > -     if (rte_memseg_contig_walk(add_memory_region, &wa) < 0) {
> > -             free(vm);
> > -             return NULL;
> > +             start_addr =3D msl->base_va;
> > +             len =3D msl->page_sz * msl->memseg_arr.len;
> > +
> > +             if (start_addr =3D=3D NULL || len =3D=3D 0)
> > +                     continue;
> > +
> > +             if (region_nr >=3D max_regions) {
> > +                     free(vm);
> > +                     return NULL;
> > +             }
> > +
> > +             mr =3D &vm->regions[region_nr++];
> > +             mr->guest_phys_addr =3D (uint64_t)(uintptr_t)start_addr;
> > +             mr->userspace_addr =3D (uint64_t)(uintptr_t)start_addr;
> > +             mr->memory_size =3D len;
> > +             mr->mmap_offset =3D 0; /* flags_padding */
> > +
> > +             PMD_DRV_LOG(DEBUG, "index=3D%u, addr=3D%p len=3D%" PRIu64=
,
> > +                             i, start_addr, len);
> >       }
> >
> > -     vm->nregions =3D wa.region_nr;
> > +     vm->nregions =3D region_nr;
> >       vm->padding =3D 0;
> >       return vm;
> >   }
> >
>
>
> --
> Thanks,
> Anatoly