From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id C48442BFA for ; Sat, 8 Dec 2018 09:10:27 +0100 (CET) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 96C9C307CDE5; Sat, 8 Dec 2018 08:10:26 +0000 (UTC) Received: from ovpn-112-10.ams2.redhat.com (unknown [10.36.112.10]) by smtp.corp.redhat.com (Postfix) with ESMTP id 26F39103BAAD; Sat, 8 Dec 2018 08:10:20 +0000 (UTC) Message-ID: <1544256619.5629.8.camel@redhat.com> From: Mohammed Gamal To: Stephen Hemminger Cc: dev@dpdk.org, maxime coquelin , Yuhui Jiang , Wei Shi Date: Sat, 08 Dec 2018 10:10:19 +0200 In-Reply-To: <20181207111841.29450b51@xeon-e3> References: <1543575881.5400.33.camel@redhat.com> <20181130102756.41332fc2@xeon-e3> <1879110132.59852748.1543604812639.JavaMail.zimbra@redhat.com> <20181204084858.03ecdf98@shemminger-XPS-13-9360> <1543942571.5400.38.camel@redhat.com> <20181205143238.5b4b1ae7@xeon-e3> <1544181343.5629.1.camel@redhat.com> <20181207111841.29450b51@xeon-e3> Organization: Red Hat Content-Type: text/plain; charset="UTF-8" Mime-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Sat, 08 Dec 2018 08:10:26 +0000 (UTC) Subject: Re: [dpdk-dev] Problems running netvsc multiq X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list Reply-To: mgamal@redhat.com List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 08 Dec 2018 08:10:28 -0000 On Fri, 2018-12-07 at 11:18 -0800, Stephen Hemminger wrote: > On Fri, 07 Dec 2018 13:15:43 +0200 > Mohammed Gamal wrote: > > > On Wed, 2018-12-05 at 14:32 -0800, Stephen Hemminger wrote: > > > The problem is a regression in 4.20 kernel. Bisecting now.   > > > > I was bisecting the kernel and the change that seems to introduce > > this > > regression is this one: > > > > commit ae6935ed7d424ffa74d634da00767e7b03c98fd3 > > Author: Stephen Hemminger > > Date:   Fri Sep 14 09:10:17 2018 -0700 > > > >     vmbus: split ring buffer allocation from open > >      > >     The UIO driver needs the ring buffer to be persistent(reused) > >     across open/close. Split the allocation and setup of ring > > buffer > >     out of vmbus_open. For normal usage vmbus_open/vmbus_close > > there > >     are no changes; only impacts uio_hv_generic which needs to keep > >     ring buffer memory and reuse when application restarts. > >      > >     Signed-off-by: Stephen Hemminger > >     Signed-off-by: Greg Kroah-Hartman > > > > Patch posted:  > > From stephen@networkplumber.org Fri Dec  7 10:58:47 2018 > From: Stephen Hemminger > Subject: [PATCH] vmbus: fix subchannel removal > > The changes to split ring allocation from open/close, broke > the cleanup of subchannels. This resulted in problems using > uio on network devices because the subchannel was left behind > when the network device was unbound. > > The cause was in the disconnect logic which used list splice > to move the subchannel list into a local variable. This won't > work because the subchannel list is needed later during the > process of the rescind messages (relid2channel). > > The fix is to just leave the subchannel list in place > which is what the original code did. The list is cleaned > up later when the host rescind is processed. > > Fixes: ae6935ed7d42 ("vmbus: split ring buffer allocation from open") > Signed-off-by: Stephen Hemminger > --- >  drivers/hv/channel.c | 10 +--------- >  1 file changed, 1 insertion(+), 9 deletions(-) > > diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c > index fe00b12e4417..bea4c9850247 100644 > --- a/drivers/hv/channel.c > +++ b/drivers/hv/channel.c > @@ -701,20 +701,12 @@ static int vmbus_close_internal(struct > vmbus_channel *channel) >  int vmbus_disconnect_ring(struct vmbus_channel *channel) >  { >   struct vmbus_channel *cur_channel, *tmp; > - unsigned long flags; > - LIST_HEAD(list); >   int ret; >   >   if (channel->primary_channel != NULL) >   return -EINVAL; >   > - /* Snapshot the list of subchannels */ > - spin_lock_irqsave(&channel->lock, flags); > - list_splice_init(&channel->sc_list, &list); > - channel->num_sc = 0; > - spin_unlock_irqrestore(&channel->lock, flags); > - > - list_for_each_entry_safe(cur_channel, tmp, &list, sc_list) { > + list_for_each_entry_safe(cur_channel, tmp, &channel- > >sc_list, sc_list) { >   if (cur_channel->rescind) >   wait_for_completion(&cur_channel- > >rescind_event); > Hi Stephen, This works indeed for the first run. In any subsequent runs, I get this testpmd: create a new mbuf pool : n=155456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc Configuring Port 0 (socket 0) hn_dev_configure():  >> hn_rndis_link_status(): link status 0x40020006 hn_subchan_configure(): open 1 subchannels vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 [...] vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 vmbus_uio_get_subchan(): ring mmap not found (yet) for: 19 ^C Signal 2 received, preparing to exit... LATENCY_STATS: failed to remove Rx callback for pid=0, qid=0 LATENCY_STATS: failed to remove Rx callback for pid=0, qid=1 LATENCY_STATS: failed to remove Tx callback for pid=0, qid=0 LATENCY_STATS: failed to remove Tx callback for pid=0, qid=1 Shutting down port 0... Stopping ports... Done Closing ports... Port 0 is now not stopped Done Bye... Do you see that on your end as well?