From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DFD57A0561; Thu, 27 Feb 2020 19:25:03 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id E8CB42C4F; Thu, 27 Feb 2020 19:25:02 +0100 (CET) Received: from mail-io1-f48.google.com (mail-io1-f48.google.com [209.85.166.48]) by dpdk.org (Postfix) with ESMTP id CCA782C02 for ; Thu, 27 Feb 2020 19:25:01 +0100 (CET) Received: by mail-io1-f48.google.com with SMTP id d15so540874iog.3 for ; Thu, 27 Feb 2020 10:25:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=NkIT8nzHdY7daRn9WrG3oUrenRcDBKuCDPfBR7PsoRY=; b=GuGaabNMSO4pmpR9BukWVJ+cjhUC4hD7Wjz2oT08I5Jh0PaLnShMV2U5THQAsODRy0 cFn+w69gXVqe8mhPpOQN03jSNIfRiVxoy1sMayKIazTrRma3lZNCyiOAuahwd4aAa3bB iSA+D2JOf/EzyBIY4Nb64i10sd3m6TU1VUgRycnC40NpbvoKCl04I1i5/zIRF/ErnE25 qOfiNt3Rk84vrtRZg85/BHNRhcm/HIRpj8cSVROC3AaqPY8Cwwd3c+5pmMGyubKL5nHP jczNbGyKz4itm7LGs7uibIqyGEsFU1/Fgm4RaVWu05EVaACtuezmb/78IqpDHCA8dLY/ PTPQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=NkIT8nzHdY7daRn9WrG3oUrenRcDBKuCDPfBR7PsoRY=; b=KE6CaiM/a+pEf4YQViaoLiDCJLtuHEEGYgxm6jwqc8QnJOxnyMuJVllO3PoA+pA1OG f4qTCcjvgDmWOSfC4pLMOG3QY45uXNyeZI8x7FuYcVG7fR4XwJOe67qmx4OsXGVn0ZbH shIhUAFJqYT+OeFUX1A9JDTX4LJxynpVh1/cf2KjdNrzdit2pnh46iP3OZTbfkJPkgy0 qMy+1UeVlKtJZlP3stJRswsU4qPP1q4PfM9q9ap3IqaaWl2XK97SP6Hwccm7ZTlVp2GG y4uz8OmB8lR+8HfFQPif2NppVhAvfeku8FkBGtJMk7t6Isuq/QyW4CkmgLEdzRADEAeg C1rA== X-Gm-Message-State: APjAAAW5/jJN1g7zX47T1cg1RQyyX8tlBpLdrHI6FVBcEksm8D5uAYaq HP5mbARXDLrzNDByUPnUZy/IVC8kGuatyiaC44GRe5gA X-Google-Smtp-Source: APXvYqx8UTA5ac1udnKLrxY/D01KsRgTJ+B5Zsn7X5JHSYSVLhZ7YWCZXHZt2NRUPj44X2VTbzb7HugKntoxv30LsF4= X-Received: by 2002:a02:cc58:: with SMTP id i24mr157781jaq.24.1582827900901; Thu, 27 Feb 2020 10:25:00 -0800 (PST) MIME-Version: 1.0 References: <20200227094704.3ab4aa25@hermes.lan> In-Reply-To: <20200227094704.3ab4aa25@hermes.lan> From: Min Tang Date: Thu, 27 Feb 2020 13:24:49 -0500 Message-ID: To: Stephen Hemminger Cc: dev@dpdk.org Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] net/netvsc: subchannel configuration failed due to unexpected NVS response X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" That quick fix was just to verify my guess. I agree that it needs more comprehensive fix. Yes, race condition is another issue here. In addition to that, I think in the function that sends the NVS_TYPE_RNDIS message, it needs to drain the response message. I looked at the netvsc driver in Linux kernel, it receives all the VMBus messages anachronously in another thread. That's probably something we can think about in the DPDK driver. On Thu, Feb 27, 2020 at 12:47 PM Stephen Hemminger < stephen@networkplumber.org> wrote: > On Thu, 27 Feb 2020 11:16:01 -0500 > Min Tang wrote: > > > Hi Stephen: > > > > I saw the following error messages when using DPDK 18.11.2 in Azure: > > > > hn_nvs_execute(): unexpected NVS resp 0x6b, expect 0x85 > > hn_dev_configure(): subchannel configuration failed > > > > It was not easy to reproduce it and it only occurred with multiple queues > > enabled. In hn_nvs_execute it expects the response to match the request. > In > > the failed case, it was expecting NVS_TYPE_SUBCH_REQ (133 or 0x85) but > > got NVS_TYPE_RNDIS(107 or 0x6b). Obviously somewhere the NVS_TYPE_RNDIS > > message had been sent before the NVS_TYPE_SUBCH_REQ message was sent. I > > looked at the code and found that the NVS_TYPE_RNDIS message needs > > completion response but it does not receive the response message > anywhere. > > The fix would be receiving and discarding the wrong response message(s). > > > > I put the following patches and it has fixed the problem. > > > > --- a/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:08:29.755530969 -0500 > > +++ b/drivers/net/netvsc/hn_nvs.c 2020-02-27 11:07:21.567371798 -0500 > > @@ -92,7 +92,7 @@ > > if (hdr->type != type) { > > PMD_DRV_LOG(ERR, "unexpected NVS resp %#x, expect %#x", > > hdr->type, type); > > - goto retry; > > + return -EINVAL; > > } > > > > if (len < resplen) { > > Thanks for the analysis. Not sure if this the right fix. > Looks like the control channel needs additional locking. > Having two outstanding requests at once is not going to work well. >