From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 29841A0C4D; Mon, 4 Oct 2021 18:19:07 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DC16F41331; Mon, 4 Oct 2021 18:19:06 +0200 (CEST) Received: from mail-il1-f176.google.com (mail-il1-f176.google.com [209.85.166.176]) by mails.dpdk.org (Postfix) with ESMTP id 6E920412E6 for ; Mon, 4 Oct 2021 18:19:05 +0200 (CEST) Received: by mail-il1-f176.google.com with SMTP id r9so18852946ile.5 for ; Mon, 04 Oct 2021 09:19:05 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mswiSURUE7gQxnSVboH7/Db4Q7DbWBN/ju1tnGOgkjk=; b=lVQEpkBZeHjxXXKCG6MjFnTtggFfJ4GSH2M6PQPguyhVNRJYMu+XR/XIC1yyIhQWSd 1YCZLWtNZYwsPDegFDlStDKgrlunuI//z/nBp9iw8X3kJtZ/5fk1AUUP0qMDJUAaQH0I fYpmwckRjkJ66fcUmJhD7v6Jt0G9g55U2waOSLQIlnOv6zdSuBBxrXjOGrKOMTvs8kDE BZj2Uzye/nCtntI0Ifg2AC2xCj4N4gIc5P+AfcDFuDv4TKensV9DTIHPBokQw7beaiqQ kUkQChCAoxvThUrtEUJJaojyQjZKtsKxEDuhtZpLg9884c4fZT1fpI5Oc+JbC8vT7dyy vNng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mswiSURUE7gQxnSVboH7/Db4Q7DbWBN/ju1tnGOgkjk=; b=m5n4tsX+iEz7M+JDLXgHphcT1lUOk04Urz+kL4DL6shRCgwyLeX6yVGdJdK09PyuRw JTb8F+vg2+DKTXfh2ztBKd3snSSbl/Mdxu/LbcHYOfK/xQaWmqA7kMGlBUHhWnXp665k DldrCvBWZSPi99u424ccftfuaZVg1nZzTmBOoyRuHk1oNc33gfrIF4jEFbGqDlKjeIR1 Ysrdwq8l0BVupIYF7HmqhEc9v6RV/E3/ZylFWh0a8QI1jORgv4KQPzga07kIB1rCtWPx pMV5PzAeOX/PYqpBgP131L8N+Tb8IssSLHcjwkei6+4rd7aeebwKL5+NBu+slqAsckhW yC3Q== X-Gm-Message-State: AOAM5308JdR2vSeebF6TJn9rpL+WcTqEflkuBmuT/apiRTEKEMUalhYM sea/GYv1O/CDNyEtTqFAFu07QAdakW5K++qyTeU= X-Google-Smtp-Source: ABdhPJz/9zMEc+66DCj+NJC1jzc1gXewsaVVrKdATA0zKzGZd/LuPLYhO/qCRY8aSN0SeNZJDncarsWX2MKacgacz4M= X-Received: by 2002:a05:6e02:154f:: with SMTP id j15mr10718157ilu.236.1633364344779; Mon, 04 Oct 2021 09:19:04 -0700 (PDT) MIME-Version: 1.0 References: <20210924105409.21711-1-eladv6@gmail.com> <3ae193df-292c-4907-df4a-88ce3d6735fc@intel.com> <1a17d552-8b81-04f9-7594-61e84ea7990f@intel.com> <8525082f-eb28-92db-11d3-ef4d24144be4@intel.com> In-Reply-To: <8525082f-eb28-92db-11d3-ef4d24144be4@intel.com> From: Elad Nachman Date: Mon, 4 Oct 2021 19:18:53 +0300 Message-ID: To: Ferruh Yigit Cc: Eric Christian , dev , Igor Ryzhov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [dpdk-dev] [PATCH v2] kni: Fix request overwritten X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Oct 4, 2021 at 7:05 PM Ferruh Yigit wrote: > On 10/4/2021 3:58 PM, Elad Nachman wrote: > > =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3, 4= =D7=91=D7=90=D7=95=D7=A7=D7=B3 2021, 17:51, =D7=9E=D7=90=D7=AA Ferruh Yigi= t =E2=80=8F< > > ferruh.yigit@intel.com>: > > > >> On 10/4/2021 3:25 PM, Elad Nachman wrote: > >> > >> Can you please try to not top post, it will make impossible to follow > this > >> discussion later from the mail archives. > >> > >>> 1. Userspace will get an error > >> > >> So there is nothing special with returning '-EAGAIN', user will only > >> observe an > >> error. > >> Wasn't initial intention to use '-EAGAIN' to try request again? > >> > > To signal user-space to retry the operation. > > > > Not sure if it will reach to the end user. If user is calling "ifconfig > > down", it will just fail right, it won't recognize the error type. > > Unless this is common usage by the Linux network drivers, having this > usage in > KNI won't help much. I am for handling this in the kernel side if we can. > > If user calls ifconfig down it will not happen. It requires some multi-core race condition only Eric can recreate. > >> > >>> 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl > unlocked > >>> for interface down command causes a crash because of a race condition > in > >>> the device delete/unregister list in the kernel. > >>> > >> > >> Why waiting with rthnl lock causes a deadlock? As said below we are > already > >> doing it, why it is different with retry logic? > >> > > Because it can be interface down request. > > > > (sure you like short answers) > > Please help me to see why "interface down" is special. Isn't it point of > your > patch to wait the request execution in the userspace even it is an async > request? > > And yet again, number of retry can be limited. > > No, it is not. Please look again: https://patches.dpdk.org/project/dpdk/patch/20210924105409.21711-1-eladv6@g= mail.com/ > > > > >> I agree to not wait with rtnl unlocked. > >> > >>> FYI, > >>> > >>> Elad. > >>> > >>> =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3,= 4 =D7=91=D7=90=D7=95=D7=A7=D7=B3 2021, 17:13, =D7=9E=D7=90=D7=AA Ferruh Yi= git =E2=80=8F< > >>> ferruh.yigit@intel.com>: > >>> > >>>> On 10/4/2021 2:09 PM, Elad Nachman wrote: > >>>>> Hi, > >>>>> > >>>>> EAGAIN is propogated back to the kernel and to the caller. > >>>>> > >>>> > >>>> So will the user get an error, or it will be handled by the kernel a= nd > >>>> retried? > >>>> > >>>>> We cannot retry from the kni kernel module since we hold the rtnl > lock. > >>>>> > >>>> > >>>> Why not? We are already waiting until a command time out, like > >>>> 'kni_net_open()' > >>>> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we c= an > >>>> limit the > >>>> number of retry for safety. > >>>> > >>>>> FYI, > >>>>> > >>>>> Elad > >>>>> > >>>>> =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7= =B3, 4 =D7=91=D7=90=D7=95=D7=A7=D7=B3 2021, 16:05, =D7=9E=D7=90=D7=AA Ferru= h Yigit =E2=80=8F< > >>>>> ferruh.yigit@intel.com>: > >>>>> > >>>>>> On 9/24/2021 11:54 AM, Elad Nachman wrote: > >>>>>>> Fix lack of multiple KNI requests handling support by introducing= a > >>>>>>> request in progress flag which will fail additional requests with > >>>>>>> EAGAIN return code if the original request has not been processed > >>>>>>> by user-space. > >>>>>>> > >>>>>>> Bugzilla ID: 809 > >>>>>> > >>>>>> Hi Eric, > >>>>>> > >>>>>> Can you please test this patch, if it solves the issue you reporte= d? > >>>>>> > >>>>>>> > >>>>>>> Signed-off-by: Elad Nachman > >>>>>>> --- > >>>>>>> kernel/linux/kni/kni_net.c | 9 +++++++++ > >>>>>>> lib/kni/rte_kni.c | 2 ++ > >>>>>>> lib/kni/rte_kni_common.h | 1 + > >>>>>>> 3 files changed, 12 insertions(+) > >>>>>>> > >>>>>> > >>>>>> <...> > >>>>>> > >>>>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device > *dev, > >>>>>> struct rte_kni_request *req) > >>>>>>> > >>>>>>> mutex_lock(&kni->sync_lock); > >>>>>>> > >>>>>>> + /* Check that existing request has been processed: */ > >>>>>>> + cur_req =3D (struct rte_kni_request *)kni->sync_kva; > >>>>>>> + if (cur_req->req_in_progress) { > >>>>>>> + ret =3D -EAGAIN; > >>>>>> > >>>>>> Overall logic in the KNI looks good to me, this helps to serialize > the > >>>>>> requests > >>>>>> even for async ones. > >>>>>> > >>>>>> But can you please clarify how it behaves in the kernel side with > >>>> '-EAGAIN' > >>>>>> return type? Will linux call the ndo again, or will it just fail. > >>>>>> > >>>>>> If it just fails should we handle the re-try on '-EAGAIN' within t= he > >> kni > >>>>>> module? > >>>>>> > >>>>>> > >>>> > >>>> > >> > >> Elad. > >