From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5569EA0C4D; Mon, 4 Oct 2021 16:58:59 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BEEEE412D2; Mon, 4 Oct 2021 16:58:58 +0200 (CEST) Received: from mail-il1-f175.google.com (mail-il1-f175.google.com [209.85.166.175]) by mails.dpdk.org (Postfix) with ESMTP id 5CF264128F for ; Mon, 4 Oct 2021 16:58:57 +0200 (CEST) Received: by mail-il1-f175.google.com with SMTP id j15so18550274ila.6 for ; Mon, 04 Oct 2021 07:58:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IrcqwrKIDcdN4VKHI/xZBtmG9Dj0B+Y0uy0RkzwAXmo=; b=Q3zBIXv+oGVwi+PSlC4mtQXmfruJKwoLTu0jCm6fb9CMGAvbVWOI+bmQCYROJa9nAr PfQ0e1zFlHvWWNbIBArHLzJLaMwpzFJcrXkpb9Fc+hXtSMosmDI7fUjIVwUTthiVDtZi p+ObdkDLEGakQjvCiSqSnw7eK9zrwS8Ll8sSFH/RyzKLUJ+/y9e7582fcurDF3wjA82F HX4+SmU7WBuXYOKxzRnWMgQRgfDT40XwXlDDVXrTijBhtZHycKAMDOrRKcTrb8rxkMSH 5HgTl3JJnvv2NpPsySc8urX0Erw6j8pXGiWC8zRmpnxT/tgUI2BUUSO9nRR9s3qeLOu7 wD6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IrcqwrKIDcdN4VKHI/xZBtmG9Dj0B+Y0uy0RkzwAXmo=; b=okKKMToFsF9tDnNU32+2MZJOImxEbulCeRZ8+5owL7OuTAos7VuZ22sVI1V43eiShH HikjN8MJPnIBzqOUBHTZt4eYy29we3jxZFkE6VrR28XId8uL85OyQu9L9bCsl7V/wVv9 HMQrVR7tjfuuGm2yUWUC3zjxeLqJY/EFv+GXwm6IHJwAULIr6IToY4iTRQbYY/Orm/jp cABbfDkf1Fw23TUaBolzJkVeIkWdOsrNeNPgrSxLEoPP8ansAVWde5Z74lwDK/1er261 I3IC0/sx4f1VRl12iHPsPGl40Muaj0bxem1h+kkyiqKdA5gV25GYRwT2nrE5FKpXfSot RXJA== X-Gm-Message-State: AOAM533yl6aNyJBqzMwD9x9L/DX8xJlFxEnShjI2TGUHrg5GrcGC+phM xtaQtcMU3fD7UzL04yhzyZdLCGKQchienGoooJWPYRaT X-Google-Smtp-Source: ABdhPJyaQ2XnBWNfmPRMcEqrh14MkipWZCqXOsh4DT0sjRN7mRGE5TCoDXlmSVXD4ZxO3db9B/ivAztY7fLhyrhc8ZU= X-Received: by 2002:a05:6e02:154f:: with SMTP id j15mr10382135ilu.236.1633359536730; Mon, 04 Oct 2021 07:58:56 -0700 (PDT) MIME-Version: 1.0 References: <20210924105409.21711-1-eladv6@gmail.com> <3ae193df-292c-4907-df4a-88ce3d6735fc@intel.com> <1a17d552-8b81-04f9-7594-61e84ea7990f@intel.com> In-Reply-To: <1a17d552-8b81-04f9-7594-61e84ea7990f@intel.com> From: Elad Nachman Date: Mon, 4 Oct 2021 17:58:45 +0300 Message-ID: To: Ferruh Yigit Cc: Eric Christian , dev , Igor Ryzhov Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [dpdk-dev] [PATCH v2] kni: Fix request overwritten X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3, 4 =D7= =91=D7=90=D7=95=D7=A7=D7=B3 2021, 17:51, =D7=9E=D7=90=D7=AA Ferruh Yigit = =E2=80=8F< ferruh.yigit@intel.com>: > On 10/4/2021 3:25 PM, Elad Nachman wrote: > > Can you please try to not top post, it will make impossible to follow thi= s > discussion later from the mail archives. > > > 1. Userspace will get an error > > So there is nothing special with returning '-EAGAIN', user will only > observe an > error. > Wasn't initial intention to use '-EAGAIN' to try request again? > > To signal user-space to retry the operation. > > > 2. Waiting with rtnl locked causes a deadlock; waiting with rtnl unlock= ed > > for interface down command causes a crash because of a race condition i= n > > the device delete/unregister list in the kernel. > > > > Why waiting with rthnl lock causes a deadlock? As said below we are alrea= dy > doing it, why it is different with retry logic? > > Because it can be interface down request. > I agree to not wait with rtnl unlocked. > > > FYI, > > > > Elad. > > > > =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3, 4= =D7=91=D7=90=D7=95=D7=A7=D7=B3 2021, 17:13, =D7=9E=D7=90=D7=AA Ferruh Yigi= t =E2=80=8F< > > ferruh.yigit@intel.com>: > > > >> On 10/4/2021 2:09 PM, Elad Nachman wrote: > >>> Hi, > >>> > >>> EAGAIN is propogated back to the kernel and to the caller. > >>> > >> > >> So will the user get an error, or it will be handled by the kernel and > >> retried? > >> > >>> We cannot retry from the kni kernel module since we hold the rtnl loc= k. > >>> > >> > >> Why not? We are already waiting until a command time out, like > >> 'kni_net_open()' > >> can retry if 'kni_net_process_request()' returns '-EAGAIN'. And we can > >> limit the > >> number of retry for safety. > >> > >>> FYI, > >>> > >>> Elad > >>> > >>> =D7=91=D7=AA=D7=90=D7=A8=D7=99=D7=9A =D7=99=D7=95=D7=9D =D7=91=D7=B3,= 4 =D7=91=D7=90=D7=95=D7=A7=D7=B3 2021, 16:05, =D7=9E=D7=90=D7=AA Ferruh Yi= git =E2=80=8F< > >>> ferruh.yigit@intel.com>: > >>> > >>>> On 9/24/2021 11:54 AM, Elad Nachman wrote: > >>>>> Fix lack of multiple KNI requests handling support by introducing a > >>>>> request in progress flag which will fail additional requests with > >>>>> EAGAIN return code if the original request has not been processed > >>>>> by user-space. > >>>>> > >>>>> Bugzilla ID: 809 > >>>> > >>>> Hi Eric, > >>>> > >>>> Can you please test this patch, if it solves the issue you reported? > >>>> > >>>>> > >>>>> Signed-off-by: Elad Nachman > >>>>> --- > >>>>> kernel/linux/kni/kni_net.c | 9 +++++++++ > >>>>> lib/kni/rte_kni.c | 2 ++ > >>>>> lib/kni/rte_kni_common.h | 1 + > >>>>> 3 files changed, 12 insertions(+) > >>>>> > >>>> > >>>> <...> > >>>> > >>>>> @@ -123,7 +124,15 @@ kni_net_process_request(struct net_device *dev= , > >>>> struct rte_kni_request *req) > >>>>> > >>>>> mutex_lock(&kni->sync_lock); > >>>>> > >>>>> + /* Check that existing request has been processed: */ > >>>>> + cur_req =3D (struct rte_kni_request *)kni->sync_kva; > >>>>> + if (cur_req->req_in_progress) { > >>>>> + ret =3D -EAGAIN; > >>>> > >>>> Overall logic in the KNI looks good to me, this helps to serialize t= he > >>>> requests > >>>> even for async ones. > >>>> > >>>> But can you please clarify how it behaves in the kernel side with > >> '-EAGAIN' > >>>> return type? Will linux call the ndo again, or will it just fail. > >>>> > >>>> If it just fails should we handle the re-try on '-EAGAIN' within the > kni > >>>> module? > >>>> > >>>> > >> > >> > > Elad.