From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id D7868A034F; Mon, 22 Feb 2021 08:52:50 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 6D08422A25F; Mon, 22 Feb 2021 08:52:50 +0100 (CET) Received: from mail-il1-f181.google.com (mail-il1-f181.google.com [209.85.166.181]) by mails.dpdk.org (Postfix) with ESMTP id 1A19C40696 for ; Sun, 21 Feb 2021 09:03:42 +0100 (CET) Received: by mail-il1-f181.google.com with SMTP id o1so1323725ila.11 for ; Sun, 21 Feb 2021 00:03:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=nfo4ujCwKagfJUmHDXnxyOhUnGawb6Iw0Sh9o65cQOw=; b=l6//3zv80AzBZDdWHUJBDPccpZHpYxgzfXZFbYH0L/oX15WaKW/NYJrIlcYDhkFKiQ BLh98IIfexecKOpxeDS5fGa9dRawKrNu+1KXLio2M1t0pdJl8Zpv7cu4qfubDEt87yrD abw5Hoo76P+smTstHZNn68LE87qx9jwb25uAUHkAtkcvMVmw0H5I07NYkU6NkDfOrXE9 mjRlBzw/vxuf5obpzj64mo2jqOeQon6LIJ+M3HZqzV0TjxmkLJ+HPADE8pu3Mr9vYXwI 0zHCrDkY19t487c0DY50zvlVRLWBoRmgFDewnr4uQXfueVy1L9bwDuMkPqJdUOz42vKr U9jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=nfo4ujCwKagfJUmHDXnxyOhUnGawb6Iw0Sh9o65cQOw=; b=JmKxSBVcFt20jEsNlkkkdHEd61mMKMAoDc+ufU2zlkNdpW4uaswqgCnIfv7y6Zy0ou MNHNz5B51UG9chwdlYdCurGLpehzbYZIz4Ou1DqYs6DvxngfChmrEzWvBJOBiYmRBGIl sBokrgRWUbxex+mv3OtLz2cMc9jCcfQ1Exb9BMPkqTnx/jfwbmQAShIqP6HHL82iSx0P kTEYeg8JEqewMnW1MXKnVd2VpY2qu2XH2SG+b8CC9wlxotJpDmWJf4q5nlxyyc/Knysy YuVcyPqi4+epFym8tYdn2alpD5TvqnEzIyS58hQ0bk6D3kbXeT4Cds3wNVJSbcJ/aFlB w3jQ== X-Gm-Message-State: AOAM533a79zalEqw8PJ8DROVGTK1qX3qCSBp0UG4Nd4ltBrfz93DI4Aq +TSb6WjtmQcH1LmIRz1WzJ6ZDc7a8z6f6Gy2fc0= X-Google-Smtp-Source: ABdhPJwcHOdXSwzuBAmpEvDPRTJGZmZ5ob1Uflh2hNaEy+wCJsH2N5muQjzIIqshBBonjoZ03B/0ejqFfHkar76o7tw= X-Received: by 2002:a92:d850:: with SMTP id h16mr9835075ilq.38.1613894621329; Sun, 21 Feb 2021 00:03:41 -0800 (PST) MIME-Version: 1.0 References: <20201126144613.4986-1-eladv6@gmail.com> <9f7284a8-4158-c246-c329-bfec27f543a5@intel.com> In-Reply-To: <9f7284a8-4158-c246-c329-bfec27f543a5@intel.com> From: Elad Nachman Date: Sun, 21 Feb 2021 10:03:30 +0200 Message-ID: To: Ferruh Yigit Cc: dev@dpdk.org, Igor Ryzhov , Stephen Hemminger Content-Type: text/plain; charset="UTF-8" X-Mailman-Approved-At: Mon, 22 Feb 2021 08:52:49 +0100 Subject: Re: [dpdk-dev] [PATCH] kni: fix rtnl deadlocks and race conditions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, Regarding the asynchronous call - thought about it, but then the request will always return OK to user-space and I will have no way to return failure error codes back to user-space. If the above explanation is acceptable, per your other comments - I can send a new patch without the parameter change , without the empty line, and with the comment moved to the proper place in the code. Waiting for your decision, Elad. On Fri, Feb 19, 2021 at 8:42 PM Ferruh Yigit wrote: > > On 11/26/2020 2:46 PM, Elad Nachman wrote: > > This patch leverages on Stephen Hemminger's 64106 patch from Dec 2019, > > and fixes the issues reported by Ferruh and Igor: > > > > A. KNI sync lock is being locked while rtnl is held. > > If two threads are calling kni_net_process_request() , > > then the first one wil take the sync lock, release rtnl lock then sleep. > > The second thread will try to lock sync lock while holding rtnl. > > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > > The remedy is to release rtnl before locking the KNI sync lock. > > Since in between nothing is accessing Linux network-wise, > > no rtnl locking is needed. > > Hi Elad, > > Thanks for explanation, that clarifies the issue. > Also I confirm I don't see the hang, at least as much as I test. > > > > > B. There is a race condition in __dev_close_many() processing the > > close_list while the application terminates. > > It looks like if two vEth devices are terminating, > > and one releases the rtnl lock, the other takes it, > > updating the close_list in an unstable state, > > causing the close_list to become a circular linked list, > > hence list_for_each_entry() will endlessly loop inside > > __dev_close_many() . > > Since the description for the original patch indicate the > > original motivation was bringing the device up, > > I have changed kni_net_process_request() to hold the rtnl mutex > > in case of bringing the device down since this is the path called > > from __dev_close_many() , causing the corruption of the close_list. > > > > I can't reproduce this case, I see the protection in the code, but better to get > confirmation from Igor. > > > > Overall the issue seems calling a function pointed by 'rte_kni_ops' which > requires to acquire the rtnl lock. > So I wonder if this can't be handled in the ops function, by processing the > request asynchronously, > like recording the request, return from 'rte_kni_ops', and process the request > afterwards? > > I assume the application we mention is not kni sample application. > > > > > > > Signed-off-by: Elad Nachman > > --- > > kernel/linux/kni/kni_net.c | 47 +++++++++++++++++++++++++------------- > > 1 file changed, 31 insertions(+), 16 deletions(-) > > > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > > index 4b752083d..cf5b0845d 100644 > > --- a/kernel/linux/kni/kni_net.c > > +++ b/kernel/linux/kni/kni_net.c > > @@ -17,6 +17,7 @@ > > #include > > #include > > #include > > +#include > > > > #include > > #include > > @@ -102,18 +103,26 @@ get_data_kva(struct kni_dev *kni, void *pkt_kva) > > * It can be called to process the request. > > */ > > static int > > -kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > > +kni_net_process_request(struct net_device *dev, struct rte_kni_request *req) > > { > > + struct kni_dev *kni = netdev_priv(dev); > > int ret = -1; > > void *resp_va; > > uint32_t num; > > int ret_val; > > + int req_is_dev_stop = 0; > > > > - if (!kni || !req) { > > - pr_err("No kni instance or request\n"); > > - return -EINVAL; > > - } > > + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > > + req->if_up == 0) > > + req_is_dev_stop = 1; > > > > + ASSERT_RTNL(); > > + > > + if (!req_is_dev_stop) { > > + dev_hold(dev); > > + rtnl_unlock(); > > + } > > + > > mutex_lock(&kni->sync_lock); > > > > /* Construct data */ > > @@ -125,8 +134,13 @@ kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > > goto fail; > > } > > > > + /* Since we need to wait and RTNL mutex is held > > + * drop the mutex and hold refernce to keep device > > + */ > > + > > Comment seems left here, need to go up. s/refernce/reference > > > ret_val = wait_event_interruptible_timeout(kni->wq, > > kni_fifo_count(kni->resp_q), 3 * HZ); > > + > > if (signal_pending(current) || ret_val <= 0) { > > ret = -ETIME; > > goto fail; > > @@ -144,6 +158,13 @@ kni_net_process_request(struct kni_dev *kni, struct rte_kni_request *req) > > > > fail: > > mutex_unlock(&kni->sync_lock); > > + > > + > > extra empty line > > > + if (!req_is_dev_stop) { > > + rtnl_lock(); > > + dev_put(dev); > > + } > > + > > return ret; > > } > > > > @@ -155,7 +176,6 @@ kni_net_open(struct net_device *dev) > > { > > int ret; > > struct rte_kni_request req; > > - struct kni_dev *kni = netdev_priv(dev); > > > > netif_start_queue(dev); > > if (kni_dflt_carrier == 1) > > @@ -168,7 +188,7 @@ kni_net_open(struct net_device *dev) > > > > /* Setting if_up to non-zero means up */ > > req.if_up = 1; > > - ret = kni_net_process_request(kni, &req); > > + ret = kni_net_process_request(dev, &req); > > > > Althoug it is not soo confusing, these lines and following ones are noise for > this patch, they are just for 'kni_net_process_request' paramter change. > > What do you think do the 'kni_net_process_request' parameter change in first > patch, and fix the issue in second, this way second patch can contain only the > actual changes required for fix.