From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 461C9A034F; Thu, 25 Feb 2021 22:01:14 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C7A031608D9; Thu, 25 Feb 2021 22:01:13 +0100 (CET) Received: from mail-ed1-f43.google.com (mail-ed1-f43.google.com [209.85.208.43]) by mails.dpdk.org (Postfix) with ESMTP id B10951608D6 for ; Thu, 25 Feb 2021 22:01:12 +0100 (CET) Received: by mail-ed1-f43.google.com with SMTP id p1so4090808edy.2 for ; Thu, 25 Feb 2021 13:01:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nfware-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Tqs10kGprk4lSAM/cIQFnccTL/Oyv2oIlM1z1IRV0J0=; b=v0Zf1mImtKZUJ8pSCSEWeBmSKb8VITCU/IoyiL6Q7Rxqj5cakiSOdbCXrZs7ITbQ0x D6okdzZLOhpkVTuwKg1BNpTyrC/iz+PfUJcW1M4QqpbrbxdE4ChfM4wzCF/bRLXs1whR 57CVWJ6XOkHogBbTKqMijUr4rnSvavkiYPqlWsHt+wUkBcvQrnbPro9FOoZUMVYsS78i bqE6lND+kNiQ2sS5Zg6fVeXrh13dB0HEJB6ryGjl2z4dNrccQLb1Ry66Q/xwiQOi1OT8 p6XcHZ0yCLUdyDMtmgJrsc8suUHhbyQYLI4u9m1CQ9GYslBZBZrs2iLCQkka5ndZ8tO9 AZHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Tqs10kGprk4lSAM/cIQFnccTL/Oyv2oIlM1z1IRV0J0=; b=UokAWk0TJA3SIZcT5Z0A0v1WMNiH5GwiWN0FNfCAxzaSfVyDB8oDqmzq3vREnTyoq7 4QmMb9aD4Pz25DaUVc7jZCRX7EeIi9BYLjgDeZo79CAb/jdJC7HcRZ1c63RFOYkHJgUh DfIwaGmHJhwLl24Cnc2dsZltiGolN3ZuiWPtyvZpjfJPTfIi+jHTMIuPYWiOL35RMEq3 uv2m6316EUtZFWnVH/IzmsemhqCpTb6nedXN40XSL/dD7k3XgI7eAz+N9So5hC0VzqC8 JlJ78qHf3RagywvbJqiBIuqGfcSLBaIl4fzcn+9wKV/JrWxtEK7mmDO6HhF3j05IOdrs h2Tg== X-Gm-Message-State: AOAM532NggGC4MgAX7bh891VaCwl43QDDPTyn08yQCXGmqkgchj0eLET uAz7rfmQH9G+fVt8JwqpWytOtS+f6ap4dSvtCPXitA== X-Google-Smtp-Source: ABdhPJzbElCnzc2fWAelJhF/jH5acspyor52llhi2mSc4Mq/nAvNIRn4/OEi0S+9f1uATgvUpdekZNsqm7uGfFe/S+w= X-Received: by 2002:a05:6402:5207:: with SMTP id s7mr4819187edd.311.1614286872306; Thu, 25 Feb 2021 13:01:12 -0800 (PST) MIME-Version: 1.0 References: <20201126144613.4986-1-eladv6@gmail.com> <20210225143239.14220-1-eladv6@gmail.com> <20210225143239.14220-2-eladv6@gmail.com> In-Reply-To: <20210225143239.14220-2-eladv6@gmail.com> From: Igor Ryzhov Date: Fri, 26 Feb 2021 00:01:01 +0300 Message-ID: To: Elad Nachman Cc: Ferruh Yigit , Stephen Hemminger , dev Content-Type: text/plain; charset="UTF-8" X-Content-Filtered-By: Mailman/MimeDel 2.1.29 Subject: Re: [dpdk-dev] [PATCH 2/2] kni: fix rtnl deadlocks and race conditions v4 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Elad, Thanks for the patch, but this is still NACK from me. The only real advantage of KNI over other exceptional-path techniques like virtio-user is the ability to configure DPDK-managed interfaces directly from the kernel using well-known utils like iproute2. A very important part of this is getting responses from the DPDK app and knowing the actual result of command execution. If you're making async requests to the application and you don't know the result, then what's the point of using KNI at all? Igor On Thu, Feb 25, 2021 at 5:32 PM Elad Nachman wrote: > This part of the series includes my fixes for the issues reported > by Ferruh and Igor (and Igor comments for v3 of the patch) > on top of part 1 of the patch series: > > A. KNI sync lock is being locked while rtnl is held. > If two threads are calling kni_net_process_request() , > then the first one will take the sync lock, release rtnl lock then sleep. > The second thread will try to lock sync lock while holding rtnl. > The first thread will wake, and try to lock rtnl, resulting in a deadlock. > The remedy is to release rtnl before locking the KNI sync lock. > Since in between nothing is accessing Linux network-wise, > no rtnl locking is needed. > > B. There is a race condition in __dev_close_many() processing the > close_list while the application terminates. > It looks like if two vEth devices are terminating, > and one releases the rtnl lock, the other takes it, > updating the close_list in an unstable state, > causing the close_list to become a circular linked list, > hence list_for_each_entry() will endlessly loop inside > __dev_close_many() . > Since the description for the original patch indicate the > original motivation was bringing the device up, > I have changed kni_net_process_request() to hold the rtnl mutex > in case of bringing the device down since this is the path called > from __dev_close_many() , causing the corruption of the close_list. > In order to prevent deadlock in Mellanox device in this case, the > code has been modified not to wait for user-space while holding > the rtnl lock. > Instead, after the request has been sent, all locks are relinquished > and the function exits immediately with return code of zero (success). > > To summarize: > request != interface down : unlock rtnl, send request to user-space, > wait for response, send the response error code to caller in user-space. > > request == interface down: send request to user-space, return immediately > with error code of 0 (success) to user-space. > > Signed-off-by: Elad Nachman > > > --- > v4: > * for if down case, send asynchronously with rtnl locked and without > wait, returning immediately to avoid both kernel race conditions > and deadlock in user-space > v3: > * Include original patch and new patch as a series of patch, added a > comment to the new patch > v2: > * rebuild the patch as increment from patch 64106 > * fix comment and blank lines > --- > kernel/linux/kni/kni_net.c | 41 +++++++++++++++++++++++++++------ > lib/librte_kni/rte_kni.c | 7 ++++-- > lib/librte_kni/rte_kni_common.h | 1 + > 3 files changed, 40 insertions(+), 9 deletions(-) > > diff --git a/kernel/linux/kni/kni_net.c b/kernel/linux/kni/kni_net.c > index f0b6e9a8d..ba991802b 100644 > --- a/kernel/linux/kni/kni_net.c > +++ b/kernel/linux/kni/kni_net.c > @@ -110,12 +110,34 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > void *resp_va; > uint32_t num; > int ret_val; > + int req_is_dev_stop = 0; > + > + /* For configuring the interface to down, > + * rtnl must be held all the way to prevent race condition > + * inside __dev_close_many() between two netdev instances of KNI > + */ > + if (req->req_id == RTE_KNI_REQ_CFG_NETWORK_IF && > + req->if_up == 0) > + req_is_dev_stop = 1; > > ASSERT_RTNL(); > > + /* Since we need to wait and RTNL mutex is held > + * drop the mutex and hold reference to keep device > + */ > + if (!req_is_dev_stop) { > + dev_hold(dev); > + rtnl_unlock(); > + } > + > mutex_lock(&kni->sync_lock); > > - /* Construct data */ > + /* Construct data, for dev stop send asynchronously > + * so instruct user-space not to sent response as no > + * one will be waiting for it. > + */ > + if (req_is_dev_stop) > + req->skip_post_resp_to_q = 1; > memcpy(kni->sync_kva, req, sizeof(struct rte_kni_request)); > num = kni_fifo_put(kni->req_q, &kni->sync_va, 1); > if (num < 1) { > @@ -124,16 +146,16 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > goto fail; > } > > - /* Since we need to wait and RTNL mutex is held > - * drop the mutex and hold refernce to keep device > + /* No result available since request is handled > + * asynchronously. set response to success. > */ > - dev_hold(dev); > - rtnl_unlock(); > + if (req_is_dev_stop) { > + req->result = 0; > + goto async; > + } > > ret_val = wait_event_interruptible_timeout(kni->wq, > kni_fifo_count(kni->resp_q), 3 * HZ); > - rtnl_lock(); > - dev_put(dev); > > if (signal_pending(current) || ret_val <= 0) { > ret = -ETIME; > @@ -148,10 +170,15 @@ kni_net_process_request(struct net_device *dev, > struct rte_kni_request *req) > } > > memcpy(req, kni->sync_kva, sizeof(struct rte_kni_request)); > +async: > ret = 0; > > fail: > mutex_unlock(&kni->sync_lock); > + if (!req_is_dev_stop) { > + rtnl_lock(); > + dev_put(dev); > + } > return ret; > } > > diff --git a/lib/librte_kni/rte_kni.c b/lib/librte_kni/rte_kni.c > index 837d0217d..6d777266d 100644 > --- a/lib/librte_kni/rte_kni.c > +++ b/lib/librte_kni/rte_kni.c > @@ -591,8 +591,11 @@ rte_kni_handle_request(struct rte_kni *kni) > break; > } > > - /* Construct response mbuf and put it back to resp_q */ > - ret = kni_fifo_put(kni->resp_q, (void **)&req, 1); > + /* if needed, construct response mbuf and put it back to resp_q */ > + if (!req->skip_post_resp_to_q) > + ret = kni_fifo_put(kni->resp_q, (void **)&req, 1); > + else > + ret = 1; > if (ret != 1) { > RTE_LOG(ERR, KNI, "Fail to put the muf back to resp_q\n"); > return -1; /* It is an error of can't putting the mbuf > back */ > diff --git a/lib/librte_kni/rte_kni_common.h > b/lib/librte_kni/rte_kni_common.h > index ffb318273..3b5d06850 100644 > --- a/lib/librte_kni/rte_kni_common.h > +++ b/lib/librte_kni/rte_kni_common.h > @@ -48,6 +48,7 @@ struct rte_kni_request { > uint8_t promiscusity;/**< 1: promisc mode enable, 0: > disable */ > uint8_t allmulti; /**< 1: all-multicast mode enable, 0: > disable */ > }; > + int32_t skip_post_resp_to_q; /**< 1: skip queue response 0: > disable */ > int32_t result; /**< Result for processing request */ > } __attribute__((__packed__)); > > -- > 2.17.1 > >