From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 79919A053A; Tue, 4 Aug 2020 01:51:55 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id C4955F64; Tue, 4 Aug 2020 01:51:54 +0200 (CEST) Received: from smtp-gw.pt.net (smtp-gw.pt.net [206.210.194.15]) by dpdk.org (Postfix) with ESMTP id 8194AF04 for ; Tue, 4 Aug 2020 01:51:52 +0200 (CEST) X-ASG-Debug-ID: 1596498710-09411a12e3122e80001-TfluYd Received: from mail.pt.net (mail.pt.net [206.210.194.11]) by smtp-gw.pt.net with ESMTP id vjKa8vpcD3DEEQqv (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Mon, 03 Aug 2020 18:51:51 -0500 (CDT) X-Barracuda-Envelope-From: lew@donzis.com X-Barracuda-Effective-Source-IP: mail.pt.net[206.210.194.11] X-Barracuda-Apparent-Source-IP: 206.210.194.11 Received: from localhost (localhost [IPv6:::1]) by mail.pt.net (Postfix) with ESMTP id E61A98427AB; Mon, 3 Aug 2020 18:51:50 -0500 (CDT) Received: from mail.pt.net ([IPv6:::1]) by localhost (mail.pt.net [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id 9-xJegTMyKbh; Mon, 3 Aug 2020 18:51:50 -0500 (CDT) Received: from localhost (localhost [IPv6:::1]) by mail.pt.net (Postfix) with ESMTP id 5288E8427AC; Mon, 3 Aug 2020 18:51:50 -0500 (CDT) X-Virus-Scanned: amavisd-new at pt.net Received: from mail.pt.net ([IPv6:::1]) by localhost (mail.pt.net [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id kWrY4cwtiUvD; Mon, 3 Aug 2020 18:51:50 -0500 (CDT) Received: from mail.pt.net (mail.pt.net [206.210.194.11]) by mail.pt.net (Postfix) with ESMTP id 25D968427AB; Mon, 3 Aug 2020 18:51:50 -0500 (CDT) Date: Mon, 3 Aug 2020 18:51:50 -0500 (CDT) From: Lewis Donzis To: Honnappa Nagarahalli Cc: dev , nd Message-ID: <302155956.2380944.1596498710016.JavaMail.zimbra@donzis.com> In-Reply-To: References: <1748858460.2301535.1596483785377.JavaMail.zimbra@donzis.com> MIME-Version: 1.0 X-ASG-Orig-Subj: Re: CPU hog & memory leak on FreeBSD X-Originating-IP: [206.210.194.11] X-Mailer: Zimbra 8.8.15_GA_3959 (ZimbraWebClient - GC84 (Mac)/8.8.15_GA_3953) Thread-Topic: CPU hog & memory leak on FreeBSD Thread-Index: AdZp7P9+CjpV23lCQQe6r8EkeLpQjHlKRB6r X-Barracuda-Connect: mail.pt.net[206.210.194.11] X-Barracuda-Start-Time: 1596498711 X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384 X-Barracuda-URL: https://smtp-gw.pt.net:443/cgi-mod/mark.cgi X-Virus-Scanned: by bsmtpd at pt.net X-Barracuda-Scan-Msg-Size: 6810 X-Barracuda-BRTS-Status: 1 X-Barracuda-Spam-Score: 0.00 X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=1000.0 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=9.0 tests= X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.83676 Rule breakdown below pts rule name description ---- ---------------------- -------------------------------------------------- Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-dev] CPU hog & memory leak on FreeBSD X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Sure, that would be great. This is from 18.11.9. Also, the entire modified file is attached. Thanks very much! lew 84,86c84,86 < struct rte_intr_callback *callback = NULL; < struct rte_intr_source *src = NULL; < int ret, add_event; --- > struct rte_intr_callback *callback; > struct rte_intr_source *src; > int ret, add_event = 0; 99,106c99 < /* allocate a new interrupt callback entity */ < callback = calloc(1, sizeof(*callback)); < if (callback == NULL) { < RTE_LOG(ERR, EAL, "Can not allocate memory\n"); < return -ENOMEM; < } < callback->cb_fn = cb; < callback->cb_arg = cb_arg; --- > rte_spinlock_lock(&intr_lock); 108,110c101 < rte_spinlock_lock(&intr_lock); < < /* check if there is at least one callback registered for the fd */ --- > /* find the source for this intr_handle */ 112,115c103,105 < if (src->intr_handle.fd == intr_handle->fd) { < /* we had no interrupts for this */ < if (TAILQ_EMPTY(&src->callbacks)) < add_event = 1; --- > if (src->intr_handle.fd == intr_handle->fd) > break; > } 117,121c107,121 < TAILQ_INSERT_TAIL(&(src->callbacks), callback, next); < ret = 0; < break; < } < } --- > /* If this is an alarm interrupt and it already has a callback, then we don't want to create a new callback > * because the only thing on the list should be eal_alarm_callback() and we may be called just to reset the timer. > */ > if (src != NULL && src->intr_handle.type == RTE_INTR_HANDLE_ALARM && !TAILQ_EMPTY(&src->callbacks)) { > callback = NULL; > } else { > /* allocate a new interrupt callback entity */ > callback = calloc(1, sizeof(*callback)); > if (callback == NULL) { > RTE_LOG(ERR, EAL, "Can not allocate memory\n"); > ret = -ENOMEM; > goto fail; > } > callback->cb_fn = cb; > callback->cb_arg = cb_arg; 123,134c123,137 < /* no existing callbacks for this - add new source */ < if (src == NULL) { < src = calloc(1, sizeof(*src)); < if (src == NULL) { < RTE_LOG(ERR, EAL, "Can not allocate memory\n"); < ret = -ENOMEM; < goto fail; < } else { < src->intr_handle = *intr_handle; < TAILQ_INIT(&src->callbacks); < TAILQ_INSERT_TAIL(&(src->callbacks), callback, next); < TAILQ_INSERT_TAIL(&intr_sources, src, next); --- > if (src == NULL) { > src = calloc(1, sizeof(*src)); > if (src == NULL) { > RTE_LOG(ERR, EAL, "Can not allocate memory\n"); > ret = -ENOMEM; > goto fail; > } else { > src->intr_handle = *intr_handle; > TAILQ_INIT(&src->callbacks); > TAILQ_INSERT_TAIL(&intr_sources, src, next); > } > } > > /* we had no interrupts for this */ > if (TAILQ_EMPTY(&src->callbacks)) 136,137c139,140 < ret = 0; < } --- > > TAILQ_INSERT_TAIL(&(src->callbacks), callback, next); 177c180 < return ret; --- > return 0; 181c184,185 < TAILQ_REMOVE(&(src->callbacks), callback, next); --- > if (callback != NULL) > TAILQ_REMOVE(&(src->callbacks), callback, next); ----- On Aug 3, 2020, at 6:22 PM, Honnappa Nagarahalli Honnappa.Nagarahalli@arm.com wrote: > Hello, > I can take a look if you can post the patch. > >> -----Original Message----- >> From: dev On Behalf Of Lewis Donzis >> Sent: Monday, August 3, 2020 2:43 PM >> To: dev@dpdk.org >> Subject: [dpdk-dev] CPU hog & memory leak on FreeBSD >> >> Hello. >> >> We've posted about this before, but I'm hoping to find someone willing to >> commit a patched version of lib/librte_eal/bsdapp/eal/eal_interrupts.c that >> corrects a memory leak and 100% CPU hog that is immediately noticeable >> with the i40e driver, at a minimum. We have modified this file to correct the >> problem, and would be happy to provide it back to whomever is a committer >> for this. > If you can send a patch, I can take a look. > >> >> The detailed explanation is: >> >> Currently, s etting an alarm with rte_eal_alarm_set() registers an alarm >> interrupt by calling rte_intr_callback_register(), which links the callback >> function (eal_alarm_callback) onto a list for that source and sets up a one- >> shot timer via kevent. Setting additional alarms links them on to the >> alarm_list, but also calls rte_eal_alarm_set() again, which links the callback >> function onto the source callback list again. >> >> When the alarm triggers and eal_alarm_callback() gets called, it goes down >> the list of pending alarms, calling as many callback functions as possible and >> removing each one from the list until it reaches one which has not yet expired. >> Once it's done, if alarm_list is not empty, it calls >> rte_intr_callback_register(), >> which then links yet another callback onto the interrupt source's list, thus >> creating an infinite loop. >> >> The problem is that the source callback list grows infinitely under this >> condition (essentially, when more than one alarm is queued). However, the >> call is necessary in order to reset the kevent timer. >> >> The proposed fix recognizes and leverages the fact that an alarm interrupt in >> FreeBSD should never have more than one callback on its list, so if > Is your fix applicable only for FreeBSD? > >> rte_intr_callback_register() is called with an interrupt handle type of >> RTE_INTR_HANDLE_ALARM, and if such an interrupt type already has a non- >> empty list, then a new callback is not created, but the kevent timer is >> restarted properly. >> >> A much simpler change is possible if we don't mind the overhead of allocating, >> filling-in, linking, de-linking, and freeing a callback unnecessarily. This > > proposed change makes every attempt to avoid that overhead.