From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 80C692A5E for ; Tue, 6 Dec 2016 16:25:05 +0100 (CET) Received: from int-mx09.intmail.prod.int.phx2.redhat.com (int-mx09.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6B77A8E3E6; Tue, 6 Dec 2016 15:25:04 +0000 (UTC) Received: from dhcp-25-97.bos.redhat.com (dhcp-25-172.bos.redhat.com [10.18.25.172]) by int-mx09.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id uB6FP17F008566 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Tue, 6 Dec 2016 10:25:03 -0500 From: Aaron Conole To: Daniele Di Proietto Cc: Ilya Maximets , "dev\@openvswitch.org" , Dyasly Sergey , Heetae Ahn , "dev\@dpdk.org" , Kevin Traynor , Mauricio =?utf-8?Q?V=C3=A1squez?= References: <1480343514-9733-1-git-send-email-i.maximets@samsung.com> <52cb119f-621c-4eb8-d2fd-1428a26a46b0@samsung.com> <51C98877-0199-473A-AB2B-248B7874B62F@vmware.com> Date: Tue, 06 Dec 2016 10:25:01 -0500 In-Reply-To: <51C98877-0199-473A-AB2B-248B7874B62F@vmware.com> (Daniele Di Proietto's message of "Tue, 29 Nov 2016 18:57:38 +0000") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1.50 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain X-Scanned-By: MIMEDefang 2.68 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Tue, 06 Dec 2016 15:25:05 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH] dpdk: Fix abort on double free. X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 06 Dec 2016 15:25:05 -0000 Daniele Di Proietto writes: > On 29/11/2016 07:57, "Aaron Conole" wrote: > >>Hi Ilya, >> >>Ilya Maximets writes: >> >>> On 28.11.2016 21:55, Aaron Conole wrote: >>>> Ilya Maximets writes: >>>> >>>>> According to DPDK API (lib/librte_eal/common/include/rte_eal.h): >>>>> >>>>> "After the call to rte_eal_init(), all arguments argv[x] >>>>> with x < ret may be modified and should not be accessed >>>>> by the application." >>>>> >>>>> This means, that OVS must not free the arguments passed to DPDK. >>>>> In real world, 'rte_eal_init()' replaces the last argument in >>>>> 'dpdk_argv' with the first one by doing this: >>>> >>>> Thanks for spotting this error, Ilya. >>>> >>>>> # eal_parse_args() from lib/librte_eal/linuxapp/eal/eal.c >>>>> >>>>> char *prgname = argv[0]; >>>>> ... >>>>> if (optind >= 0) >>>>> argv[optind-1] = prgname; >>>>> >>>>> This leads to double free inside 'deferred_argv_release()' and >>>>> possible ABORT at exit: >>>> >>>> I haven't seen this, which is both shocking and scary - the commit which >>>> does this copy is almost 4 years old; did you have to do anything >>>> specific for this behavior to occur? Did something change in DPDK >>>> recently that exposed this behavior? Just wondering how you reproduced >>>> it. >>> >>> Abort was caught up accidentally. I'm able to reproduce it on my a >>> little unusual testing system (ARMv8 + Fedora 21 + clang 3.5) without >>> any specific manipulations. The bug exists always but it's hard >>> for libc to detect double free here because there are many other >>> frees/allocations at exit time. I've used following patch to confirm >>> the issue if it wasn't detected by libc: >> >>Well, it's at least good that you can observe it consistently. Did you >>try my provided patch to see if that works as well? >> >>> diff --git a/lib/dpdk.c b/lib/dpdk.c >>> index 49a589a..65d2d28 100644 >>> --- a/lib/dpdk.c >>> +++ b/lib/dpdk.c >>> @@ -258,6 +258,8 @@ deferred_argv_release(void) >>> { >>> int result; >>> for (result = 0; result < dpdk_argc; ++result) { >>> + VLOG_INFO("DPDK ARGV release: %2d: 0x%" PRIx64 " (%s)", >>> + result, (intptr_t)dpdk_argv[result], dpdk_argv[result]); >>> free(dpdk_argv[result]); >>> } >>> >> >>It's quite glaring after studying the code. Really good catch! > > I agree, thanks for spotting this > >> >>>> >>>>> *** Error in `ovs-vswitchd': double free or corruption (fasttop) <...> *** >>>>> Program received signal SIGABRT, Aborted. >>>>> >>>>> #0 raise () from /lib64/libc.so.6 >>>>> #1 abort () from /lib64/libc.so.6 >>>>> #2 __libc_message () from /lib64/libc.so.6 >>>>> #3 free () from /lib64/libc.so.6 >>>>> #4 deferred_argv_release () at lib/dpdk.c:261 >>>>> #5 __run_exit_handlers () from /lib64/libc.so.6 >>>>> #6 exit () from /lib64/libc.so.6 >>>>> #7 __libc_start_main () from /lib64/libc.so.6 >>>>> #8 _start () >>>>> >>>>> Fix that by not calling free for the memory passed to DPDK. >>>>> >>>>> CC: Aaron Conole >>>>> Fixes: bab694097133 ("netdev-dpdk: Convert initialization from >>>>> cmdline to db") >>>>> Signed-off-by: Ilya Maximets >>>>> --- >>>> >>>> We need to free the memory - I think that is not a question; >>> >>> Actually, it is. According to DPDK API (see above) 'rte_eal_init()' >>> takes the ownership of 'argv'. This means that we must not free >>> or use this memory. >> >>Apologies for the ranty-wall of text below. >> >>DPDK *cannot* take ownership of freeing this memory, unless 1) it expects a >>completely separate array from argv/argc than the one passed during >>program execution and initialization, or 2) it expects the hosted >>environment to give it the responsibility of cleaning this up. It >>explicitly claims that the argv/argc is what comes from main(), and >>therefore should obey the restrictions and privileges afforded those >>variables. >> >>In fact, I don't even see anywhere that dpdk preserves argv, *at all*. >>Looking through the history very quickly (admittedly just back to commit >>af75078fece3615088e561357c1e97603e43a5fe in dpdk) confirms that dpdk >>hasn't stored the arguments anywhere to do any processing. >> >>DPDK api guide does NOT state that it takes possession - and that matches >>with what happens in the code, BUT I will agree the statement >> >> 'all arguments argv[x] with x < ret may be modified and should not be >> accessed by the application' >> >>is a bit ambiguous. I think it's trying to say that the application should do >>its getopt()s parsing before calling the dpdk init routine, because DPDK libs >>will change the array. I don't see a reason for modifying the array in >>the code (the `argv[optind-1] = progname`), but if the dpdk library wants >>to do that, it is free to do so according to C99 5.1.2.2.1; I think >>it's best we always free what we allocate, which is why I suggested the >>side array patch which stores additional pointers to the data to be >>free'd up at exit. >> >>I am not sure which is more appropriate, since this is an exit condition, >>after all. The memory will get free()d up eventually by the >>environments on which OvS runs. It doesn't _feel_ correct to leave the >>memory dangling, since we can free it. >> >>Anyone else have thoughts on this? > > I don't think it's a big deal to leak memory that has to be used until the process > terminates. There are other examples of this in OvS, such as 'timewarp_seq' in > lib/timeval.c. They should be reported by valgrind as "still reachable". With Ilya's incremental, it won't - we will drop the static variables which would still hold references, as well. So valgrind will report it as leaked for sure. > That said, at some point we might want to have 100% leak free valgrind runs, so > I think it's be better to free everything we allocate, so I would prefer Aaron's > solution. I don't think DPDK should expect the arguments to be available in exit > handlers, i.e. after main() returns. > > I don't feel strongly about it though, since, if I'm not mistaken, valgrind doesn't > support DPDK yet. I'm okay if you want to take Ilya's patches for this, since I can't truly test for this case (my environment hasn't reproduced the report). Alternately, I can put in some printf() calls and prove it out that way, then submit my patch formally. I would prefer if OvS would release the memory, since it isn't required. On further thought, though, I actually think it should be released immediately, and you can see my original proposal did that, and was told to change: https://mail.openvswitch.org/pipermail/ovs-dev/2015-December/306540.html So, I'd much prefer a patch which frees immediately after calling dpdk_init. Nothing in the dpdk API documentation makes me believe that it expects to use the memory. Does anyone object to me submitting a formal patch which does an immediate free of the memory passed to 'dpdk_init()' as the 'argv' parameter?