From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 197D6A09EF; Mon, 21 Dec 2020 08:35:20 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F350FCBB4; Mon, 21 Dec 2020 08:35:18 +0100 (CET) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id B9C5ECB93 for ; Mon, 21 Dec 2020 08:35:17 +0100 (CET) IronPort-SDR: V0BPqdCz9l23JxkKgPJKMXj4+LrH+fUgvoxqqhli+sIfIPNZ4pE2lEi66QbBahspM1MKpTuktQ yl3cac1douyQ== X-IronPort-AV: E=McAfee;i="6000,8403,9841"; a="237255621" X-IronPort-AV: E=Sophos;i="5.78,436,1599548400"; d="scan'208";a="237255621" Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 Dec 2020 23:35:13 -0800 IronPort-SDR: 8cxDHVa65q7/taDxWQ0hAPTPc3HRTEJ4YnOPzxRd8ne2XJN83QnAzn1PiljlzoxZTC/AWXmGVE eTmUdyLMJfWw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,436,1599548400"; d="scan'208";a="560575693" Received: from fmsmsx606.amr.corp.intel.com ([10.18.126.86]) by fmsmga006.fm.intel.com with ESMTP; 20 Dec 2020 23:35:13 -0800 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by fmsmsx606.amr.corp.intel.com (10.18.126.86) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Sun, 20 Dec 2020 23:35:12 -0800 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by SHSMSX604.ccr.corp.intel.com (10.109.6.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Mon, 21 Dec 2020 15:35:10 +0800 Received: from shsmsx604.ccr.corp.intel.com ([10.109.6.214]) by SHSMSX604.ccr.corp.intel.com ([10.109.6.214]) with mapi id 15.01.1713.004; Mon, 21 Dec 2020 15:35:10 +0800 From: "Peng, ZhihongX" To: Stephen Hemminger CC: "Wang, Haiyue" , "Zhang, Qi Z" , "Xing, Beilei" , "dev@dpdk.org" , "Lin, Xueqin" , "Yu, PingX" Thread-Topic: [dpdk-dev] [RFC] mem_debug add more log Thread-Index: AQHW1SkHYzjUf6j+5kCIZmhiKeTNf6n8rfkAgAR4DSA= Date: Mon, 21 Dec 2020 07:35:10 +0000 Message-ID: References: <20201218192109.50098-1-zhihongx.peng@intel.com> <20201218105424.6731d866@hermes.local> In-Reply-To: <20201218105424.6731d866@hermes.local> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC] mem_debug add more log X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 1. I think this implement doesn't add significant overhead. Overhead only w= ill be occurred in rte_malloc and rte_free. 2. Current existing address sanitizer infrastructure only support libc mall= oc. Regards, Peng,Zhihong -----Original Message----- From: Stephen Hemminger =20 Sent: Saturday, December 19, 2020 2:54 AM To: Peng, ZhihongX Cc: Wang, Haiyue ; Zhang, Qi Z ; Xing, Beilei ; dev@dpdk.org Subject: Re: [dpdk-dev] [RFC] mem_debug add more log On Fri, 18 Dec 2020 14:21:09 -0500 Peng Zhihong wrote: > 1. The debugging log in current DPDK RTE_MALLOC_DEBUG mode is insufficien= t, > which makes it difficult to locate the issues, such as: > a) When a memeory overlflow occur in rte_free, there is a little log > information. Even if abort here, we can find which API is core > dumped but we still need to read the source code to find out where > the requested memory is overflowed. > b) Current DPDK can NOT find that the overflow if the memory has been > used and not released. > c) If there are two pieces of continuous memory, when the first block > is not released and an overflow is occured and also the second bloc= k > of memory is covered, a memory overflow will be detected once the s= econd > block of memory is released. However, current DPDK can not find the > correct point of memory overflow. It only detect the memory overflo= w > of the second block but should dedect the one of first block. > -------------------------------------------------------------------= --------------- > | header cookie | data1 | trailer cookie | header cookie | data2 |t= railer cookie | > =20 > ---------------------------------------------------------------------- > ------------ 2. To fix above issues, we can store the requested=20 > information When DPDK > request memory. Including the requested address and requested momory's > file, function and numbers of rows and then put it into a list. > -------------------- ---------------------- ------------------= ---- > | struct list_head |---->| struct malloc_info |---->| struct malloc_in= fo | > -------------------- ---------------------- ------------------= ---- > The above 3 problems can be solved through this implementation: > a) If there is a memory overflow in rte_free, you can traverse the > list to find the information of overflow memory and print the > overflow memory information. like this: > code: > 37 char *p =3D rte_zmalloc(NULL, 64, 0); > 38 memset(p, 0, 65); > 39 rte_free(p); > 40 //rte_malloc_validate_all_memory(); > memory error: > EAL: Error: Invalid memory > malloc memory address 0x17ff2c340 overflow in \ > file:../examples/helloworld/main.c function:main line:37 > b)c) Provide a interface to check all memory overflow in function > rte_malloc_validate_all_memory, this function will check all > memory on the list. Call this funcation manually at the exit > point of business logic, we can find all overflow points in time. >=20 > Signed-off-by: Peng Zhihong Good concept, but doesn't this add significant overhead? Maybe we could make rte_malloc work with existing address sanitizer infrast= ructure in gcc/clang? That would provide faster and more immediate better = diagnostic info.