From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id DE620A052A; Fri, 25 Dec 2020 08:21:11 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 32157C9C8; Fri, 25 Dec 2020 08:21:08 +0100 (CET) Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id 10C60C9C4 for ; Fri, 25 Dec 2020 08:21:05 +0100 (CET) IronPort-SDR: rtVBeTs5VAuk7/jYxBtS2v0lfYpnynzGB0rJSs59WTjDV/lZQjVmpkP/amwn9uOh38QxNZOlUW y5Cart2gsctA== X-IronPort-AV: E=McAfee;i="6000,8403,9845"; a="176355439" X-IronPort-AV: E=Sophos;i="5.78,447,1599548400"; d="scan'208";a="176355439" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 24 Dec 2020 23:21:00 -0800 IronPort-SDR: RxrvG0bQs86vPUTSPPEG2wLSdgFktVulj/6dm2jo57/BFUZ3BuhlISqwI6y4EbCRP1F0tBlDrw Kr3I/ejWqI5A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.78,447,1599548400"; d="scan'208";a="566627716" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by fmsmga005.fm.intel.com with ESMTP; 24 Dec 2020 23:21:00 -0800 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Thu, 24 Dec 2020 23:20:59 -0800 Received: from shsmsx604.ccr.corp.intel.com (10.109.6.214) by SHSMSX604.ccr.corp.intel.com (10.109.6.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5; Fri, 25 Dec 2020 15:20:57 +0800 Received: from shsmsx604.ccr.corp.intel.com ([10.109.6.214]) by SHSMSX604.ccr.corp.intel.com ([10.109.6.214]) with mapi id 15.01.1713.004; Fri, 25 Dec 2020 15:20:57 +0800 From: "Peng, ZhihongX" To: Stephen Hemminger CC: "Wang, Haiyue" , "Zhang, Qi Z" , "Xing, Beilei" , "dev@dpdk.org" , "Lin, Xueqin" , "Yu, PingX" Thread-Topic: [dpdk-dev] [RFC] mem_debug add more log Thread-Index: AQHW1SkHYzjUf6j+5kCIZmhiKeTNf6n8rfkAgAR4DSCAADwigIAGD8Kg Date: Fri, 25 Dec 2020 07:20:57 +0000 Message-ID: <918d438cd1cc4f4cbc6274218903072c@intel.com> References: <20201218192109.50098-1-zhihongx.peng@intel.com> <20201218105424.6731d866@hermes.local> <20201221104421.5912a951@hermes.local> In-Reply-To: <20201221104421.5912a951@hermes.local> Accept-Language: zh-CN, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC] mem_debug add more log X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The performance of our simple scheme is better than asan. We are trying the= asan solution. Regards, Peng,Zhihong -----Original Message----- From: Stephen Hemminger =20 Sent: Tuesday, December 22, 2020 2:44 AM To: Peng, ZhihongX Cc: Wang, Haiyue ; Zhang, Qi Z ; Xing, Beilei ; dev@dpdk.org; Lin, Xueqin ; Yu, PingX Subject: Re: [dpdk-dev] [RFC] mem_debug add more log On Mon, 21 Dec 2020 07:35:10 +0000 "Peng, ZhihongX" wrote: > 1. I think this implement doesn't add significant overhead. Overhead only= will be occurred in rte_malloc and rte_free. >=20 > 2. Current existing address sanitizer infrastructure only support libc ma= lloc. >=20 > Regards, > Peng,Zhihong >=20 > -----Original Message----- > From: Stephen Hemminger > Sent: Saturday, December 19, 2020 2:54 AM > To: Peng, ZhihongX > Cc: Wang, Haiyue ; Zhang, Qi Z=20 > ; Xing, Beilei ;=20 > dev@dpdk.org > Subject: Re: [dpdk-dev] [RFC] mem_debug add more log >=20 > On Fri, 18 Dec 2020 14:21:09 -0500 > Peng Zhihong wrote: >=20 > > 1. The debugging log in current DPDK RTE_MALLOC_DEBUG mode is insuffici= ent, > > which makes it difficult to locate the issues, such as: > > a) When a memeory overlflow occur in rte_free, there is a little log > > information. Even if abort here, we can find which API is core > > dumped but we still need to read the source code to find out wher= e > > the requested memory is overflowed. > > b) Current DPDK can NOT find that the overflow if the memory has bee= n > > used and not released. > > c) If there are two pieces of continuous memory, when the first bloc= k > > is not released and an overflow is occured and also the second bl= ock > > of memory is covered, a memory overflow will be detected once the= second > > block of memory is released. However, current DPDK can not find t= he > > correct point of memory overflow. It only detect the memory overf= low > > of the second block but should dedect the one of first block. > > -----------------------------------------------------------------= ----------------- > > | header cookie | data1 | trailer cookie | header cookie |=20 > > data2 |trailer cookie | > > =20 > > -------------------------------------------------------------------- > > -- > > ------------ 2. To fix above issues, we can store the requested=20 > > information When DPDK > > request memory. Including the requested address and requested momory= 's > > file, function and numbers of rows and then put it into a list. > > -------------------- ---------------------- ----------------= ------ > > | struct list_head |---->| struct malloc_info |---->| struct malloc_= info | > > -------------------- ---------------------- ----------------= ------ > > The above 3 problems can be solved through this implementation: > > a) If there is a memory overflow in rte_free, you can traverse the > > list to find the information of overflow memory and print the > > overflow memory information. like this: > > code: > > 37 char *p =3D rte_zmalloc(NULL, 64, 0); > > 38 memset(p, 0, 65); > > 39 rte_free(p); > > 40 //rte_malloc_validate_all_memory(); > > memory error: > > EAL: Error: Invalid memory > > malloc memory address 0x17ff2c340 overflow in \ > > file:../examples/helloworld/main.c function:main line:37 > > b)c) Provide a interface to check all memory overflow in function > > rte_malloc_validate_all_memory, this function will check all > > memory on the list. Call this funcation manually at the exit > > point of business logic, we can find all overflow points in time. > >=20 > > Signed-off-by: Peng Zhihong >=20 > Good concept, but doesn't this add significant overhead? >=20 > Maybe we could make rte_malloc work with existing address sanitizer infra= structure in gcc/clang? That would provide faster and more immediate bette= r diagnostic info. Everybody builds there own custom debug hooks, and some of these are worth = sharing. But lots of time debug code becomes a technical debt, creates API/ABI issue= s and causes more trouble than it is worth. Therefore my desire is for DPDK to be better supported by standard tools su= ch as valgrind and address sanitizer. The standard tools catch more errors = faster and do not create project maintenance workload. See: https://github.com/google/sanitizers/wiki/AddressSanitizerAlgorithm