From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 753C2A09F6; Fri, 18 Dec 2020 11:32:09 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id B1884CA6E; Fri, 18 Dec 2020 11:32:07 +0100 (CET) Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by dpdk.org (Postfix) with ESMTP id 19E30CA68 for ; Fri, 18 Dec 2020 11:32:04 +0100 (CET) IronPort-SDR: hRIikzcbfOLSogTMkRQ4Df9h08VBLqMxG8IexKqwUss1efiQwPNr5RwpT+UGPVnRhROul7NwbW 5p4jo5geM8TQ== X-IronPort-AV: E=McAfee;i="6000,8403,9838"; a="260144547" X-IronPort-AV: E=Sophos;i="5.78,430,1599548400"; d="scan'208";a="260144547" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Dec 2020 02:32:02 -0800 IronPort-SDR: 0qdKTj8VaKP9+27OCiKTgvR+j6C/jGWPAfhbxXMHyHyUd6kVXqIbpXdOT7wAbgWnknkN1nJU38 2xiRyiwugCYQ== X-IronPort-AV: E=Sophos;i="5.78,430,1599548400"; d="scan'208";a="561139725" Received: from unknown (HELO localhost.localdomain) ([10.240.183.63]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Dec 2020 02:31:59 -0800 From: Peng Zhihong To: haiyue.wang@intel.com, qi.z.zhang@intel.com, beilei.xing@intel.com Cc: dev@dpdk.org, Peng Zhihong Date: Fri, 18 Dec 2020 14:21:09 -0500 Message-Id: <20201218192109.50098-1-zhihongx.peng@intel.com> X-Mailer: git-send-email 2.17.1 Subject: [dpdk-dev] [RFC] mem_debug add more log X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" 1. The debugging log in current DPDK RTE_MALLOC_DEBUG mode is insufficient, which makes it difficult to locate the issues, such as: a) When a memeory overlflow occur in rte_free, there is a little log information. Even if abort here, we can find which API is core dumped but we still need to read the source code to find out where the requested memory is overflowed. b) Current DPDK can NOT find that the overflow if the memory has been used and not released. c) If there are two pieces of continuous memory, when the first block is not released and an overflow is occured and also the second block of memory is covered, a memory overflow will be detected once the second block of memory is released. However, current DPDK can not find the correct point of memory overflow. It only detect the memory overflow of the second block but should dedect the one of first block. ---------------------------------------------------------------------------------- | header cookie | data1 | trailer cookie | header cookie | data2 |trailer cookie | ---------------------------------------------------------------------------------- 2. To fix above issues, we can store the requested information When DPDK request memory. Including the requested address and requested momory's file, function and numbers of rows and then put it into a list. -------------------- ---------------------- ---------------------- | struct list_head |---->| struct malloc_info |---->| struct malloc_info | -------------------- ---------------------- ---------------------- The above 3 problems can be solved through this implementation: a) If there is a memory overflow in rte_free, you can traverse the list to find the information of overflow memory and print the overflow memory information. like this: code: 37 char *p = rte_zmalloc(NULL, 64, 0); 38 memset(p, 0, 65); 39 rte_free(p); 40 //rte_malloc_validate_all_memory(); memory error: EAL: Error: Invalid memory malloc memory address 0x17ff2c340 overflow in \ file:../examples/helloworld/main.c function:main line:37 b)c) Provide a interface to check all memory overflow in function rte_malloc_validate_all_memory, this function will check all memory on the list. Call this funcation manually at the exit point of business logic, we can find all overflow points in time. Signed-off-by: Peng Zhihong --- lib/librte_eal/common/eal_common_mcfg.c | 30 ++ lib/librte_eal/common/eal_memcfg.h | 3 + lib/librte_eal/common/rte_malloc.c | 380 ++++++++++++++++++++- lib/librte_eal/include/rte_eal_memconfig.h | 15 + lib/librte_eal/include/rte_malloc.h | 70 ++++ lib/librte_eal/version.map | 13 + 6 files changed, 510 insertions(+), 1 deletion(-) diff --git a/lib/librte_eal/common/eal_common_mcfg.c b/lib/librte_eal/common/eal_common_mcfg.c index c77ba97a9..914db6251 100644 --- a/lib/librte_eal/common/eal_common_mcfg.c +++ b/lib/librte_eal/common/eal_common_mcfg.c @@ -174,3 +174,33 @@ rte_mcfg_get_single_file_segments(void) struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; return (bool)mcfg->single_file_segments; } + +#ifdef RTE_MALLOC_DEBUG +void +rte_mcfg_malloc_debug_read_lock(void) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + rte_rwlock_read_lock(&mcfg->malloc_debug_lock); +} + +void +rte_mcfg_malloc_debug_read_unlock(void) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + rte_rwlock_read_unlock(&mcfg->malloc_debug_lock); +} + +void +rte_mcfg_malloc_debug_write_lock(void) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + rte_rwlock_write_lock(&mcfg->malloc_debug_lock); +} + +void +rte_mcfg_malloc_debug_write_unlock(void) +{ + struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config; + rte_rwlock_write_unlock(&mcfg->malloc_debug_lock); +} +#endif diff --git a/lib/librte_eal/common/eal_memcfg.h b/lib/librte_eal/common/eal_memcfg.h index ea013a9da..6d872d57b 100644 --- a/lib/librte_eal/common/eal_memcfg.h +++ b/lib/librte_eal/common/eal_memcfg.h @@ -37,6 +37,9 @@ struct rte_mem_config { rte_rwlock_t qlock; /**< used by tailqs for thread safety. */ rte_rwlock_t mplock; /**< used by mempool library for thread safety. */ rte_spinlock_t tlock; /**< used by timer library for thread safety. */ +#ifdef RTE_MALLOC_DEBUG + rte_rwlock_t malloc_debug_lock; /**< used by malloc debug. */ +#endif rte_rwlock_t memory_hotplug_lock; /**< Indicates whether memory hotplug request is in progress. */ diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c index 9d39e58c0..2c65a24db 100644 --- a/lib/librte_eal/common/rte_malloc.c +++ b/lib/librte_eal/common/rte_malloc.c @@ -20,9 +20,12 @@ #include #include #include +#include #include +#include + #include #include "malloc_elem.h" #include "malloc_heap.h" @@ -31,6 +34,28 @@ #include "eal_private.h" +#ifdef RTE_MALLOC_DEBUG + +TAILQ_HEAD(malloc_debug_info_list, rte_tailq_entry); + +static struct rte_tailq_elem malloc_debug_info_tailq = { + .name = "RTE_MALLOC_DEBUG", +}; +EAL_REGISTER_TAILQ(malloc_debug_info_tailq); + +struct malloc_info { + void *ptr; + char *file; + char *func; + int line; +}; + +static void find_overflow_info(void *ptr); +static void remove_malloc_info(void *ptr); +static void clean_malloc_debug_info_list(void); + +#endif + /* Free the memory space back to heap */ static void mem_free(void *addr, const bool trace_ena) @@ -39,8 +64,17 @@ mem_free(void *addr, const bool trace_ena) rte_eal_trace_mem_free(addr); if (addr == NULL) return; - if (malloc_heap_free(malloc_elem_from_data(addr)) < 0) + if (malloc_heap_free(malloc_elem_from_data(addr)) < 0) { RTE_LOG(ERR, EAL, "Error: Invalid memory\n"); +#ifdef RTE_MALLOC_DEBUG + find_overflow_info(addr); + clean_malloc_debug_info_list(); + abort(); +#endif + } +#ifdef RTE_MALLOC_DEBUG + remove_malloc_info(addr); +#endif } void @@ -55,6 +89,8 @@ eal_free_no_trace(void *addr) return mem_free(addr, false); } +#ifndef RTE_MALLOC_DEBUG + static void * malloc_socket(const char *type, size_t size, unsigned int align, int socket_arg, const bool trace_ena) @@ -209,6 +245,8 @@ rte_realloc(void *ptr, size_t size, unsigned int align) return rte_realloc_socket(ptr, size, align, SOCKET_ID_ANY); } +#endif + int rte_malloc_validate(const void *ptr, size_t *size) { @@ -667,3 +705,343 @@ rte_malloc_heap_destroy(const char *heap_name) return ret; } + +#ifdef RTE_MALLOC_DEBUG + +static void +add_malloc_info_to_list(void *ptr, const char *file, int line, + const char *func) +{ + struct rte_tailq_entry *te = NULL; + struct malloc_debug_info_list *debug_info_list = NULL; + struct malloc_info *debug_info = NULL; + + debug_info_list = RTE_TAILQ_CAST(malloc_debug_info_tailq.head, + malloc_debug_info_list); + + rte_mcfg_malloc_debug_write_lock(); + + te = malloc(sizeof(*te)); + if (te == NULL) { + printf("%s mallco te erro.\n", __func__); + goto error; + } + memset(te, 0, sizeof(*te)); + + debug_info = malloc(sizeof(struct malloc_info)); + if (debug_info == NULL) { + printf("%s mallco debug_info erro.\n", __func__); + goto error; + } + memset(debug_info, 0, sizeof(struct malloc_info)); + + debug_info->file = strdup(file); + if (debug_info->file == NULL) { + printf("%s mallco file erro.\n", __func__); + goto error; + } + + debug_info->func = strdup(func); + if (debug_info->func == NULL) { + printf("%s mallco func erro.\n", __func__); + goto error; + } + + debug_info->ptr = ptr; + debug_info->line = line; + + te->data = (void *)debug_info; + + TAILQ_INSERT_TAIL(debug_info_list, te, next); + + rte_mcfg_malloc_debug_write_unlock(); + return; + +error: + if (te) + free(te); + + if (debug_info->file) + free(debug_info->file); + + if (debug_info->func) + free(debug_info->func); + + if (debug_info) + free(debug_info); + + rte_mcfg_malloc_debug_write_unlock(); +} + +static void +find_overflow_info(void *ptr) +{ + struct rte_tailq_entry *te = NULL; + struct malloc_debug_info_list *debug_info_list = NULL; + struct malloc_info *debug_info = NULL; + + debug_info_list = RTE_TAILQ_CAST(malloc_debug_info_tailq.head, + malloc_debug_info_list); + + rte_mcfg_malloc_debug_read_lock(); + TAILQ_FOREACH(te, debug_info_list, next) { + debug_info = (struct malloc_info *) te->data; + if (ptr == debug_info->ptr) { + printf("malloc memory address %p overflow in file:%s " + "function:%s line:%d\n", + debug_info->ptr, debug_info->file, + debug_info->func, debug_info->line); + } + } + rte_mcfg_malloc_debug_read_unlock(); +} + +static void +remove_malloc_info(void *ptr) +{ + struct rte_tailq_entry *te = NULL; + struct malloc_debug_info_list *debug_info_list = NULL; + struct malloc_info *debug_info = NULL; + + debug_info_list = RTE_TAILQ_CAST(malloc_debug_info_tailq.head, + malloc_debug_info_list); + + rte_mcfg_malloc_debug_write_unlock(); + TAILQ_FOREACH(te, debug_info_list, next) { + debug_info = (struct malloc_info *) te->data; + if (ptr == debug_info->ptr) { + TAILQ_REMOVE(debug_info_list, te, next); + free(debug_info->file); + free(debug_info->func); + free(debug_info); + free(te); + break; + } + } + rte_mcfg_malloc_debug_write_unlock(); +} + +static void +clean_malloc_debug_info_list(void) +{ + struct rte_tailq_entry *te = NULL; + struct malloc_debug_info_list *debug_info_list = NULL; + struct malloc_info *debug_info = NULL; + + debug_info_list = RTE_TAILQ_CAST(malloc_debug_info_tailq.head, + malloc_debug_info_list); + + rte_mcfg_malloc_debug_write_unlock(); + TAILQ_FOREACH(te, debug_info_list, next) { + debug_info = (struct malloc_info *) te->data; + TAILQ_REMOVE(debug_info_list, te, next); + free(debug_info->file); + free(debug_info->func); + free(debug_info); + free(te); + } + rte_mcfg_malloc_debug_write_unlock(); +} + +void +rte_malloc_validate_all_memory(void) +{ + struct rte_tailq_entry *te = NULL; + struct malloc_debug_info_list *debug_info_list = NULL; + struct malloc_info *debug_info = NULL; + int is_overflow = 0; + + debug_info_list = RTE_TAILQ_CAST(malloc_debug_info_tailq.head, + malloc_debug_info_list); + + rte_mcfg_malloc_debug_read_lock(); + TAILQ_FOREACH(te, debug_info_list, next) { + debug_info = (struct malloc_info *) te->data; + if (rte_malloc_validate(debug_info->ptr, NULL)) { + printf("malloc memory address %p overflow in file:%s " + "function:%s line:%d\n", + debug_info->ptr, debug_info->file, + debug_info->func, debug_info->line); + is_overflow = 1; + } + } + if (is_overflow) { + clean_malloc_debug_info_list(); + abort(); + } + rte_mcfg_malloc_debug_read_unlock(); +} + +static void * +malloc_socket_debug(const char *type, size_t size, unsigned int align, + int socket_arg, const bool trace_ena, const char *file, + int line, const char *func) +{ + void *ptr; + + /* return NULL if size is 0 or alignment is not power-of-2 */ + if (size == 0 || (align && !rte_is_power_of_2(align))) + return NULL; + + /* if there are no hugepages and if we are not allocating from an + * external heap, use memory from any socket available. checking for + * socket being external may return -1 in case of invalid socket, but + * that's OK - if there are no hugepages, it doesn't matter. + */ + if (rte_malloc_heap_socket_is_external(socket_arg) != 1 && + !rte_eal_has_hugepages()) + socket_arg = SOCKET_ID_ANY; + + ptr = malloc_heap_alloc(type, size, socket_arg, 0, + align == 0 ? 1 : align, 0, false); + + add_malloc_info_to_list(ptr, file, line, func); + + if (trace_ena) + rte_eal_trace_mem_malloc(type, size, align, socket_arg, ptr); + return ptr; +} + +/* + * Allocate memory on specified heap. + */ +void * +rte_malloc_socket_debug(const char *type, size_t size, unsigned int align, + int socket_arg, const char *file, int line, const char *func) +{ + return malloc_socket_debug(type, size, align, socket_arg, true, + file, line, func); +} + +/* + * Allocate memory on default heap. + */ +void * +rte_malloc_debug(const char *type, size_t size, unsigned int align, + const char *file, int line, const char *func) +{ + return rte_malloc_socket_debug(type, size, align, SOCKET_ID_ANY, + file, line, func); +} + +/* + * Allocate zero'd memory on specified heap. + */ +void * +rte_zmalloc_socket_debug(const char *type, size_t size, unsigned int align, + int socket, const char *file, int line, const char *func) +{ + void *ptr = rte_malloc_socket_debug(type, size, align, + socket, file, line, func); + + /* + * If DEBUG is enabled, then freed memory is marked with poison + * value and set to zero on allocation. + * If DEBUG is not enabled then memory is already zeroed. + */ + if (ptr != NULL) + memset(ptr, 0, size); + + rte_eal_trace_mem_zmalloc(type, size, align, socket, ptr); + return ptr; +} + +/* + * Allocate zero'd memory on default heap. + */ +void * +rte_zmalloc_debug(const char *type, size_t size, unsigned int align, + const char *file, int line, const char *func) +{ + return rte_zmalloc_socket_debug(type, size, align, SOCKET_ID_ANY, + file, line, func); +} + +/* + * Allocate zero'd memory on specified heap. + */ +void * +rte_calloc_socket_debug(const char *type, size_t num, size_t size, + unsigned int align, int socket, const char *file, + int line, const char *func) +{ + return rte_zmalloc_socket_debug(type, num * size, align, socket, + file, line, func); +} + +/* + * Allocate zero'd memory on default heap. + */ +void * +rte_calloc_debug(const char *type, size_t num, size_t size, unsigned int align, + const char *file, int line, const char *func) +{ + return rte_zmalloc_debug(type, num * size, align, file, line, func); +} + +/* + * Resize allocated memory on specified heap. + */ +void * +rte_realloc_socket_debug(void *ptr, size_t size, unsigned int align, + int socket, const char *file, int line, const char *func) +{ + if (ptr == NULL) + return rte_malloc_socket_debug(NULL, size, align, socket, + file, line, func); + + struct malloc_elem *elem = malloc_elem_from_data(ptr); + if (elem == NULL) { + RTE_LOG(ERR, EAL, "Error: memory corruption detected\n"); + return NULL; + } + + size = RTE_CACHE_LINE_ROUNDUP(size), align = RTE_CACHE_LINE_ROUNDUP(align); + + /* check requested socket id and alignment matches first, and if ok, + * see if we can resize block + */ + if ((socket == SOCKET_ID_ANY || + (unsigned int)socket == elem->heap->socket_id) && + RTE_PTR_ALIGN(ptr, align) == ptr && + malloc_heap_resize(elem, size) == 0) { + rte_eal_trace_mem_realloc(size, align, socket, ptr); + return ptr; + } + + /* either requested socket id doesn't match, alignment is off + * or we have no room to expand, + * so move the data. + */ + void *new_ptr = rte_malloc_socket_debug(NULL, size, align, socket, + file, line, func); + if (new_ptr == NULL) + return NULL; + /* elem: |pad|data_elem|data|trailer| */ + const size_t old_size = elem->size - elem->pad - MALLOC_ELEM_OVERHEAD; + rte_memcpy(new_ptr, ptr, old_size < size ? old_size : size); + rte_free(ptr); + + rte_eal_trace_mem_realloc(size, align, socket, new_ptr); + return new_ptr; +} + +/* + * Resize allocated memory. + */ +void * +rte_realloc_debug(void *ptr, size_t size, unsigned int align, + const char *file, int line, const char *func) +{ + return rte_realloc_socket_debug(ptr, size, align, SOCKET_ID_ANY, + file, line, func); +} + +void * +eal_malloc_no_trace(const char *type, size_t size, unsigned int align) +{ + return malloc_socket_debug(type, size, align, SOCKET_ID_ANY, false, + __FILE__, __LINE__, __func__); +} + +#endif diff --git a/lib/librte_eal/include/rte_eal_memconfig.h b/lib/librte_eal/include/rte_eal_memconfig.h index dede2ee32..6482a3a42 100644 --- a/lib/librte_eal/include/rte_eal_memconfig.h +++ b/lib/librte_eal/include/rte_eal_memconfig.h @@ -122,6 +122,21 @@ __rte_experimental bool rte_mcfg_get_single_file_segments(void); +#ifdef RTE_MALLOC_DEBUG +void +rte_mcfg_malloc_debug_read_lock(void); + +void +rte_mcfg_malloc_debug_read_unlock(void); + +void +rte_mcfg_malloc_debug_write_lock(void); + +void +rte_mcfg_malloc_debug_write_unlock(void); + +#endif + #ifdef __cplusplus } #endif diff --git a/lib/librte_eal/include/rte_malloc.h b/lib/librte_eal/include/rte_malloc.h index 3af64f876..805f1a9b9 100644 --- a/lib/librte_eal/include/rte_malloc.h +++ b/lib/librte_eal/include/rte_malloc.h @@ -32,6 +32,8 @@ struct rte_malloc_socket_stats { size_t heap_allocsz_bytes; /**< Total allocated bytes on heap */ }; +#ifndef RTE_MALLOC_DEBUG + /** * This function allocates memory from the huge-page area of memory. The memory * is not cleared. In NUMA systems, the memory allocated resides on the same @@ -247,6 +249,74 @@ void * rte_calloc_socket(const char *type, size_t num, size_t size, unsigned align, int socket) __rte_alloc_size(2, 3); +#else + +void * +rte_malloc_debug(const char *type, size_t size, unsigned int align, + const char *file, int line, const char *func); + +void * +rte_zmalloc_debug(const char *type, size_t size, unsigned int align, + const char *file, int line, const char *func); + +void * +rte_calloc_debug(const char *type, size_t num, size_t size, unsigned int align, + const char *file, int line, const char *func); + +void * +rte_realloc_debug(void *ptr, size_t size, unsigned int align, + const char *file, int line, const char *func); + +__rte_experimental +void * +rte_realloc_socket_debug(void *ptr, size_t size, unsigned int align, + int socket, const char *file, int line, const char *func); + +void * +rte_malloc_socket_debug(const char *type, size_t size, unsigned int align, + int socket, const char *file, int line, const char *func); + +void * +rte_zmalloc_socket_debug(const char *type, size_t size, unsigned int align, + int socket, const char *file, int line, const char *func); + +void * +rte_calloc_socket_debug(const char *type, size_t num, size_t size, + unsigned int align, int socket, const char *file, + int line, const char *func); + +#define rte_malloc(type, size, align) \ + rte_malloc_debug(type, size, align, __FILE__, __LINE__, __func__) + +#define rte_zmalloc(type, size, align) \ + rte_zmalloc_debug(type, size, align, __FILE__, __LINE__, __func__) + +#define rte_calloc(type, num, size, align) \ + rte_calloc_debug(type, num, size, align, __FILE__, __LINE__, __func__) + +#define rte_realloc(ptr, size, align) \ + rte_realloc_debug(ptr, size, align, __FILE__, __LINE__, __func__) + +#define rte_malloc_socket(type, size, align, socket) \ + rte_malloc_socket_debug(type, size, align, socket,\ + __FILE__, __LINE__, __func__) + +#define rte_zmalloc_socket(type, size, align, socket) \ + rte_zmalloc_socket_debug(type, size, align, socket, \ + __FILE__, __LINE__, __func__) + +#define rte_calloc_socket(type, num, size, align, socket) \ + rte_calloc_socket_debug(type, num, size, align, socket, \ + __FILE__, __LINE__, __func__) + +#define rte_realloc_socket(ptr, size, align, socket) \ + rte_realloc_socket_debug(ptr, size, align, socket, \ + __FILE__, __LINE__, __func__) + +void rte_malloc_validate_all_memory(void); + +#endif + /** * Frees the memory space pointed to by the provided pointer. * diff --git a/lib/librte_eal/version.map b/lib/librte_eal/version.map index 354c068f3..bbd683062 100644 --- a/lib/librte_eal/version.map +++ b/lib/librte_eal/version.map @@ -133,6 +133,10 @@ DPDK_21 { rte_mcfg_tailq_read_unlock; rte_mcfg_tailq_write_lock; rte_mcfg_tailq_write_unlock; + rte_mcfg_malloc_debug_read_lock; + rte_mcfg_malloc_debug_read_unlock; + rte_mcfg_malloc_debug_write_lock; + rte_mcfg_malloc_debug_write_unlock; rte_mem_lock_page; rte_mem_virt2iova; rte_mem_virt2phy; @@ -217,6 +221,14 @@ DPDK_21 { rte_vlog; rte_zmalloc; rte_zmalloc_socket; + rte_malloc_debug; + rte_zmalloc_debug; + rte_calloc_debug; + rte_realloc_debug; + rte_malloc_socket_debug; + rte_zmalloc_socket_debug; + rte_calloc_socket_debug; + rte_malloc_validate_all_memory; local: *; }; @@ -319,6 +331,7 @@ EXPERIMENTAL { rte_fbarray_find_rev_biggest_used; rte_intr_callback_unregister_pending; rte_realloc_socket; + rte_realloc_socket_debug; # added in 19.08 rte_intr_ack; -- 2.17.1