From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 179D4A034F; Mon, 17 Jan 2022 09:08:44 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 7979B411ED; Mon, 17 Jan 2022 09:08:28 +0100 (CET) Received: from NAM11-BN8-obe.outbound.protection.outlook.com (mail-bn8nam11on2068.outbound.protection.outlook.com [40.107.236.68]) by mails.dpdk.org (Postfix) with ESMTP id 6E2C6411FE for ; Mon, 17 Jan 2022 09:08:27 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=AMOKwr7pkwIaBzK5PVN7apiDJ7Mo41mzv1Og3V62JsQkHlqkvxLHOLMSFem5qhwYoiHtPN3Wx34COovm8xeQme9mHHOYRQpLfpXWL7fHUmGvI51ZV1c3pM8/R8GrUxTobgtbEtiJ/rlVD+NeFpWrpEj6+3I5U0fF5JHZSr3MXdK4DfGw+GaunYqEgB8zKKHxhQqPhnuKnjbV9yxLBQtVlT4MpPT1kq1zzg9NOPNozeRN3hffThi3QDd2kd5wQI2cTRASF7xrUbmIKjb7QdkGlSfI33iJmLAUH5U/X13zxFfVj33FT9QrASQeivXUbVpfJfdpcvhaeqa09ewxUG2B2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=+4bnuS8dtAe/ORpswZzHjiLPbQL+O3OiGMFFKj8M8Ok=; b=ABUvRGkqKVKd/tPY6Rg3O4XZCrQ6+DFuo1TNO6t/jlr8K4M1jF00fAhbXVVSTMU3mdogzNeiXUkOLT5LoyjPW1lOsee3tFi/RgZ/smgO+nFIqsj0AmwfDOi7dIvlP3A3ywrp/2fOxXamw/HiYqeDs5IuXpYGT5xLjV4pdoXU3uabwNT2twxOQAfobzGMJVH2ncmrpZVJ4aMyVH1dBFZtLhVrNIKfaXf6Uhe2yaw5dGQifY0sMzfckbcjRpXHPLeWM/cNw3ygqdsh7phPMszKNoptyd4B3/xWHximjWPh5QBcNgwVap/APYw6/tmWgqn0Pry3O0XDNu0vYkuQ5NvQlA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 12.22.5.236) smtp.rcpttodomain=intel.com smtp.mailfrom=nvidia.com; dmarc=pass (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=+4bnuS8dtAe/ORpswZzHjiLPbQL+O3OiGMFFKj8M8Ok=; b=BGskIRI7S3ouTUbK+8D5VljBD/68lcXsTV2bv0nStnJ03kqZ4uuuAhJWt2K4ZFTBn5bwvoM5AhWTXL/6g/3mwEIc6pWlat2G+xw6u3YI2b9FiMwILf7EX08f+/8lZk+Yp7c1zkP/puibnZ6T8fzn3PzoaCZ198bh364JmKQtEE9RdTJqeHlhjojCfe1AatFNnfxlFwseqvADkI0HtDT8IT3YmK+W99KOus3SV6ANEhmPZGgr7NH88fOASUATjC9PQJIoW9BQVNSeiL8JUiC10K/f6Qb/uzEdzzijal6bKLX3wlfS5JJB67FNCb5mq1R7f8n1RI9UhGxJx1BJrOLzWg== Received: from DM5PR13CA0067.namprd13.prod.outlook.com (2603:10b6:3:117::29) by BY5PR12MB4323.namprd12.prod.outlook.com (2603:10b6:a03:211::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4888.9; Mon, 17 Jan 2022 08:08:25 +0000 Received: from DM6NAM11FT021.eop-nam11.prod.protection.outlook.com (2603:10b6:3:117:cafe::c3) by DM5PR13CA0067.outlook.office365.com (2603:10b6:3:117::29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4909.2 via Frontend Transport; Mon, 17 Jan 2022 08:08:24 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.236) smtp.mailfrom=nvidia.com; dkim=none (message not signed) header.d=none;dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 12.22.5.236 as permitted sender) receiver=protection.outlook.com; client-ip=12.22.5.236; helo=mail.nvidia.com; Received: from mail.nvidia.com (12.22.5.236) by DM6NAM11FT021.mail.protection.outlook.com (10.13.173.76) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4888.9 via Frontend Transport; Mon, 17 Jan 2022 08:08:24 +0000 Received: from rnnvmail201.nvidia.com (10.129.68.8) by DRHQMAIL109.nvidia.com (10.27.9.19) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Mon, 17 Jan 2022 08:08:23 +0000 Received: from nvidia.com (10.126.231.35) by rnnvmail201.nvidia.com (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.2.986.9; Mon, 17 Jan 2022 00:08:22 -0800 From: Dmitry Kozlyuk To: CC: Anatoly Burakov Subject: [PATCH v1 3/6] mem: add dirty malloc element support Date: Mon, 17 Jan 2022 10:07:58 +0200 Message-ID: <20220117080801.481568-4-dkozlyuk@nvidia.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220117080801.481568-1-dkozlyuk@nvidia.com> References: <20211230143744.3550098-1-dkozlyuk@nvidia.com> <20220117080801.481568-1-dkozlyuk@nvidia.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.126.231.35] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To rnnvmail201.nvidia.com (10.129.68.8) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 7ccd605d-b2f1-457e-cfc7-08d9d9908909 X-MS-TrafficTypeDiagnostic: BY5PR12MB4323:EE_ X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:9508; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: UhvKma3tgtAbA2hlNqzWSa1Z6sZF10TqxNCVycSvqz4Jlv4Y1UzooC9v9jb+x66RPXumiYwqebaNsuMOibNnZA2nKU7wi3N55OyK9XoVryNwyGI0h4brFqywhdihIWqR7v9x54RaDyLFCBeDecVYd+RvraQIo1/MdHvGojVgjlWSMEPURNWHAyJF4+oVnkRxRdCMlFcDHl/JDDINIOBFL4BJ1BKEWBzGLOIht75sqbMkiuu/q9IocOgP+svUJrt6EAF05Q/lvPDEzy9zrPUxyxVP+uaaPMDskoooBnoP/EEriGE7/7eGizcao7+nv0iP519W9r3MaijtVMZcOvvo0LCcGNkVdPapzF/hVRpAPRQHZZ9+bWf6eEgJJk6eEv4A8spiqMbL1SHRwLsImnFvuo3wlBX9Sf6cfW4g8ZvLjxCctAlbe5hKDqbxUO/6sXGjQQ8D2tUXjifCnBXhUjXbXEp8vYll01r5/HzocBvumVeu2qKwdYcXvytBD1ABFUk3fKCZ8Bik/2HjfqX3VASgw438VNXrgxXQyz1PqBfHV2En2C34ejl5s5ZLG4zVrKPHO2iZPOWfQPZD7uPUoGsOkWYPZpAxnkwn7rtMAlBJ27F/xccw9U3GRn7aZwcL/yjjm2IG8UXfffVljCVhuhDnuJeORqvTfhOl2xzGNgz0H21STp3tSyOPEfsXIYQ57lsGoDhbeeb8+htjiy9CnshETssUvSYxQS3rl14s6l6fn0Dr+Q1POBW2fliSycO1RC8znPty52LiCXPG7sHwrypGhaYI/RvGFMdo4ixkksZjQ81AtLE0SMilpY8Y6ETpxM53dM8et+WaH/aEkQ2/K3UV5g== X-Forefront-Antispam-Report: CIP:12.22.5.236; CTRY:US; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE; SFS:(4636009)(46966006)(40470700002)(36840700001)(1076003)(2616005)(40460700001)(86362001)(6286002)(336012)(316002)(16526019)(186003)(4326008)(55016003)(82310400004)(6916009)(26005)(6666004)(426003)(2906002)(36756003)(508600001)(356005)(7696005)(83380400001)(8936002)(70206006)(47076005)(5660300002)(8676002)(81166007)(70586007)(36860700001)(14143004)(36900700001); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 17 Jan 2022 08:08:24.5400 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 7ccd605d-b2f1-457e-cfc7-08d9d9908909 X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.236]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT021.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY5PR12MB4323 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org EAL malloc layer assumed all free elements content is filled with zeros ("clean"), as opposed to uninitialized ("dirty"). This assumption was ensured in two ways: 1. EAL memalloc layer always returned clean memory. 2. Freed memory was cleared before returning into the heap. Clearing the memory can be as slow as around 14 GiB/s. To save doing so, memalloc layer is allowed to return dirty memory. Such segments being marked with RTE_MEMSEG_FLAG_DIRTY. The allocator tracks elements that contain dirty memory using the new flag in the element header. When clean memory is requested via rte_zmalloc*() and the suitable element is dirty, it is cleared on allocation. When memory is deallocated, the freed element is joined with adjacent free elements, and the dirty flag is updated: dirty + freed + dirty = dirty => no need to clean freed + dirty = dirty the freed memory clean + freed + clean = clean => freed memory clean + freed = clean must be cleared freed + clean = clean freed = clean As a result, memory is either cleared on free, as before, or it will be cleared on allocation if need be, but never twice. Signed-off-by: Dmitry Kozlyuk --- lib/eal/common/malloc_elem.c | 22 +++++++++++++++++++--- lib/eal/common/malloc_elem.h | 11 +++++++++-- lib/eal/common/malloc_heap.c | 18 ++++++++++++------ lib/eal/common/rte_malloc.c | 21 ++++++++++++++------- lib/eal/include/rte_memory.h | 8 ++++++-- 5 files changed, 60 insertions(+), 20 deletions(-) diff --git a/lib/eal/common/malloc_elem.c b/lib/eal/common/malloc_elem.c index bdd20a162e..e04e0890fb 100644 --- a/lib/eal/common/malloc_elem.c +++ b/lib/eal/common/malloc_elem.c @@ -129,7 +129,7 @@ malloc_elem_find_max_iova_contig(struct malloc_elem *elem, size_t align) void malloc_elem_init(struct malloc_elem *elem, struct malloc_heap *heap, struct rte_memseg_list *msl, size_t size, - struct malloc_elem *orig_elem, size_t orig_size) + struct malloc_elem *orig_elem, size_t orig_size, bool dirty) { elem->heap = heap; elem->msl = msl; @@ -137,6 +137,7 @@ malloc_elem_init(struct malloc_elem *elem, struct malloc_heap *heap, elem->next = NULL; memset(&elem->free_list, 0, sizeof(elem->free_list)); elem->state = ELEM_FREE; + elem->dirty = dirty; elem->size = size; elem->pad = 0; elem->orig_elem = orig_elem; @@ -300,7 +301,7 @@ split_elem(struct malloc_elem *elem, struct malloc_elem *split_pt) const size_t new_elem_size = elem->size - old_elem_size; malloc_elem_init(split_pt, elem->heap, elem->msl, new_elem_size, - elem->orig_elem, elem->orig_size); + elem->orig_elem, elem->orig_size, elem->dirty); split_pt->prev = elem; split_pt->next = next_elem; if (next_elem) @@ -506,6 +507,7 @@ join_elem(struct malloc_elem *elem1, struct malloc_elem *elem2) else elem1->heap->last = elem1; elem1->next = next; + elem1->dirty |= elem2->dirty; if (elem1->pad) { struct malloc_elem *inner = RTE_PTR_ADD(elem1, elem1->pad); inner->size = elem1->size - elem1->pad; @@ -579,6 +581,14 @@ malloc_elem_free(struct malloc_elem *elem) ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN); data_len = elem->size - MALLOC_ELEM_OVERHEAD; + /* + * Consider the element clean for the purposes of joining. + * If both neighbors are clean or non-existent, + * the joint element will be clean, + * which means the memory should be cleared. + * There is no need to clear the memory if the joint element is dirty. + */ + elem->dirty = false; elem = malloc_elem_join_adjacent_free(elem); malloc_elem_free_list_insert(elem); @@ -588,8 +598,14 @@ malloc_elem_free(struct malloc_elem *elem) /* decrease heap's count of allocated elements */ elem->heap->alloc_count--; - /* poison memory */ +#ifndef RTE_MALLOC_DEBUG + /* Normally clear the memory when needed. */ + if (!elem->dirty) + memset(ptr, 0, data_len); +#else + /* Always poison the memory in debug mode. */ memset(ptr, MALLOC_POISON, data_len); +#endif return elem; } diff --git a/lib/eal/common/malloc_elem.h b/lib/eal/common/malloc_elem.h index 15d8ba7af2..f2aa98821b 100644 --- a/lib/eal/common/malloc_elem.h +++ b/lib/eal/common/malloc_elem.h @@ -27,7 +27,13 @@ struct malloc_elem { LIST_ENTRY(malloc_elem) free_list; /**< list of free elements in heap */ struct rte_memseg_list *msl; - volatile enum elem_state state; + /** Element state, @c dirty and @c pad validity depends on it. */ + /* An extra bit is needed to represent enum elem_state as signed int. */ + enum elem_state state : 3; + /** If state == ELEM_FREE: the memory is not filled with zeroes. */ + uint32_t dirty : 1; + /** Reserved for future use. */ + uint32_t reserved : 28; uint32_t pad; size_t size; struct malloc_elem *orig_elem; @@ -320,7 +326,8 @@ malloc_elem_init(struct malloc_elem *elem, struct rte_memseg_list *msl, size_t size, struct malloc_elem *orig_elem, - size_t orig_size); + size_t orig_size, + bool dirty); void malloc_elem_insert(struct malloc_elem *elem); diff --git a/lib/eal/common/malloc_heap.c b/lib/eal/common/malloc_heap.c index 55aad2711b..24080fc473 100644 --- a/lib/eal/common/malloc_heap.c +++ b/lib/eal/common/malloc_heap.c @@ -93,11 +93,11 @@ malloc_socket_to_heap_id(unsigned int socket_id) */ static struct malloc_elem * malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list *msl, - void *start, size_t len) + void *start, size_t len, bool dirty) { struct malloc_elem *elem = start; - malloc_elem_init(elem, heap, msl, len, elem, len); + malloc_elem_init(elem, heap, msl, len, elem, len, dirty); malloc_elem_insert(elem); @@ -135,7 +135,8 @@ malloc_add_seg(const struct rte_memseg_list *msl, found_msl = &mcfg->memsegs[msl_idx]; - malloc_heap_add_memory(heap, found_msl, ms->addr, len); + malloc_heap_add_memory(heap, found_msl, ms->addr, len, + ms->flags & RTE_MEMSEG_FLAG_DIRTY); heap->total_size += len; @@ -303,7 +304,8 @@ alloc_pages_on_heap(struct malloc_heap *heap, uint64_t pg_sz, size_t elt_size, struct rte_memseg_list *msl; struct malloc_elem *elem = NULL; size_t alloc_sz; - int allocd_pages; + int allocd_pages, i; + bool dirty = false; void *ret, *map_addr; alloc_sz = (size_t)pg_sz * n_segs; @@ -372,8 +374,12 @@ alloc_pages_on_heap(struct malloc_heap *heap, uint64_t pg_sz, size_t elt_size, goto fail; } + /* Element is dirty if it contains at least one dirty page. */ + for (i = 0; i < allocd_pages; i++) + dirty |= ms[i]->flags & RTE_MEMSEG_FLAG_DIRTY; + /* add newly minted memsegs to malloc heap */ - elem = malloc_heap_add_memory(heap, msl, map_addr, alloc_sz); + elem = malloc_heap_add_memory(heap, msl, map_addr, alloc_sz, dirty); /* try once more, as now we have allocated new memory */ ret = find_suitable_element(heap, elt_size, flags, align, bound, @@ -1260,7 +1266,7 @@ malloc_heap_add_external_memory(struct malloc_heap *heap, memset(msl->base_va, 0, msl->len); /* now, add newly minted memory to the malloc heap */ - malloc_heap_add_memory(heap, msl, msl->base_va, msl->len); + malloc_heap_add_memory(heap, msl, msl->base_va, msl->len, false); heap->total_size += msl->len; diff --git a/lib/eal/common/rte_malloc.c b/lib/eal/common/rte_malloc.c index d0bec26920..71a3f7ecb4 100644 --- a/lib/eal/common/rte_malloc.c +++ b/lib/eal/common/rte_malloc.c @@ -115,15 +115,22 @@ rte_zmalloc_socket(const char *type, size_t size, unsigned align, int socket) { void *ptr = rte_malloc_socket(type, size, align, socket); + if (ptr != NULL) { + struct malloc_elem *elem = malloc_elem_from_data(ptr); + + if (elem->dirty) { + memset(ptr, 0, size); + } else { #ifdef RTE_MALLOC_DEBUG - /* - * If DEBUG is enabled, then freed memory is marked with poison - * value and set to zero on allocation. - * If DEBUG is not enabled then memory is already zeroed. - */ - if (ptr != NULL) - memset(ptr, 0, size); + /* + * If DEBUG is enabled, then freed memory is marked + * with a poison value and set to zero on allocation. + * If DEBUG is disabled then memory is already zeroed. + */ + memset(ptr, 0, size); #endif + } + } rte_eal_trace_mem_zmalloc(type, size, align, socket, ptr); return ptr; diff --git a/lib/eal/include/rte_memory.h b/lib/eal/include/rte_memory.h index 6d018629ae..68b069fd04 100644 --- a/lib/eal/include/rte_memory.h +++ b/lib/eal/include/rte_memory.h @@ -19,6 +19,7 @@ extern "C" { #endif +#include #include #include #include @@ -37,11 +38,14 @@ extern "C" { #define SOCKET_ID_ANY -1 /**< Any NUMA socket. */ +/** Prevent this segment from being freed back to the OS. */ +#define RTE_MEMSEG_FLAG_DO_NOT_FREE RTE_BIT32(0) +/** This segment is not filled with zeros. */ +#define RTE_MEMSEG_FLAG_DIRTY RTE_BIT32(1) + /** * Physical memory segment descriptor. */ -#define RTE_MEMSEG_FLAG_DO_NOT_FREE (1 << 0) -/**< Prevent this segment from being freed back to the OS. */ struct rte_memseg { rte_iova_t iova; /**< Start IO address. */ RTE_STD_C11 -- 2.25.1