DPDK patches and discussions
 help / color / mirror / Atom feed
From: Anatoly Burakov <anatoly.burakov@intel.com>
To: dev@dpdk.org
Cc: Bruce Richardson <bruce.richardson@intel.com>,
	Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>,
	Narcisa Ana Maria Vasile <navasile@linux.microsoft.com>,
	Dmitry Malloy <dmitrym@microsoft.com>,
	Pallavi Kadam <pallavi.kadam@intel.com>,
	stephen@networkplumber.org
Subject: [dpdk-dev] [PATCH v2] eal: detach memsegs on cleanup
Date: Mon, 14 Sep 2020 12:26:29 +0100
Message-ID: <bb71e5eff4af6c225cb638ad8d4acb12956cc5b0.1600082462.git.anatoly.burakov@intel.com> (raw)
In-Reply-To: <cab1b8bcc35a164d6683118985412d937aec137b.1600081980.git.anatoly.burakov@intel.com>

Currently, we don't detach the shared memory on EAL cleanup, which
leaves the page table descriptors still holding on to the file
descriptors as well as memory space occupied by them. Fix it by adding
another detach stage that closes the internal memory allocator resource
references, detaches shared fbarrays and unmaps the shared mem config.

Bugzilla ID: 380

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---

Notes:
    v2:
    - Fixed checkpatch warnings
    
    Not backporting to stable because this fix isn't critical but is rather
    "nice to have".

 lib/librte_eal/common/eal_common_memory.c | 54 +++++++++++++++
 lib/librte_eal/common/eal_memalloc.h      |  3 +
 lib/librte_eal/common/eal_private.h       |  7 ++
 lib/librte_eal/freebsd/eal.c              |  2 +
 lib/librte_eal/freebsd/eal_memalloc.c     |  5 ++
 lib/librte_eal/linux/eal.c                |  2 +
 lib/librte_eal/linux/eal_memalloc.c       | 84 +++++++++++++++++++++++
 lib/librte_eal/windows/eal.c              |  3 +-
 lib/librte_eal/windows/eal_memalloc.c     |  7 ++
 9 files changed, 166 insertions(+), 1 deletion(-)

diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
index 33917fa835..ca14fdd96e 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -1002,6 +1002,60 @@ rte_extmem_detach(void *va_addr, size_t len)
 	return sync_memory(va_addr, len, false);
 }
 
+/* detach all EAL memory */
+int
+rte_eal_memory_detach(void)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	size_t page_sz = rte_mem_page_size();
+	unsigned int i;
+
+	rte_rwlock_write_lock(&mcfg->memory_hotplug_lock);
+
+	/* detach internal memory subsystem data first */
+	if (eal_memalloc_cleanup())
+		RTE_LOG(ERR, EAL, "Could not release memory subsystem data\n");
+
+	for (i = 0; i < RTE_DIM(mcfg->memsegs); i++) {
+		struct rte_memseg_list *msl = &mcfg->memsegs[i];
+
+		/* skip uninitialized segments */
+		if (msl->base_va == NULL)
+			continue;
+		/*
+		 * external segments are supposed to be detached at this point,
+		 * but if they aren't, we can't really do anything about it,
+		 * because if we skip them here, they'll become invalid after
+		 * we unmap the memconfig anyway. however, if this is externally
+		 * referenced memory, we have no business unmapping it.
+		 */
+		if (!msl->external)
+			if (munmap(msl->base_va, msl->len) != 0)
+				RTE_LOG(ERR, EAL, "Could not unmap memory: %s\n",
+						strerror(errno));
+
+		/*
+		 * we are detaching the fbarray rather than destroying because
+		 * other processes might still reference this fbarray, and we
+		 * have no way of knowing if they still do.
+		 */
+		if (rte_fbarray_detach(&msl->memseg_arr))
+			RTE_LOG(ERR, EAL, "Could not detach fbarray: %s\n",
+					rte_strerror(rte_errno));
+	}
+	rte_rwlock_write_unlock(&mcfg->memory_hotplug_lock);
+
+	/*
+	 * we've detached the memseg lists, so we can unmap the shared mem
+	 * config - we can't zero it out because it might still be referenced
+	 * by other processes.
+	 */
+	munmap(mcfg, RTE_ALIGN(sizeof(*mcfg), page_sz));
+	rte_eal_get_configuration()->mem_config = NULL;
+
+	return 0;
+}
+
 /* init memory subsystem */
 int
 rte_eal_memory_init(void)
diff --git a/lib/librte_eal/common/eal_memalloc.h b/lib/librte_eal/common/eal_memalloc.h
index e953cd84e6..ebc3a6f6c1 100644
--- a/lib/librte_eal/common/eal_memalloc.h
+++ b/lib/librte_eal/common/eal_memalloc.h
@@ -93,4 +93,7 @@ eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset);
 int
 eal_memalloc_init(void);
 
+int
+eal_memalloc_cleanup(void);
+
 #endif /* EAL_MEMALLOC_H */
diff --git a/lib/librte_eal/common/eal_private.h b/lib/librte_eal/common/eal_private.h
index a6a6381567..84c4621768 100644
--- a/lib/librte_eal/common/eal_private.h
+++ b/lib/librte_eal/common/eal_private.h
@@ -433,6 +433,13 @@ int rte_eal_hugepage_init(void);
  */
 int rte_eal_hugepage_attach(void);
 
+/**
+ * Detaches all memory mappings from a process.
+ *
+ * This function is private to the EAL.
+ */
+int rte_eal_memory_detach(void);
+
 /**
  * Find a bus capable of identifying a device.
  *
diff --git a/lib/librte_eal/freebsd/eal.c b/lib/librte_eal/freebsd/eal.c
index 798add0b59..361593a005 100644
--- a/lib/librte_eal/freebsd/eal.c
+++ b/lib/librte_eal/freebsd/eal.c
@@ -963,6 +963,8 @@ rte_eal_cleanup(void)
 		eal_get_internal_configuration();
 	rte_service_finalize();
 	rte_mp_channel_cleanup();
+	/* after this point, any DPDK pointers will become dangling */
+	rte_eal_memory_detach();
 	rte_trace_save();
 	eal_trace_fini();
 	eal_cleanup_config(internal_conf);
diff --git a/lib/librte_eal/freebsd/eal_memalloc.c b/lib/librte_eal/freebsd/eal_memalloc.c
index 6893448db7..00ab02cb63 100644
--- a/lib/librte_eal/freebsd/eal_memalloc.c
+++ b/lib/librte_eal/freebsd/eal_memalloc.c
@@ -74,6 +74,11 @@ eal_memalloc_get_seg_fd_offset(int list_idx __rte_unused,
 	return -ENOTSUP;
 }
 
+int eal_memalloc_cleanup(void)
+{
+	return 0;
+}
+
 int
 eal_memalloc_init(void)
 {
diff --git a/lib/librte_eal/linux/eal.c b/lib/librte_eal/linux/eal.c
index 0960f01d05..eb69f4aedf 100644
--- a/lib/librte_eal/linux/eal.c
+++ b/lib/librte_eal/linux/eal.c
@@ -1359,6 +1359,8 @@ rte_eal_cleanup(void)
 		rte_memseg_walk(mark_freeable, NULL);
 	rte_service_finalize();
 	rte_mp_channel_cleanup();
+	/* after this point, any DPDK pointers will become dangling */
+	rte_eal_memory_detach();
 	rte_trace_save();
 	eal_trace_fini();
 	eal_cleanup_config(internal_conf);
diff --git a/lib/librte_eal/linux/eal_memalloc.c b/lib/librte_eal/linux/eal_memalloc.c
index db60e79975..351cbdec6b 100644
--- a/lib/librte_eal/linux/eal_memalloc.c
+++ b/lib/librte_eal/linux/eal_memalloc.c
@@ -1418,6 +1418,31 @@ secondary_msl_create_walk(const struct rte_memseg_list *msl,
 	return 0;
 }
 
+static int
+secondary_msl_destroy_walk(const struct rte_memseg_list *msl,
+		void *arg __rte_unused)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	struct rte_memseg_list *local_msl;
+	int msl_idx, ret;
+
+	if (msl->external)
+		return 0;
+
+	msl_idx = msl - mcfg->memsegs;
+	local_msl = &local_memsegs[msl_idx];
+
+	ret = rte_fbarray_destroy(&local_msl->memseg_arr);
+	if (ret < 0) {
+		RTE_LOG(ERR, EAL, "Cannot destroy local memory map\n");
+		return -1;
+	}
+	local_msl->base_va = NULL;
+	local_msl->len = 0;
+
+	return 0;
+}
+
 static int
 alloc_list(int list_idx, int len)
 {
@@ -1450,6 +1475,34 @@ alloc_list(int list_idx, int len)
 	return 0;
 }
 
+static int
+destroy_list(int list_idx)
+{
+	const struct internal_config *internal_conf =
+			eal_get_internal_configuration();
+
+	/* single-file segments mode does not need fd list */
+	if (!internal_conf->single_file_segments) {
+		int *fds = fd_list[list_idx].fds;
+		int i;
+		/* go through each fd and ensure it's closed */
+		for (i = 0; i < fd_list[list_idx].len; i++) {
+			if (fds[i] >= 0) {
+				close(fds[i]);
+				fds[i] = -1;
+			}
+		}
+		free(fds);
+		fd_list[list_idx].fds = NULL;
+		fd_list[list_idx].len = 0;
+	} else if (fd_list[list_idx].memseg_list_fd >= 0) {
+		close(fd_list[list_idx].memseg_list_fd);
+		fd_list[list_idx].count = 0;
+		fd_list[list_idx].memseg_list_fd = -1;
+	}
+	return 0;
+}
+
 static int
 fd_list_create_walk(const struct rte_memseg_list *msl,
 		void *arg __rte_unused)
@@ -1467,6 +1520,20 @@ fd_list_create_walk(const struct rte_memseg_list *msl,
 	return alloc_list(msl_idx, len);
 }
 
+static int
+fd_list_destroy_walk(const struct rte_memseg_list *msl, void *arg __rte_unused)
+{
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
+	int msl_idx;
+
+	if (msl->external)
+		return 0;
+
+	msl_idx = msl - mcfg->memsegs;
+
+	return destroy_list(msl_idx);
+}
+
 int
 eal_memalloc_set_seg_fd(int list_idx, int seg_idx, int fd)
 {
@@ -1603,6 +1670,23 @@ eal_memalloc_get_seg_fd_offset(int list_idx, int seg_idx, size_t *offset)
 	return 0;
 }
 
+int
+eal_memalloc_cleanup(void)
+{
+	/* close all remaining fd's - these are per-process, so it's safe */
+	if (rte_memseg_list_walk(fd_list_destroy_walk, NULL))
+		return -1;
+
+	/* destroy the shadow page table if we're a secondary process */
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY)
+		return 0;
+
+	if (rte_memseg_list_walk(secondary_msl_destroy_walk, NULL))
+		return -1;
+
+	return 0;
+}
+
 int
 eal_memalloc_init(void)
 {
diff --git a/lib/librte_eal/windows/eal.c b/lib/librte_eal/windows/eal.c
index e50601dd36..d5db007717 100644
--- a/lib/librte_eal/windows/eal.c
+++ b/lib/librte_eal/windows/eal.c
@@ -250,7 +250,8 @@ rte_eal_cleanup(void)
 {
 	struct internal_config *internal_conf =
 		eal_get_internal_configuration();
-
+	/* after this point, any DPDK pointers will become dangling */
+	rte_eal_memory_detach();
 	eal_cleanup_config(internal_conf);
 	return 0;
 }
diff --git a/lib/librte_eal/windows/eal_memalloc.c b/lib/librte_eal/windows/eal_memalloc.c
index d8cae3ebc1..85a9712cea 100644
--- a/lib/librte_eal/windows/eal_memalloc.c
+++ b/lib/librte_eal/windows/eal_memalloc.c
@@ -437,6 +437,13 @@ eal_memalloc_sync_with_primary(void)
 	return -1;
 }
 
+int
+eal_memalloc_cleanup(void)
+{
+	/* not implemented */
+	return 0;
+}
+
 int
 eal_memalloc_init(void)
 {
-- 
2.17.1

  reply	other threads:[~2020-09-14 11:26 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-09-14 11:18 [dpdk-dev] [PATCH] " Anatoly Burakov
2020-09-14 11:26 ` Anatoly Burakov [this message]
2020-09-14 11:45   ` [dpdk-dev] [PATCH v3] " Anatoly Burakov
2020-09-14 13:04     ` [dpdk-dev] [PATCH v4] " Anatoly Burakov
2020-10-15  9:54       ` Burakov, Anatoly
2020-10-20 11:53         ` Thomas Monjalon
2020-11-22 18:16           ` Thomas Monjalon
2020-11-27 12:56             ` Burakov, Anatoly
2020-11-27 13:21               ` Thomas Monjalon
2021-02-11 10:53                 ` Burakov, Anatoly
2021-02-11 16:23       ` Stephen Hemminger
2021-03-03  9:04       ` David Marchand
2021-02-11 13:33   ` [dpdk-dev] [PATCH v2] " Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=bb71e5eff4af6c225cb638ad8d4acb12956cc5b0.1600082462.git.anatoly.burakov@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=dmitrym@microsoft.com \
    --cc=navasile@linux.microsoft.com \
    --cc=pallavi.kadam@intel.com \
    --cc=stephen@networkplumber.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git