patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Kevin Traynor <ktraynor@redhat.com>
To: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Vipin Varghese <vipin.varghese@intel.com>, dpdk stable <stable@dpdk.org>
Subject: [dpdk-stable] patch 'eal: clean up unused files on initialization' has been queued to LTS release 18.11.1
Date: Fri,  4 Jan 2019 13:23:51 +0000	[thread overview]
Message-ID: <20190104132455.15170-9-ktraynor@redhat.com> (raw)
In-Reply-To: <20190104132455.15170-1-ktraynor@redhat.com>

Hi,

FYI, your patch has been queued to LTS release 18.11.1

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 01/11/19. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Thanks.

Kevin Traynor

---
>From 8c95205c36c6872e2a96a70bd0044d91cbe1792a Mon Sep 17 00:00:00 2001
From: Anatoly Burakov <anatoly.burakov@intel.com>
Date: Tue, 13 Nov 2018 15:54:44 +0000
Subject: [PATCH] eal: clean up unused files on initialization

[ upstream commit 0a529578f162df8b16e4eb7423e55570f3d13c97 ]

When creating process data structures, EAL will create many files
in EAL runtime directory. Because we allow multiple secondary
processes to run, each secondary process gets their own unique
file. With many secondary processes running and exiting on the
system, runtime directory will, over time, create enormous amounts
of sockets, fbarray files and other stuff that just sits there
unused because the process that allocated it has died a long time
ago. This may lead to exhaustion of disk (or RAM) space in the
runtime directory.

Fix this by removing every unlocked file at initialization that
matches either socket or fbarray naming convention. We cannot be
sure of any other files, so we'll leave them alone. Also, remove
similar code from mp socket code.

We do it at the end of init, rather than at the beginning, because
secondary process will use primary process' data structures even
if the primary itself has died, and we don't want to remove those
before we lock them.

Bugzilla ID: 106

Reported-by: Vipin Varghese <vipin.varghese@intel.com>
Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/bsdapp/eal/eal.c         | 100 ++++++++++++++++++++++++
 lib/librte_eal/common/eal_common_proc.c |  30 -------
 lib/librte_eal/common/eal_filesystem.h  |   3 +
 lib/librte_eal/linuxapp/eal/eal.c       |  99 +++++++++++++++++++++++
 4 files changed, 202 insertions(+), 30 deletions(-)

diff --git a/lib/librte_eal/bsdapp/eal/eal.c b/lib/librte_eal/bsdapp/eal/eal.c
index b8152a75c..41ddb5a22 100644
--- a/lib/librte_eal/bsdapp/eal/eal.c
+++ b/lib/librte_eal/bsdapp/eal/eal.c
@@ -4,4 +4,6 @@
  */
 
+#include <dirent.h>
+#include <fnmatch.h>
 #include <stdio.h>
 #include <stdlib.h>
@@ -142,4 +144,90 @@ eal_create_runtime_dir(void)
 }
 
+int
+eal_clean_runtime_dir(void)
+{
+	DIR *dir;
+	struct dirent *dirent;
+	int dir_fd, fd, lck_result;
+	static const char * const filters[] = {
+		"fbarray_*",
+		"mp_socket_*"
+	};
+
+	/* open directory */
+	dir = opendir(runtime_dir);
+	if (!dir) {
+		RTE_LOG(ERR, EAL, "Unable to open runtime directory %s\n",
+				runtime_dir);
+		goto error;
+	}
+	dir_fd = dirfd(dir);
+
+	/* lock the directory before doing anything, to avoid races */
+	if (flock(dir_fd, LOCK_EX) < 0) {
+		RTE_LOG(ERR, EAL, "Unable to lock runtime directory %s\n",
+			runtime_dir);
+		goto error;
+	}
+
+	dirent = readdir(dir);
+	if (!dirent) {
+		RTE_LOG(ERR, EAL, "Unable to read runtime directory %s\n",
+				runtime_dir);
+		goto error;
+	}
+
+	while (dirent != NULL) {
+		unsigned int f_idx;
+		bool skip = true;
+
+		/* skip files that don't match the patterns */
+		for (f_idx = 0; f_idx < RTE_DIM(filters); f_idx++) {
+			const char *filter = filters[f_idx];
+
+			if (fnmatch(filter, dirent->d_name, 0) == 0) {
+				skip = false;
+				break;
+			}
+		}
+		if (skip) {
+			dirent = readdir(dir);
+			continue;
+		}
+
+		/* try and lock the file */
+		fd = openat(dir_fd, dirent->d_name, O_RDONLY);
+
+		/* skip to next file */
+		if (fd == -1) {
+			dirent = readdir(dir);
+			continue;
+		}
+
+		/* non-blocking lock */
+		lck_result = flock(fd, LOCK_EX | LOCK_NB);
+
+		/* if lock succeeds, remove the file */
+		if (lck_result != -1)
+			unlinkat(dir_fd, dirent->d_name, 0);
+		close(fd);
+		dirent = readdir(dir);
+	}
+
+	/* closedir closes dir_fd and drops the lock */
+	closedir(dir);
+	return 0;
+
+error:
+	if (dir)
+		closedir(dir);
+
+	RTE_LOG(ERR, EAL, "Error while clearing runtime dir: %s\n",
+		strerror(errno));
+
+	return -1;
+}
+
+
 const char *
 rte_eal_get_runtime_dir(void)
@@ -808,4 +896,16 @@ rte_eal_init(int argc, char **argv)
 	}
 
+	/*
+	 * Clean up unused files in runtime directory. We do this at the end of
+	 * init and not at the beginning because we want to clean stuff up
+	 * whether we are primary or secondary process, but we cannot remove
+	 * primary process' files because secondary should be able to run even
+	 * if primary process is dead.
+	 */
+	if (eal_clean_runtime_dir() < 0) {
+		rte_eal_init_alert("Cannot clear runtime directory\n");
+		return -1;
+	}
+
 	rte_eal_mcfg_complete();
 
diff --git a/lib/librte_eal/common/eal_common_proc.c b/lib/librte_eal/common/eal_common_proc.c
index 1c3f09aad..6b876590a 100644
--- a/lib/librte_eal/common/eal_common_proc.c
+++ b/lib/librte_eal/common/eal_common_proc.c
@@ -543,27 +543,4 @@ open_socket_fd(void)
 }
 
-static int
-unlink_sockets(const char *filter)
-{
-	int dir_fd;
-	DIR *mp_dir;
-	struct dirent *ent;
-
-	mp_dir = opendir(mp_dir_path);
-	if (!mp_dir) {
-		RTE_LOG(ERR, EAL, "Unable to open directory %s\n", mp_dir_path);
-		return -1;
-	}
-	dir_fd = dirfd(mp_dir);
-
-	while ((ent = readdir(mp_dir))) {
-		if (fnmatch(filter, ent->d_name, 0) == 0)
-			unlinkat(dir_fd, ent->d_name, 0);
-	}
-
-	closedir(mp_dir);
-	return 0;
-}
-
 int
 rte_mp_channel_init(void)
@@ -604,11 +581,4 @@ rte_mp_channel_init(void)
 	}
 
-	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
-			unlink_sockets(mp_filter)) {
-		RTE_LOG(ERR, EAL, "failed to unlink mp sockets\n");
-		close(dir_fd);
-		return -1;
-	}
-
 	if (open_socket_fd() < 0) {
 		close(dir_fd);
diff --git a/lib/librte_eal/common/eal_filesystem.h b/lib/librte_eal/common/eal_filesystem.h
index 6e0331fdb..64a028db7 100644
--- a/lib/librte_eal/common/eal_filesystem.h
+++ b/lib/librte_eal/common/eal_filesystem.h
@@ -26,4 +26,7 @@ int
 eal_create_runtime_dir(void);
 
+int
+eal_clean_runtime_dir(void);
+
 #define RUNTIME_CONFIG_FNAME "config"
 static inline const char *
diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
index 361744d40..d252c8591 100644
--- a/lib/librte_eal/linuxapp/eal/eal.c
+++ b/lib/librte_eal/linuxapp/eal/eal.c
@@ -14,5 +14,7 @@
 #include <getopt.h>
 #include <sys/file.h>
+#include <dirent.h>
 #include <fcntl.h>
+#include <fnmatch.h>
 #include <stddef.h>
 #include <errno.h>
@@ -150,4 +152,89 @@ eal_create_runtime_dir(void)
 }
 
+int
+eal_clean_runtime_dir(void)
+{
+	DIR *dir;
+	struct dirent *dirent;
+	int dir_fd, fd, lck_result;
+	static const char * const filters[] = {
+		"fbarray_*",
+		"mp_socket_*"
+	};
+
+	/* open directory */
+	dir = opendir(runtime_dir);
+	if (!dir) {
+		RTE_LOG(ERR, EAL, "Unable to open runtime directory %s\n",
+				runtime_dir);
+		goto error;
+	}
+	dir_fd = dirfd(dir);
+
+	/* lock the directory before doing anything, to avoid races */
+	if (flock(dir_fd, LOCK_EX) < 0) {
+		RTE_LOG(ERR, EAL, "Unable to lock runtime directory %s\n",
+			runtime_dir);
+		goto error;
+	}
+
+	dirent = readdir(dir);
+	if (!dirent) {
+		RTE_LOG(ERR, EAL, "Unable to read runtime directory %s\n",
+				runtime_dir);
+		goto error;
+	}
+
+	while (dirent != NULL) {
+		unsigned int f_idx;
+		bool skip = true;
+
+		/* skip files that don't match the patterns */
+		for (f_idx = 0; f_idx < RTE_DIM(filters); f_idx++) {
+			const char *filter = filters[f_idx];
+
+			if (fnmatch(filter, dirent->d_name, 0) == 0) {
+				skip = false;
+				break;
+			}
+		}
+		if (skip) {
+			dirent = readdir(dir);
+			continue;
+		}
+
+		/* try and lock the file */
+		fd = openat(dir_fd, dirent->d_name, O_RDONLY);
+
+		/* skip to next file */
+		if (fd == -1) {
+			dirent = readdir(dir);
+			continue;
+		}
+
+		/* non-blocking lock */
+		lck_result = flock(fd, LOCK_EX | LOCK_NB);
+
+		/* if lock succeeds, remove the file */
+		if (lck_result != -1)
+			unlinkat(dir_fd, dirent->d_name, 0);
+		close(fd);
+		dirent = readdir(dir);
+	}
+
+	/* closedir closes dir_fd and drops the lock */
+	closedir(dir);
+	return 0;
+
+error:
+	if (dir)
+		closedir(dir);
+
+	RTE_LOG(ERR, EAL, "Error while clearing runtime dir: %s\n",
+		strerror(errno));
+
+	return -1;
+}
+
 const char *
 rte_eal_get_runtime_dir(void)
@@ -1097,4 +1184,16 @@ rte_eal_init(int argc, char **argv)
 	}
 
+	/*
+	 * Clean up unused files in runtime directory. We do this at the end of
+	 * init and not at the beginning because we want to clean stuff up
+	 * whether we are primary or secondary process, but we cannot remove
+	 * primary process' files because secondary should be able to run even
+	 * if primary process is dead.
+	 */
+	if (eal_clean_runtime_dir() < 0) {
+		rte_eal_init_alert("Cannot clear runtime directory\n");
+		return -1;
+	}
+
 	rte_eal_mcfg_complete();
 
-- 
2.19.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2019-01-04 13:23:07.672255523 +0000
+++ 0009-eal-clean-up-unused-files-on-initialization.patch	2019-01-04 13:23:07.000000000 +0000
@@ -1,8 +1,10 @@
-From 0a529578f162df8b16e4eb7423e55570f3d13c97 Mon Sep 17 00:00:00 2001
+From 8c95205c36c6872e2a96a70bd0044d91cbe1792a Mon Sep 17 00:00:00 2001
 From: Anatoly Burakov <anatoly.burakov@intel.com>
 Date: Tue, 13 Nov 2018 15:54:44 +0000
 Subject: [PATCH] eal: clean up unused files on initialization
 
+[ upstream commit 0a529578f162df8b16e4eb7423e55570f3d13c97 ]
+
 When creating process data structures, EAL will create many files
 in EAL runtime directory. Because we allow multiple secondary
 processes to run, each secondary process gets their own unique
@@ -24,7 +26,6 @@
 before we lock them.
 
 Bugzilla ID: 106
-Cc: stable@dpdk.org
 
 Reported-by: Vipin Varghese <vipin.varghese@intel.com>
 Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>

  parent reply	other threads:[~2019-01-04 13:26 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-04 13:23 [dpdk-stable] patch 'config: enable C11 memory model for armv8 with meson' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'mk: do not install meson.build in usertools' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'log: add missing experimental tag' " Kevin Traynor
2019-01-10  9:52   ` David Marchand
2019-01-10 10:28     ` Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bus/vmbus: fix race in subchannel creation' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'net/netvsc: enable SR-IOV' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'net/netvsc: disable multi-queue on older servers' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bus/dpaa: do nothing if bus not present' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'doc: fix garbage text in generated HTML guides' " Kevin Traynor
2019-01-04 13:23 ` Kevin Traynor [this message]
2019-01-08 16:53   ` [dpdk-stable] patch 'eal: clean up unused files on initialization' " Burakov, Anatoly
2019-01-08 18:09     ` Kevin Traynor
2019-01-10 11:38       ` Burakov, Anatoly
2019-01-04 13:23 ` [dpdk-stable] patch 'gro: fix overflow of payload length calculation' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix error log in eth Rx adapter' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: remove redundant timer adapter function prototypes' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'app/eventdev: detect deadlock for timer event producer' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix xstats documentation typo' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix eth Tx adapter queue count checks' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'compressdev: fix structure comment' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bb/turbo_sw: fix dynamic linking' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix block size error handling' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix message for CCM when setting unused counter' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix message for NULL algo " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'common/qat: remove check of valid firmware response' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'compress/qat: fix return on building request error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'compress/qat: fix dequeue error counter' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'timer: fix race condition' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'ip_frag: fix IPv6 when MTU sizes not aligned to 8 bytes' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix missing newline in a log' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix detection of duplicate option register' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix leak on multi-process request error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'memzone: fix unlock on initialization failure' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: fix finding maximum contiguous IOVA size' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: notify primary process about hotplug in secondary' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: fix duplicate mem event notification' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: make alignment requirements more stringent' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'mem: fix segment fd API error code for external segment' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'mem: check for memfd support in segment fd API' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'doc: remove note on memory mode limitation in multi-process' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'test/mem: add external mem autotest to meson' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'test/fbarray: add " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: close multi-process socket during cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'hash: fix return of bulk lookup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'hash: fix out-of-bound write while freeing key slot' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'devtools: fix return of forbidden addition checks' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: fix deadlock when reading stats' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: clear VF reset flags after reset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: fix statistics inconsistency' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/netvsc: fix transmit descriptor pool cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/netvsc: fix probe when VF not found' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: fix race condition when adding fd in the fdset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ifc: store only registered device instance' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: add reset reason in Rx error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: skip packet with wrong request id' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: destroy queues if start failed' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: do not reconfigure queues on reset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: add supported RSS offloads types' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: fix invalid reference to variable in union' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: fix cleanup for out of order packets' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: update completion queue after cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/cxgbe: fix overlapping regions in TID table' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/cxgbe: skip parsing match items with no spec' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: fix config name in comment' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/mlx5: fix Multi-Packet RQ mempool free' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net: fix underflow for checksum of invalid IPv4 packets' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/tap: add buffer overflow checks before checksum' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/vhost: fix double free of MAC address' " Kevin Traynor
2019-01-07  0:04   ` Hideyuki Yamashita
2019-01-07 10:23     ` Kevin Traynor
2019-01-09  7:39       ` Hideyuki Yamashita
2019-01-09 11:04         ` Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: enforce avail index and desc read ordering' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: enforce desc flags and content " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/af_packet: fix setting MTU decrements sockaddr twice' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/tap: fix possible uninitialized variable access' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/avf/base: fix comment referencing internal data' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/sfc: pass HW Tx queue index on creation' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'telemetry: fix using ports of different types' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'sched: fix memory leak on init failure' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'app/testpmd: expand RED queue thresholds to 64 bits' " Kevin Traynor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190104132455.15170-9-ktraynor@redhat.com \
    --to=ktraynor@redhat.com \
    --cc=anatoly.burakov@intel.com \
    --cc=stable@dpdk.org \
    --cc=vipin.varghese@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).