patches for DPDK stable branches
 help / color / mirror / Atom feed
From: Kevin Traynor <ktraynor@redhat.com>
To: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: dpdk stable <stable@dpdk.org>
Subject: [dpdk-stable] patch 'malloc: fix deadlock when reading stats' has been queued to LTS release 18.11.1
Date: Fri,  4 Jan 2019 13:24:25 +0000	[thread overview]
Message-ID: <20190104132455.15170-43-ktraynor@redhat.com> (raw)
In-Reply-To: <20190104132455.15170-1-ktraynor@redhat.com>

Hi,

FYI, your patch has been queued to LTS release 18.11.1

Note it hasn't been pushed to http://dpdk.org/browse/dpdk-stable yet.
It will be pushed if I get no objections before 01/11/19. So please
shout if anyone has objections.

Also note that after the patch there's a diff of the upstream commit vs the
patch applied to the branch. This will indicate if there was any rebasing
needed to apply to the stable branch. If there were code changes for rebasing
(ie: not only metadata diffs), please double check that the rebase was
correctly done.

Thanks.

Kevin Traynor

---
>From ed54c0902f56a47e1d269824f543852fdac1796f Mon Sep 17 00:00:00 2001
From: Anatoly Burakov <anatoly.burakov@intel.com>
Date: Fri, 21 Dec 2018 12:26:05 +0000
Subject: [PATCH] malloc: fix deadlock when reading stats

[ upstream commit ba731ea1dda3f1a1b7eb4248d5988de746d3ef0a ]

Currently, malloc statistics and external heap creation code
use memory hotplug lock as a way to synchronize accesses to
heaps (as in, locking the hotplug lock to prevent list of heaps
from changing under our feet). At the same time, malloc
statistics code will also lock the heap because it needs to
access heap data and does not want any other thread to allocate
anything from that heap.

In such scheme, it is possible to enter a deadlock with the
following sequence of events:

thread 1		thread 2
rte_malloc()
			rte_malloc_dump_stats()
take heap lock
			take hotplug lock
failed to allocate,
attempt to take
hotplug lock
			attempt to take heap lock

Neither thread will be able to continue, as both of them are
waiting for the other one to drop the lock. Adding an
additional lock will require an ABI change, so instead of
that, make malloc statistics calls thread-unsafe with
respect to creating/destroying heaps.

Fixes: 72cf92b31855 ("malloc: index heaps using heap ID rather than NUMA node")

Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
 lib/librte_eal/common/include/rte_malloc.h |  9 +++++++++
 lib/librte_eal/common/rte_malloc.c         | 19 +++----------------
 2 files changed, 12 insertions(+), 16 deletions(-)

diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h
index a5290b074..54a12467a 100644
--- a/lib/librte_eal/common/include/rte_malloc.h
+++ b/lib/librte_eal/common/include/rte_malloc.h
@@ -252,4 +252,7 @@ rte_malloc_validate(const void *ptr, size_t *size);
  * Get heap statistics for the specified heap.
  *
+ * @note This function is not thread-safe with respect to
+ *    ``rte_malloc_heap_create()``/``rte_malloc_heap_destroy()`` functions.
+ *
  * @param socket
  *   An unsigned integer specifying the socket to get heap statistics for
@@ -462,4 +465,7 @@ rte_malloc_heap_socket_is_external(int socket_id);
  * NULL, all memory types will be dumped.
  *
+ * @note This function is not thread-safe with respect to
+ *    ``rte_malloc_heap_create()``/``rte_malloc_heap_destroy()`` functions.
+ *
  * @param f
  *   A pointer to a file for output
@@ -474,4 +480,7 @@ rte_malloc_dump_stats(FILE *f, const char *type);
  * Dump contents of all malloc heaps to a file.
  *
+ * @note This function is not thread-safe with respect to
+ *    ``rte_malloc_heap_create()``/``rte_malloc_heap_destroy()`` functions.
+ *
  * @param f
  *   A pointer to a file for output
diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
index 06cf1e666..47c2bec72 100644
--- a/lib/librte_eal/common/rte_malloc.c
+++ b/lib/librte_eal/common/rte_malloc.c
@@ -157,18 +157,12 @@ rte_malloc_get_socket_stats(int socket,
 {
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
-	int heap_idx, ret = -1;
-
-	rte_rwlock_read_lock(&mcfg->memory_hotplug_lock);
+	int heap_idx;
 
 	heap_idx = malloc_socket_to_heap_id(socket);
 	if (heap_idx < 0)
-		goto unlock;
+		return -1;
 
-	ret = malloc_heap_get_stats(&mcfg->malloc_heaps[heap_idx],
+	return malloc_heap_get_stats(&mcfg->malloc_heaps[heap_idx],
 			socket_stats);
-unlock:
-	rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock);
-
-	return ret;
 }
 
@@ -182,12 +176,8 @@ rte_malloc_dump_heaps(FILE *f)
 	unsigned int idx;
 
-	rte_rwlock_read_lock(&mcfg->memory_hotplug_lock);
-
 	for (idx = 0; idx < RTE_MAX_HEAPS; idx++) {
 		fprintf(f, "Heap id: %u\n", idx);
 		malloc_heap_dump(&mcfg->malloc_heaps[idx], f);
 	}
-
-	rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock);
 }
 
@@ -263,6 +253,4 @@ rte_malloc_dump_stats(FILE *f, __rte_unused const char *type)
 	struct rte_malloc_socket_stats sock_stats;
 
-	rte_rwlock_read_lock(&mcfg->memory_hotplug_lock);
-
 	/* Iterate through all initialised heaps */
 	for (heap_id = 0; heap_id < RTE_MAX_HEAPS; heap_id++) {
@@ -281,5 +269,4 @@ rte_malloc_dump_stats(FILE *f, __rte_unused const char *type)
 		fprintf(f, "\tFree_count:%u,\n", sock_stats.free_count);
 	}
-	rte_rwlock_read_unlock(&mcfg->memory_hotplug_lock);
 	return;
 }
-- 
2.19.0

---
  Diff of the applied patch vs upstream commit (please double-check if non-empty:
---
--- -	2019-01-04 13:23:08.478061537 +0000
+++ 0043-malloc-fix-deadlock-when-reading-stats.patch	2019-01-04 13:23:07.000000000 +0000
@@ -1,8 +1,10 @@
-From ba731ea1dda3f1a1b7eb4248d5988de746d3ef0a Mon Sep 17 00:00:00 2001
+From ed54c0902f56a47e1d269824f543852fdac1796f Mon Sep 17 00:00:00 2001
 From: Anatoly Burakov <anatoly.burakov@intel.com>
 Date: Fri, 21 Dec 2018 12:26:05 +0000
 Subject: [PATCH] malloc: fix deadlock when reading stats
 
+[ upstream commit ba731ea1dda3f1a1b7eb4248d5988de746d3ef0a ]
+
 Currently, malloc statistics and external heap creation code
 use memory hotplug lock as a way to synchronize accesses to
 heaps (as in, locking the hotplug lock to prevent list of heaps
@@ -31,28 +33,13 @@
 respect to creating/destroying heaps.
 
 Fixes: 72cf92b31855 ("malloc: index heaps using heap ID rather than NUMA node")
-Cc: stable@dpdk.org
 
 Signed-off-by: Anatoly Burakov <anatoly.burakov@intel.com>
 ---
- doc/guides/rel_notes/release_19_02.rst     |  4 ++++
  lib/librte_eal/common/include/rte_malloc.h |  9 +++++++++
  lib/librte_eal/common/rte_malloc.c         | 19 +++----------------
- 3 files changed, 16 insertions(+), 16 deletions(-)
+ 2 files changed, 12 insertions(+), 16 deletions(-)
 
-diff --git a/doc/guides/rel_notes/release_19_02.rst b/doc/guides/rel_notes/release_19_02.rst
-index 47768288a..0b248d55d 100644
---- a/doc/guides/rel_notes/release_19_02.rst
-+++ b/doc/guides/rel_notes/release_19_02.rst
-@@ -127,4 +127,8 @@ API Changes
-     fd's (such as in-memory or no-huge mode)
- 
-+* eal: Functions ``rte_malloc_dump_stats()``, ``rte_malloc_dump_heaps()`` and
-+  ``rte_malloc_get_socket_stats()`` are no longer safe to call concurrently with
-+  ``rte_malloc_heap_create()`` or ``rte_malloc_heap_destroy()`` function calls.
-+
- * pdump: The ``rte_pdump_set_socket_dir()``, the parameter ``path`` of
-   ``rte_pdump_init()`` and enum ``rte_pdump_socktype`` were deprecated
 diff --git a/lib/librte_eal/common/include/rte_malloc.h b/lib/librte_eal/common/include/rte_malloc.h
 index a5290b074..54a12467a 100644
 --- a/lib/librte_eal/common/include/rte_malloc.h
@@ -82,7 +69,7 @@
   * @param f
   *   A pointer to a file for output
 diff --git a/lib/librte_eal/common/rte_malloc.c b/lib/librte_eal/common/rte_malloc.c
-index 09051c236..b39de3c99 100644
+index 06cf1e666..47c2bec72 100644
 --- a/lib/librte_eal/common/rte_malloc.c
 +++ b/lib/librte_eal/common/rte_malloc.c
 @@ -157,18 +157,12 @@ rte_malloc_get_socket_stats(int socket,

  parent reply	other threads:[~2019-01-04 13:27 UTC|newest]

Thread overview: 82+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-04 13:23 [dpdk-stable] patch 'config: enable C11 memory model for armv8 with meson' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'mk: do not install meson.build in usertools' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'log: add missing experimental tag' " Kevin Traynor
2019-01-10  9:52   ` David Marchand
2019-01-10 10:28     ` Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bus/vmbus: fix race in subchannel creation' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'net/netvsc: enable SR-IOV' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'net/netvsc: disable multi-queue on older servers' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bus/dpaa: do nothing if bus not present' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'doc: fix garbage text in generated HTML guides' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eal: clean up unused files on initialization' " Kevin Traynor
2019-01-08 16:53   ` Burakov, Anatoly
2019-01-08 18:09     ` Kevin Traynor
2019-01-10 11:38       ` Burakov, Anatoly
2019-01-04 13:23 ` [dpdk-stable] patch 'gro: fix overflow of payload length calculation' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix error log in eth Rx adapter' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: remove redundant timer adapter function prototypes' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'app/eventdev: detect deadlock for timer event producer' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix xstats documentation typo' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'eventdev: fix eth Tx adapter queue count checks' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'compressdev: fix structure comment' " Kevin Traynor
2019-01-04 13:23 ` [dpdk-stable] patch 'bb/turbo_sw: fix dynamic linking' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix block size error handling' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix message for CCM when setting unused counter' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'crypto/qat: fix message for NULL algo " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'common/qat: remove check of valid firmware response' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'compress/qat: fix return on building request error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'compress/qat: fix dequeue error counter' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'timer: fix race condition' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'ip_frag: fix IPv6 when MTU sizes not aligned to 8 bytes' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix missing newline in a log' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix detection of duplicate option register' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: fix leak on multi-process request error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'memzone: fix unlock on initialization failure' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: fix finding maximum contiguous IOVA size' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: notify primary process about hotplug in secondary' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: fix duplicate mem event notification' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'malloc: make alignment requirements more stringent' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'mem: fix segment fd API error code for external segment' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'mem: check for memfd support in segment fd API' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'doc: remove note on memory mode limitation in multi-process' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'test/mem: add external mem autotest to meson' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'test/fbarray: add " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'eal: close multi-process socket during cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'hash: fix return of bulk lookup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'hash: fix out-of-bound write while freeing key slot' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'devtools: fix return of forbidden addition checks' " Kevin Traynor
2019-01-04 13:24 ` Kevin Traynor [this message]
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: clear VF reset flags after reset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: fix statistics inconsistency' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/netvsc: fix transmit descriptor pool cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/netvsc: fix probe when VF not found' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: fix race condition when adding fd in the fdset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ifc: store only registered device instance' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: add reset reason in Rx error' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: skip packet with wrong request id' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: destroy queues if start failed' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: do not reconfigure queues on reset' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: add supported RSS offloads types' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: fix invalid reference to variable in union' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: fix cleanup for out of order packets' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/ena: update completion queue after cleanup' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/cxgbe: fix overlapping regions in TID table' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/cxgbe: skip parsing match items with no spec' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/i40e: fix config name in comment' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/mlx5: fix Multi-Packet RQ mempool free' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net: fix underflow for checksum of invalid IPv4 packets' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/tap: add buffer overflow checks before checksum' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/vhost: fix double free of MAC address' " Kevin Traynor
2019-01-07  0:04   ` Hideyuki Yamashita
2019-01-07 10:23     ` Kevin Traynor
2019-01-09  7:39       ` Hideyuki Yamashita
2019-01-09 11:04         ` Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: enforce avail index and desc read ordering' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'vhost: enforce desc flags and content " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/af_packet: fix setting MTU decrements sockaddr twice' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/tap: fix possible uninitialized variable access' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/avf/base: fix comment referencing internal data' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'net/sfc: pass HW Tx queue index on creation' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'telemetry: fix using ports of different types' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'sched: fix memory leak on init failure' " Kevin Traynor
2019-01-04 13:24 ` [dpdk-stable] patch 'app/testpmd: expand RED queue thresholds to 64 bits' " Kevin Traynor

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190104132455.15170-43-ktraynor@redhat.com \
    --to=ktraynor@redhat.com \
    --cc=anatoly.burakov@intel.com \
    --cc=stable@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).