DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] eal: support including mapped memory in core dump
@ 2024-10-23 23:18 Dmitry Kozlyuk
  2024-10-24  2:19 ` Stephen Hemminger
                   ` (5 more replies)
  0 siblings, 6 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-23 23:18 UTC (permalink / raw)
  To: dev; +Cc: Dmitry Kozlyuk, Anatoly Burakov, Lewis Donzis

From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

Commit d72e4042c5eb ("mem: exclude unused memory from core dump")
unconditionally excluded all hugepage memory managed by DPDK.
The rationale was to avoid overly large core dumps
generated from reserved (PROT_NONE) but not mapped memory.
Mapped hugepages, however, may hold data useful for debugging,
even if not being fully used and containing some garbage.
Users may want to include those hugepages in core dump.

Add `--huge-dump` EAL command-line option to include in core dump
all mapped hugepages and also non-hugepage memory
allocated with `--no-huge` (as it substitutes for hugepages).
Using this option requires more disk storage for core dumps,
and also may include sensitive data in core dump,
which is why it must be explicitly enabled and a warning is printed.

Linux requires /proc/self/coredump_filter adjustment
to include hugepages mapped with MAP_SHARED in core dump.
Windows EAL requires no change since it automatically
excludes reserved memory and includes committed memory.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
---
Lewis, testing on FreeBSD would be appreciated.

 doc/guides/linux_gsg/eal_args.include.rst |  8 +++
 lib/eal/common/eal_common_dynmem.c        |  1 +
 lib/eal/common/eal_common_options.c       |  7 ++
 lib/eal/common/eal_internal_cfg.h         |  2 +
 lib/eal/common/eal_options.h              |  2 +
 lib/eal/freebsd/eal_memory.c              | 19 +++++
 lib/eal/linux/eal_memalloc.c              |  8 +++
 lib/eal/linux/eal_memory.c                | 86 +++++++++++++++++++++--
 8 files changed, 129 insertions(+), 4 deletions(-)

diff --git a/doc/guides/linux_gsg/eal_args.include.rst b/doc/guides/linux_gsg/eal_args.include.rst
index 9cfbf7de84..0cab6b2f16 100644
--- a/doc/guides/linux_gsg/eal_args.include.rst
+++ b/doc/guides/linux_gsg/eal_args.include.rst
@@ -122,6 +122,14 @@ Memory-related options
     to system pthread stack size unless the optional size (in kbytes) is
     specified.
 
+*   ``--huge-dump``
+
+    Include in code dump all hugepages mapped by DPDK either at startup or later,
+    and ordirary pages mapped by DPDK at startup when ``--no-huge`` is given.
+    Hugepages increase core dumps size roughly by the amount of memory used.
+    Dumped memory often contains processed data, which may be sensitive.
+    Primary and secondary processes can use this option independently.
+
 Debugging options
 ~~~~~~~~~~~~~~~~~
 
diff --git a/lib/eal/common/eal_common_dynmem.c b/lib/eal/common/eal_common_dynmem.c
index b4dc231940..e6497307b5 100644
--- a/lib/eal/common/eal_common_dynmem.c
+++ b/lib/eal/common/eal_common_dynmem.c
@@ -7,6 +7,7 @@
 #include <stdlib.h>
 #include <string.h>
 
+#include <rte_errno.h>
 #include <rte_log.h>
 #include <rte_string_fns.h>
 
diff --git a/lib/eal/common/eal_common_options.c b/lib/eal/common/eal_common_options.c
index f1a5e329a5..b02a9894ff 100644
--- a/lib/eal/common/eal_common_options.c
+++ b/lib/eal/common/eal_common_options.c
@@ -105,6 +105,7 @@ eal_long_options[] = {
 	{OPT_NO_TELEMETRY,      0, NULL, OPT_NO_TELEMETRY_NUM     },
 	{OPT_FORCE_MAX_SIMD_BITWIDTH, 1, NULL, OPT_FORCE_MAX_SIMD_BITWIDTH_NUM},
 	{OPT_HUGE_WORKER_STACK, 2, NULL, OPT_HUGE_WORKER_STACK_NUM     },
+	{OPT_HUGE_DUMP,         0, NULL, OPT_HUGE_DUMP_NUM        },
 
 	{0,                     0, NULL, 0                        }
 };
@@ -1942,6 +1943,11 @@ eal_parse_common_option(int opt, const char *optarg,
 			return -1;
 		}
 		break;
+	case OPT_HUGE_DUMP_NUM:
+		conf->huge_dump = 1;
+		EAL_LOG(WARNING, "Using --"OPT_HUGE_DUMP
+			" may result in core dump files which are large or contain sensitive data.");
+		break;
 
 	/* don't know what to do, leave this to caller */
 	default:
@@ -2250,6 +2256,7 @@ eal_common_usage(void)
 	       "  --"OPT_IN_MEMORY"   Operate entirely in memory. This will\n"
 	       "                      disable secondary process support\n"
 	       "  --"OPT_BASE_VIRTADDR"     Base virtual address\n"
+	       "  --"OPT_HUGE_DUMP"   Include hugepages in core dump.\n"
 	       "  --"OPT_TELEMETRY"   Enable telemetry support (on by default)\n"
 	       "  --"OPT_NO_TELEMETRY"   Disable telemetry support\n"
 	       "  --"OPT_FORCE_MAX_SIMD_BITWIDTH" Force the max SIMD bitwidth\n"
diff --git a/lib/eal/common/eal_internal_cfg.h b/lib/eal/common/eal_internal_cfg.h
index 167ec501fa..5e03c3952c 100644
--- a/lib/eal/common/eal_internal_cfg.h
+++ b/lib/eal/common/eal_internal_cfg.h
@@ -103,6 +103,8 @@ struct internal_config {
 	struct simd_bitwidth max_simd_bitwidth;
 	/**< max simd bitwidth path to use */
 	size_t huge_worker_stack_size; /**< worker thread stack size */
+	/** True to include mapped hugepages in coredump. */
+	unsigned int huge_dump;
 };
 
 void eal_reset_internal_config(struct internal_config *internal_cfg);
diff --git a/lib/eal/common/eal_options.h b/lib/eal/common/eal_options.h
index 3cc9cb6412..604d39417d 100644
--- a/lib/eal/common/eal_options.h
+++ b/lib/eal/common/eal_options.h
@@ -89,6 +89,8 @@ enum {
 	OPT_FORCE_MAX_SIMD_BITWIDTH_NUM,
 #define OPT_HUGE_WORKER_STACK  "huge-worker-stack"
 	OPT_HUGE_WORKER_STACK_NUM,
+#define OPT_HUGE_DUMP "huge-dump"
+	OPT_HUGE_DUMP_NUM,
 
 	OPT_LONG_MAX_NUM
 };
diff --git a/lib/eal/freebsd/eal_memory.c b/lib/eal/freebsd/eal_memory.c
index a6f3ba226e..cd99af4b0e 100644
--- a/lib/eal/freebsd/eal_memory.c
+++ b/lib/eal/freebsd/eal_memory.c
@@ -94,6 +94,12 @@ rte_eal_hugepage_init(void)
 
 		eal_memseg_list_populate(msl, addr, n_segs);
 
+		if (internal_conf->huge_dump) {
+			if (eal_mem_set_dump(addr, mem_sz, true) < 0)
+				EAL_LOG(WARNING, "Failed to include pages in core dump (address %p, size %zu): %s",
+					addr, mem_sz, rte_strerror(rte_errno));
+		}
+
 		return 0;
 	}
 
@@ -205,6 +211,12 @@ rte_eal_hugepage_init(void)
 
 			rte_fbarray_set_used(arr, ms_idx);
 
+			if (internal_conf->huge_dump) {
+				if (eal_mem_set_dump(addr, page_sz, true) < 0)
+					EAL_LOG(WARNING, "Failed to include hugepage in core dump (address %p, size %zu): %s",
+						addr, page_sz, rte_strerror(rte_errno));
+			}
+
 			EAL_LOG(INFO, "Mapped memory segment %u @ %p: physaddr:0x%"
 					PRIx64", len %zu",
 					seg_idx++, addr, physaddr, page_sz);
@@ -232,6 +244,7 @@ static int
 attach_segment(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 		void *arg)
 {
+	struct internal_config *internal_conf = eal_get_internal_configuration();
 	struct attach_walk_args *wa = arg;
 	void *addr;
 
@@ -245,6 +258,12 @@ attach_segment(const struct rte_memseg_list *msl, const struct rte_memseg *ms,
 		return -1;
 	wa->seg_idx++;
 
+	if (internal_conf->huge_dump) {
+		if (eal_mem_set_dump(addr, ms->len, true) < 0)
+			EAL_LOG(WARNING, "Failed to include hugepage in core dump (address %p, size %zu): %s",
+				addr, ms->len, rte_strerror(rte_errno));
+	}
+
 	return 0;
 }
 
diff --git a/lib/eal/linux/eal_memalloc.c b/lib/eal/linux/eal_memalloc.c
index e354efc95d..611b1c65db 100644
--- a/lib/eal/linux/eal_memalloc.c
+++ b/lib/eal/linux/eal_memalloc.c
@@ -28,6 +28,7 @@
 #include <linux/mman.h> /* for hugetlb-related mmap flags */
 
 #include <rte_common.h>
+#include <rte_errno.h>
 #include <rte_log.h>
 #include <rte_eal.h>
 #include <rte_memory.h>
@@ -688,6 +689,13 @@ alloc_seg(struct rte_memseg *ms, void *addr, int socket_id,
 	ms->socket_id = socket_id;
 	ms->flags = dirty ? RTE_MEMSEG_FLAG_DIRTY : 0;
 
+	if (internal_conf->huge_dump) {
+		if (eal_mem_set_dump(addr, alloc_sz, true) < 0)
+			EAL_LOG(WARNING,
+				"Failed to include hugepage in coredump (address %p, size %zu): %s",
+				addr, alloc_sz, rte_strerror(rte_errno));
+	}
+
 	return 0;
 
 mapped:
diff --git a/lib/eal/linux/eal_memory.c b/lib/eal/linux/eal_memory.c
index 45879ca743..c1cb8d71e7 100644
--- a/lib/eal/linux/eal_memory.c
+++ b/lib/eal/linux/eal_memory.c
@@ -659,6 +659,7 @@ unmap_unneeded_hugepages(struct hugepage_file *hugepg_tbl,
 static int
 remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 {
+	const struct internal_config *internal_conf = eal_get_internal_configuration();
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	struct rte_memseg_list *msl;
 	struct rte_fbarray *arr;
@@ -668,10 +669,9 @@ remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 	uint64_t page_sz;
 	size_t memseg_len;
 	int socket_id;
-#ifndef RTE_ARCH_64
-	const struct internal_config *internal_conf =
-		eal_get_internal_configuration();
-#endif
+	void *map_addr;
+	size_t map_len;
+
 	page_sz = hugepages[seg_start].size;
 	socket_id = hugepages[seg_start].socket_id;
 	seg_len = seg_end - seg_start;
@@ -721,6 +721,9 @@ remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 		return -1;
 	}
 
+	map_addr = RTE_PTR_ADD(msl->base_va, ms_idx * page_sz);
+	map_len = page_sz * seg_len;
+
 #ifdef RTE_ARCH_PPC_64
 	/* for PPC64 we go through the list backwards */
 	for (cur_page = seg_end - 1; cur_page >= seg_start;
@@ -793,6 +796,14 @@ remap_segment(struct hugepage_file *hugepages, int seg_start, int seg_end)
 			EAL_LOG(ERR, "Could not store segment fd: %s",
 				rte_strerror(rte_errno));
 	}
+
+	if (internal_conf->huge_dump) {
+		if (eal_mem_set_dump(map_addr, map_len, true) < 0)
+			EAL_LOG(WARNING,
+				"Failed to include hugepages in core dump (address %p, size %zu): %s",
+				map_addr, map_len, rte_strerror(rte_errno));
+	}
+
 	EAL_LOG(DEBUG, "Allocated %" PRIu64 "M on socket %i",
 			(seg_len * page_sz) >> 20, socket_id);
 	return seg_len;
@@ -1241,6 +1252,13 @@ eal_legacy_hugepage_init(void)
 					__func__);
 			goto fail;
 		}
+
+		if (internal_conf->huge_dump) {
+			if (eal_mem_set_dump(prealloc_addr, mem_sz, true) < 0)
+				EAL_LOG(WARNING, "Failed to include pages in core dump (address %p, size %zu): %s",
+					prealloc_addr, mem_sz, rte_strerror(rte_errno));
+		}
+
 		return 0;
 	}
 
@@ -1518,6 +1536,7 @@ getFileSize(int fd)
 static int
 eal_legacy_hugepage_attach(void)
 {
+	struct internal_config *internal_conf = eal_get_internal_configuration();
 	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	struct hugepage_file *hp = NULL;
 	unsigned int num_hp = 0;
@@ -1616,6 +1635,12 @@ eal_legacy_hugepage_attach(void)
 		if (eal_memalloc_set_seg_fd(msl_idx, ms_idx, fd) < 0)
 			EAL_LOG(ERR, "Could not store segment fd: %s",
 				rte_strerror(rte_errno));
+
+		if (internal_conf->huge_dump) {
+			if (eal_mem_set_dump(map_addr, map_sz, true) < 0)
+				EAL_LOG(WARNING, "Failed to include hugepage in core dump (address %p, size %zu): %s",
+					map_addr, map_sz, rte_strerror(rte_errno));
+		}
 	}
 	/* unmap the hugepage config file, since we are done using it */
 	munmap(hp, size);
@@ -1650,12 +1675,65 @@ eal_hugepage_attach(void)
 	return 0;
 }
 
+static int
+enable_shared_hugepage_coredump(void)
+{
+	const char *path = "/proc/self/coredump_filter";
+	const unsigned long shared_hugepage_flag = RTE_BIT64(6);
+
+	FILE *f;
+	uint64_t coredump_filter;
+
+	f = fopen(path, "r");
+	if (f == NULL) {
+		rte_errno = errno;
+		EAL_LOG(ERR, "Failed to open %s for reading: %s", path, strerror(errno));
+		return -1;
+	}
+
+	if (fscanf(f, "%"SCNx64, &coredump_filter) != 1) {
+		rte_errno = errno;
+		EAL_LOG(ERR, "Failed to parse %s: %s", path, strerror(errno));
+		fclose(f);
+		return -1;
+	}
+
+	fclose(f);
+
+	if (coredump_filter & shared_hugepage_flag)
+		return 0;
+
+	f = fopen(path, "w");
+	if (f == NULL) {
+		rte_errno = errno;
+		EAL_LOG(ERR, "Failed to open %s for writing: %s", path, strerror(errno));
+		return -1;
+	}
+
+	coredump_filter |= shared_hugepage_flag;
+	if (fprintf(f, "%"PRIx64, coredump_filter) <= 0) {
+		rte_errno = EIO;
+		EAL_LOG(ERR, "Failed to write %"PRIx64" to %s", coredump_filter, path);
+		fclose(f);
+		return -1;
+	}
+
+	fclose(f);
+	return 0;
+}
+
 int
 rte_eal_hugepage_init(void)
 {
 	const struct internal_config *internal_conf =
 		eal_get_internal_configuration();
 
+	if (internal_conf->huge_dump && enable_shared_hugepage_coredump() < 0) {
+		EAL_LOG(ERR, "Failed to enable shared hugepage core dump: %s",
+			rte_strerror(rte_errno));
+		return -1;
+	}
+
 	return internal_conf->legacy_mem ?
 			eal_legacy_hugepage_init() :
 			eal_dynmem_hugepage_init();
-- 
2.38.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
@ 2024-10-24  2:19 ` Stephen Hemminger
  2024-10-24  2:31 ` Stephen Hemminger
                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 21+ messages in thread
From: Stephen Hemminger @ 2024-10-24  2:19 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, Anatoly Burakov, Lewis Donzis

On Thu, 24 Oct 2024 02:18:59 +0300
Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:

> From: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> 
> Commit d72e4042c5eb ("mem: exclude unused memory from core dump")
> unconditionally excluded all hugepage memory managed by DPDK.
> The rationale was to avoid overly large core dumps
> generated from reserved (PROT_NONE) but not mapped memory.
> Mapped hugepages, however, may hold data useful for debugging,
> even if not being fully used and containing some garbage.
> Users may want to include those hugepages in core dump.
> 
> Add `--huge-dump` EAL command-line option to include in core dump
> all mapped hugepages and also non-hugepage memory
> allocated with `--no-huge` (as it substitutes for hugepages).
> Using this option requires more disk storage for core dumps,
> and also may include sensitive data in core dump,
> which is why it must be explicitly enabled and a warning is printed.
> 
> Linux requires /proc/self/coredump_filter adjustment
> to include hugepages mapped with MAP_SHARED in core dump.
> Windows EAL requires no change since it automatically
> excludes reserved memory and includes committed memory.
> 
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>

Has build errors.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
  2024-10-24  2:19 ` Stephen Hemminger
@ 2024-10-24  2:31 ` Stephen Hemminger
  2024-10-24  7:07   ` Morten Brørup
  2024-10-24  7:22 ` Morten Brørup
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Stephen Hemminger @ 2024-10-24  2:31 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, Anatoly Burakov, Lewis Donzis

On Thu, 24 Oct 2024 02:18:59 +0300
Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:

> --- a/lib/eal/common/eal_internal_cfg.h
> +++ b/lib/eal/common/eal_internal_cfg.h
> @@ -103,6 +103,8 @@ struct internal_config {
>  	struct simd_bitwidth max_simd_bitwidth;
>  	/**< max simd bitwidth path to use */
>  	size_t huge_worker_stack_size; /**< worker thread stack size */
> +	/** True to include mapped hugepages in coredump. */
> +	unsigned int huge_dump;

No need to waste 4 bytes for a flag.
Find an existing one byte hole and use that.

struct internal_config {
	volatile size_t            memory;               /*     0     8 */
	volatile unsigned int      force_nchannel;       /*     8     4 */
	volatile unsigned int      force_nrank;          /*    12     4 */
	volatile unsigned int      no_hugetlbfs;         /*    16     4 */
	struct hugepage_file_discipline hugepage_file;   /*    20     2 */

	/* XXX 2 bytes hole, try to pack */

	volatile unsigned int      no_pci;               /*    24     4 */
	volatile unsigned int      no_hpet;              /*    28     4 */
	volatile unsigned int      vmware_tsc_map;       /*    32     4 */
	volatile unsigned int      no_shconf;            /*    36     4 */
	volatile unsigned int      in_memory;            /*    40     4 */
	volatile unsigned int      create_uio_dev;       /*    44     4 */
	volatile enum rte_proc_type_t  process_type;     /*    48     4 */
	volatile unsigned int      force_sockets;        /*    52     4 */
	volatile volatile uint64_t   socket_mem;         /*    56   256 */
	/* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */
	volatile unsigned int      force_socket_limits;  /*   312     4 */

	/* XXX 4 bytes hole, try to pack */

	/* --- cacheline 5 boundary (320 bytes) --- */
	volatile volatile uint64_t   socket_limit;       /*   320   256 */
	/* --- cacheline 9 boundary (576 bytes) --- */
	uintptr_t                  base_virtaddr;        /*   576     8 */
	volatile unsigned int      legacy_mem;           /*   584     4 */
	volatile unsigned int      match_allocations;    /*   588     4 */
	volatile unsigned int      single_file_segments; /*   592     4 */
	volatile int               syslog_facility;      /*   596     4 */
	volatile enum rte_intr_mode  vfio_intr_mode;     /*   600     4 */
	rte_uuid_t                 vfio_vf_token;        /*   604    16 */

	/* XXX 4 bytes hole, try to pack */

	char *                     hugefile_prefix;      /*   624     8 */
	char *                     hugepage_dir;         /*   632     8 */
	/* --- cacheline 10 boundary (640 bytes) --- */
	char *                     user_mbuf_pool_ops_name; /*   640     8 */
	unsigned int               num_hugepage_sizes;   /*   648     4 */

	/* XXX 4 bytes hole, try to pack */

	struct hugepage_info       hugepage_info[3];     /*   656 12720 */
	/* --- cacheline 209 boundary (13376 bytes) --- */
	enum rte_iova_mode         iova_mode;            /* 13376     4 */

	/* XXX 4 bytes hole, try to pack */

	rte_cpuset_t               ctrl_cpuset;          /* 13384   128 */
	/* --- cacheline 211 boundary (13504 bytes) was 8 bytes ago --- */
	volatile unsigned int      init_complete;        /* 13512     4 */
	unsigned int               no_telemetry;         /* 13516     4 */
	struct simd_bitwidth       max_simd_bitwidth;    /* 13520     4 */

	/* XXX last struct has 1 hole */
	/* XXX 4 bytes hole, try to pack */

	size_t                     huge_worker_stack_size; /* 13528     8 */

	/* size: 13536, cachelines: 212, members: 34 */
	/* sum members: 13514, holes: 6, sum holes: 22 */
	/* member types with holes: 1, total: 1 */
	/* last cacheline: 32 bytes */
};

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] eal: support including mapped memory in core dump
  2024-10-24  2:31 ` Stephen Hemminger
@ 2024-10-24  7:07   ` Morten Brørup
  0 siblings, 0 replies; 21+ messages in thread
From: Morten Brørup @ 2024-10-24  7:07 UTC (permalink / raw)
  To: Stephen Hemminger, Dmitry Kozlyuk; +Cc: dev, Anatoly Burakov, Lewis Donzis

> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Thursday, 24 October 2024 04.31
> 
> On Thu, 24 Oct 2024 02:18:59 +0300
> Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:
> 
> > --- a/lib/eal/common/eal_internal_cfg.h
> > +++ b/lib/eal/common/eal_internal_cfg.h
> > @@ -103,6 +103,8 @@ struct internal_config {
> >  	struct simd_bitwidth max_simd_bitwidth;
> >  	/**< max simd bitwidth path to use */
> >  	size_t huge_worker_stack_size; /**< worker thread stack size */
> > +	/** True to include mapped hugepages in coredump. */
> > +	unsigned int huge_dump;
> 
> No need to waste 4 bytes for a flag.
> Find an existing one byte hole and use that.

Disagree with Stephen on this one.
Using unsigned int for Booleans is the convention in this structure.
Please following the convention and stick with unsigned int.

De-bloating this structure belongs in a separate patch series.
I have no clue how "hot" the structure is, so don't know if it is worth the effort.

> 
> struct internal_config {
> 	volatile size_t            memory;               /*     0     8
> */
> 	volatile unsigned int      force_nchannel;       /*     8     4
> */
> 	volatile unsigned int      force_nrank;          /*    12     4
> */
> 	volatile unsigned int      no_hugetlbfs;         /*    16     4
> */
> 	struct hugepage_file_discipline hugepage_file;   /*    20     2
> */
> 
> 	/* XXX 2 bytes hole, try to pack */
> 
> 	volatile unsigned int      no_pci;               /*    24     4
> */
> 	volatile unsigned int      no_hpet;              /*    28     4
> */
> 	volatile unsigned int      vmware_tsc_map;       /*    32     4
> */
> 	volatile unsigned int      no_shconf;            /*    36     4
> */
> 	volatile unsigned int      in_memory;            /*    40     4
> */
> 	volatile unsigned int      create_uio_dev;       /*    44     4
> */
> 	volatile enum rte_proc_type_t  process_type;     /*    48     4
> */
> 	volatile unsigned int      force_sockets;        /*    52     4
> */
> 	volatile volatile uint64_t   socket_mem;         /*    56   256
> */
> 	/* --- cacheline 4 boundary (256 bytes) was 56 bytes ago --- */
> 	volatile unsigned int      force_socket_limits;  /*   312     4
> */
> 
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	/* --- cacheline 5 boundary (320 bytes) --- */
> 	volatile volatile uint64_t   socket_limit;       /*   320   256
> */
> 	/* --- cacheline 9 boundary (576 bytes) --- */
> 	uintptr_t                  base_virtaddr;        /*   576     8
> */
> 	volatile unsigned int      legacy_mem;           /*   584     4
> */
> 	volatile unsigned int      match_allocations;    /*   588     4
> */
> 	volatile unsigned int      single_file_segments; /*   592     4
> */
> 	volatile int               syslog_facility;      /*   596     4
> */
> 	volatile enum rte_intr_mode  vfio_intr_mode;     /*   600     4
> */
> 	rte_uuid_t                 vfio_vf_token;        /*   604    16
> */
> 
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	char *                     hugefile_prefix;      /*   624     8
> */
> 	char *                     hugepage_dir;         /*   632     8
> */
> 	/* --- cacheline 10 boundary (640 bytes) --- */
> 	char *                     user_mbuf_pool_ops_name; /*   640
> 8 */
> 	unsigned int               num_hugepage_sizes;   /*   648     4
> */
> 
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	struct hugepage_info       hugepage_info[3];     /*   656 12720
> */
> 	/* --- cacheline 209 boundary (13376 bytes) --- */
> 	enum rte_iova_mode         iova_mode;            /* 13376     4
> */
> 
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	rte_cpuset_t               ctrl_cpuset;          /* 13384   128
> */
> 	/* --- cacheline 211 boundary (13504 bytes) was 8 bytes ago ---
> */
> 	volatile unsigned int      init_complete;        /* 13512     4
> */
> 	unsigned int               no_telemetry;         /* 13516     4
> */
> 	struct simd_bitwidth       max_simd_bitwidth;    /* 13520     4
> */
> 
> 	/* XXX last struct has 1 hole */
> 	/* XXX 4 bytes hole, try to pack */
> 
> 	size_t                     huge_worker_stack_size; /* 13528     8
> */
> 
> 	/* size: 13536, cachelines: 212, members: 34 */
> 	/* sum members: 13514, holes: 6, sum holes: 22 */
> 	/* member types with holes: 1, total: 1 */
> 	/* last cacheline: 32 bytes */
> };

^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH] eal: support including mapped memory in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
  2024-10-24  2:19 ` Stephen Hemminger
  2024-10-24  2:31 ` Stephen Hemminger
@ 2024-10-24  7:22 ` Morten Brørup
  2024-10-24  8:25   ` Dmitry Kozlyuk
  2024-10-24 12:55 ` Lewis Donzis
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 21+ messages in thread
From: Morten Brørup @ 2024-10-24  7:22 UTC (permalink / raw)
  To: Dmitry Kozlyuk, dev; +Cc: Anatoly Burakov, Lewis Donzis, Stephen Hemminger

> From: Dmitry Kozlyuk [mailto:dmitry.kozliuk@gmail.com]
> Sent: Thursday, 24 October 2024 01.19
> 
> Commit d72e4042c5eb ("mem: exclude unused memory from core dump")
> unconditionally excluded all hugepage memory managed by DPDK.
> The rationale was to avoid overly large core dumps
> generated from reserved (PROT_NONE) but not mapped memory.
> Mapped hugepages, however, may hold data useful for debugging,
> even if not being fully used and containing some garbage.
> Users may want to include those hugepages in core dump.
> 
> Add `--huge-dump` EAL command-line option to include in core dump
> all mapped hugepages and also non-hugepage memory
> allocated with `--no-huge` (as it substitutes for hugepages).

I agree with the behavior of also dumping --no-huge mapped.
Perhaps because of this, the option should be named differently; something like --dump-verbose, --dump-all, --dump-mapped.

> Using this option requires more disk storage for core dumps,
> and also may include sensitive data in core dump,
> which is why it must be explicitly enabled and a warning is printed.
> 
> Linux requires /proc/self/coredump_filter adjustment
> to include hugepages mapped with MAP_SHARED in core dump.
> Windows EAL requires no change since it automatically
> excludes reserved memory and includes committed memory.

The above O/S specific information should also be included as notes in the documentation.

> 
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
> Lewis, testing on FreeBSD would be appreciated.

If there are any notes about FreeBSD, like the notes about Linux and Windows above, it should also be included in the documentation.

> --- a/doc/guides/linux_gsg/eal_args.include.rst
> +++ b/doc/guides/linux_gsg/eal_args.include.rst
> @@ -122,6 +122,14 @@ Memory-related options
>      to system pthread stack size unless the optional size (in kbytes)
> is
>      specified.
> 
> +*   ``--huge-dump``
> +
> +    Include in code dump all hugepages mapped by DPDK either at
> startup or later,
> +    and ordirary pages mapped by DPDK at startup when ``--no-huge`` is
> given.
> +    Hugepages increase core dumps size roughly by the amount of memory
> used.
> +    Dumped memory often contains processed data, which may be
> sensitive.
> +    Primary and secondary processes can use this option independently.
> +

With O/S specific notes added to the documentation,
Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-24  7:22 ` Morten Brørup
@ 2024-10-24  8:25   ` Dmitry Kozlyuk
  0 siblings, 0 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-24  8:25 UTC (permalink / raw)
  To: Morten Brørup; +Cc: dev, Anatoly Burakov, Lewis Donzis, Stephen Hemminger

2024-10-24 09:22 (UTC+0200), Morten Brørup:
> > From: Dmitry Kozlyuk [mailto:dmitry.kozliuk@gmail.com]
> > Sent: Thursday, 24 October 2024 01.19
[...]
> > Add `--huge-dump` EAL command-line option to include in core dump
> > all mapped hugepages and also non-hugepage memory
> > allocated with `--no-huge` (as it substitutes for hugepages).  
> 
> I agree with the behavior of also dumping --no-huge mapped.
> Perhaps because of this, the option should be named differently; something
> like --dump-verbose, --dump-all, --dump-mapped. 

Or one may interpret "huge dump" as "the dump will be huge" :)
I'll use "--dump-mapped" as the most precise, thanks.

> > Linux requires /proc/self/coredump_filter adjustment
> > to include hugepages mapped with MAP_SHARED in core dump.
> > Windows EAL requires no change since it automatically
> > excludes reserved memory and includes committed memory.  
> 
> The above O/S specific information should also be included as notes in the documentation.
[...]
> If there are any notes about FreeBSD, like the notes about Linux and
> Windows above, it should also be included in the documentation.

"Linux requires..." here explains what DPDK code in the patch is doing
and "Windows EAL requires no change" explains why there are no code changes.
No OS-specific user action is required for --huge-dump to work.
I'll rephrase that comment and add a release note about the new option,
but I think documentation already contains everything users need.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
                   ` (2 preceding siblings ...)
  2024-10-24  7:22 ` Morten Brørup
@ 2024-10-24 12:55 ` Lewis Donzis
  2024-10-24 16:38 ` Stephen Hemminger
  2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
  5 siblings, 0 replies; 21+ messages in thread
From: Lewis Donzis @ 2024-10-24 12:55 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, anatoly burakov



----- On Oct 23, 2024, at 6:18 PM, Dmitry Kozlyuk dmitry.kozliuk@gmail.com wrote:
> Lewis, testing on FreeBSD would be appreciated.

Well, unfortunately, it's not working very well...  The contigmem memory was not included in the core dump.

I added logging just before the madvise() call to prove that it's calling madvise() properly, and it looks good.
I then tried forcing all madvise() calls to use EAL_DODUMP.
And finally, I tried removing the call to madvise() altogether.

None of those things changed the size of the core dump from what it was, and my test of trying to access an mbuf in the debugger still says the memory is inaccessible.

What's even worse is, it appears that I had already tried the above work-arounds back on May 28th and mentioned that in an e-mail to you.  Apparently I totally forgot about those experiments!  At the time, I wondered if it was just something fundamental about FreeBSD and contigmem.

It may take some digging through the FreeBSD source code to see how the dumper works.

So I apologize for not remembering that this wasn't entirely a DPDK issue.  That said, it's nice that this is available on Linux and maybe we'll find a way to make it work on FreeBSD.

Thanks again for the efforts,
lew

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
                   ` (3 preceding siblings ...)
  2024-10-24 12:55 ` Lewis Donzis
@ 2024-10-24 16:38 ` Stephen Hemminger
  2024-10-24 20:54   ` Dmitry Kozlyuk
  2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
  5 siblings, 1 reply; 21+ messages in thread
From: Stephen Hemminger @ 2024-10-24 16:38 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, Anatoly Burakov, Lewis Donzis

On Thu, 24 Oct 2024 02:18:59 +0300
Dmitry Kozlyuk <dmitry.kozliuk@gmail.com> wrote:

> +static int
> +enable_shared_hugepage_coredump(void)
> +{
> +	const char *path = "/proc/self/coredump_filter";
> +	const unsigned long shared_hugepage_flag = RTE_BIT64(6);
> +
> +	FILE *f;
> +	uint64_t coredump_filter;
> +
> +	f = fopen(path, "r");
> +	if (f == NULL) {
> +		rte_errno = errno;
> +		EAL_LOG(ERR, "Failed to open %s for reading: %s", path, strerror(errno));
> +		return -1;
> +	}
> +
> +	if (fscanf(f, "%"SCNx64, &coredump_filter) != 1) {
> +		rte_errno = errno;
> +		EAL_LOG(ERR, "Failed to parse %s: %s", path, strerror(errno));
> +		fclose(f);
> +		return -1;
> +	}
> +
> +	fclose(f);
> +
> +	if (coredump_filter & shared_hugepage_flag)
> +		return 0;
> +
> +	f = fopen(path, "w");
> +	if (f == NULL) {
> +		rte_errno = errno;
> +		EAL_LOG(ERR, "Failed to open %s for writing: %s", path, strerror(errno));
> +		return -1;
> +	}
> +
> +	coredump_filter |= shared_hugepage_flag;
> +	if (fprintf(f, "%"PRIx64, coredump_filter) <= 0) {
> +		rte_errno = EIO;
> +		EAL_LOG(ERR, "Failed to write %"PRIx64" to %s", coredump_filter, path);
> +		fclose(f);
> +		return -1;
> +	}
> +
> +	fclose(f);
> +	return 0;
> +}
> +

Having a process set a system global value like coredump_filter via an internal
call seems like a potential problem. What about other processes on the system?
It may not even be allowed if using a hardened kernel.

I would prefer that madvise() be used, and document the required change to
coredump_filter.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-24 16:38 ` Stephen Hemminger
@ 2024-10-24 20:54   ` Dmitry Kozlyuk
  2024-10-25  0:22     ` Dmitry Kozlyuk
  0 siblings, 1 reply; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-24 20:54 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Anatoly Burakov, Lewis Donzis

2024-10-24 09:38 (UTC-0700), Stephen Hemminger:
> Having a process set a system global value like coredump_filter via an internal
> call seems like a potential problem. What about other processes on the system?
> It may not even be allowed if using a hardened kernel.
> 
> I would prefer that madvise() be used, and document the required change to
> coredump_filter.

/proc/self/coredump_filter affects only the process and its children.
madvise() done on hugepages will be ignored unless this bit is set.
So this must be done, and it seems convenient to require no user interaction.
If changing /proc/self/coredump_filter is disallowed,
EAL startup will fail and the user will have to go the way you described.
So the current solution:
- is convenient for a typical case
- is still usable in a hypothetical hardening case

On FreeBSD, including hugepages in core dump will require a global setting.
There I've been planning to go your way and have the user configure it,
because it is impossible to restrict to one process.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH] eal: support including mapped memory in core dump
  2024-10-24 20:54   ` Dmitry Kozlyuk
@ 2024-10-25  0:22     ` Dmitry Kozlyuk
  0 siblings, 0 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-25  0:22 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: dev, Anatoly Burakov, Lewis Donzis

2024-10-24 23:54 (UTC+0300), Dmitry Kozlyuk:
> 2024-10-24 09:38 (UTC-0700), Stephen Hemminger:
> > Having a process set a system global value like coredump_filter via an internal
> > call seems like a potential problem. What about other processes on the system?
> > It may not even be allowed if using a hardened kernel.
> > 
> > I would prefer that madvise() be used, and document the required change to
> > coredump_filter.  
> 
> /proc/self/coredump_filter affects only the process and its children.
> madvise() done on hugepages will be ignored unless this bit is set.
> So this must be done, and it seems convenient to require no user interaction.
> If changing /proc/self/coredump_filter is disallowed,
> EAL startup will fail and the user will have to go the way you described.
> So the current solution:
> - is convenient for a typical case
> - is still usable in a hypothetical hardening case
> 
> On FreeBSD, including hugepages in core dump will require a global setting.
> There I've been planning to go your way and have the user configure it,
> because it is impossible to restrict to one process.

On the second thought, why block or enforce anything at startup;
it is more flexible if we allow the user to enable dumping hugepages
at whatever moment they wish, including prior to startup.
This series would then become purely documentation for Linux
and similar + a bit of code for FreeBSD.

^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 0/2] Hugepage inclusion in core dump
  2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
                   ` (4 preceding siblings ...)
  2024-10-24 16:38 ` Stephen Hemminger
@ 2024-10-25 20:26 ` Dmitry Kozlyuk
  2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
                     ` (2 more replies)
  5 siblings, 3 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-25 20:26 UTC (permalink / raw)
  To: dev; +Cc: Anatoly Burakov, Lewis Donzis, Dmitry Kozlyuk

From: Dmitry Kozlyuk <kozlyuk@bifit.com>

Hugepages often contain information useful for debugging
and thus desired for inclusion in core dumps.
However, including them has implications and drawbacks,
so it must be an opt-in at user's discretion.
This patchset add the ability to do so for FreeBSD
and for Linux it just documents existing settings.

v2: * Make it work for FreeBSD (thanks Lewis for testing).
    * Drop all EAL code, document required user actions.

Dmitry Kozlyuk (2):
  contigmem: support including mapped buffers in core dump
  doc: add instruction for including hugepages in core dump

 doc/guides/freebsd_gsg/build_dpdk.rst      | 26 ++++++++++++++++------
 doc/guides/linux_gsg/build_sample_apps.rst | 17 ++++++++++++++
 kernel/freebsd/contigmem/contigmem.c       |  8 +++++++
 3 files changed, 44 insertions(+), 7 deletions(-)

-- 
2.38.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
@ 2024-10-25 20:26   ` Dmitry Kozlyuk
  2024-10-26 11:43     ` Lewis Donzis
  2024-11-19 15:43     ` Bruce Richardson
  2024-10-25 20:26   ` [PATCH v2 2/2] doc: add instruction for including hugepages " Dmitry Kozlyuk
  2024-10-26  3:18   ` [PATCH v2 0/2] Hugepage inclusion " Morten Brørup
  2 siblings, 2 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-25 20:26 UTC (permalink / raw)
  To: dev; +Cc: Anatoly Burakov, Lewis Donzis, Dmitry Kozlyuk

It was impossible to include mapped contigmem buffers in core dump.
Add hw.contigmem.coredump_enable tunable to control the inclusion.

Mappings backed by OBJ_FICTITIOUS are excluded from core dump.
While contigmem is a device and its OBJT_DEVICE mappings get this flag,
they are actually backed by memory.
Clear OBJ_FICTITIOUS mapping bit to include them in core dump.
Do this only if hw.contigmem.coredump_enable=1.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
---
 doc/guides/freebsd_gsg/build_dpdk.rst | 27 ++++++++++++++++++++-------
 kernel/freebsd/contigmem/contigmem.c  |  8 ++++++++
 2 files changed, 28 insertions(+), 7 deletions(-)

diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst b/doc/guides/freebsd_gsg/build_dpdk.rst
index 86e8e5a805..387ec52820 100644
--- a/doc/guides/freebsd_gsg/build_dpdk.rst
+++ b/doc/guides/freebsd_gsg/build_dpdk.rst
@@ -86,6 +86,22 @@ module loading using::
     kenv hw.contigmem.num_buffers=n
     kenv hw.contigmem.buffer_size=m
 
+Where n is the number of blocks and m is the size in bytes of each area of
+contiguous memory.  A default of two buffers of size 1073741824 bytes (1 Gigabyte)
+each is set during module load if they are not specified in the environment.
+
+Buffers are excluded from core dump by default.
+Mapped buffers can be included in core dump using the following tunable:
+
+    hw.contigmem.coredump_enable=1
+
+.. note::
+
+    Including contigmem buffers in core dump file increases its size,
+    which may fill the storage or overload the transport.
+    Buffers typically hold data processed by the application,
+    like network packets, which may contain sensitive information.
+
 The kernel environment variables can also be specified during boot by placing the
 following in ``/boot/loader.conf``:
 
@@ -93,15 +109,12 @@ following in ``/boot/loader.conf``:
 
     hw.contigmem.num_buffers=n
     hw.contigmem.buffer_size=m
+    hw.contigmem.coredump_enable=1
 
 The variables can be inspected using the following command::
 
     sysctl -a hw.contigmem
 
-Where n is the number of blocks and m is the size in bytes of each area of
-contiguous memory.  A default of two buffers of size 1073741824 bytes (1 Gigabyte)
-each is set during module load if they are not specified in the environment.
-
 The module can then be loaded using kldload::
 
     kldload contigmem
@@ -115,13 +128,13 @@ up time.  This can be achieved by placing lines similar to the following into
 
     hw.contigmem.num_buffers=1
     hw.contigmem.buffer_size=1073741824
+    hw.contigmem.coredump_enable=1
     contigmem_load="YES"
 
 .. note::
 
-    The contigmem_load directive should be placed after any definitions of
-    ``hw.contigmem.num_buffers`` and ``hw.contigmem.buffer_size`` if the default values
-    are not to be used.
+    The ``contigmem_load`` directive should be placed after any definitions
+    of ``hw.contigmem.*`` if the default values are not to be used.
 
 An error such as::
 
diff --git a/kernel/freebsd/contigmem/contigmem.c b/kernel/freebsd/contigmem/contigmem.c
index 7dd87599d9..591e46dded 100644
--- a/kernel/freebsd/contigmem/contigmem.c
+++ b/kernel/freebsd/contigmem/contigmem.c
@@ -51,6 +51,7 @@ static d_close_t        contigmem_close;
 
 static int              contigmem_num_buffers = RTE_CONTIGMEM_DEFAULT_NUM_BUFS;
 static int64_t          contigmem_buffer_size = RTE_CONTIGMEM_DEFAULT_BUF_SIZE;
+static bool             contigmem_coredump_enable;
 
 static eventhandler_tag contigmem_eh_tag;
 static struct contigmem_buffer contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS];
@@ -59,6 +60,7 @@ static int              contigmem_refcnt;
 
 TUNABLE_INT("hw.contigmem.num_buffers", &contigmem_num_buffers);
 TUNABLE_QUAD("hw.contigmem.buffer_size", &contigmem_buffer_size);
+TUNABLE_BOOL("hw.contigmem.coredump_enable", &contigmem_coredump_enable);
 
 static SYSCTL_NODE(_hw, OID_AUTO, contigmem, CTLFLAG_RD, 0, "contigmem");
 
@@ -68,6 +70,8 @@ SYSCTL_QUAD(_hw_contigmem, OID_AUTO, buffer_size, CTLFLAG_RD,
 	&contigmem_buffer_size, 0, "Size of each contiguous buffer");
 SYSCTL_INT(_hw_contigmem, OID_AUTO, num_references, CTLFLAG_RD,
 	&contigmem_refcnt, 0, "Number of references to contigmem");
+SYSCTL_BOOL(_hw_contigmem, OID_AUTO, coredump_enable, CTLFLAG_RD,
+	&contigmem_coredump_enable, 0, "Include mapped buffers in core dump");
 
 static SYSCTL_NODE(_hw_contigmem, OID_AUTO, physaddr, CTLFLAG_RD, 0,
 	"physaddr");
@@ -357,5 +361,9 @@ contigmem_mmap_single(struct cdev *cdev, vm_ooffset_t *offset, vm_size_t size,
 	*obj = cdev_pager_allocate(vmh, OBJT_DEVICE, &contigmem_cdev_pager_ops,
 			size, nprot, *offset, curthread->td_ucred);
 
+	/* Mappings backed by OBJ_FICTITIOUS are excluded from core dump. */
+	if (contigmem_coredump_enable)
+		(*obj)->flags &= ~OBJ_FICTITIOUS;
+
 	return 0;
 }
-- 
2.38.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* [PATCH v2 2/2] doc: add instruction for including hugepages in core dump
  2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
  2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
@ 2024-10-25 20:26   ` Dmitry Kozlyuk
  2024-10-26  3:18   ` [PATCH v2 0/2] Hugepage inclusion " Morten Brørup
  2 siblings, 0 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-25 20:26 UTC (permalink / raw)
  To: dev; +Cc: Anatoly Burakov, Lewis Donzis, Dmitry Kozlyuk

Linux excludes hugepages from core dump by default.
Describe the means to override this behavior
as well as implications of doing so.
Only mapped hugepages will be included in core dump,
because Linux EAL explicitly excludes reserved regions.

Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
---
 doc/guides/linux_gsg/build_sample_apps.rst | 18 ++++++++++++++++++
 1 file changed, 18 insertions(+)

diff --git a/doc/guides/linux_gsg/build_sample_apps.rst b/doc/guides/linux_gsg/build_sample_apps.rst
index 4f99617233..86d8712e69 100644
--- a/doc/guides/linux_gsg/build_sample_apps.rst
+++ b/doc/guides/linux_gsg/build_sample_apps.rst
@@ -180,6 +180,24 @@ Similarly, on a four socket system, to allocate 1 GB memory on each of sockets 0
 No memory will be reserved on any CPU socket that is not explicitly referenced, for example, socket 3 in this case.
 If the DPDK cannot allocate enough memory on each socket, the EAL initialization fails.
 
+Whether hugepages are included in core dump is controlled by ``/proc/<pid>/coredump_filter``.
+It is ``33`` (hexadecimal) by default, which means that hugepages are exclued from core dump.
+This setting is per-process and is inherited.  Refer to ``core(5)`` for details.
+To include mapped hugepages in core dump, set bit 6 (``0x40``) in the parent process or shell
+before running a DPDK application:
+
+.. code-block:: shell
+
+    echo 0x73 > /proc/self/coredump_filter
+    ./rte-app ...
+
+.. note::
+
+    Including hugepages in core dump file increases its size,
+    which may fill the storage or overload the transport.
+    Hugepages typically hold data processed by the application,
+    like network packets, which may contain sensitive information.
+
 Additional Sample Applications
 ------------------------------
 
-- 
2.38.4


^ permalink raw reply	[flat|nested] 21+ messages in thread

* RE: [PATCH v2 0/2] Hugepage inclusion in core dump
  2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
  2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
  2024-10-25 20:26   ` [PATCH v2 2/2] doc: add instruction for including hugepages " Dmitry Kozlyuk
@ 2024-10-26  3:18   ` Morten Brørup
  2 siblings, 0 replies; 21+ messages in thread
From: Morten Brørup @ 2024-10-26  3:18 UTC (permalink / raw)
  To: Dmitry Kozlyuk, dev; +Cc: Anatoly Burakov, Lewis Donzis, Dmitry Kozlyuk

For the series,
Acked-by: Morten Brørup <mb@smartsharesystems.com>


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
@ 2024-10-26 11:43     ` Lewis Donzis
  2024-10-28 13:26       ` Dmitry Kozlyuk
  2024-11-19 15:43     ` Bruce Richardson
  1 sibling, 1 reply; 21+ messages in thread
From: Lewis Donzis @ 2024-10-26 11:43 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, anatoly burakov

Is the extra control necessary, i.e., why not just always do this and let the EAL option control whether the pages get dumped?

----- On Oct 25, 2024, at 3:26 PM, Dmitry Kozlyuk dmitry.kozliuk@gmail.com wrote:

> It was impossible to include mapped contigmem buffers in core dump.
> Add hw.contigmem.coredump_enable tunable to control the inclusion.
> 
> Mappings backed by OBJ_FICTITIOUS are excluded from core dump.
> While contigmem is a device and its OBJT_DEVICE mappings get this flag,
> they are actually backed by memory.
> Clear OBJ_FICTITIOUS mapping bit to include them in core dump.
> Do this only if hw.contigmem.coredump_enable=1.
> 
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
> doc/guides/freebsd_gsg/build_dpdk.rst | 27 ++++++++++++++++++++-------
> kernel/freebsd/contigmem/contigmem.c  |  8 ++++++++
> 2 files changed, 28 insertions(+), 7 deletions(-)
> 
> diff --git a/doc/guides/freebsd_gsg/build_dpdk.rst
> b/doc/guides/freebsd_gsg/build_dpdk.rst
> index 86e8e5a805..387ec52820 100644
> --- a/doc/guides/freebsd_gsg/build_dpdk.rst
> +++ b/doc/guides/freebsd_gsg/build_dpdk.rst
> @@ -86,6 +86,22 @@ module loading using::
>     kenv hw.contigmem.num_buffers=n
>     kenv hw.contigmem.buffer_size=m
> 
> +Where n is the number of blocks and m is the size in bytes of each area of
> +contiguous memory.  A default of two buffers of size 1073741824 bytes (1
> Gigabyte)
> +each is set during module load if they are not specified in the environment.
> +
> +Buffers are excluded from core dump by default.
> +Mapped buffers can be included in core dump using the following tunable:
> +
> +    hw.contigmem.coredump_enable=1
> +
> +.. note::
> +
> +    Including contigmem buffers in core dump file increases its size,
> +    which may fill the storage or overload the transport.
> +    Buffers typically hold data processed by the application,
> +    like network packets, which may contain sensitive information.
> +
> The kernel environment variables can also be specified during boot by placing
> the
> following in ``/boot/loader.conf``:
> 
> @@ -93,15 +109,12 @@ following in ``/boot/loader.conf``:
> 
>     hw.contigmem.num_buffers=n
>     hw.contigmem.buffer_size=m
> +    hw.contigmem.coredump_enable=1
> 
> The variables can be inspected using the following command::
> 
>     sysctl -a hw.contigmem
> 
> -Where n is the number of blocks and m is the size in bytes of each area of
> -contiguous memory.  A default of two buffers of size 1073741824 bytes (1
> Gigabyte)
> -each is set during module load if they are not specified in the environment.
> -
> The module can then be loaded using kldload::
> 
>     kldload contigmem
> @@ -115,13 +128,13 @@ up time.  This can be achieved by placing lines similar to
> the following into
> 
>     hw.contigmem.num_buffers=1
>     hw.contigmem.buffer_size=1073741824
> +    hw.contigmem.coredump_enable=1
>     contigmem_load="YES"
> 
> .. note::
> 
> -    The contigmem_load directive should be placed after any definitions of
> -    ``hw.contigmem.num_buffers`` and ``hw.contigmem.buffer_size`` if the
> default values
> -    are not to be used.
> +    The ``contigmem_load`` directive should be placed after any definitions
> +    of ``hw.contigmem.*`` if the default values are not to be used.
> 
> An error such as::
> 
> diff --git a/kernel/freebsd/contigmem/contigmem.c
> b/kernel/freebsd/contigmem/contigmem.c
> index 7dd87599d9..591e46dded 100644
> --- a/kernel/freebsd/contigmem/contigmem.c
> +++ b/kernel/freebsd/contigmem/contigmem.c
> @@ -51,6 +51,7 @@ static d_close_t        contigmem_close;
> 
> static int              contigmem_num_buffers = RTE_CONTIGMEM_DEFAULT_NUM_BUFS;
> static int64_t          contigmem_buffer_size = RTE_CONTIGMEM_DEFAULT_BUF_SIZE;
> +static bool             contigmem_coredump_enable;
> 
> static eventhandler_tag contigmem_eh_tag;
> static struct contigmem_buffer contigmem_buffers[RTE_CONTIGMEM_MAX_NUM_BUFS];
> @@ -59,6 +60,7 @@ static int              contigmem_refcnt;
> 
> TUNABLE_INT("hw.contigmem.num_buffers", &contigmem_num_buffers);
> TUNABLE_QUAD("hw.contigmem.buffer_size", &contigmem_buffer_size);
> +TUNABLE_BOOL("hw.contigmem.coredump_enable", &contigmem_coredump_enable);
> 
> static SYSCTL_NODE(_hw, OID_AUTO, contigmem, CTLFLAG_RD, 0, "contigmem");
> 
> @@ -68,6 +70,8 @@ SYSCTL_QUAD(_hw_contigmem, OID_AUTO, buffer_size, CTLFLAG_RD,
> 	&contigmem_buffer_size, 0, "Size of each contiguous buffer");
> SYSCTL_INT(_hw_contigmem, OID_AUTO, num_references, CTLFLAG_RD,
> 	&contigmem_refcnt, 0, "Number of references to contigmem");
> +SYSCTL_BOOL(_hw_contigmem, OID_AUTO, coredump_enable, CTLFLAG_RD,
> +	&contigmem_coredump_enable, 0, "Include mapped buffers in core dump");
> 
> static SYSCTL_NODE(_hw_contigmem, OID_AUTO, physaddr, CTLFLAG_RD, 0,
> 	"physaddr");
> @@ -357,5 +361,9 @@ contigmem_mmap_single(struct cdev *cdev, vm_ooffset_t
> *offset, vm_size_t size,
> 	*obj = cdev_pager_allocate(vmh, OBJT_DEVICE, &contigmem_cdev_pager_ops,
> 			size, nprot, *offset, curthread->td_ucred);
> 
> +	/* Mappings backed by OBJ_FICTITIOUS are excluded from core dump. */
> +	if (contigmem_coredump_enable)
> +		(*obj)->flags &= ~OBJ_FICTITIOUS;
> +
> 	return 0;
> }
> --
> 2.38.4

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-26 11:43     ` Lewis Donzis
@ 2024-10-28 13:26       ` Dmitry Kozlyuk
  2024-10-28 13:35         ` Lewis Donzis
  2024-11-19 15:48         ` Bruce Richardson
  0 siblings, 2 replies; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-10-28 13:26 UTC (permalink / raw)
  To: Lewis Donzis; +Cc: dev, Anatoly Burakov, Stephen Hemminger, Morten Brørup

2024-10-26 06:43 (UTC-0500), Lewis Donzis:
> Is the extra control necessary, i.e., why not just always do this and let
> the EAL option control whether the pages get dumped?

I've been evaluating your suggestion and see no downsides,
except contigmem default behavior change, but does it have non-DPDK users?
If no one objects, I'll prepare v3 doing the following:
1) everything from v2,
2) except always mark contigmem buffers as dumpable,
3) add --dump-mapped back and make DPDK disable dumping by default.
As a result, in order to include mapped memory in coredump:
* FreeBSD will require only "--dump-mapped";
* Linux will require both "coredump_filter" setup and "--dump-mapped".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-28 13:26       ` Dmitry Kozlyuk
@ 2024-10-28 13:35         ` Lewis Donzis
  2024-11-19 15:48         ` Bruce Richardson
  1 sibling, 0 replies; 21+ messages in thread
From: Lewis Donzis @ 2024-10-28 13:35 UTC (permalink / raw)
  To: Dmitry Kozlyuk
  Cc: dev, anatoly burakov, Stephen Hemminger, Morten Brørup

It seems unlikely that there are other users of contigmem on FreeBSD, especially since it's not necessary for other applications.

On FreeBSD, use of large and huge pages is automatic; you just call mmap() with a large size and it automatically tries to use the largest physical pages possible.  So the only thing contigmem is doing for us on FreeBSD is providing the physical address and, of course, making it consistent with Linux.

At least, that's my understanding.


----- On Oct 28, 2024, at 8:26 AM, Dmitry Kozlyuk dmitry.kozliuk@gmail.com wrote:

> 2024-10-26 06:43 (UTC-0500), Lewis Donzis:
>> Is the extra control necessary, i.e., why not just always do this and let
>> the EAL option control whether the pages get dumped?
> 
> I've been evaluating your suggestion and see no downsides,
> except contigmem default behavior change, but does it have non-DPDK users?
> If no one objects, I'll prepare v3 doing the following:
> 1) everything from v2,
> 2) except always mark contigmem buffers as dumpable,
> 3) add --dump-mapped back and make DPDK disable dumping by default.
> As a result, in order to include mapped memory in coredump:
> * FreeBSD will require only "--dump-mapped";
> * Linux will require both "coredump_filter" setup and "--dump-mapped".

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
  2024-10-26 11:43     ` Lewis Donzis
@ 2024-11-19 15:43     ` Bruce Richardson
  1 sibling, 0 replies; 21+ messages in thread
From: Bruce Richardson @ 2024-11-19 15:43 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, Anatoly Burakov, Lewis Donzis

On Fri, Oct 25, 2024 at 11:26:14PM +0300, Dmitry Kozlyuk wrote:
> It was impossible to include mapped contigmem buffers in core dump.
> Add hw.contigmem.coredump_enable tunable to control the inclusion.
> 
> Mappings backed by OBJ_FICTITIOUS are excluded from core dump.
> While contigmem is a device and its OBJT_DEVICE mappings get this flag,
> they are actually backed by memory.
> Clear OBJ_FICTITIOUS mapping bit to include them in core dump.
> Do this only if hw.contigmem.coredump_enable=1.
> 
> Signed-off-by: Dmitry Kozlyuk <dmitry.kozliuk@gmail.com>
> ---
Patch LGTM

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-10-28 13:26       ` Dmitry Kozlyuk
  2024-10-28 13:35         ` Lewis Donzis
@ 2024-11-19 15:48         ` Bruce Richardson
  2024-11-19 18:27           ` Dmitry Kozlyuk
  1 sibling, 1 reply; 21+ messages in thread
From: Bruce Richardson @ 2024-11-19 15:48 UTC (permalink / raw)
  To: Dmitry Kozlyuk
  Cc: Lewis Donzis, dev, Anatoly Burakov, Stephen Hemminger,
	Morten Brørup

On Mon, Oct 28, 2024 at 04:26:06PM +0300, Dmitry Kozlyuk wrote:
> 2024-10-26 06:43 (UTC-0500), Lewis Donzis:
> > Is the extra control necessary, i.e., why not just always do this and let
> > the EAL option control whether the pages get dumped?
> 
> I've been evaluating your suggestion and see no downsides,
> except contigmem default behavior change, but does it have non-DPDK users?
> If no one objects, I'll prepare v3 doing the following:
> 1) everything from v2,
> 2) except always mark contigmem buffers as dumpable,
> 3) add --dump-mapped back and make DPDK disable dumping by default.
> As a result, in order to include mapped memory in coredump:
> * FreeBSD will require only "--dump-mapped";
> * Linux will require both "coredump_filter" setup and "--dump-mapped".

That all seems very reasonable, but it is late in the release cycle at this
point. Do you think it would be good to just take the v2 as is for 24.11,
and then change things thereafter to have more dynamic/runtime control of
the dump-setting in 25.03? 

From compatibility viewpoint, I don't think there are any non-DPDK users,
so if behaviour with EAL doesn't change, I don't think having the kernel
module default be different matters as we go from one release to the next.

/Bruce

^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-11-19 15:48         ` Bruce Richardson
@ 2024-11-19 18:27           ` Dmitry Kozlyuk
  2024-11-19 23:09             ` Thomas Monjalon
  0 siblings, 1 reply; 21+ messages in thread
From: Dmitry Kozlyuk @ 2024-11-19 18:27 UTC (permalink / raw)
  To: Bruce Richardson
  Cc: Lewis Donzis, dev, Anatoly Burakov, Stephen Hemminger,
	Morten Brørup

2024-11-19 15:48 (UTC+0000), Bruce Richardson:
> On Mon, Oct 28, 2024 at 04:26:06PM +0300, Dmitry Kozlyuk wrote:
> > 2024-10-26 06:43 (UTC-0500), Lewis Donzis:  
> > > Is the extra control necessary, i.e., why not just always do this and let
> > > the EAL option control whether the pages get dumped?  
> > 
> > I've been evaluating your suggestion and see no downsides,
> > except contigmem default behavior change, but does it have non-DPDK users?
> > If no one objects, I'll prepare v3 doing the following:
> > 1) everything from v2,
> > 2) except always mark contigmem buffers as dumpable,
> > 3) add --dump-mapped back and make DPDK disable dumping by default.
> > As a result, in order to include mapped memory in coredump:
> > * FreeBSD will require only "--dump-mapped";
> > * Linux will require both "coredump_filter" setup and "--dump-mapped".  
> 
> That all seems very reasonable, but it is late in the release cycle at this
> point. Do you think it would be good to just take the v2 as is for 24.11,
> and then change things thereafter to have more dynamic/runtime control of
> the dump-setting in 25.03? 
>
> From compatibility viewpoint, I don't think there are any non-DPDK users,
> so if behaviour with EAL doesn't change, I don't think having the kernel
> module default be different matters as we go from one release to the next.

If you think it is worth adding to 24.11, then by all means.
V2 works as intended and it solves Lewis' problem.
I wasn't rushing with v3 just because I knew that
new EAL options can only target 25.03 since 24.11 RC was out already.


^ permalink raw reply	[flat|nested] 21+ messages in thread

* Re: [PATCH v2 1/2] contigmem: support including mapped buffers in core dump
  2024-11-19 18:27           ` Dmitry Kozlyuk
@ 2024-11-19 23:09             ` Thomas Monjalon
  0 siblings, 0 replies; 21+ messages in thread
From: Thomas Monjalon @ 2024-11-19 23:09 UTC (permalink / raw)
  To: Dmitry Kozlyuk
  Cc: Bruce Richardson, dev, Lewis Donzis, dev, Anatoly Burakov,
	Stephen Hemminger, Morten Brørup

19/11/2024 19:27, Dmitry Kozlyuk:
> 2024-11-19 15:48 (UTC+0000), Bruce Richardson:
> > On Mon, Oct 28, 2024 at 04:26:06PM +0300, Dmitry Kozlyuk wrote:
> > > 2024-10-26 06:43 (UTC-0500), Lewis Donzis:  
> > > > Is the extra control necessary, i.e., why not just always do this and let
> > > > the EAL option control whether the pages get dumped?  
> > > 
> > > I've been evaluating your suggestion and see no downsides,
> > > except contigmem default behavior change, but does it have non-DPDK users?
> > > If no one objects, I'll prepare v3 doing the following:
> > > 1) everything from v2,
> > > 2) except always mark contigmem buffers as dumpable,
> > > 3) add --dump-mapped back and make DPDK disable dumping by default.
> > > As a result, in order to include mapped memory in coredump:
> > > * FreeBSD will require only "--dump-mapped";
> > > * Linux will require both "coredump_filter" setup and "--dump-mapped".  
> > 
> > That all seems very reasonable, but it is late in the release cycle at this
> > point. Do you think it would be good to just take the v2 as is for 24.11,
> > and then change things thereafter to have more dynamic/runtime control of
> > the dump-setting in 25.03? 
> >
> > From compatibility viewpoint, I don't think there are any non-DPDK users,
> > so if behaviour with EAL doesn't change, I don't think having the kernel
> > module default be different matters as we go from one release to the next.
> 
> If you think it is worth adding to 24.11, then by all means.
> V2 works as intended and it solves Lewis' problem.
> I wasn't rushing with v3 just because I knew that
> new EAL options can only target 25.03 since 24.11 RC was out already.

Applied, thanks.



^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2024-11-19 23:09 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2024-10-23 23:18 [PATCH] eal: support including mapped memory in core dump Dmitry Kozlyuk
2024-10-24  2:19 ` Stephen Hemminger
2024-10-24  2:31 ` Stephen Hemminger
2024-10-24  7:07   ` Morten Brørup
2024-10-24  7:22 ` Morten Brørup
2024-10-24  8:25   ` Dmitry Kozlyuk
2024-10-24 12:55 ` Lewis Donzis
2024-10-24 16:38 ` Stephen Hemminger
2024-10-24 20:54   ` Dmitry Kozlyuk
2024-10-25  0:22     ` Dmitry Kozlyuk
2024-10-25 20:26 ` [PATCH v2 0/2] Hugepage inclusion " Dmitry Kozlyuk
2024-10-25 20:26   ` [PATCH v2 1/2] contigmem: support including mapped buffers " Dmitry Kozlyuk
2024-10-26 11:43     ` Lewis Donzis
2024-10-28 13:26       ` Dmitry Kozlyuk
2024-10-28 13:35         ` Lewis Donzis
2024-11-19 15:48         ` Bruce Richardson
2024-11-19 18:27           ` Dmitry Kozlyuk
2024-11-19 23:09             ` Thomas Monjalon
2024-11-19 15:43     ` Bruce Richardson
2024-10-25 20:26   ` [PATCH v2 2/2] doc: add instruction for including hugepages " Dmitry Kozlyuk
2024-10-26  3:18   ` [PATCH v2 0/2] Hugepage inclusion " Morten Brørup

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).