DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [RFC PATCH] ci: catch coredumps
@ 2021-01-11 10:01 David Marchand
  2021-01-11 13:17 ` Aaron Conole
  2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
  0 siblings, 2 replies; 6+ messages in thread
From: David Marchand @ 2021-01-11 10:01 UTC (permalink / raw)
  To: dev; +Cc: Aaron Conole, Michael Santana, Olivier Matz

Parts of the unit tests code rely on forked/secondary processes
(expectedly) failing.
A crash in those situations could be missed so add a check on coredumps
presence after unit tests have run.

In some situations (like explicitly call rte_panic), coredump generation
must be disabled to avoid false positives.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Sending as a RFC, as this is a "nice to have" patch I had in store for
some time, but I did not see the actual need so far.

We could attach the coredumps in GHA result, but it would be hardly usable
without attaching all generated binaries...
Opinions?

---
 .ci/linux-build.sh    |  8 ++++++++
 app/test/test_debug.c | 11 +++++++++--
 app/test/test_mbuf.c  |  9 ++++++++-
 3 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index d2c821adf3..d00a5804b4 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -57,7 +57,11 @@ meson build --werror $OPTS
 ninja -C build
 
 if [ "$AARCH64" != "true" ]; then
+    ulimit -c unlimited
+    sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
+
     devtools/test-null.sh
+    ! ls /tmp/dpdk-core.*.* 2>/dev/null
 fi
 
 if [ "$ABI_CHECKS" = "true" ]; then
@@ -102,5 +106,9 @@ if [ "$ABI_CHECKS" = "true" ]; then
 fi
 
 if [ "$RUN_TESTS" = "true" ]; then
+    ulimit -c unlimited
+    sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
+
     sudo meson test -C build --suite fast-tests -t 3
+    ! ls /tmp/dpdk-core.*.* 2>/dev/null
 fi
diff --git a/app/test/test_debug.c b/app/test/test_debug.c
index 834a7386f5..23b24db177 100644
--- a/app/test/test_debug.c
+++ b/app/test/test_debug.c
@@ -4,6 +4,8 @@
 
 #include <stdio.h>
 #include <stdint.h>
+#include <sys/resource.h>
+#include <sys/time.h>
 #include <sys/wait.h>
 #include <unistd.h>
 
@@ -28,9 +30,14 @@ test_panic(void)
 
 	pid = fork();
 
-	if (pid == 0)
+	if (pid == 0) {
+		struct rlimit rl;
+
+		/* No need to generate a coredump when panicking. */
+		rl.rlim_cur = rl.rlim_max = 0;
+		setrlimit(RLIMIT_CORE, &rl);
 		rte_panic("Test Debug\n");
-	else if (pid < 0){
+	} else if (pid < 0) {
 		printf("Fork Failed\n");
 		return -1;
 	}
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index a40f7d4883..47a7b197d7 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1174,6 +1174,8 @@ test_refcnt_mbuf(void)
 }
 
 #include <unistd.h>
+#include <sys/resource.h>
+#include <sys/time.h>
 #include <sys/wait.h>
 
 /* use fork() to test mbuf errors panic */
@@ -1186,9 +1188,14 @@ verify_mbuf_check_panics(struct rte_mbuf *buf)
 	pid = fork();
 
 	if (pid == 0) {
+		struct rlimit rl;
+
+		/* No need to generate a coredump when panicking. */
+		rl.rlim_cur = rl.rlim_max = 0;
+		setrlimit(RLIMIT_CORE, &rl);
 		rte_mbuf_sanity_check(buf, 1); /* should panic */
 		exit(0);  /* return normally if it doesn't panic */
-	} else if (pid < 0){
+	} else if (pid < 0) {
 		printf("Fork Failed\n");
 		return -1;
 	}
-- 
2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [RFC PATCH] ci: catch coredumps
  2021-01-11 10:01 [dpdk-dev] [RFC PATCH] ci: catch coredumps David Marchand
@ 2021-01-11 13:17 ` Aaron Conole
  2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
  1 sibling, 0 replies; 6+ messages in thread
From: Aaron Conole @ 2021-01-11 13:17 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Michael Santana, Olivier Matz

David Marchand <david.marchand@redhat.com> writes:

> Parts of the unit tests code rely on forked/secondary processes
> (expectedly) failing.
> A crash in those situations could be missed so add a check on coredumps
> presence after unit tests have run.
>
> In some situations (like explicitly call rte_panic), coredump generation
> must be disabled to avoid false positives.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> Sending as a RFC, as this is a "nice to have" patch I had in store for
> some time, but I did not see the actual need so far.
>
> We could attach the coredumps in GHA result, but it would be hardly usable
> without attaching all generated binaries...
> Opinions?

I think it's a good start.  We may not need to attach the binaries at
all - if we have access to gdb, we can run a script to just run simple
commands like 'thread apply all bt' and maybe some info commands.  That
can all be stuffed into the logs.  Usually a source level backtrace is
enough to work through what happened.

> ---
>  .ci/linux-build.sh    |  8 ++++++++
>  app/test/test_debug.c | 11 +++++++++--
>  app/test/test_mbuf.c  |  9 ++++++++-
>  3 files changed, 25 insertions(+), 3 deletions(-)
>
> diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
> index d2c821adf3..d00a5804b4 100755
> --- a/.ci/linux-build.sh
> +++ b/.ci/linux-build.sh
> @@ -57,7 +57,11 @@ meson build --werror $OPTS
>  ninja -C build
>  
>  if [ "$AARCH64" != "true" ]; then
> +    ulimit -c unlimited
> +    sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
> +
>      devtools/test-null.sh
> +    ! ls /tmp/dpdk-core.*.* 2>/dev/null
>  fi
>  
>  if [ "$ABI_CHECKS" = "true" ]; then
> @@ -102,5 +106,9 @@ if [ "$ABI_CHECKS" = "true" ]; then
>  fi
>  
>  if [ "$RUN_TESTS" = "true" ]; then
> +    ulimit -c unlimited
> +    sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
> +
>      sudo meson test -C build --suite fast-tests -t 3
> +    ! ls /tmp/dpdk-core.*.* 2>/dev/null
>  fi
> diff --git a/app/test/test_debug.c b/app/test/test_debug.c
> index 834a7386f5..23b24db177 100644
> --- a/app/test/test_debug.c
> +++ b/app/test/test_debug.c
> @@ -4,6 +4,8 @@
>  
>  #include <stdio.h>
>  #include <stdint.h>
> +#include <sys/resource.h>
> +#include <sys/time.h>
>  #include <sys/wait.h>
>  #include <unistd.h>
>  
> @@ -28,9 +30,14 @@ test_panic(void)
>  
>  	pid = fork();
>  
> -	if (pid == 0)
> +	if (pid == 0) {
> +		struct rlimit rl;
> +
> +		/* No need to generate a coredump when panicking. */
> +		rl.rlim_cur = rl.rlim_max = 0;
> +		setrlimit(RLIMIT_CORE, &rl);
>  		rte_panic("Test Debug\n");
> -	else if (pid < 0){
> +	} else if (pid < 0) {
>  		printf("Fork Failed\n");
>  		return -1;
>  	}
> diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
> index a40f7d4883..47a7b197d7 100644
> --- a/app/test/test_mbuf.c
> +++ b/app/test/test_mbuf.c
> @@ -1174,6 +1174,8 @@ test_refcnt_mbuf(void)
>  }
>  
>  #include <unistd.h>
> +#include <sys/resource.h>
> +#include <sys/time.h>
>  #include <sys/wait.h>
>  
>  /* use fork() to test mbuf errors panic */
> @@ -1186,9 +1188,14 @@ verify_mbuf_check_panics(struct rte_mbuf *buf)
>  	pid = fork();
>  
>  	if (pid == 0) {
> +		struct rlimit rl;
> +
> +		/* No need to generate a coredump when panicking. */
> +		rl.rlim_cur = rl.rlim_max = 0;
> +		setrlimit(RLIMIT_CORE, &rl);
>  		rte_mbuf_sanity_check(buf, 1); /* should panic */
>  		exit(0);  /* return normally if it doesn't panic */
> -	} else if (pid < 0){
> +	} else if (pid < 0) {
>  		printf("Fork Failed\n");
>  		return -1;
>  	}


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [dpdk-dev] [PATCH v2] ci: catch coredumps
  2021-01-11 10:01 [dpdk-dev] [RFC PATCH] ci: catch coredumps David Marchand
  2021-01-11 13:17 ` Aaron Conole
@ 2021-01-25 15:05 ` David Marchand
  2021-03-02 16:16   ` Luca Boccassi
                     ` (2 more replies)
  1 sibling, 3 replies; 6+ messages in thread
From: David Marchand @ 2021-01-25 15:05 UTC (permalink / raw)
  To: dev; +Cc: Aaron Conole, Michael Santana, Olivier Matz

Parts of the unit tests code rely on forked/secondary processes
(expectedly) failing.
A crash in those situations could be missed so add a check on coredumps
presence after unit tests have run.
When unit tests fail, it can also help checking for coredumps as it
could give more insights on what happened.

In some situations (like explicit call to rte_panic), coredump generation
must be disabled to avoid false positives.

Signed-off-by: David Marchand <david.marchand@redhat.com>
---
Changelog since RFC v1
- removed RFC,
- pushed coredumps into gdb for in-situ analysis,
- gdb presence is used to enable the check. Travis config is left
  untouched for now,

---
 .ci/linux-build.sh          | 37 ++++++++++++++++++++++++++++++++++---
 .github/workflows/build.yml |  4 ++++
 app/test/test_debug.c       | 11 +++++++++--
 app/test/test_mbuf.c        |  9 ++++++++-
 4 files changed, 55 insertions(+), 6 deletions(-)

diff --git a/.ci/linux-build.sh b/.ci/linux-build.sh
index d2c821adf3..26c30a2301 100755
--- a/.ci/linux-build.sh
+++ b/.ci/linux-build.sh
@@ -4,7 +4,10 @@ on_error() {
     if [ $? = 0 ]; then
         exit
     fi
-    FILES_TO_PRINT="build/meson-logs/testlog.txt build/.ninja_log build/meson-logs/meson-log.txt"
+    FILES_TO_PRINT="build/meson-logs/testlog.txt"
+    FILES_TO_PRINT="$FILES_TO_PRINT build/.ninja_log"
+    FILES_TO_PRINT="$FILES_TO_PRINT build/meson-logs/meson-log.txt"
+    FILES_TO_PRINT="$FILES_TO_PRINT build/gdb.log"
 
     for pr_file in $FILES_TO_PRINT; do
         if [ -e "$pr_file" ]; then
@@ -30,6 +33,26 @@ install_libabigail() {
     rm ${version}.tar.gz
 }
 
+configure_coredump() {
+    # No point in configuring coredump without gdb
+    which gdb >/dev/null || return 0
+    ulimit -c unlimited
+    sudo sysctl -w kernel.core_pattern=/tmp/dpdk-core.%e.%p
+}
+
+catch_coredump() {
+    ls /tmp/dpdk-core.*.* 2>/dev/null || return 0
+    for core in /tmp/dpdk-core.*.*; do
+        binary=$(sudo readelf -n $core |grep $(pwd)/build/ 2>/dev/null |head -n1)
+        [ -x $binary ] || binary=
+        sudo gdb $binary -c $core \
+            -ex 'info threads' \
+            -ex 'thread apply all bt full' \
+            -ex 'quit'
+    done |tee -a build/gdb.log
+    return 1
+}
+
 if [ "$AARCH64" = "true" ]; then
     # convert the arch specifier
     OPTS="$OPTS --cross-file config/arm/arm64_armv8_linux_gcc"
@@ -57,7 +80,11 @@ meson build --werror $OPTS
 ninja -C build
 
 if [ "$AARCH64" != "true" ]; then
-    devtools/test-null.sh
+    failed=
+    configure_coredump
+    devtools/test-null.sh || failed="true"
+    catch_coredump
+    [ "$failed" != "true" ]
 fi
 
 if [ "$ABI_CHECKS" = "true" ]; then
@@ -102,5 +129,9 @@ if [ "$ABI_CHECKS" = "true" ]; then
 fi
 
 if [ "$RUN_TESTS" = "true" ]; then
-    sudo meson test -C build --suite fast-tests -t 3
+    failed=
+    configure_coredump
+    sudo meson test -C build --suite fast-tests -t 3 || failed="true"
+    catch_coredump
+    [ "$failed" != "true" ]
 fi
diff --git a/.github/workflows/build.yml b/.github/workflows/build.yml
index a5b579adda..786525e5a3 100644
--- a/.github/workflows/build.yml
+++ b/.github/workflows/build.yml
@@ -104,6 +104,9 @@ jobs:
       if: env.AARCH64 == 'true'
       run: sudo apt install -y gcc-aarch64-linux-gnu libc6-dev-arm64-cross
         pkg-config-aarch64-linux-gnu
+    - name: Install test tools packages
+      if: env.AARCH64 != 'true' || env.RUN_TESTS == 'true'
+      run: sudo apt install -y gdb
     - name: Install doc generation packages
       if: env.BUILD_DOCS == 'true'
       run: sudo apt install -y doxygen graphviz python3-sphinx
@@ -124,3 +127,4 @@ jobs:
           build/meson-logs/testlog.txt
           build/.ninja_log
           build/meson-logs/meson-log.txt
+          build/gdb.log
diff --git a/app/test/test_debug.c b/app/test/test_debug.c
index 834a7386f5..23b24db177 100644
--- a/app/test/test_debug.c
+++ b/app/test/test_debug.c
@@ -4,6 +4,8 @@
 
 #include <stdio.h>
 #include <stdint.h>
+#include <sys/resource.h>
+#include <sys/time.h>
 #include <sys/wait.h>
 #include <unistd.h>
 
@@ -28,9 +30,14 @@ test_panic(void)
 
 	pid = fork();
 
-	if (pid == 0)
+	if (pid == 0) {
+		struct rlimit rl;
+
+		/* No need to generate a coredump when panicking. */
+		rl.rlim_cur = rl.rlim_max = 0;
+		setrlimit(RLIMIT_CORE, &rl);
 		rte_panic("Test Debug\n");
-	else if (pid < 0){
+	} else if (pid < 0) {
 		printf("Fork Failed\n");
 		return -1;
 	}
diff --git a/app/test/test_mbuf.c b/app/test/test_mbuf.c
index a40f7d4883..47a7b197d7 100644
--- a/app/test/test_mbuf.c
+++ b/app/test/test_mbuf.c
@@ -1174,6 +1174,8 @@ test_refcnt_mbuf(void)
 }
 
 #include <unistd.h>
+#include <sys/resource.h>
+#include <sys/time.h>
 #include <sys/wait.h>
 
 /* use fork() to test mbuf errors panic */
@@ -1186,9 +1188,14 @@ verify_mbuf_check_panics(struct rte_mbuf *buf)
 	pid = fork();
 
 	if (pid == 0) {
+		struct rlimit rl;
+
+		/* No need to generate a coredump when panicking. */
+		rl.rlim_cur = rl.rlim_max = 0;
+		setrlimit(RLIMIT_CORE, &rl);
 		rte_mbuf_sanity_check(buf, 1); /* should panic */
 		exit(0);  /* return normally if it doesn't panic */
-	} else if (pid < 0){
+	} else if (pid < 0) {
 		printf("Fork Failed\n");
 		return -1;
 	}
-- 
2.23.0


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ci: catch coredumps
  2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
@ 2021-03-02 16:16   ` Luca Boccassi
  2021-03-02 21:17   ` Aaron Conole
  2021-03-03  9:03   ` David Marchand
  2 siblings, 0 replies; 6+ messages in thread
From: Luca Boccassi @ 2021-03-02 16:16 UTC (permalink / raw)
  To: David Marchand, dev; +Cc: Aaron Conole, Michael Santana, Olivier Matz

On Mon, 2021-01-25 at 16:05 +0100, David Marchand wrote:
> Parts of the unit tests code rely on forked/secondary processes
> (expectedly) failing.
> A crash in those situations could be missed so add a check on coredumps
> presence after unit tests have run.
> When unit tests fail, it can also help checking for coredumps as it
> could give more insights on what happened.
> 
> In some situations (like explicit call to rte_panic), coredump generation
> must be disabled to avoid false positives.
> 
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---
> Changelog since RFC v1
> - removed RFC,
> - pushed coredumps into gdb for in-situ analysis,
> - gdb presence is used to enable the check. Travis config is left
>   untouched for now,
> 
> ---
>  .ci/linux-build.sh          | 37 ++++++++++++++++++++++++++++++++++---
>  .github/workflows/build.yml |  4 ++++
>  app/test/test_debug.c       | 11 +++++++++--
>  app/test/test_mbuf.c        |  9 ++++++++-
>  4 files changed, 55 insertions(+), 6 deletions(-)

Acked-by: Luca Boccassi <bluca@debian.org>

This fixes an issue for me, with running the debug_autotest under a
reproducible build CI environment.

-- 
Kind regards,
Luca Boccassi

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ci: catch coredumps
  2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
  2021-03-02 16:16   ` Luca Boccassi
@ 2021-03-02 21:17   ` Aaron Conole
  2021-03-03  9:03   ` David Marchand
  2 siblings, 0 replies; 6+ messages in thread
From: Aaron Conole @ 2021-03-02 21:17 UTC (permalink / raw)
  To: David Marchand; +Cc: dev, Michael Santana, Olivier Matz

David Marchand <david.marchand@redhat.com> writes:

> Parts of the unit tests code rely on forked/secondary processes
> (expectedly) failing.
> A crash in those situations could be missed so add a check on coredumps
> presence after unit tests have run.
> When unit tests fail, it can also help checking for coredumps as it
> could give more insights on what happened.
>
> In some situations (like explicit call to rte_panic), coredump generation
> must be disabled to avoid false positives.
>
> Signed-off-by: David Marchand <david.marchand@redhat.com>
> ---

I had this in my backlog and accidentally marked it 'visited' - sorry
about that.

Acked-by: Aaron Conole <aconole@redhat.com>


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-dev] [PATCH v2] ci: catch coredumps
  2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
  2021-03-02 16:16   ` Luca Boccassi
  2021-03-02 21:17   ` Aaron Conole
@ 2021-03-03  9:03   ` David Marchand
  2 siblings, 0 replies; 6+ messages in thread
From: David Marchand @ 2021-03-03  9:03 UTC (permalink / raw)
  To: dev; +Cc: Aaron Conole, Michael Santana, Olivier Matz, dpdk stable

On Mon, Jan 25, 2021 at 4:06 PM David Marchand
<david.marchand@redhat.com> wrote:
>
> Parts of the unit tests code rely on forked/secondary processes
> (expectedly) failing.
> A crash in those situations could be missed so add a check on coredumps
> presence after unit tests have run.
> When unit tests fail, it can also help checking for coredumps as it
> could give more insights on what happened.
>
> In some situations (like explicit call to rte_panic), coredump generation
> must be disabled to avoid false positives.
>

As suggested by Luca, marking for backport.
Cc: stable@dpdk.org

> Signed-off-by: David Marchand <david.marchand@redhat.com>
Acked-by: Luca Boccassi <bluca@debian.org>
Acked-by: Aaron Conole <aconole@redhat.com>


-- 
David Marchand


^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-03-03  9:04 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-01-11 10:01 [dpdk-dev] [RFC PATCH] ci: catch coredumps David Marchand
2021-01-11 13:17 ` Aaron Conole
2021-01-25 15:05 ` [dpdk-dev] [PATCH v2] " David Marchand
2021-03-02 16:16   ` Luca Boccassi
2021-03-02 21:17   ` Aaron Conole
2021-03-03  9:03   ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).