DPDK patches and discussions
 help / color / mirror / Atom feed
From: Yongseok Koh <yskoh@mellanox.com>
To: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Shahaf Shuler <shahafs@mellanox.com>,
	Olga Shern <olgas@mellanox.com>,
	Thomas Monjalon <thomasm@mellanox.com>,
	Talat Batheesh <talatb@mellanox.com>,
	 Noa Spanier <noas@mellanox.com>, dev <dev@dpdk.org>
Subject: Re: [dpdk-dev] rte_memcpy() moves data incorrectly on Ubuntu 18.04 on Intel Skylake.
Date: Wed, 12 Sep 2018 23:37:32 +0000	[thread overview]
Message-ID: <C6817FD4-9D7A-4B47-B220-91B2661C1754@mellanox.com> (raw)
In-Reply-To: <FD73C016-5F3F-49BD-865B-50444C1AEB1E@mellanox.com>

> On Sep 12, 2018, at 1:56 PM, Yongseok Koh <yskoh@mellanox.com> wrote:
> 
> Hi, Christian
> 
> We've recently encountered a weird issue with Ubuntu 18.04 on the Skylake
> server. I can always reproduce this crash and I could narrowed it down. I guess
> it could be a GCC issue.
> 
> 
> [1] How to reproduce
> - ConnectX-4Lx/ConnectX-5 with mlx5 PMD in DPDK 18.02.1
> - Ubuntu 18.04 on Intel Skylake server
> - gcc (Ubuntu 7.3.0-16ubuntu3) 7.3.0
> - Testpmd crashes when it starts to forward traffic. Easy to reproduce.
> - Only happens on the Skylake server.
> - DPDK 18.05 and later don't have such issue. git-bisect gives no clue.

This is because I enabled MEMPOOL_DEBUG and MLX5_DEBUG. As mempool/rte_memcpy is
inlined function, it should be affected. Now I can see the crash regardlessly -
18.02, 18.05 and 18.08.

Thanks,
Yongseok.

> 
> 
> [2] Failure point
> 
> The attached patch gives an insight of why it crashes. The following is the
> result of the patch and the GDB commands.
> 
> In summary, rte_memcpy() doesn't work as expected. In __mempool_generic_put(),
> there's rte_memcpy() to move the array of objects to the lcore cache. If I run
> memcmp() right after rte_memcpy(dst, src, n), data in dst differs from data in
> src. And it looks like some of data got shifted by a few bytes as you can see
> below.
> 
> 	[GDB command]
> 	$dst = 0x7ffff4e09ea8
> 	$src = 0x7fffce3fb970
> 	$n = 256
> 	x/32gx 0x7ffff4e09ea8
> 	x/32gx 0x7fffce3fb970
> 	testpmd: /home/mlnxtest/dpdk/build/include/rte_mempool.h:1140: __mempool_generic_put: Assertion `0' failed.
> 
> 	Thread 4 "lcore-slave-1" received signal SIGABRT, Aborted.
> 	[Switching to Thread 0x7fffce3ff700 (LWP 69913)]
> 	(gdb) x/32gx 0x7ffff4e09ea8
> 	0x7ffff4e09ea8: 0x00007fffaac38ec0      0x00007fffaac38500
> 	0x7ffff4e09eb8: 0x00007fffaac37b40      0x00007fffaac37180
> 	0x7ffff4e09ec8: 0x850000007fffaac3      0x7b4000007fffaac3
> 	0x7ffff4e09ed8: 0x00007fffaac35440      0x00007fffaac34a80
> 	0x7ffff4e09ee8: 0xaac3850000007fff      0xaac37b4000007fff
> 	0x7ffff4e09ef8: 0x00007fffaac32d40      0x00007fffaac32380
> 	0x7ffff4e09f08: 0x7fffaac385000000      0x7fffaac37b400000
> 	0x7ffff4e09f18: 0x00007fffaac30640      0x00007fffaac2fc80
> 	0x7ffff4e09f28: 0x00007fffaac2f2c0      0x00007fffaac2e900
> 	0x7ffff4e09f38: 0x00007fffaac2df40      0x00007fffaac2d580
> 	0x7ffff4e09f48: 0x00007fffaac2cbc0      0x00007fffaac2c200
> 	0x7ffff4e09f58: 0x00007fffaac2b840      0x00007fffaac2ae80
> 	0x7ffff4e09f68: 0x00007fffaac2a4c0      0x00007fffaac29b00
> 	0x7ffff4e09f78: 0x00007fffaac29140      0x00007fffaac28780
> 	0x7ffff4e09f88: 0x00007fffaac27dc0      0x00007fffaac27400
> 	0x7ffff4e09f98: 0x00007fffaac26a40      0x00007fffaac26080
> 	(gdb) x/32gx 0x7fffce3fb970
> 	0x7fffce3fb970: 0x00007fffaac38ec0      0x00007fffaac38500
> 	0x7fffce3fb980: 0x00007fffaac37b40      0x00007fffaac37180
> 	0x7fffce3fb990: 0x00007fffaac367c0      0x00007fffaac35e00
> 	0x7fffce3fb9a0: 0x00007fffaac35440      0x00007fffaac34a80
> 	0x7fffce3fb9b0: 0x00007fffaac340c0      0x00007fffaac33700
> 	0x7fffce3fb9c0: 0x00007fffaac32d40      0x00007fffaac32380
> 	0x7fffce3fb9d0: 0x00007fffaac319c0      0x00007fffaac31000
> 	0x7fffce3fb9e0: 0x00007fffaac30640      0x00007fffaac2fc80
> 	0x7fffce3fb9f0: 0x00007fffaac2f2c0      0x00007fffaac2e900
> 	0x7fffce3fba00: 0x00007fffaac2df40      0x00007fffaac2d580
> 	0x7fffce3fba10: 0x00007fffaac2cbc0      0x00007fffaac2c200
> 	0x7fffce3fba20: 0x00007fffaac2b840      0x00007fffaac2ae80
> 	0x7fffce3fba30: 0x00007fffaac2a4c0      0x00007fffaac29b00
> 	0x7fffce3fba40: 0x00007fffaac29140      0x00007fffaac28780
> 	0x7fffce3fba50: 0x00007fffaac27dc0      0x00007fffaac27400
> 	0x7fffce3fba60: 0x00007fffaac26a40      0x00007fffaac26080
> 
> 
> AFAIK, AVX512F support is disabled by default in DPDK as it is still
> experimental (CONFIG_RTE_ENABLE_AVX512=n). But with gcc optimization, AVX2
> version of rte_memcpy() seems to be optimized with 512b instructions. If I
> disable it by adding EXTRA_CFLAGS="-mno-avx512f", then it works fine and doesn't
> crash.
> 
> Do you have any idea regarding this issue or are you already aware of it?
> 
> 
> Thanks,
> Yongseok
> 
> 
> $ git diff
> diff --git a/config/common_base b/config/common_base
> index ad03cf433..f512b5a88 100644
> --- a/config/common_base
> +++ b/config/common_base
> @@ -275,8 +275,8 @@ CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE=8
> #
> # Compile burst-oriented Mellanox ConnectX-4 & ConnectX-5 (MLX5) PMD
> #
> -CONFIG_RTE_LIBRTE_MLX5_PMD=n
> -CONFIG_RTE_LIBRTE_MLX5_DEBUG=n
> +CONFIG_RTE_LIBRTE_MLX5_PMD=y
> +CONFIG_RTE_LIBRTE_MLX5_DEBUG=y
> CONFIG_RTE_LIBRTE_MLX5_DLOPEN_DEPS=n
> CONFIG_RTE_LIBRTE_MLX5_TX_MP_CACHE=8
> 
> @@ -597,7 +597,7 @@ CONFIG_RTE_RING_USE_C11_MEM_MODEL=n
> #
> CONFIG_RTE_LIBRTE_MEMPOOL=y
> CONFIG_RTE_MEMPOOL_CACHE_MAX_SIZE=512
> -CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=n
> +CONFIG_RTE_LIBRTE_MEMPOOL_DEBUG=y
> 
> #
> # Compile Mempool drivers
> diff --git a/lib/librte_mempool/rte_mempool.h b/lib/librte_mempool/rte_mempool.h
> index 8b1b7f7ed..9f48028d9 100644
> --- a/lib/librte_mempool/rte_mempool.h
> +++ b/lib/librte_mempool/rte_mempool.h
> @@ -39,6 +39,7 @@
> #include <errno.h>
> #include <inttypes.h>
> #include <sys/queue.h>
> +#include <assert.h>
> 
> #include <rte_config.h>
> #include <rte_spinlock.h>
> @@ -1123,6 +1124,22 @@ __mempool_generic_put(struct rte_mempool *mp, void * const *obj_table,
>        /* Add elements back into the cache */
>        rte_memcpy(&cache_objs[0], obj_table, sizeof(void *) * n);
> 
> +       if(memcmp(&cache_objs[0], obj_table, sizeof(void *) * n)) {
> +               printf("[GDB command] \n"
> +                      "$dst = %p\n"
> +                      "$src = %p\n"
> +                      "$n = %ld\n"
> +                      "x/%ldgx %p\n"
> +                      "x/%ldgx %p\n",
> +                      (void *)&cache_objs[0],
> +                      (const void *)obj_table,
> +                      sizeof(void *) * n,
> +                      sizeof(void *) * n / 8, (void *)&cache_objs[0],
> +                      sizeof(void *) * n / 8, (const void *)obj_table
> +                      );
> +               assert(0);
> +       }
> +
>        cache->len += n;
> 
>        if (cache->len >= cache->flushthresh) {
> 
> 

      reply	other threads:[~2018-09-12 23:37 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-12 20:56 Yongseok Koh
2018-09-12 23:37 ` Yongseok Koh [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=C6817FD4-9D7A-4B47-B220-91B2661C1754@mellanox.com \
    --to=yskoh@mellanox.com \
    --cc=christian.ehrhardt@canonical.com \
    --cc=dev@dpdk.org \
    --cc=noas@mellanox.com \
    --cc=olgas@mellanox.com \
    --cc=shahafs@mellanox.com \
    --cc=talatb@mellanox.com \
    --cc=thomasm@mellanox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).