DPDK patches and discussions
 help / color / mirror / Atom feed
* [PATCH] eal: zero out new added memory
@ 2022-08-27  9:25 chengtcli
  2022-08-27  9:57 ` Dmitry Kozlyuk
  0 siblings, 1 reply; 16+ messages in thread
From: chengtcli @ 2022-08-27  9:25 UTC (permalink / raw)
  To: dev; +Cc: lic121

From: lic121 <lic121@chinatelecom.cn>

When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
zero oute allocaed memory. Because memory are zeroed out when free
in malloc_elem_free(). But seems the initial allocated memory is
not zeroed out as expected.

This patch zero out initial allocated memory in
malloc_heap_add_memory().

With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
this problem.
```
  Stack trace of thread 412780:
  #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
  #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
  #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
  #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
  #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
  #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
  #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
  #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
  #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
  #9  0x000000000041768f main (dpdk-testpmd)
  #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
  #11 0x000000000067e34a _start (dpdk-testpmd)
```

Signed-off-by: lic121 <lic121@chinatelecom.cn>
---
 lib/librte_eal/common/malloc_heap.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
index f4e20ea..1607401 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -96,11 +96,19 @@
 		void *start, size_t len)
 {
 	struct malloc_elem *elem = start;
+	void *ptr;
+	size_t data_len
+
 
 	malloc_elem_init(elem, heap, msl, len, elem, len);
 
 	malloc_elem_insert(elem);
 
+	/* Zero out new added memory. */
+	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
+	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
+	memset(ptr, 0, data_len);
+
 	elem = malloc_elem_join_adjacent_free(elem);
 
 	malloc_elem_free_list_insert(elem);
-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-27  9:25 [PATCH] eal: zero out new added memory chengtcli
@ 2022-08-27  9:57 ` Dmitry Kozlyuk
  2022-08-27 13:31   ` lic121
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-08-27  9:57 UTC (permalink / raw)
  To: chengtcli; +Cc: dev, lic121

2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> From: lic121 <lic121@chinatelecom.cn>
> 
> When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> zero oute allocaed memory. Because memory are zeroed out when free
> in malloc_elem_free(). But seems the initial allocated memory is
> not zeroed out as expected.
> 
> This patch zero out initial allocated memory in
> malloc_heap_add_memory().
> 
> With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> this problem.
> ```
>   Stack trace of thread 412780:
>   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
>   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
>   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
>   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
>   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
>   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
>   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
>   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
>   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
>   #9  0x000000000041768f main (dpdk-testpmd)
>   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
>   #11 0x000000000067e34a _start (dpdk-testpmd)
> ```
> 
> Signed-off-by: lic121 <lic121@chinatelecom.cn>
> ---
>  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
>  1 file changed, 8 insertions(+)
> 
> diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> index f4e20ea..1607401 100644
> --- a/lib/librte_eal/common/malloc_heap.c
> +++ b/lib/librte_eal/common/malloc_heap.c
> @@ -96,11 +96,19 @@
>  		void *start, size_t len)
>  {
>  	struct malloc_elem *elem = start;
> +	void *ptr;
> +	size_t data_len
> +
>  
>  	malloc_elem_init(elem, heap, msl, len, elem, len);
>  
>  	malloc_elem_insert(elem);
>  
> +	/* Zero out new added memory. */
> +	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> +	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> +	memset(ptr, 0, data_len);
> +
>  	elem = malloc_elem_join_adjacent_free(elem);
>  
>  	malloc_elem_free_list_insert(elem);

Hi,

The kernel ensures that the newly mapped memory is zeroed,
and DPDK ensures that files in hugetlbfs are not re-mapped.
What makes you think that it is not zeroed?
Were you able to catch [start; start+len) contain non-zero bytes
at the start of this function?
If so, is it system memory (not an external heap)?
If so, what is the CPU, kernel, any custom settings?

Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?

This patch cannot be accepted as-is anyway:
1. It zeroes memory even if the code was called not via rte_zmalloc().
2. It leads to zeroing on both alloc and free, which is suboptimal.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-27  9:57 ` Dmitry Kozlyuk
@ 2022-08-27 13:31   ` lic121
  2022-08-27 14:56     ` Dmitry Kozlyuk
  0 siblings, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-27 13:31 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, lic121

On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> > From: lic121 <lic121@chinatelecom.cn>
> > 
> > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > zero oute allocaed memory. Because memory are zeroed out when free
> > in malloc_elem_free(). But seems the initial allocated memory is
> > not zeroed out as expected.
> > 
> > This patch zero out initial allocated memory in
> > malloc_heap_add_memory().
> > 
> > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > this problem.
> > ```
> >   Stack trace of thread 412780:
> >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> >   #9  0x000000000041768f main (dpdk-testpmd)
> >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > ```
> > 
> > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > ---
> >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> >  1 file changed, 8 insertions(+)
> > 
> > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > index f4e20ea..1607401 100644
> > --- a/lib/librte_eal/common/malloc_heap.c
> > +++ b/lib/librte_eal/common/malloc_heap.c
> > @@ -96,11 +96,19 @@
> >  		void *start, size_t len)
> >  {
> >  	struct malloc_elem *elem = start;
> > +	void *ptr;
> > +	size_t data_len
> > +
> >  
> >  	malloc_elem_init(elem, heap, msl, len, elem, len);
> >  
> >  	malloc_elem_insert(elem);
> >  
> > +	/* Zero out new added memory. */
> > +	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > +	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > +	memset(ptr, 0, data_len);
> > +
> >  	elem = malloc_elem_join_adjacent_free(elem);
> >  
> >  	malloc_elem_free_list_insert(elem);
> 
> Hi,
> 
> The kernel ensures that the newly mapped memory is zeroed,
> and DPDK ensures that files in hugetlbfs are not re-mapped.
> What makes you think that it is not zeroed?
> Were you able to catch [start; start+len) contain non-zero bytes
> at the start of this function?
> If so, is it system memory (not an external heap)?
> If so, what is the CPU, kernel, any custom settings?
> 
> Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> 
> This patch cannot be accepted as-is anyway:
> 1. It zeroes memory even if the code was called not via rte_zmalloc().
> 2. It leads to zeroing on both alloc and free, which is suboptimal.

Hi Dmitry, thanks for the review.

In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
I printed
the content in gdb. It's not zero.

print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info

cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
kernel: 4.19.90-2102

[1]
https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-27 13:31   ` lic121
@ 2022-08-27 14:56     ` Dmitry Kozlyuk
  2022-08-29  1:18       ` lic121
  0 siblings, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-08-27 14:56 UTC (permalink / raw)
  To: lic121; +Cc: dev, lic121

2022-08-27 13:31 (UTC+0000), lic121:
> On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:  
> > > From: lic121 <lic121@chinatelecom.cn>
> > > 
> > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > zero oute allocaed memory. Because memory are zeroed out when free
> > > in malloc_elem_free(). But seems the initial allocated memory is
> > > not zeroed out as expected.
> > > 
> > > This patch zero out initial allocated memory in
> > > malloc_heap_add_memory().
> > > 
> > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > this problem.
> > > ```
> > >   Stack trace of thread 412780:
> > >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> > >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> > >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> > >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> > >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> > >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> > >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> > >   #9  0x000000000041768f main (dpdk-testpmd)
> > >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> > >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > > ```
> > > 
> > > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > > ---
> > >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> > >  1 file changed, 8 insertions(+)
> > > 
> > > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > > index f4e20ea..1607401 100644
> > > --- a/lib/librte_eal/common/malloc_heap.c
> > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > @@ -96,11 +96,19 @@
> > >  		void *start, size_t len)
> > >  {
> > >  	struct malloc_elem *elem = start;
> > > +	void *ptr;
> > > +	size_t data_len
> > > +
> > >  
> > >  	malloc_elem_init(elem, heap, msl, len, elem, len);
> > >  
> > >  	malloc_elem_insert(elem);
> > >  
> > > +	/* Zero out new added memory. */
> > > +	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > +	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > +	memset(ptr, 0, data_len);
> > > +
> > >  	elem = malloc_elem_join_adjacent_free(elem);
> > >  
> > >  	malloc_elem_free_list_insert(elem);  
> > 
> > Hi,
> > 
> > The kernel ensures that the newly mapped memory is zeroed,
> > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > What makes you think that it is not zeroed?
> > Were you able to catch [start; start+len) contain non-zero bytes
> > at the start of this function?
> > If so, is it system memory (not an external heap)?
> > If so, what is the CPU, kernel, any custom settings?
> > 
> > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > 
> > This patch cannot be accepted as-is anyway:
> > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > 2. It leads to zeroing on both alloc and free, which is suboptimal.  
> 
> Hi Dmitry, thanks for the review.
> 
> In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> I printed
> the content in gdb. It's not zero.
> 
> print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> 
> cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> kernel: 4.19.90-2102
> 
> [1]
> https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93

Sorry, it seems that something is wrong with your debug.
Your link is for 20.11.0.
In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
Even if it was, in zeroed memory it would be a NULL pointer,
reading a member would give a random value at NULL + some offset.
I suggest to print content of the allocated memory with rte_hexdump().

[2]:
http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-27 14:56     ` Dmitry Kozlyuk
@ 2022-08-29  1:18       ` lic121
  2022-08-29 11:37         ` lic121
  0 siblings, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-29  1:18 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, lic121

On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-27 13:31 (UTC+0000), lic121:
> > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:  
> > > > From: lic121 <lic121@chinatelecom.cn>
> > > > 
> > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > not zeroed out as expected.
> > > > 
> > > > This patch zero out initial allocated memory in
> > > > malloc_heap_add_memory().
> > > > 
> > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > this problem.
> > > > ```
> > > >   Stack trace of thread 412780:
> > > >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> > > >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> > > >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> > > >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> > > >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> > > >   #9  0x000000000041768f main (dpdk-testpmd)
> > > >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> > > >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > > > ```
> > > > 
> > > > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > > > ---
> > > >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> > > >  1 file changed, 8 insertions(+)
> > > > 
> > > > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > > > index f4e20ea..1607401 100644
> > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > @@ -96,11 +96,19 @@
> > > >  		void *start, size_t len)
> > > >  {
> > > >  	struct malloc_elem *elem = start;
> > > > +	void *ptr;
> > > > +	size_t data_len
> > > > +
> > > >  
> > > >  	malloc_elem_init(elem, heap, msl, len, elem, len);
> > > >  
> > > >  	malloc_elem_insert(elem);
> > > >  
> > > > +	/* Zero out new added memory. */
> > > > +	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > +	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > +	memset(ptr, 0, data_len);
> > > > +
> > > >  	elem = malloc_elem_join_adjacent_free(elem);
> > > >  
> > > >  	malloc_elem_free_list_insert(elem);  
> > > 
> > > Hi,
> > > 
> > > The kernel ensures that the newly mapped memory is zeroed,
> > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > What makes you think that it is not zeroed?
> > > Were you able to catch [start; start+len) contain non-zero bytes
> > > at the start of this function?
> > > If so, is it system memory (not an external heap)?
> > > If so, what is the CPU, kernel, any custom settings?
> > > 
> > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > > 
> > > This patch cannot be accepted as-is anyway:
> > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > 2. It leads to zeroing on both alloc and free, which is suboptimal.  
> > 
> > Hi Dmitry, thanks for the review.
> > 
> > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > I printed
> > the content in gdb. It's not zero.
> > 
> > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> > 
> > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > kernel: 4.19.90-2102
> > 
> > [1]
> > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
> 
> Sorry, it seems that something is wrong with your debug.
> Your link is for 20.11.0.
> In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
> Even if it was, in zeroed memory it would be a NULL pointer,
> reading a member would give a random value at NULL + some offset.
> I suggest to print content of the allocated memory with rte_hexdump().
> 

Sorry I didn't describe my debug clear. At first I debuged with version
20.11.0, I found that the rte_zmalloc_socket() memory is dirty. Then I
tried 20.11.5, I didn't debug on 20.11.5 but the behave is the same(nic
failed to be probed). So in the commit msg I said v20.11.5 has the
issue. But when I describe my debug I uesd 20.11.0 url.

More debug info:
1. I reproduced the issue for tens of times, every time the printed var
has the same value.
2. After search malloc_heap_add_memory, I found that there are 3 places
where call this function to add memory, malloc_add_seg(),
alloc_pages_on_heap() and malloc_heap_add_external_memory(). Firstly, I
zero out memory only for malloc_add_seg(), it didn't fix the issue. Then
I zero out meory in malloc_heap_add_memory() to cover all 3 cases, this
time nic is probed successfully.
3. Once nic is probed, I roll back my fix code, try to reproduce the
issue. But I can't reproduce anymore. So I guess: the memory allocated
when probe qede nic is at a fixed memory location. Because every time in
my debug the printed var has the same value. After I zeroed out the
allocated memory once, I can't reproduce the issue anymore.

> [2]:
> http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-29  1:18       ` lic121
@ 2022-08-29 11:37         ` lic121
  2022-08-29 11:57           ` David Marchand
  0 siblings, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-29 11:37 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: dev, lic121

On Mon, Aug 29, 2022 at 01:18:36AM +0000, lic121 wrote:
> On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-27 13:31 (UTC+0000), lic121:
> > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:  
> > > > > From: lic121 <lic121@chinatelecom.cn>
> > > > > 
> > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > > not zeroed out as expected.
> > > > > 
> > > > > This patch zero out initial allocated memory in
> > > > > malloc_heap_add_memory().
> > > > > 
> > > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > > this problem.
> > > > > ```
> > > > >   Stack trace of thread 412780:
> > > > >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > > >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > > >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> > > > >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> > > > >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> > > > >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > > >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > > >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> > > > >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> > > > >   #9  0x000000000041768f main (dpdk-testpmd)
> > > > >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> > > > >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > > > > ```
> > > > > 
> > > > > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > > > > ---
> > > > >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> > > > >  1 file changed, 8 insertions(+)
> > > > > 
> > > > > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > > > > index f4e20ea..1607401 100644
> > > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > > @@ -96,11 +96,19 @@
> > > > >  		void *start, size_t len)
> > > > >  {
> > > > >  	struct malloc_elem *elem = start;
> > > > > +	void *ptr;
> > > > > +	size_t data_len
> > > > > +
> > > > >  
> > > > >  	malloc_elem_init(elem, heap, msl, len, elem, len);
> > > > >  
> > > > >  	malloc_elem_insert(elem);
> > > > >  
> > > > > +	/* Zero out new added memory. */
> > > > > +	*ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > > +	data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > > +	memset(ptr, 0, data_len);
> > > > > +
> > > > >  	elem = malloc_elem_join_adjacent_free(elem);
> > > > >  
> > > > >  	malloc_elem_free_list_insert(elem);  
> > > > 
> > > > Hi,
> > > > 
> > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > > What makes you think that it is not zeroed?
> > > > Were you able to catch [start; start+len) contain non-zero bytes
> > > > at the start of this function?
> > > > If so, is it system memory (not an external heap)?
> > > > If so, what is the CPU, kernel, any custom settings?
> > > > 
> > > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > > > 
> > > > This patch cannot be accepted as-is anyway:
> > > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > > 2. It leads to zeroing on both alloc and free, which is suboptimal.  
> > > 
> > > Hi Dmitry, thanks for the review.
> > > 
> > > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > > I printed
> > > the content in gdb. It's not zero.
> > > 
> > > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> > > 
> > > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > > kernel: 4.19.90-2102
> > > 
> > > [1]
> > > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
> > 
> > Sorry, it seems that something is wrong with your debug.
> > Your link is for 20.11.0.
> > In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
> > Even if it was, in zeroed memory it would be a NULL pointer,
> > reading a member would give a random value at NULL + some offset.
> > I suggest to print content of the allocated memory with rte_hexdump().
> > 
> 
> Sorry I didn't describe my debug clear. At first I debuged with version
> 20.11.0, I found that the rte_zmalloc_socket() memory is dirty. Then I
> tried 20.11.5, I didn't debug on 20.11.5 but the behave is the same(nic
> failed to be probed). So in the commit msg I said v20.11.5 has the
> issue. But when I describe my debug I uesd 20.11.0 url.
> 
> More debug info:
> 1. I reproduced the issue for tens of times, every time the printed var
> has the same value.
> 2. After search malloc_heap_add_memory, I found that there are 3 places
> where call this function to add memory, malloc_add_seg(),
> alloc_pages_on_heap() and malloc_heap_add_external_memory(). Firstly, I
> zero out memory only for malloc_add_seg(), it didn't fix the issue. Then
> I zero out meory in malloc_heap_add_memory() to cover all 3 cases, this
> time nic is probed successfully.
> 3. Once nic is probed, I roll back my fix code, try to reproduce the
> issue. But I can't reproduce anymore. So I guess: the memory allocated
> when probe qede nic is at a fixed memory location. Because every time in
> my debug the printed var has the same value. After I zeroed out the
> allocated memory once, I can't reproduce the issue anymore.
> 
> > [2]:
> > http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223

Today we probaly meet the same issue with intel E810 nic, the behave is
that E810 nic can be probed on some host, but can't one some other. On
the same host, one E810 may be probed while the other one can't be.
After I applied this patch, no such issue anymore.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-29 11:37         ` lic121
@ 2022-08-29 11:57           ` David Marchand
  2022-08-29 12:37             ` Morten Brørup
  0 siblings, 1 reply; 16+ messages in thread
From: David Marchand @ 2022-08-29 11:57 UTC (permalink / raw)
  To: lic121; +Cc: Dmitry Kozlyuk, dev

On Mon, Aug 29, 2022 at 1:38 PM lic121 <chengtcli@qq.com> wrote:
>
> On Mon, Aug 29, 2022 at 01:18:36AM +0000, lic121 wrote:
> > On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > > 2022-08-27 13:31 (UTC+0000), lic121:
> > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> > > > > > From: lic121 <lic121@chinatelecom.cn>
> > > > > >
> > > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket() doens't
> > > > > > zero oute allocaed memory. Because memory are zeroed out when free
> > > > > > in malloc_elem_free(). But seems the initial allocated memory is
> > > > > > not zeroed out as expected.
> > > > > >
> > > > > > This patch zero out initial allocated memory in
> > > > > > malloc_heap_add_memory().
> > > > > >
> > > > > > With dpdk 20.11.5, "QLogic Corp. FastLinQ QL41000" probe triggers
> > > > > > this problem.
> > > > > > ```
> > > > > >   Stack trace of thread 412780:
> > > > > >   #0  0x0000000000e5fb99 ecore_int_igu_read_cam (dpdk-testpmd)
> > > > > >   #1  0x0000000000e4df54 ecore_get_hw_info (dpdk-testpmd)
> > > > > >   #2  0x0000000000e504aa ecore_hw_prepare (dpdk-testpmd)
> > > > > >   #3  0x0000000000e8a7ca qed_probe (dpdk-testpmd)
> > > > > >   #4  0x0000000000e83c59 qede_common_dev_init (dpdk-testpmd)
> > > > > >   #5  0x0000000000e84c8e qede_eth_dev_init (dpdk-testpmd)
> > > > > >   #6  0x00000000009dd5a7 rte_pci_probe_one_driver (dpdk-testpmd)
> > > > > >   #7  0x00000000009734e3 rte_bus_probe (dpdk-testpmd)
> > > > > >   #8  0x00000000009933bd rte_eal_init (dpdk-testpmd)
> > > > > >   #9  0x000000000041768f main (dpdk-testpmd)
> > > > > >   #10 0x00007f41a7001b17 __libc_start_main (libc.so.6)
> > > > > >   #11 0x000000000067e34a _start (dpdk-testpmd)
> > > > > > ```
> > > > > >
> > > > > > Signed-off-by: lic121 <lic121@chinatelecom.cn>
> > > > > > ---
> > > > > >  lib/librte_eal/common/malloc_heap.c | 8 ++++++++
> > > > > >  1 file changed, 8 insertions(+)
> > > > > >
> > > > > > diff --git a/lib/librte_eal/common/malloc_heap.c b/lib/librte_eal/common/malloc_heap.c
> > > > > > index f4e20ea..1607401 100644
> > > > > > --- a/lib/librte_eal/common/malloc_heap.c
> > > > > > +++ b/lib/librte_eal/common/malloc_heap.c
> > > > > > @@ -96,11 +96,19 @@
> > > > > >               void *start, size_t len)
> > > > > >  {
> > > > > >       struct malloc_elem *elem = start;
> > > > > > +     void *ptr;
> > > > > > +     size_t data_len
> > > > > > +
> > > > > >
> > > > > >       malloc_elem_init(elem, heap, msl, len, elem, len);
> > > > > >
> > > > > >       malloc_elem_insert(elem);
> > > > > >
> > > > > > +     /* Zero out new added memory. */
> > > > > > +     *ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
> > > > > > +     data_len = elem->size - MALLOC_ELEM_OVERHEAD;
> > > > > > +     memset(ptr, 0, data_len);
> > > > > > +
> > > > > >       elem = malloc_elem_join_adjacent_free(elem);
> > > > > >
> > > > > >       malloc_elem_free_list_insert(elem);
> > > > >
> > > > > Hi,
> > > > >
> > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.
> > > > > What makes you think that it is not zeroed?
> > > > > Were you able to catch [start; start+len) contain non-zero bytes
> > > > > at the start of this function?
> > > > > If so, is it system memory (not an external heap)?
> > > > > If so, what is the CPU, kernel, any custom settings?
> > > > >
> > > > > Can it be the PMD or the app that uses rte_malloc instead of rte_zmalloc?
> > > > >
> > > > > This patch cannot be accepted as-is anyway:
> > > > > 1. It zeroes memory even if the code was called not via rte_zmalloc().
> > > > > 2. It leads to zeroing on both alloc and free, which is suboptimal.
> > > >
> > > > Hi Dmitry, thanks for the review.
> > > >
> > > > In rte_eth_dev_pci_allocate(), imediately after rte_zmalloc_socket()[1]
> > > > I printed
> > > > the content in gdb. It's not zero.
> > > >
> > > > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev->p_iov_info
> > > >
> > > > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > > > kernel: 4.19.90-2102
> > > >
> > > > [1]
> > > > https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_pci.h#L91-L93
> > >
> > > Sorry, it seems that something is wrong with your debug.
> > > Your link is for 20.11.0.
> > > In 20.11.5 (apparently always) struct qede_dev::edev is not a pointer [2].
> > > Even if it was, in zeroed memory it would be a NULL pointer,
> > > reading a member would give a random value at NULL + some offset.
> > > I suggest to print content of the allocated memory with rte_hexdump().
> > >
> >
> > Sorry I didn't describe my debug clear. At first I debuged with version
> > 20.11.0, I found that the rte_zmalloc_socket() memory is dirty. Then I
> > tried 20.11.5, I didn't debug on 20.11.5 but the behave is the same(nic
> > failed to be probed). So in the commit msg I said v20.11.5 has the
> > issue. But when I describe my debug I uesd 20.11.0 url.
> >
> > More debug info:
> > 1. I reproduced the issue for tens of times, every time the printed var
> > has the same value.
> > 2. After search malloc_heap_add_memory, I found that there are 3 places
> > where call this function to add memory, malloc_add_seg(),
> > alloc_pages_on_heap() and malloc_heap_add_external_memory(). Firstly, I
> > zero out memory only for malloc_add_seg(), it didn't fix the issue. Then
> > I zero out meory in malloc_heap_add_memory() to cover all 3 cases, this
> > time nic is probed successfully.
> > 3. Once nic is probed, I roll back my fix code, try to reproduce the
> > issue. But I can't reproduce anymore. So I guess: the memory allocated
> > when probe qede nic is at a fixed memory location. Because every time in
> > my debug the printed var has the same value. After I zeroed out the
> > allocated memory once, I can't reproduce the issue anymore.
> >
> > > [2]:
> > > http://git.dpdk.org/dpdk-stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223
>
> Today we probaly meet the same issue with intel E810 nic, the behave is
> that E810 nic can be probed on some host, but can't one some other. On
> the same host, one E810 may be probed while the other one can't be.
> After I applied this patch, no such issue anymore.

Are you perhaps running your DPDK application from inside a container?
I remember tracking an issue which had to do with reusing a "dirty"
hugepage file (because of SELinux forbidding to destroy those files).


-- 
David Marchand


^ permalink raw reply	[flat|nested] 16+ messages in thread

* RE: [PATCH] eal: zero out new added memory
  2022-08-29 11:57           ` David Marchand
@ 2022-08-29 12:37             ` Morten Brørup
  2022-08-29 12:43               ` David Marchand
  2022-08-29 12:49               ` Dmitry Kozlyuk
  0 siblings, 2 replies; 16+ messages in thread
From: Morten Brørup @ 2022-08-29 12:37 UTC (permalink / raw)
  To: David Marchand, lic121; +Cc: Dmitry Kozlyuk, dev

> From: David Marchand [mailto:david.marchand@redhat.com]
> Sent: Monday, 29 August 2022 13.58
> 
> On Mon, Aug 29, 2022 at 1:38 PM lic121 <chengtcli@qq.com> wrote:
> >
> > On Mon, Aug 29, 2022 at 01:18:36AM +0000, lic121 wrote:
> > > On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > > > 2022-08-27 13:31 (UTC+0000), lic121:
> > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> > > > > > > From: lic121 <lic121@chinatelecom.cn>
> > > > > > >
> > > > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket()
> doens't
> > > > > > > zero oute allocaed memory. Because memory are zeroed out
> when free
> > > > > > > in malloc_elem_free(). But seems the initial allocated
> memory is
> > > > > > > not zeroed out as expected.
> > > > > > >
> > > > > > > This patch zero out initial allocated memory in
> > > > > > > malloc_heap_add_memory().
> > > > > > >

[...]

> > > > > > Hi,
> > > > > >
> > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.

David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?

If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.

> > > > > > What makes you think that it is not zeroed?
> > > > > > Were you able to catch [start; start+len) contain non-zero
> bytes
> > > > > > at the start of this function?
> > > > > > If so, is it system memory (not an external heap)?
> > > > > > If so, what is the CPU, kernel, any custom settings?
> > > > > >
> > > > > > Can it be the PMD or the app that uses rte_malloc instead of
> rte_zmalloc?
> > > > > >
> > > > > > This patch cannot be accepted as-is anyway:
> > > > > > 1. It zeroes memory even if the code was called not via
> rte_zmalloc().
> > > > > > 2. It leads to zeroing on both alloc and free, which is
> suboptimal.
> > > > >
> > > > > Hi Dmitry, thanks for the review.
> > > > >
> > > > > In rte_eth_dev_pci_allocate(), imediately after
> rte_zmalloc_socket()[1]
> > > > > I printed
> > > > > the content in gdb. It's not zero.
> > > > >
> > > > > print ((struct qede_dev *)(eth_dev->data->dev_private))->edev-
> >p_iov_info
> > > > >
> > > > > cpu: Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
> > > > > kernel: 4.19.90-2102
> > > > >
> > > > > [1]
> > > > >
> https://github.com/DPDK/dpdk/blob/v20.11/lib/librte_ethdev/rte_ethdev_p
> ci.h#L91-L93
> > > >
> > > > Sorry, it seems that something is wrong with your debug.
> > > > Your link is for 20.11.0.
> > > > In 20.11.5 (apparently always) struct qede_dev::edev is not a
> pointer [2].
> > > > Even if it was, in zeroed memory it would be a NULL pointer,
> > > > reading a member would give a random value at NULL + some offset.
> > > > I suggest to print content of the allocated memory with
> rte_hexdump().
> > > >
> > >
> > > Sorry I didn't describe my debug clear. At first I debuged with
> version
> > > 20.11.0, I found that the rte_zmalloc_socket() memory is dirty.
> Then I
> > > tried 20.11.5, I didn't debug on 20.11.5 but the behave is the
> same(nic
> > > failed to be probed). So in the commit msg I said v20.11.5 has the
> > > issue. But when I describe my debug I uesd 20.11.0 url.
> > >
> > > More debug info:
> > > 1. I reproduced the issue for tens of times, every time the printed
> var
> > > has the same value.
> > > 2. After search malloc_heap_add_memory, I found that there are 3
> places
> > > where call this function to add memory, malloc_add_seg(),
> > > alloc_pages_on_heap() and malloc_heap_add_external_memory().
> Firstly, I
> > > zero out memory only for malloc_add_seg(), it didn't fix the issue.
> Then
> > > I zero out meory in malloc_heap_add_memory() to cover all 3 cases,
> this
> > > time nic is probed successfully.
> > > 3. Once nic is probed, I roll back my fix code, try to reproduce
> the
> > > issue. But I can't reproduce anymore. So I guess: the memory
> allocated
> > > when probe qede nic is at a fixed memory location. Because every
> time in
> > > my debug the printed var has the same value. After I zeroed out the
> > > allocated memory once, I can't reproduce the issue anymore.
> > >
> > > > [2]:
> > > > http://git.dpdk.org/dpdk-
> stable/tree/drivers/net/qede/qede_ethdev.h?h=v20.11.5#n223
> >
> > Today we probaly meet the same issue with intel E810 nic, the behave
> is
> > that E810 nic can be probed on some host, but can't one some other.
> On
> > the same host, one E810 may be probed while the other one can't be.
> > After I applied this patch, no such issue anymore.
> 
> Are you perhaps running your DPDK application from inside a container?
> I remember tracking an issue which had to do with reusing a "dirty"
> hugepage file (because of SELinux forbidding to destroy those files).
> 
> 
> --
> David Marchand
> 


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-29 12:37             ` Morten Brørup
@ 2022-08-29 12:43               ` David Marchand
  2022-08-29 12:49               ` Dmitry Kozlyuk
  1 sibling, 0 replies; 16+ messages in thread
From: David Marchand @ 2022-08-29 12:43 UTC (permalink / raw)
  To: Morten Brørup; +Cc: lic121, Dmitry Kozlyuk, dev

On Mon, Aug 29, 2022 at 2:37 PM Morten Brørup <mb@smartsharesystems.com> wrote:
>
> > From: David Marchand [mailto:david.marchand@redhat.com]
> > Sent: Monday, 29 August 2022 13.58
> >
> > On Mon, Aug 29, 2022 at 1:38 PM lic121 <chengtcli@qq.com> wrote:
> > >
> > > On Mon, Aug 29, 2022 at 01:18:36AM +0000, lic121 wrote:
> > > > On Sat, Aug 27, 2022 at 05:56:54PM +0300, Dmitry Kozlyuk wrote:
> > > > > 2022-08-27 13:31 (UTC+0000), lic121:
> > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:
> > > > > > > 2022-08-27 09:25 (UTC+0000), chengtcli@qq.com:
> > > > > > > > From: lic121 <lic121@chinatelecom.cn>
> > > > > > > >
> > > > > > > > When RTE_MALLOC_DEBUG not configured, rte_zmalloc_socket()
> > doens't
> > > > > > > > zero oute allocaed memory. Because memory are zeroed out
> > when free
> > > > > > > > in malloc_elem_free(). But seems the initial allocated
> > memory is
> > > > > > > > not zeroed out as expected.
> > > > > > > >
> > > > > > > > This patch zero out initial allocated memory in
> > > > > > > > malloc_heap_add_memory().
> > > > > > > >
>
> [...]
>
> > > > > > > Hi,
> > > > > > >
> > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.
>
> David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?
>
> If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.

SELinux was preventing some file manipulations.
And I found the fix I had sent, which is already in 20.11, so it is
not the same issue.
aa48ddf4f0d2 ("mem: fix allocation in container with SELinux")

On the other hand, it is the kind of side effect to keep in mind when
debugging those "my memory is not zero'd" issues.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-29 12:37             ` Morten Brørup
  2022-08-29 12:43               ` David Marchand
@ 2022-08-29 12:49               ` Dmitry Kozlyuk
  2022-08-30  1:11                 ` lic121
  1 sibling, 1 reply; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-08-29 12:49 UTC (permalink / raw)
  To: Morten Brørup; +Cc: David Marchand, lic121, dev

2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > From: David Marchand [mailto:david.marchand@redhat.com]
> > Sent: Monday, 29 August 2022 13.58
> >
> > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:  
> > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> 
> David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?
> 
> If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.

+1, I'd like to know more about that case.

EAL checks the unlink() result, so if it fails, the allocation should fail
and the invariant should not be broken.
Code from 20.11.5:

	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
			unlink(path) == -1 &&
			errno != ENOENT) {
		RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
			__func__, path, strerror(errno));
		return -1;
	}

Can SELinux restriction result in errno == ENOENT?
I'd expect EPERM/EACCESS.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-29 12:49               ` Dmitry Kozlyuk
@ 2022-08-30  1:11                 ` lic121
  2022-08-30  9:49                   ` lic121
  0 siblings, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-30  1:11 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: Morten Brørup, David Marchand, dev

On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote:
> 2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > > From: David Marchand [mailto:david.marchand@redhat.com]
> > > Sent: Monday, 29 August 2022 13.58
> > >
> > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:  
> > > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> > 
> > David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?
> > 
> > If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.
> 
> +1, I'd like to know more about that case.
> 
> EAL checks the unlink() result, so if it fails, the allocation should fail
> and the invariant should not be broken.
> Code from 20.11.5:
> 
> 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> 			unlink(path) == -1 &&
> 			errno != ENOENT) {
> 		RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
> 			__func__, path, strerror(errno));
> 		return -1;
> 	}
> 
> Can SELinux restriction result in errno == ENOENT?
> I'd expect EPERM/EACCESS.

Thanks for your info, the selinux is disabled on my server. Also I
checked that the selinux fix is already in my dpdk. Could any other
settings may cause dirty memory? If you can think of any thing related,
I can have a try.

BTW, this is my nic info:
```
Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)

driver: ice
version: 1.9.3
firmware-version: 2.30 0x80005d22 1.2877.0
expansion-rom-version:
bus-info: 0000:3b:00.1
supports-statistics: yes
supports-test: yes
supports-eeprom-access: yes
supports-register-dump: yes
supports-priv-flags: yes
```

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-30  1:11                 ` lic121
@ 2022-08-30  9:49                   ` lic121
  2022-08-30 10:59                     ` Dmitry Kozlyuk
  0 siblings, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-30  9:49 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: Morten Brørup, David Marchand, dev

On Tue, Aug 30, 2022 at 01:11:25AM +0000, lic121 wrote:
> On Mon, Aug 29, 2022 at 03:49:25PM +0300, Dmitry Kozlyuk wrote:
> > 2022-08-29 14:37 (UTC+0200), Morten Brørup:
> > > > From: David Marchand [mailto:david.marchand@redhat.com]
> > > > Sent: Monday, 29 August 2022 13.58
> > > >
> > > > > > > > On Sat, Aug 27, 2022 at 12:57:50PM +0300, Dmitry Kozlyuk wrote:  
> > > > > > > > > The kernel ensures that the newly mapped memory is zeroed,
> > > > > > > > > and DPDK ensures that files in hugetlbfs are not re-mapped.  
> > > 
> > > David, are you suggesting that this invariant - guaranteeing that DPDK memory is zeroed - was violated by SELinux in the SELinux/container issue you were tracking?
> > > 
> > > If so, the method to ensure the invariant is faulty for SELinux. Assuming DPDK supports SELinux, this bug should be fixed.
> > 
> > +1, I'd like to know more about that case.
> > 
> > EAL checks the unlink() result, so if it fails, the allocation should fail
> > and the invariant should not be broken.
> > Code from 20.11.5:
> > 
> > 	if (rte_eal_process_type() == RTE_PROC_PRIMARY &&
> > 			unlink(path) == -1 &&
> > 			errno != ENOENT) {
> > 		RTE_LOG(DEBUG, EAL, "%s(): could not remove '%s': %s\n",
> > 			__func__, path, strerror(errno));
> > 		return -1;
> > 	}
> > 
> > Can SELinux restriction result in errno == ENOENT?
> > I'd expect EPERM/EACCESS.
> 
> Thanks for your info, the selinux is disabled on my server. Also I
> checked that the selinux fix is already in my dpdk. Could any other
> settings may cause dirty memory? If you can think of any thing related,
> I can have a try.
> 
> BTW, this is my nic info:
> ```
> Intel Corporation Ethernet Controller E810-XXV for SFP (rev 02)
> 
> driver: ice
> version: 1.9.3
> firmware-version: 2.30 0x80005d22 1.2877.0
> expansion-rom-version:
> bus-info: 0000:3b:00.1
> supports-statistics: yes
> supports-test: yes
> supports-eeprom-access: yes
> supports-register-dump: yes
> supports-priv-flags: yes
> ```


update with more debugs:

Preparation:
1. set hugepage size to 2 GB.
```
[root@gz15-compute-s3-55e247e16e22 huge]# grep -i huge /proc/meminfo
AnonHugePages:    124928 kB
ShmemHugePages:        0 kB
HugePages_Total:       2
HugePages_Free:        2
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:    1048576 kB
Hugetlb:         2097152 kB
```

2. make a simple programe to poison memory
```c
#include <stdio.h>
#include <sys/mman.h>
#include <string.h>

static int memvcmp(void *memory, unsigned char val, size_t size)
{
    unsigned char *mm = (unsigned char*)memory;
    return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
}

int main(int argc, char *argv[]){
    size_t size = 2 * (1 << 30)-1;
    void *ptr2 = mmap(NULL,  size,
                            PROT_READ | PROT_WRITE,
                            MAP_PRIVATE | MAP_ANONYMOUS |
                            MAP_HUGETLB, -1, 0);
    if (! ptr2) {
        printf("failed to allocted mm");
        return 0;
    }
    if (argc > 1) {
        memset(ptr2, 0xff, size);
    }
    unsigned char * ss = ptr2;
    printf("ss: %x\n", *ss);
    if (memvcmp(ptr2, 0, size)){
        printf("all zero\n");
    } else {
        printf("not all zero\n");
    }
}
```

3. insert debug info to check if memory all zero
```
diff --git a/lib/librte_eal/common/malloc_heap.c
b/lib/librte_eal/common/malloc_heap.c
index 5a09247a6..026560333 100644
--- a/lib/librte_eal/common/malloc_heap.c
+++ b/lib/librte_eal/common/malloc_heap.c
@@ -91,16 +91,32 @@ malloc_socket_to_heap_id(unsigned int socket_id)
 /*
  * Expand the heap with a memory area.
  */
+static int memvcmp(void *memory, unsigned char val, size_t size)
+{
+    unsigned char *mm = (unsigned char*)memory;
+    return (*mm == val) && memcmp(mm, mm + 1, size - 1) == 0;
+}
 static struct malloc_elem *
 malloc_heap_add_memory(struct malloc_heap *heap, struct rte_memseg_list
*msl,
                void *start, size_t len)
 {
        struct malloc_elem *elem = start;
+       void *ptr;
+       size_t data_len;
+

        malloc_elem_init(elem, heap, msl, len, elem, len);

        malloc_elem_insert(elem);

+       ptr = RTE_PTR_ADD(elem, MALLOC_ELEM_HEADER_LEN);
+       data_len = elem->size - MALLOC_ELEM_OVERHEAD;
+    if (memvcmp(ptr, 0, data_len)){
+        RTE_LOG(ERR, EAL, "liiiiiiilog: all zero\n");
+    } else {
+        RTE_LOG(ERR, EAL, "liiiiiiilog: not all zero\n");
+    }
+
        elem = malloc_elem_join_adjacent_free(elem);

        malloc_elem_free_list_insert(elem);
```

debug steps:
1. poison 2GB memory
```
[root@gz15-compute-s3-55e247e16e22 secure]# rm -rf
/dev/hugepages/rtemap_* ; huge/a.out 1
ss: ff
not all zero
```
2. Run testpmd(with no nic bind vfio-pci)
```
[root@gz15-compute-s3-55e247e16e22 secure]# dpdk-testpmd -l 0-3 -n 4 --
-i --nb-cores=3
EAL: Detected 64 lcore(s)
EAL: Detected 2 NUMA nodes
EAL: Multi-process socket /var/run/dpdk/rte/mp_socket
EAL: Selected IOVA mode 'VA'
EAL: Probing VFIO support...
EAL: VFIO support initialized
EAL: liiiiiiilog: not all zero
EAL: No legacy callbacks, legacy socket not created
testpmd: No probed ethernet devices
Interactive-mode selected
testpmd: create a new mbuf pool <mb_pool_0>: n=171456, size=2176,
socket=0
testpmd: preferred mempool ops selected: ring_mp_mc
testpmd: create a new mbuf pool <mb_pool_1>: n=171456, size=2176,
socket=1
testpmd: preferred mempool ops selected: ring_mp_mc
EAL: liiiiiiilog: not all zero
Done
testpmd>
```

Dirty memory happens even no nic probe.
I tried on two CPUs, the same issue.
- Intel(R) Xeon(R) Gold 5218 CPU @ 2.30GHz
- Intel(R) Xeon(R) Platinum 8378A CPU @ 3.00GHz


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-30  9:49                   ` lic121
@ 2022-08-30 10:59                     ` Dmitry Kozlyuk
  2022-08-30 12:47                       ` lic121
  2022-08-30 12:53                       ` lic121
  0 siblings, 2 replies; 16+ messages in thread
From: Dmitry Kozlyuk @ 2022-08-30 10:59 UTC (permalink / raw)
  To: lic121; +Cc: Morten Brørup, David Marchand, dev

Thank you for the most detailed info!

1. If you run the poisoner program the second time,
   does it also see dirty memory immediately after mmap()?

2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
   can there be any unusual patches applied?
   Your host has "compute" in its name,
   can it have patches that trade security for performance?

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-30 10:59                     ` Dmitry Kozlyuk
@ 2022-08-30 12:47                       ` lic121
  2022-08-30 12:53                       ` lic121
  1 sibling, 0 replies; 16+ messages in thread
From: lic121 @ 2022-08-30 12:47 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: Morten Brørup, David Marchand, dev

On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> Thank you for the most detailed info!
> 
> 1. If you run the poisoner program the second time,
>    does it also see dirty memory immediately after mmap()?

If I run the poisoner program the second time, no dirty memory.

There could be some difference on how my poisoner program and how
testpmd using mmap. Because I notice that testpmd leaves the hugepage
files under /dev/hugepage/rtemap_xxx even after testpmd exits. But my
poisoner program didn't create any file under /dev/hugepage.

> 
> 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
>    can there be any unusual patches applied?
>    Your host has "compute" in its name,
>    can it have patches that trade security for performance?

I may need to talk with the kernel team on this. Indeed, I failed to
reprouce the issue on a virtuam machine of kernel
4.18.0-305.12.1.el8_4.x86_64(2M page size instead of 1G). 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-30 10:59                     ` Dmitry Kozlyuk
  2022-08-30 12:47                       ` lic121
@ 2022-08-30 12:53                       ` lic121
  2022-09-03 13:53                         ` lic121
  1 sibling, 1 reply; 16+ messages in thread
From: lic121 @ 2022-08-30 12:53 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: Morten Brørup, David Marchand, dev

On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> Thank you for the most detailed info!
> 
> 1. If you run the poisoner program the second time,
>    does it also see dirty memory immediately after mmap()?

No, running the poisoner program again doesn't show dirty memory
immediately after mmap.

I assume that there are some differences on how my poisoner program
using hugepage and how testpmd using hugepage. I notice that testpmd
leaves some files under /dev/hugepages/rtemap_xxx even after testpmd
process exits. But my program didn't create any file under
/dev/hugepages/.
> 
> 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
>    can there be any unusual patches applied?
>    Your host has "compute" in its name,
>    can it have patches that trade security for performance?

I may need to talk to the kernel team. Indeed, I failed to reproduce the
issue on a Virtual Machine of kernel 4.18.0-305.12.1.el8_4.x86_64.(2M
page size insetad of 1G)

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH] eal: zero out new added memory
  2022-08-30 12:53                       ` lic121
@ 2022-09-03 13:53                         ` lic121
  0 siblings, 0 replies; 16+ messages in thread
From: lic121 @ 2022-09-03 13:53 UTC (permalink / raw)
  To: Dmitry Kozlyuk; +Cc: Morten Brørup, David Marchand, dev

On Tue, Aug 30, 2022 at 12:53:37PM +0000, lic121 wrote:
> On Tue, Aug 30, 2022 at 01:59:16PM +0300, Dmitry Kozlyuk wrote:
> > Thank you for the most detailed info!
> > 
> > 1. If you run the poisoner program the second time,
> >    does it also see dirty memory immediately after mmap()?
> 
> No, running the poisoner program again doesn't show dirty memory
> immediately after mmap.
> 
> I assume that there are some differences on how my poisoner program
> using hugepage and how testpmd using hugepage. I notice that testpmd
> leaves some files under /dev/hugepages/rtemap_xxx even after testpmd
> process exits. But my program didn't create any file under
> /dev/hugepages/.
> > 
> > 2. Kernel 4.19.90-2102 patchlevel 2102 is very high,
> >    can there be any unusual patches applied?
> >    Your host has "compute" in its name,
> >    can it have patches that trade security for performance?
> 
> I may need to talk to the kernel team. Indeed, I failed to reproduce the
> issue on a Virtual Machine of kernel 4.18.0-305.12.1.el8_4.x86_64.(2M
> page size insetad of 1G)

Verified that this issue is caused by an down stream linux kernel issue.
Thanks for your attention.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2022-09-03 13:53 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-27  9:25 [PATCH] eal: zero out new added memory chengtcli
2022-08-27  9:57 ` Dmitry Kozlyuk
2022-08-27 13:31   ` lic121
2022-08-27 14:56     ` Dmitry Kozlyuk
2022-08-29  1:18       ` lic121
2022-08-29 11:37         ` lic121
2022-08-29 11:57           ` David Marchand
2022-08-29 12:37             ` Morten Brørup
2022-08-29 12:43               ` David Marchand
2022-08-29 12:49               ` Dmitry Kozlyuk
2022-08-30  1:11                 ` lic121
2022-08-30  9:49                   ` lic121
2022-08-30 10:59                     ` Dmitry Kozlyuk
2022-08-30 12:47                       ` lic121
2022-08-30 12:53                       ` lic121
2022-09-03 13:53                         ` lic121

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).