About memory coherency

DPDK usage discussions
 help / color / mirror / Atom feed

* About memory coherency
@ 2022-08-09  2:38 Nick Tian
  2022-08-09  2:54 ` Nick Tian
  0 siblings, 1 reply; 13+ messages in thread
From: Nick Tian @ 2022-08-09  2:38 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 940 bytes --]

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use  memfd_create--> ftruncate--> mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
   /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
    fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 7617 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* About memory coherency
  2022-08-09  2:38 About memory coherency Nick Tian
@ 2022-08-09  2:54 ` Nick Tian
  2022-08-09  9:04   ` Kinsella, Ray
  0 siblings, 1 reply; 13+ messages in thread
From: Nick Tian @ 2022-08-09  2:54 UTC (permalink / raw)
  To: users

[-- Attachment #1: Type: text/plain, Size: 938 bytes --]

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
   /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
    fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 7424 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: About memory coherency
  2022-08-09  2:54 ` Nick Tian
@ 2022-08-09  9:04   ` Kinsella, Ray
  2022-08-09  9:25     ` Burakov, Anatoly
  0 siblings, 1 reply; 13+ messages in thread
From: Kinsella, Ray @ 2022-08-09  9:04 UTC (permalink / raw)
  To: Nick Tian, users; +Cc: Burakov, Anatoly

[-- Attachment #1: Type: text/plain, Size: 1247 bytes --]

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 7996 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: About memory coherency
  2022-08-09  9:04   ` Kinsella, Ray
@ 2022-08-09  9:25     ` Burakov, Anatoly
  2022-08-09  9:41       ` 回复：RE: " Nick Tian
  0 siblings, 1 reply; 13+ messages in thread
From: Burakov, Anatoly @ 2022-08-09  9:25 UTC (permalink / raw)
  To: Kinsella, Ray, Nick Tian, users

[-- Attachment #1: Type: text/plain, Size: 2361 bytes --]

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it is technically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com>; users@dpdk.org
Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 10204 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* 回复：RE: About memory coherency
  2022-08-09  9:25     ` Burakov, Anatoly
@ 2022-08-09  9:41       ` Nick Tian
  2022-08-09  9:59         ` Burakov, Anatoly
  2022-08-09 11:32         ` 回复：回复：RE: " Nick Tian
  0 siblings, 2 replies; 13+ messages in thread
From: Nick Tian @ 2022-08-09  9:41 UTC (permalink / raw)
  To: Burakov, Anatoly, Kinsella, Ray, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li

[-- Attachment #1: Type: text/plain, Size: 3123 bytes --]

Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by  memfd_create-->ftruncate-->mmap, 
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
    fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

 ------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com>, Nick Tian <nick.tian@longsailingsemi.com>, users@dpdk.org <users@dpdk.org>
主题:RE: About memory coherency

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com> 
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com>; users@dpdk.org
Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 10757 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: 回复：RE: About memory coherency
  2022-08-09  9:41       ` 回复：RE: " Nick Tian
@ 2022-08-09  9:59         ` Burakov, Anatoly
  2022-08-09 11:32         ` 回复：回复：RE: " Nick Tian
  1 sibling, 0 replies; 13+ messages in thread
From: Burakov, Anatoly @ 2022-08-09  9:59 UTC (permalink / raw)
  To: Nick Tian, Kinsella, Ray, users; +Cc: Jason Liu, Sunshine Qin, Mediter Li

[-- Attachment #1: Type: text/plain, Size: 4627 bytes --]

You should not use that memory’s real physical addresses, because we cannot guarantee that the kernel won’t change them under our feet. Please do not do that.

However, if you use IOVA as VA mode (that is, if you have an IOMMU on your machine and you’re using VFIO to bind devices to DPDK), then no-huge memory can be used with IOMMU, because then the kernel/IOMMU takes care of all the VA to PA mappings. I honestly cannot answer the “cacheable” question, as it never comes up (at least on IA). What are you trying to do, and how is this relevant?

In general, no-huge is meant to be a debug option, and is neither intended nor adequately tested for production workloads.

Thanks,
Anatoly

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday, August 9, 2022 10:42 AM
To: Burakov, Anatoly <anatoly.burakov@intel.com>; Kinsella, Ray <ray.kinsella@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>
Subject: 回复：RE: About memory coherency

Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by  memfd_create-->ftruncate-->mmap,
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>
主题:RE: About memory coherency
There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 17750 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* 回复：回复：RE: About memory coherency
  2022-08-09  9:41       ` 回复：RE: " Nick Tian
  2022-08-09  9:59         ` Burakov, Anatoly
@ 2022-08-09 11:32         ` Nick Tian
  2022-08-09 11:44           ` Kinsella, Ray
  1 sibling, 1 reply; 13+ messages in thread
From: Nick Tian @ 2022-08-09 11:32 UTC (permalink / raw)
  To: Nick Tian, Burakov, Anatoly, Kinsella, Ray, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li

[-- Attachment #1: Type: text/plain, Size: 5044 bytes --]

Hi  Anatoly
Let me provide some detail about our user scenario.
1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.
2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).
3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).
4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).
5.Application use the virtual address of this memory to read the descriptor and payload to do something else.
As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.
(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get
the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) is UNCACHEABLE/Physically continous or not? 
We need it physically continous and UNCACHEABLE,  otherwise it will not suitable for this user scenario.

 ------------------原始邮件 ------------------
发件人:Nick Tian <nick.tian@longsailingsemi.com>
发送时间:08/09/22 17:41:35
收件人:Burakov, Anatoly <anatoly.burakov@intel.com>, Kinsella, Ray <ray.kinsella@intel.com>, users@dpdk.org <users@dpdk.org>
抄送:Jason Liu <jason.liu@longsailingsemi.com>, Sunshine Qin <sunshine.qin@longsailingsemi.com>, Mediter Li <mediter.li@longsailingsemi.com>
主题:回复：RE: About memory coherency

Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by memfd_create-->ftruncate-->mmap, 
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
    fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

 ------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com>, Nick Tian <nick.tian@longsailingsemi.com>, users@dpdk.org <users@dpdk.org>
主题:RE: About memory coherency

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com> 
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com>; users@dpdk.org
Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 17022 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: 回复：回复：RE: About memory coherency
  2022-08-09 11:32         ` 回复：回复：RE: " Nick Tian
@ 2022-08-09 11:44           ` Kinsella, Ray
  2022-08-09 11:57             ` Nick Tian
  2022-08-09 12:16             ` 回复：回复：RE: " Burakov, Anatoly
  0 siblings, 2 replies; 13+ messages in thread
From: Kinsella, Ray @ 2022-08-09 11:44 UTC (permalink / raw)
  To: Nick Tian, Burakov, Anatoly, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li, jerinj

[-- Attachment #1: Type: text/plain, Size: 6344 bytes --]

Hi Nick,

Adding Jerin to direct your query. As I think you need some of the ARM guys to chime in here, coherency constraints are ARM is different to Intel.

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 12:32
To: Nick Tian <nick.tian@longsailingsemi.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Kinsella, Ray <ray.kinsella@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>
Subject: 回复：回复：RE: About memory coherency

Hi  Anatoly
Let me provide some detail about our user scenario.
1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.
2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).
3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).
4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).
5.Application use the virtual address of this memory to read the descriptor and payload to do something else.
As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.
(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get
the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) is UNCACHEABLE/Physically continous or not?
We need it physically continous and UNCACHEABLE,  otherwise it will not suitable for this user scenario.

------------------原始邮件 ------------------
发件人:Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
发送时间:08/09/22 17:41:35
收件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>, Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>
抄送:Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>, Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>, Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>
主题:回复：RE: About memory coherency
Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by memfd_create-->ftruncate-->mmap,
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>
主题:RE: About memory coherency
There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 26844 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: About memory coherency
  2022-08-09 11:44           ` Kinsella, Ray
@ 2022-08-09 11:57             ` Nick Tian
  2022-08-09 12:16             ` 回复：回复：RE: " Burakov, Anatoly
  1 sibling, 0 replies; 13+ messages in thread
From: Nick Tian @ 2022-08-09 11:57 UTC (permalink / raw)
  To: Kinsella, Ray, Burakov, Anatoly, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li, jerinj

[-- Attachment #1: Type: text/plain, Size: 7486 bytes --]

Hi Jerin
Let me re-organize my question.
We are using DPDK configure with "no-huge" option to implement our ethernet forwarding application.(Since in my HW platform, when I enable huge-page, linux will hang during startup).
After check the EAL source code, I found when I configure DPDK as "No-huge", DPDK will allocate memory with memfd_create-->ftruncate-->mmap.
Our user scenario is like this:

1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.
2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let's use HAC as abbreviate).
3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).
4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the physical address with it's register).
5.Application use the virtual address of this memory to read the descriptor and payload to do something else.
I need the memory I got from rte_malloc be physically continous and uncacheable to avoid In-coherent issue and other issue.
I want to know if the memory allocated as memfd_create-->ftruncate-->mmap is physically continous and uncacheable.

 ------------------原始邮件 ------------------
发件人:Kinsella, Ray <ray.kinsella@intel.com>
发送时间:08/09/22 19:44:26
收件人:Nick Tian <nick.tian@longsailingsemi.com>, Burakov, Anatoly <anatoly.burakov@intel.com>, users@dpdk.org <users@dpdk.org>
抄送:Jason Liu <jason.liu@longsailingsemi.com>, Sunshine Qin <sunshine.qin@longsailingsemi.com>, Mediter Li <mediter.li@longsailingsemi.com>, jerinj@marvell.com <jerinj@marvell.com>
主题:RE: 回复：回复：RE: About memory coherency

Hi Nick,

Adding Jerin to direct your query. As I think you need some of the ARM guys to chime in here, coherency constraints are ARM is different to Intel.

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 12:32
To: Nick Tian <nick.tian@longsailingsemi.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; Kinsella, Ray <ray.kinsella@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>
Subject: 回复：回复：RE: About memory coherency
Hi  Anatoly
Let me provide some detail about our user scenario.
1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.
2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).
3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).
4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).
5.Application use the virtual address of this memory to read the descriptor and payload to do something else.
As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.
(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get
the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) isUNCACHEABLE/Physically continous or not? 
We need itphysically continous andUNCACHEABLE,  otherwise it will not suitable for this user scenario.

------------------原始邮件 ------------------
发件人:Nick Tian <nick.tian@longsailingsemi.com>
发送时间:08/09/22 17:41:35
收件人:Burakov, Anatoly <anatoly.burakov@intel.com>, Kinsella, Ray <ray.kinsella@intel.com>,users@dpdk.org <users@dpdk.org>
抄送:Jason Liu <jason.liu@longsailingsemi.com>, Sunshine Qin <sunshine.qin@longsailingsemi.com>, Mediter Li <mediter.li@longsailingsemi.com>
主题:回复：RE: About memory coherency

Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by memfd_create-->ftruncate-->mmap, 
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com>, Nick Tian <nick.tian@longsailingsemi.com>,users@dpdk.org <users@dpdk.org>
主题:RE: About memory coherency

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com>;users@dpdk.org
Cc: Burakov, Anatoly <anatoly.burakov@intel.com>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 34500 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: 回复：回复：RE: About memory coherency
  2022-08-09 11:44           ` Kinsella, Ray
  2022-08-09 11:57             ` Nick Tian
@ 2022-08-09 12:16             ` Burakov, Anatoly
  2022-08-09 13:05               ` 回复：About memory cohere =?utf-8?Q?ncy ‪nick.tian
  2022-08-09 13:16               ` 回复：回复：RE: About memory coherency Jerin Jacob Kollanukkaran
  1 sibling, 2 replies; 13+ messages in thread
From: Burakov, Anatoly @ 2022-08-09 12:16 UTC (permalink / raw)
  To: Kinsella, Ray, Nick Tian, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li, jerinj

[-- Attachment #1: Type: text/plain, Size: 7296 bytes --]

Memory allocated with ‘no-huge’ will not be physically contiguous. Whether it’s cacheable is, I assume, depending on your platform. I can’t comment on ARM platforms, so I’ll defer to Jerin et al. to comment on this 😊

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com>
Sent: Tuesday, August 9, 2022 12:44 PM
To: Nick Tian <nick.tian@longsailingsemi.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>; jerinj@marvell.com
Subject: RE: 回复：回复：RE: About memory coherency

Hi Nick,

Adding Jerin to direct your query. As I think you need some of the ARM guys to chime in here, coherency constraints are ARM is different to Intel.

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 12:32
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>; Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>; Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>
Subject: 回复：回复：RE: About memory coherency

Hi  Anatoly
Let me provide some detail about our user scenario.
1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.
2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).
3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).
4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).
5.Application use the virtual address of this memory to read the descriptor and payload to do something else.
As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.
(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get
the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) is UNCACHEABLE/Physically continous or not?
We need it physically continous and UNCACHEABLE,  otherwise it will not suitable for this user scenario.

------------------原始邮件 ------------------
发件人:Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
发送时间:08/09/22 17:41:35
收件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>, Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>
抄送:Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>, Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>, Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>
主题:回复：RE: About memory coherency
Hi Burakov
Thanks for your reply.
BTW, about the memory reserved by memfd_create-->ftruncate-->mmap,
what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?
Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?
If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)
  memfd = memfd_create("nohuge", 0);
...
   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------
发件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
发送时间:08/09/22 17:25:35
收件人:Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>
主题:RE: About memory coherency
There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,
Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi
I am confusing about the "no-huge" option of DPDK 21.11.
The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.
But when I check the EAL source code, I found some code piece like this:
It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory
and then provide to application with rte_malloc.
Am I right?
If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c
static int eal_legacy_hugepage_init(void){
....
 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......
  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....
   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);
...

[-- Attachment #2: Type: text/html, Size: 30944 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* 回复：About memory cohere =?utf-8?Q?ncy
  2022-08-09 12:16             ` 回复：回复：RE: " Burakov, Anatoly
@ 2022-08-09 13:05               ` ‪nick.tian
  2022-08-09 13:13                 ` Burakov, Anatoly
  2022-08-09 13:16               ` 回复：回复：RE: About memory coherency Jerin Jacob Kollanukkaran
  1 sibling, 1 reply; 13+ messages in thread
From: ‪nick.tian@longsailingsemi.com‬ @ 2022-08-09 13:05 UTC (permalink / raw)
  To: Burakov, Anatoly, Kinsella, Ray, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li, jerinj

[-- Attachment #1: Type: text/html, Size: 22962 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: 回复：About memory cohere =?utf-8?Q?ncy
  2022-08-09 13:05               ` 回复：About memory cohere =?utf-8?Q?ncy ‪nick.tian
@ 2022-08-09 13:13                 ` Burakov, Anatoly
  0 siblings, 0 replies; 13+ messages in thread
From: Burakov, Anatoly @ 2022-08-09 13:13 UTC (permalink / raw)
  To: ‪nick.tian@longsailingsemi.com‬, Kinsella, Ray, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li, jerinj

[-- Attachment #1: Type: text/plain, Size: 9762 bytes --]

rte_malloc merely distributes memory that was already allocated from the system using mmap(). This isn’t even guaranteed with hugepages unless you’re using IOMMU or using legacy mem mode, but without hugepages, there is no way we can allocate physically contiguous memory, because we have no control over what the kernel gives us.

So, the underlying memory will not be physically contiguous, and, as a consequence, neither will be one given out by rte_malloc.

Thanks,
Anatoly

From: ‪nick.tian@longsailingsemi.com <nick.tian@longsailingsemi.com>
Sent: Tuesday, August 9, 2022 2:06 PM
To: Burakov, Anatoly <anatoly.burakov@intel.com>; Kinsella, Ray <ray.kinsella@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>; jerinj@marvell.com
Subject: 回复：About memory cohere =?utf-8?Q?ncy

Hi Anatoly
Memory allocated with ‘no-huge’ will not be physically contiguous.

Do you means dpdk can not guarantee the memory which allocated by calling rte_malloc once is physically continuous?

Or memory1(allocated by rte_malloc at first) and memory2(allocated by rte_malloc later) is not contiguous?

-------- 原始邮件 --------
发件人： "Burakov, Anatoly" <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
日期： 2022年8月9日周二 晚上8:16
收件人： "Kinsella, Ray" <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org>
抄送： Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>, Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>, Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>, jerinj@marvell.com<mailto:jerinj@marvell.com>
主 题： RE: 回复：回复：RE: About memory cohere =?utf-8?Q?ncy

Memory allocated with ‘no-huge’ will not be physically contiguous. Whether it’s cacheable is, I assume, depending on your platform. I can’t comment on ARM platforms, so I’ll defer to Jerin et al. to comment on this 😊

Thanks,

Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 12:44 PM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>; Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>; Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>; jerinj@marvell.com<mailto:jerinj@marvell.com>
Subject: RE: 回复：回复：RE: About memory coherency

Hi Nick,

Adding Jerin to direct your query. As I think you need some of the ARM guys to chime in here, coherency constraints are ARM is different to Intel.

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 12:32
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>; Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>; Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>
Subject: 回复：回复：RE: About memory coherency

Hi  Anatoly

Let me provide some detail about our user scenario.

1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.

2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).

3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).

4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).

5.Application use the virtual address of this memory to read the descriptor and payload to do something else.

As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.

(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get

the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) is UNCACHEABLE/Physically continous or not?

We need it physically continous and UNCACHEABLE,  otherwise it will not suitable for this user scenario.

------------------原始邮件 ------------------

发件人:Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>

发送时间:08/09/22 17:41:35

收件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>, Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>

抄送:Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>, Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>, Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>

主题:回复：RE: About memory coherency

Hi Burakov

Thanks for your reply.

BTW, about the memory reserved by memfd_create-->ftruncate-->mmap,

what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?

Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?

If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)

  memfd = memfd_create("nohuge", 0);
...

   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------

发件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>

发送时间:08/09/22 17:25:35

收件人:Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>

主题:RE: About memory coherency

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,

Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi

I am confusing about the "no-huge" option of DPDK 21.11.

The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.

But when I check the EAL source code, I found some code piece like this:

It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory

and then provide to application with rte_malloc.

Am I right?

If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c

static int eal_legacy_hugepage_init(void){

....

 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......

  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....

   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);

...

[-- Attachment #2: Type: text/html, Size: 30420 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: 回复：回复：RE: About memory coherency
  2022-08-09 12:16             ` 回复：回复：RE: " Burakov, Anatoly
  2022-08-09 13:05               ` 回复：About memory cohere =?utf-8?Q?ncy ‪nick.tian
@ 2022-08-09 13:16               ` Jerin Jacob Kollanukkaran
  1 sibling, 0 replies; 13+ messages in thread
From: Jerin Jacob Kollanukkaran @ 2022-08-09 13:16 UTC (permalink / raw)
  To: Burakov, Anatoly, Kinsella, Ray, Nick Tian, users
  Cc: Jason Liu, Sunshine Qin, Mediter Li

[-- Attachment #1: Type: text/plain, Size: 8199 bytes --]

> I assume, depending on your platform. I can’t comment on ARM platforms, so I’ll defer to Jerin et al. to comment on this 😊

it will be cacheable.

________________________________
From: Burakov, Anatoly <anatoly.burakov@intel.com>
Sent: Tuesday, August 9, 2022 5:46 PM
To: Kinsella, Ray <ray.kinsella@intel.com>; Nick Tian <nick.tian@longsailingsemi.com>; users@dpdk.org <users@dpdk.org>
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>; Jerin Jacob Kollanukkaran <jerinj@marvell.com>
Subject: [EXT] RE: 回复：回复：RE: About memory coherency

External Email
________________________________

Memory allocated with ‘no-huge’ will not be physically contiguous. Whether it’s cacheable is, I assume, depending on your platform. I can’t comment on ARM platforms, so I’ll defer to Jerin et al. to comment on this 😊

Thanks,

Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com>
Sent: Tuesday, August 9, 2022 12:44 PM
To: Nick Tian <nick.tian@longsailingsemi.com>; Burakov, Anatoly <anatoly.burakov@intel.com>; users@dpdk.org
Cc: Jason Liu <jason.liu@longsailingsemi.com>; Sunshine Qin <sunshine.qin@longsailingsemi.com>; Mediter Li <mediter.li@longsailingsemi.com>; jerinj@marvell.com
Subject: RE: 回复：回复：RE: About memory coherency

Hi Nick,

Adding Jerin to direct your query. As I think you need some of the ARM guys to chime in here, coherency constraints are ARM is different to Intel.

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 12:32
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>; Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>; Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>; Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>
Subject: 回复：回复：RE: About memory coherency

Hi  Anatoly

Let me provide some detail about our user scenario.

1.Our HW platform does not support IOMMU, it's a ARM A53 based platform.

2.We try to use DPDK to accelerate the ethernet forwarding performance with our own HW accelerator(Let use HAC as abbreviate).

3.We configure DPDK as "no-huge", and try to use rte_malloc to allocate memory and pass the PHY address of this memory to HAC(write the PHY address to it's register).

4.When HAC receive the ethernet packet, it will parse the packet data and write the descriptor and payload to the memory which allocated in step 3.(The HAC know where to write since we already tell it the address with it's register).

5.Application use the virtual address of this memory to read the descriptor and payload to do something else.

As far as you know, the memory which would be passed to HW should be "uncacheable" and "physically continous" to avoid the in-coherency issue and other issue.

(Given 64 bytes start with Address1 has already in CPU L2 cache, in the meantime , HAC write 32 bytes from Address1, then in-coherency issue will happen. since if we read the content from Address1 with software, the software will get

the old copy from cache instead of the latest one in DDR).

So, I need to know the memory which allocated by rte_malloc( configure with no-huge) is UNCACHEABLE/Physically continous or not?

We need it physically continous and UNCACHEABLE,  otherwise it will not suitable for this user scenario.

------------------原始邮件 ------------------

发件人:Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>

发送时间:08/09/22 17:41:35

收件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>, Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>

抄送:Jason Liu <jason.liu@longsailingsemi.com<mailto:jason.liu@longsailingsemi.com>>, Sunshine Qin <sunshine.qin@longsailingsemi.com<mailto:sunshine.qin@longsailingsemi.com>>, Mediter Li <mediter.li@longsailingsemi.com<mailto:mediter.li@longsailingsemi.com>>

主题:回复：RE: About memory coherency

Hi Burakov

Thanks for your reply.

BTW, about the memory reserved by memfd_create-->ftruncate-->mmap,

what on earth is the coherency between cache and DDR? In another word, is it cacheable?uncacheable?

Is it possible for application to pass this memory to a device with DMA controller(I mean pass the PHY addr coverted by  rte_mem_virt2phy to DMA controller)?

If yes, how can we ensure the coherency between cache and DDR?

static int eal_legacy_hugepage_init(void)

  memfd = memfd_create("nohuge", 0);
...

   fd = memfd;
    flags = MAP_SHARED; //MAP_SHARED means UNCACHEABLE?

------------------原始邮件 ------------------

发件人:Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>

发送时间:08/09/22 17:25:35

收件人:Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>, Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>, users@dpdk.org<mailto:users@dpdk.org> <users@dpdk.org<mailto:users@dpdk.org>>

主题:RE: About memory coherency

There are two different issues at play here.

The purpose of “no-huge” flag is to run DPDK without requiring hugepage memory. Originally, this has been done using an anonymous mmap() call – so, this memory was not using any fd’s at all. This presents a problem with vhost-user, because it relies on fd’s for its shared memory implementation. This is what memfd (a relatively recent addition to the kernel) is addressing – it’s enabling usage of vhost-user with no-huge because memfd actually does create an fd to back our memory.

That said, while description says “malloc”, it istechnically incorrect because there’s no malloc involved in the process. The “malloc” term is simply shorthand for “use regular memory”, and should be understood in that context.

Thanks,

Anatoly

From: Kinsella, Ray <ray.kinsella@intel.com<mailto:ray.kinsella@intel.com>>
Sent: Tuesday, August 9, 2022 10:04 AM
To: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>; users@dpdk.org<mailto:users@dpdk.org>
Cc: Burakov, Anatoly <anatoly.burakov@intel.com<mailto:anatoly.burakov@intel.com>>
Subject: RE: About memory coherency

I may be incorrect, but is it not simply the case, that when using the no-huge parameter that MAP_HUGETLB is omitted from flags?

Ray K

From: Nick Tian <nick.tian@longsailingsemi.com<mailto:nick.tian@longsailingsemi.com>>
Sent: Tuesday 9 August 2022 03:55
To: users@dpdk.org<mailto:users@dpdk.org>
Subject: About memory coherency

Hi

I am confusing about the "no-huge" option of DPDK 21.11.

The dpdk usage said: --no-huge:Use malloc instead of hugetlbfs.

But when I check the EAL source code, I found some code piece like this:

It's look like "no-huge" option will lead dpdk use memfd_create-->ftruncate-->mmap to reserve memory

and then provide to application with rte_malloc.

Am I right?

If so, what the "malloc" in "use malloc instead of hugelbfs" refer to?

EAL_memory.c

static int eal_legacy_hugepage_init(void){

....

 if (internal_conf->no_hugetlbfs) {
....
#ifdef MEMFD_SUPPORTED
  /* create a memfd and store it in the segment fd table */
  memfd = memfd_create("nohuge", 0);
......

  /* we got an fd - now resize it */
   if (ftruncate(memfd, internal_conf->memory) < 0) {
.....

   fd = memfd;
    flags = MAP_SHARED;   }
....
  prealloc_addr = msl->base_va;
  addr = mmap(prealloc_addr, mem_sz, PROT_READ | PROT_WRITE,
    flags | MAP_FIXED, fd, 0);

...

[-- Attachment #2: Type: text/html, Size: 31602 bytes --]

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2022-08-11  8:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-08-09  2:38 About memory coherency Nick Tian
2022-08-09  2:54 ` Nick Tian
2022-08-09  9:04   ` Kinsella, Ray
2022-08-09  9:25     ` Burakov, Anatoly
2022-08-09  9:41       ` 回复：RE: " Nick Tian
2022-08-09  9:59         ` Burakov, Anatoly
2022-08-09 11:32         ` 回复：回复：RE: " Nick Tian
2022-08-09 11:44           ` Kinsella, Ray
2022-08-09 11:57             ` Nick Tian
2022-08-09 12:16             ` 回复：回复：RE: " Burakov, Anatoly
2022-08-09 13:05               ` 回复：About memory cohere =?utf-8?Q?ncy ‪nick.tian
2022-08-09 13:13                 ` Burakov, Anatoly
2022-08-09 13:16               ` 回复：回复：RE: About memory coherency Jerin Jacob Kollanukkaran

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).