DPDK usage discussions
 help / color / mirror / Atom feed
From: Sudhakar Vajha <sudhakar.vajha@oracle.com>
To: "users@dpdk.org" <users@dpdk.org>
Subject: KASLR enabled in Linux Kernel
Date: Mon, 22 Jan 2024 09:58:48 +0000	[thread overview]
Message-ID: <PH0PR10MB472839FEC31AB6CC8352C5F0EA752@PH0PR10MB4728.namprd10.prod.outlook.com> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 6911 bytes --]

Hi Team,

The issue we are facing:
We have a telecom product called "Session Boarder Controller(SBC)" version 6400 in which we are using DPDK version 22.11.1.

Users have encountered instances where the activation of ASLR in the Linux Kernel results in DPDK initialization failures on the SBC 6400 platform. As ASLR is needed for FIPS, this issue poses a challenge for users seeking to benefit from both enhanced security through ASLR and the high-performance packet processing capabilities offered by DPDK.

Problem analysis
The DPDK defines the memory type in the following way:
The number of huge page types * the number of NUMA nodes present in the system i.e. 2 * 1 = 2, which means that there are two memory types (two huge pages 1GB and 2MB) with one NUMA node.

Deciding the amount of memory going towards each memory type is a balancing act between maximum segments per type, maximum memory per type, and number of detected NUMA nodes. The goal is to make sure each memory type gets at least one memseg list.

The total amount of memory is limited by RTE_MAX_MEM_MB value.

The total amount of memory per type is limited by either RTE_MAX_MEM_MB_PER_TYPE, or by RTE_MAX_MEM_MB divided by the number of detected NUMA nodes. Additionally, maximum number of segments per type is also limited by RTE_MAX_MEMSEG_PER_TYPE. This is because for smaller page sizes, it can take hundreds of thousands of segments to reach the above specified per-type memory limits.

Additionally, each type may have multiple memseg lists associated with it, each limited by either RTE_MAX_MEM_MB_PER_LIST for bigger page sizes, or RTE_MAX_MEMSEG_PER_LIST segments for smaller ones. The number of memseg lists per type is decided based on the above limits, and also take number of detected NUMA nodes, to make sure that doesn't run out of memseg lists before we populate all NUMA nodes with memory.

          #define RTE_MAX_MEM_MB 524288 defined in rte_build_config.h file.
          #define RTE_MAX_MEM_MB_PER_TYPE 65536 defined in rte_config.h file.
          #define RTE_MAX_MEMSEG_PER_LIST 32768
          #define RTE_MAX_MEM_MB_PER_LIST 65536
          #define RTE_MAX_MEMSEG_PER_TYPE 32768

          max_mem = (uint64_t)RTE_MAX_MEM_MB << 20;
          max_mem_per_type = RTE_MIN((uint64_t)RTE_MAX_MEM_MB_PER_TYPE << 20,
                                                             max_mem / n_memtypes);

          The following logs are captured from 6400 during boot-up time:
          EAL: eal_dynmem_memseg_lists_init:117 n_memtypes = 2!!!!!
          EAL: eal_dynmem_memseg_lists_init:124 max_mem:549755813888 max_mem_per_type :68719476736
          EAL: eal_dynmem_memseg_lists_init:132 max_seglists_per_type = 64!!!!!
          EAL: eal_dynmem_memseg_lists_init:175 max_segs_per_type = 64!!!!!

          EAL: eal_dynmem_memseg_lists_init:179 max_segs_per_list = 64!!!!!
          EAL: eal_dynmem_memseg_lists_init:184 max_mem_per_list = 68719476736!!!!!
EAL: eal_dynmem_memseg_lists_init:188 n_segs = 64!!!!!

Each memory type is created the following named memseg lists:
*         memseg-1048576k-0-0(1GB) with 64 segments:
*         memseg-2048k-0-0(2MB) with 32768 segments.

During SBC 6400 initialization, requesting the system to create 64 huge pages of 1GB size. DPDK allocates all these 64 huge pages of 1 GB size in a contiguous physical memory location. If all these pages are allocated in a contiguous memory location, no issue has been observed while remapping the huge pages into the memory segment list of size 64. But with ASLR enabled, it is not guaranteed that the memory for huge pages will always be allocated in contiguous memory locations. When ASLR is enabled, if DPDK creates the memory for huge pages in a contiguous memory location, remapping the huge page memory into the memory segment list will be done at once in one step. This is the default behavior.

The issue is happening while remapping the 64 huge pages, not created in contiguous physical memory, into the memory segment list. When huge pages are not contiguous, the remapping will be done in two steps:

1st step:

Huge page memory layout:
   0    1     2     3     4     5     6     7     8     9................................................................................63


























For example, if 0-9 pages are contiguous, and the rest of the huge pages are stored in different physical memory locations, only 0-9 huge pages will be remapped into the memory segment list.



Memory segment list:
   0    1     2     3     4     5     6     7     8     9................................................................................63





























2nd Step:

Remapping will be done again for the rest of the huge pages, this time, as the memory segment list is not empty (as it is already having 9 segments), DPDK is leaving a space for one segment in the memory segment list and try to remap the huge pages into the rest of the segments in the memory segment list. As both huge pages and memory segment list are equal in size 64, DPDK is failing to get the enough memory from the memory segment list as it is already left the space for one segment in the memory segment list.


Huge page memory layout:
   0    1     2     3     4     5     6     7     8     9                   10   11     ..................................................63


























The remaining huge pages are 54. DPDK tries to remap the 54 huge pages into the memory segment list.

But the memory segment list is having 53 segments as it is left one segment for a hole. Hence, the memory allocation would be failed and initialization of DPDK would also be failed.

Memory segment list:
   0    1     2     3     4     5     6     7     8     9     10   11   12   13  .....................................................63










[cid:image001.png@01DA4D47.AE9DB9C0]














[cid:image002.png@01DA4D47.AE9DB9C0][Leave space for a hole if memory segment list  is not empty][The space is having only for 53 segments.]









Note: Why DPDK is leaving a space for a hole in a memory segment list?

Basically, DPDK is leaving the space to know how many segments there are in order to map all pages into one address space, and leave appropriate holes between segments so that rte_malloc does not concatenate them into one big segment. But in this case, all the 64 pages are belongs to one address space and leaving space for a hole is not required.



The following are my queries:

1.    Why the space is leaving in the memseg list? And what is the significance of the hole?

2.    Can I scale up the size of the memory segment list to greater than the 64?



Regards,

Sudhakar


[-- Attachment #1.2: Type: text/html, Size: 65215 bytes --]

[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 436 bytes --]

[-- Attachment #3: image002.png --]
[-- Type: image/png, Size: 600 bytes --]

[-- Attachment #4: image003.emz --]
[-- Type: application/octet-stream, Size: 983 bytes --]

[-- Attachment #5: image004.png --]
[-- Type: image/png, Size: 1864 bytes --]

[-- Attachment #6: image005.emz --]
[-- Type: application/octet-stream, Size: 931 bytes --]

[-- Attachment #7: image006.png --]
[-- Type: image/png, Size: 1558 bytes --]

             reply	other threads:[~2024-01-22  9:58 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-22  9:58 Sudhakar Vajha [this message]
2024-01-23 21:00 ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=PH0PR10MB472839FEC31AB6CC8352C5F0EA752@PH0PR10MB4728.namprd10.prod.outlook.com \
    --to=sudhakar.vajha@oracle.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).