Hi All,
I have a crash inside of rte_mempool_create which happens when I allocate multiple big mem pools(multiple GB each), in particular first argument of function alloc_seg(LINUX) is null, I investigated a little bit and it seems like there is a bug inside of rte_fbarray_find_next_n_free.
Inside of alloc_seg_walk rte_fbarray_find_next_n_free is used to find if the current memseg_arr has n (6 in my case) consecutive free pages, it does not in reality, but rte_fbarray_find_next_n_free returns that it does and we read out of bounds from memseg_arr which leads to null argument.
Bug is in find_next_n function called from rte_fbarray_find_next_n_free,
first = MASK_LEN_TO_IDX(start);
first_mod = MASK_LEN_TO_MOD(start);
ignore_msk = ~((1ULL << first_mod) - 1); // in my case first_mod is 0 and ignore_mask is all 1
last = MASK_LEN_TO_IDX(arr->len);
last_mod = MASK_LEN_TO_MOD(arr->len);
last_msk = ~(UINT64_MAX << last_mod); //arr->len is 32 (arr is memseg_arr)
.....SKIP.....
/* if we're looking for free spaces, invert the mask */
if (!used) // THIS IS TRUE
cur_msk = ~cur_msk;
/* combine current ignore mask with last index ignore mask */
if (msk_idx == last) // TRUE
ignore_msk |= last_msk; // BUG HERE
// Since ignore_mask is all 1, it just absorbs last_msk and in the end, last_msk doesn't matter, which then leads to an incorrect result because bits after 32 are treated as free space and not masked correctly by last_msk.
/* if we have an ignore mask, ignore once */
if (ignore_msk) {
cur_msk &= ignore_msk;
ignore_msk = 0; // If there are multiple used_mask msk->n_masks this set to zero fixes the issue because when we get to BUG ignore_msk is already set to 0 and doesn't absosrb last_msk, but in my case, n_masks is 1.
}
I think ignore_msk |= last_msk; should be replaced by cur_msk &= last_msk; I tried this and dpdk doesn't crash anymore and seems to work fine, but I'd like someone else familiar with this code to take a look at this.
Best Regards,
Oleksandr