Hello Bruce, Thank you for your mail! I have attached the complete log that we observed. Upon your suggestion, we experimented with the --in-memory option and it worked! Since this option enables both --no-shconf and --huge-unlink. We also tried each option separately. The test ran fine with the --no-shconf option, but it failed when only the --huge-unlink option was added. Now when --no-shconf option also worked, we also need some understanding whether it will impact multi-queue performance and if there are any other costs associated with using this flag. Best Regards, Avijit Pandey Cloud SME | VoerEirAB +919598570190 From: Bruce Richardson Date: Wednesday, 27 March 2024 at 20:25 To: Avijit Pandey Cc: dev@dpdk.org Subject: Re: Error in rte_eal_init() when multiple PODs over single node of K8 cluster On Wed, Mar 27, 2024 at 12:42:55PM +0000, Avijit Pandey wrote: > Hello Devs, > > > I hope this email finds you well. > > I am reaching out to seek assistance regarding an issue I am facing in > DPDK within my Kubernetes cluster. > > > I have deployed a Kubernetes cluster v1.26.0, and I am currently > running network testing through DPPD-PRoX ([1]commit/02425932) using > DPDK (v22.11.0). I have deployed 3 pairs of PODs (3 server pods and 3 > client pods) on a single K8 node. The server generates and sends > traffic to the receiver pod. > > > During the automated testing, I encounter an error: "Error in > rte_eal_init()." This error occurs randomly, and I am unable to > determine the root cause. However, this issue does not occur when I use > a single pair of PODs (1 server pod and 1 client pod). The traffic is > sent and received through the sriov NICs. > > > With master core index 23, full core mask is 0x2800000 > > EAL command line: /opt/samplevnf/VNFs/DPPD-PROX/build/prox > -c0x2800000 --main-lcore=23 -n4 --allow 0000:86:04.6 > > error Error in rte_eal_init() > > Not sure what the problem is exactly, without a better error message. Can you manage to provide the EAL output in the failure case, perhaps using --log-level flag to up the log levels a bit higher if the error is not clear from the default output. Also, in case of running multiple instances of DPDK on a single system, I'd generally recommend passing --in-memory flag to each instance to avoid issues with conflicts over hugepage files. (This will disable support for DPDK multi-process operation, so don't use the flag if that is a feature you are using.) /Bruce PS: couple of other comments on your commandline that may be of interest, since it's a little longer than it needs to be :-) - We'd generally recommend, for clarity, using "-l" flag rather than "-c" for passing core masks. In your case "-c 0x2800000" should be equivalent to the more comprehensible "-l 23,25". - DPDK always uses the lowest core number as the main lcore, so in the example above --main-lcore=23 should be superfluous and can be omitted - For mempool creation, -n 4 is the default in DPDK if unsupecified, so again that flag can be dropped without impact, unless something specific in the app depends on it in some other way. - If you want to shorten your allow list a little, the "0000:" can be dropped from the PCI address. So "--allow 0000:86:04.6" can be "-a 86:04.6"