Hi Stephen and Michal

Thanks a lot for all the discussions and progress made on this.Appreciate it.
Sorry for the late reply. To answer your questions:

1. Is the application you're using the single-process or multiprocess?
If so, from which process are you probing for the xstats?
>> System has both primary and secondary processes running. but the stats are being fetched from the primary process only. I'm not sure if the presence of secondary processes is causing the crash even if we try to fetch stats from the primary process. Can we confirm this from the code?

2. Have you tried running latest DPDK v20.11 LTS?
>> It's DPDK v20.11.1. Did not try with the latest 20.11 LTS.

3. What kernel module are you using (igb_uio/vfio-pci)?
>> It's igb_uio.

4. On what AWS instance type it was reproduced?
>> It's c5n.2xlarge. ( 8 cores. 1 primary process and 6 secondary processes.)

5. Is the Seg Fault happening the first time you call for the xstats?
>> Yes. That's correct.

Regards
Amiya



On Wed, Apr 20, 2022 at 4:39 AM Stephen Hemminger <stephen@networkplumber.org> wrote:
On Tue, 19 Apr 2022 22:27:32 +0200
Michał Krawczyk <mk@semihalf.com> wrote:

> Thanks Stephen, indeed the issue reproduces in the secondary process.
>
> Basically ENA v2.2.1 is not MP aware, meaning it cannot be used safely
> from the secondary process. The main obstacle is the admin queue which
> is used for processing the hardware requests which can be used safely
> only from the primary process. It's not strictly a bug, as we weren't
> exposing 'MP Awareness' in the PMD features list, it's more like a
> lack of proper MP support.
>
> The latest ENA PMD release should be MP safe. We currently don't have
> PMD backport ready for the older LTS release (but we're planning to do
> so for ENA v2.6.0 on the amzn-drivers repository:
> https://urldefense.com/v3/__https://github.com/amzn/amzn-drivers/tree/master/userspace/dpdk__;!!Mt_FR42WkD9csi9Y!ZAgIa147k7j0wwnu83K-vq8T9bH0gWwoldqHg9IshR1CSkTYpJOLzT35FhtlVPDkWbN9CZMv469Jj68fwxrqFsQQErwYHNc$ ).

I wish that ENA did not have its own versioning scheme.
Driver versions are meaningful only to the driver writer/vendor, they
don't help the end user.

Since backporting is not part of stable process. I suggest doing what
XDP did for 21.11 and earlier releases.

diff --git a/drivers/net/ena/ena_ethdev.c b/drivers/net/ena/ena_ethdev.c
index 634c97acf60d..3778349f3fe9 100644
--- a/drivers/net/ena/ena_ethdev.c
+++ b/drivers/net/ena/ena_ethdev.c
@@ -3212,6 +3212,12 @@ static int ena_rx_queue_intr_disable(struct rte_eth_dev *dev,
 static int eth_ena_pci_probe(struct rte_pci_driver *pci_drv __rte_unused,
        struct rte_pci_device *pci_dev)
 {
+       if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+               PMD_INIT_LOG(ERR,
+                           "Ena PMD does not support secondary processes\n");
+               return -ENOTSUP;
+       }
+
        return rte_eth_dev_pci_generic_probe(pci_dev,
                sizeof(struct ena_adapter), eth_ena_dev_init);
 }