From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id D3E90A046B for ; Fri, 26 Jul 2019 11:05:08 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 4E0D11B998; Fri, 26 Jul 2019 11:05:08 +0200 (CEST) Received: from mga06.intel.com (mga06.intel.com [134.134.136.31]) by dpdk.org (Postfix) with ESMTP id EB1F327D for ; Fri, 26 Jul 2019 11:05:05 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Jul 2019 02:05:04 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,310,1559545200"; d="scan'208";a="369473035" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.252.25.220]) ([10.252.25.220]) by fmsmga005.fm.intel.com with ESMTP; 26 Jul 2019 02:05:03 -0700 To: "dev@dpdk.org" Cc: Thomas Monjalon , "Richardson, Bruce" From: "Burakov, Anatoly" Message-ID: <29cf5458-8459-0187-13b1-44277283fc93@intel.com> Date: Fri, 26 Jul 2019 10:05:02 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.8.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: [dpdk-dev] Should we disallow running secondaries after primary has died? X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi all, While investigating this bug: https://bugs.dpdk.org/show_bug.cgi?id=284 I came across a realization that, when primary process dies, very little actually works. There are some documented issues that are already present when secondary processes keep running, like memory map becoming static, and hotplug not working any more. What is less known (and documented) is that VFIO also completely stops working when initializing processes, because some time since 18.xx releases, we've fixed a long standing VFIO-related bug that had to do with creating new containers every time a secondary is run - secondary processes will now reuse the primary process's container instead. Meaning, for VFIO devices, secondary process *initialization* will fail after primary process has died, because there is no longer a process from which we can get the VFIO container from. Things will still sort-of work with igb_uio or in vfio-noiommu mode, but again - no memory map updates, no hotplug, potentially other things that i don't even know about. Therefore, while ideally we would like people to have primary process always running, the least we can do to avoid documenting a complex matrix of "what is supported in which case" is to disallow secondary process initialization after primary process has died. ("disallow" as in "explicitly document it as unsupported", although we can also outright prevent it if we want - rte_eal_primary_proc_alive() will tell us that) Thoughts? -- Thanks, Anatoly