From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 55FA51094 for ; Thu, 6 Jul 2017 15:08:14 +0200 (CEST) Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 827657F411; Thu, 6 Jul 2017 13:08:13 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com 827657F411 Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx01.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=maxime.coquelin@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com 827657F411 Received: from [10.36.112.14] (unknown [10.36.112.14]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 15F4570130; Thu, 6 Jul 2017 13:08:10 +0000 (UTC) To: santosh , Jerin Jacob Cc: thomas@monjalon.net, bruce.richardson@intel.com, dev@dpdk.org, hemant.agrawal@nxp.com, shreyansh.jain@nxp.com, gaetan.rivet@6wind.com References: <20170608110513.22548-1-santosh.shukla@caviumnetworks.com> <20170608110513.22548-8-santosh.shukla@caviumnetworks.com> <730e333b-a9ab-df8b-cf7a-1e0186c6152d@redhat.com> <20170705154314.GA4635@jerin> <2fe366fb-15fa-f754-458e-3f4e8be18699@redhat.com> <20170706094939.GA1709@jerin> <89425d75-3f79-d3e8-f0b1-330292866bbb@redhat.com> From: Maxime Coquelin Message-ID: Date: Thu, 6 Jul 2017 15:08:09 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.1.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Thu, 06 Jul 2017 13:08:13 +0000 (UTC) Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova mode before mapping X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 06 Jul 2017 13:08:14 -0000 On 07/06/2017 01:19 PM, santosh wrote: > On Thursday 06 July 2017 04:29 PM, Maxime Coquelin wrote: > >> >> On 07/06/2017 11:49 AM, Jerin Jacob wrote: >>> -----Original Message----- >>>> Date: Thu, 6 Jul 2017 09:58:41 +0200 >>>> From: Maxime Coquelin >>>> To: Jerin Jacob >>>> CC: Santosh Shukla , >>>> thomas@monjalon.net, bruce.richardson@intel.com, dev@dpdk.org, >>>> hemant.agrawal@nxp.com, shreyansh.jain@nxp.com, gaetan.rivet@6wind.com >>>> Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova mode >>>> before mapping >>>> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 >>>> Thunderbird/52.1.0 >>>> >>>> >>>> >>>> On 07/05/2017 05:43 PM, Jerin Jacob wrote: >>>>> -----Original Message----- >>>>>> Date: Wed, 5 Jul 2017 11:14:01 +0200 >>>>>> From: Maxime Coquelin >>>>>> To: Santosh Shukla , >>>>>> thomas@monjalon.net, bruce.richardson@intel.com, dev@dpdk.org >>>>>> CC: jerin.jacob@caviumnetworks.com, hemant.agrawal@nxp.com, >>>>>> shreyansh.jain@nxp.com, gaetan.rivet@6wind.com >>>>>> Subject: Re: [dpdk-dev] [PATCH 07/10] linuxapp/eal_vfio: honor iova mode >>>>>> before mapping >>>>>> User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 >>>>>> Thunderbird/52.1.0 >>>>>> >>>>>> >>>>>> >>>>>> On 06/08/2017 01:05 PM, Santosh Shukla wrote: >>>>>>> Check iova mode and accordingly map iova to pa or va. >>>>>>> >>>>>>> Signed-off-by: Santosh Shukla >>>>>>> Signed-off-by: Jerin Jacob >>>>>>> --- >>>>>>> lib/librte_eal/linuxapp/eal/eal_vfio.c | 10 ++++++++-- >>>>>>> 1 file changed, 8 insertions(+), 2 deletions(-) >>>>>>> >>>>>>> diff --git a/lib/librte_eal/linuxapp/eal/eal_vfio.c b/lib/librte_eal/linuxapp/eal/eal_vfio.c >>>>>>> index 04914406f..348b7a7f4 100644 >>>>>>> --- a/lib/librte_eal/linuxapp/eal/eal_vfio.c >>>>>>> +++ b/lib/librte_eal/linuxapp/eal/eal_vfio.c >>>>>>> @@ -706,7 +706,10 @@ vfio_type1_dma_map(int vfio_container_fd) >>>>>>> dma_map.argsz = sizeof(struct vfio_iommu_type1_dma_map); >>>>>>> dma_map.vaddr = ms[i].addr_64; >>>>>>> dma_map.size = ms[i].len; >>>>>>> - dma_map.iova = ms[i].phys_addr; >>>>>>> + if (rte_eal_iova_mode() == RTE_IOVA_VA) >>>>>>> + dma_map.iova = dma_map.vaddr; >>>>>>> + else >>>>>>> + dma_map.iova = ms[i].phys_addr; >>>>>>> dma_map.flags = VFIO_DMA_MAP_FLAG_READ | VFIO_DMA_MAP_FLAG_WRITE; >>>>>> >>>>>> IIUC, it is changing default behavior for VFIO devices. >>>>>> >>>>>> I see a possible problem, but I'm not sure the case is valid. >>>>>> >>>>>> Imagine you have two devices in the iommu group, and the two devices are >>>>>> used in separate processes. Each process could try two different >>>>>> physical addresses at the same virtual address, and so the second map >>>>>> would fail. >>>>> >>>>> IMO, Doesn't look like a problem. Here is the data flow >>>>> >>>>> 1) The vfio DMA map function(vfio_type1_dma_map()) will be called only >>>>> on primary process >>>>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_vfio.c#n359 >>>>> >>>>> 2) On secondary process, DPDK rte_eal_huge_page_attach() will make sure >>>>> that, the Secondary process has the _same_ virtual address as primary or >>>>> exit from on attach. >>>>> http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal_memory.c#n1452 >>>>> >>>>> 3) Since secondary process adds the mapped the virtual address in step (2). >>>>> in the page table in OS. On SMMU entry miss(When device >>>>> request from I/O transaction), OS will load the mapping and update the SMMU >>>>> "context" with page tables from MMU. >>>> >>>> Ok thanks for the detailed info, but what about the case where the same >>>> iommu group is used by two primary processes? >>> >>> Does that case exist with DPDK? We always need to blacklist same BDF in >>> the secondary process to make things work with existing DPDK setup. Which >>> make sense as well. Only primary process configures the HW blocks. >> >> I meant the case when two BDF are in the same IOMMU group (if ACS is not >> supported at some point in the hierarchy). And I meant two primary >> processes running, like for example two containers running each a DPDK >> application. >> >> Maybe this is not a valid use-case (it is not secure, as it would break >> isolation between the two containers), but it seems that it is something >> DPDK allows today, if I'm not mistaken. >> > I'm not sure how two primary process could run, as because latter primary process > would try accessing /var/run/.rte_config and would fail at this [1] point. > > It's not valid use-case for dpdk (imo). > [1] http://dpdk.org/browse/dpdk/tree/lib/librte_eal/linuxapp/eal/eal.c#n204 Yes this is possible. I had never used it before, but Thomas told me it is supported by setting--file-prefix option. I had a trial, and I confirm it works: session 1> ./install/bin/testpmd -l 0,2 --socket-mem=1024 -w 0000:05:00.0 --proc-type=primary --file-prefix=app1 -- --disable-hw-vlan -i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io session 2> ./install/bin/testpmd -l 0,3 --socket-mem=1024 -w 0000:05:00.1 --proc-type=primary --file-prefix=app2 -- --disable-hw-vlan -i --rxq=1 --txq=1 --nb-cores=1 --forward-mode=io In the above example, two ports of the same card is used by two processes. Note that in this case, ACS is supproted and both ports have their own iommu group. Maxime