DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
To: Stephen Hemminger <stephen@networkplumber.org>,
	Ferruh Yigit <ferruh.yigit@intel.com>
Cc: dpdk-dev <dev@dpdk.org>, Thomas Monjalon <thomas@monjalon.net>
Subject: Re: [dpdk-dev] [RFC] pci: force address of mappings in secondary process
Date: Mon, 28 Jan 2019 09:59:56 +0000	[thread overview]
Message-ID: <a537ff45-bf21-9cbc-9735-39e0d7440a7b@intel.com> (raw)
In-Reply-To: <20190124093734.0d3312c2@shemminger-XPS-13-9360>

On 23-Jan-19 8:37 PM, Stephen Hemminger wrote:
> On Wed, 23 Jan 2019 19:21:03 +0000
> Ferruh Yigit <ferruh.yigit@intel.com> wrote:
> 
>> On 7/12/2017 9:58 AM, jianfeng.tan at intel.com (Tan, Jianfeng) wrote:
>>>> -----Original Message-----
>>>> From: Gonzalez Monroy, Sergio
>>>> Sent: Wednesday, July 12, 2017 3:32 PM
>>>> To: Tan, Jianfeng; Stephen Hemminger; dev at dpdk.org
>>>> Subject: Re: [dpdk-dev] [RFC] pci: force address of mappings in secondary
>>>> process
>>>>
>>>> On 12/07/2017 03:45, Tan, Jianfeng wrote:
>>>>>   
>>>>>> -----Original Message-----
>>>>>> From: Gonzalez Monroy, Sergio
>>>>>> Sent: Tuesday, July 11, 2017 7:36 PM
>>>>>> To: Tan, Jianfeng; Stephen Hemminger; dev at dpdk.org
>>>>>> Subject: Re: [dpdk-dev] [RFC] pci: force address of mappings in secondary
>>>>>> process
>>>>>>
>>>>>> On 11/07/2017 02:56, Tan, Jianfeng wrote:
>>>>>>>> -----Original Message-----
>>>>>>>> From: dev [mailto:dev-bounces at dpdk.org] On Behalf Of Stephen
>>>>>>>> Hemminger
>>>>>>>> Sent: Tuesday, July 11, 2017 9:13 AM
>>>>>>>> To: dev at dpdk.org
>>>>>>>> Cc: Stephen Hemminger
>>>>>>>> Subject: [dpdk-dev] [RFC] pci: force address of mappings in secondary
>>>>>>>> process
>>>>>>>>
>>>>>>>> The PCI memory resources in the secondary process should be in
>>>>>>>> the exact same location as the primary process. Otherwise
>>>>>>>> there is a risk of a stray pointer.
>>>>>>>>
>>>>>>>> Not sure if this is right, but it looks like a potential
>>>>>>>> problem.
>>>>>>>>
>>>>>>>> ---
>>>>>>>>     lib/librte_eal/common/eal_common_pci_uio.c | 2 +-
>>>>>>>>     1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_eal/common/eal_common_pci_uio.c
>>>>>>>> b/lib/librte_eal/common/eal_common_pci_uio.c
>>>>>>>> index 367a6816dcb8..2156b1a436c4 100644
>>>>>>>> --- a/lib/librte_eal/common/eal_common_pci_uio.c
>>>>>>>> +++ b/lib/librte_eal/common/eal_common_pci_uio.c
>>>>>>>> @@ -77,7 +77,7 @@ pci_uio_map_secondary(struct rte_pci_device
>>>> *dev)
>>>>>>>>
>>>>>>>>     			void *mapaddr = pci_map_resource(uio_res-
>>>>>>>>> maps[i].addr,
>>>>>>>>     					fd, (off_t)uio_res->maps[i].offset,
>>>>>>>> -					(size_t)uio_res->maps[i].size, 0);
>>>>>>>> +					(size_t)uio_res->maps[i].size,
>>>>>>>> MAP_FIXED);
>>>>>>>>     			/* fd is not needed in slave process, close it */
>>>>>>>>     			close(fd);
>>>>>>>>     			if (mapaddr != uio_res->maps[i].addr) {
>>>>>>>> --
>>>>>>>> 2.11.0
>>>>>>> +1 for this RFC. I also once encounter such problem, and I use the same
>>>>>> way to solve it. The addr parameter of mmap() syscall is only a hint instead
>>>> of
>>>>>> a must even the VMA is not occupied yet.
>>>>>>> Thanks,
>>>>>>> Jianfeng
>>>>>> How do you know the VMA is not occupied?
>>>>> I did by check /proc/self/maps.
>>>>>   
>>>>>> I think the risk here is that the dynamic linker loaded some shared
>>>>>> library in that VMA, and forcing MAP_FIXED is not a safe solution.
>>>>>> What I have observed is that Linux will return a different VMA than the
>>>>>> one hinted when there is already a mapping in the requested/hinted
>>>> VMA.
>>>>> IMO, that's not the target of this RFC. The target is to solve the situation (in
>>>> current primary/secondary model) that kernel will not use the addr even
>>>> there's no VMA on that addr. This is my understanding, Stephen, please
>>>> correct me if I'm wrong.
>>>>
>>>> The point I was trying to make is, that you do not know if there is a
>>>> mapping or not in that address, and by using MAP_FIXED it will unmap
>>>> whatever was in there before.
>>>
>>> Oh, I missed that if there's conflict, the existing VMA will be unmapped. That's a bad effect.
>>>    
>>>>
>>>> So unless you parse /proc/self/maps and check that the VMA range is not
>>>> being used, forcing MAP_FIXED is not safe.
>>>>   
>>>>>> I reckon this is a similar issue as we have with the multi-process model
>>>>>> when we do not get the VMA requested for the huge-pages.
>>>>>> AFAIK we do not have a robust solution for this issue other than restart
>>>>>> the program and hope the dynamic linker does not map anything in the
>>>> VMA
>>>>>> ranges that we need to map from the primary. This is also assuming that
>>>>>> the application does not allocate memory and maps things before calling
>>>>>> eal_init as it could potentially use VMA ranges that we need in the
>>>>>> secondary process.
>>>>> This is another problem.
>>>>
>>>> It is the same problem, VMA ranges that we need to map being already used.
>>>
>>> Still two problems from my side:
>>> (1) A VMA already exists on that addr/len range; conflict happens.
>>> (2) Kernel will not allocate the VMA to DPDK even there is no VMA on that ranges; there's no conflict.
>>
>> Hi Stephen,
>>
>> Both Sergio & Jianfeng are not active anymore in DPDK, which the discussion
>> seems was going on with. cc'ed Anatoly.
>>
>> Is this RFC still valid?
>> Should we expect an update on it?
>>
>> Thanks,
>> ferruh
>>
>>>
>>> Thanks,
>>> Jianfeng
>>>    
>>>>
>>>> Thanks,
>>>> Sergio
>>>>   
>>>>>> The proposal for new secondary process model would solve these issues:
>>>>>> http://dpdk.org/ml/archives/dev/2017-May/066147.html
>>>>> And yes, this might happen to solve the targeted issue in this RFC. But
>>>> before the new model is out, this patch seems a workable way for the
>>>> original issue.
>>>>>
>>>>> Thanks,
>>>>> Jianfeng
>>>>>   
>>>>>> Thanks,
>>>>>> Sergio
>>>>   
>>>
>>>    
>>
> 
> MAP_FIXED is the wrong solution. If the secondary passes the address
> it wants, and gets something else that means that is overlapping.
> The current code returns an error which is the best response.
> 

...which is why this: 
http://patches.dpdk.org/bundle/aburakov/reliable_device_map/

is a better approach :)

-- 
Thanks,
Anatoly

      reply	other threads:[~2019-01-28 10:00 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-07-11  1:12 Stephen Hemminger
2017-07-11  1:56 ` Tan, Jianfeng
2017-07-11 11:35   ` Sergio Gonzalez Monroy
2017-07-11 20:00     ` Stephen Hemminger
2017-07-12  7:24       ` Sergio Gonzalez Monroy
2017-07-12  2:45     ` Tan, Jianfeng
2017-07-12  7:31       ` Sergio Gonzalez Monroy
2017-07-12  8:58         ` Tan, Jianfeng
2019-01-23 19:21           ` Ferruh Yigit
2019-01-23 20:37             ` Stephen Hemminger
2019-01-28  9:59               ` Burakov, Anatoly [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a537ff45-bf21-9cbc-9735-39e0d7440a7b@intel.com \
    --to=anatoly.burakov@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).