Hello, As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still consumed. Therefore, is there any reason not to unlink backing files after initialization ? If no, I will send a patch for the change. -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; }
Additional info: Before staring Application: ------------------------------------- cat /sys/devices/system/node/node*/meminfo | grep HugePages_ Node 0 HugePages_Total: 2048 Node 0 HugePages_Free: 2048 Node 0 HugePages_Surp: 0 Node 1 HugePages_Total: 2048 Node 1 HugePages_Free: 2048 Node 1 HugePages_Surp: 0 While application is running: ------------------------------------- cat /sys/devices/system/node/node*/meminfo | grep HugePages_ Node 0 HugePages_Total: 2048 Node 0 HugePages_Free: 1536 Node 0 HugePages_Surp: 0 Node 1 HugePages_Total: 2048 Node 1 HugePages_Free: 1536 Node 1 HugePages_Surp: 0 After Application is stopped: ------------------------------------- cat /sys/devices/system/node/node*/meminfo | grep HugePages_ Node 0 HugePages_Total: 2048 Node 0 HugePages_Free: 1536 Node 0 HugePages_Surp: 0 Node 1 HugePages_Total: 2048 Node 1 HugePages_Free: 1536 Node 1 HugePages_Surp: 0 With UNLINKING in eal_memory.c::rte_eal_hugepage_init() and after application is stopped: ------------------------------------------------------------ cat /sys/devices/system/node/node*/meminfo | grep HugePages_ Node 0 HugePages_Total: 2048 Node 0 HugePages_Free: 2048 Node 0 HugePages_Surp: 0 Node 1 HugePages_Total: 2048 Node 1 HugePages_Free: 2048 Node 1 HugePages_Surp: 0 -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: dev <dev-bounces@dpdk.org<mailto:dev-bounces@dpdk.org>> on behalf of Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>> Date: Monday, September 28, 2015 at 5:04 PM To: "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> Subject: [dpdk-dev] Unlinking hugepage backing file after initialiation Hello, As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still consumed. Therefore, is there any reason not to unlink backing files after initialization ? If no, I will send a patch for the change. -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; }
On 9/29/2015 8:04 AM, shesha Sreenivasamurthy (shesha) wrote: > Hello, > As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still consumed. Therefore, is there any reason not to unlink backing files after initialization ? If no, I will send a patch for the change. shesha: You remind me the virtio unexpected crashing issue. DPDK runs in user space. It is quite possible it dies unexpectedly, either crash or being killed. When the dpdk virtio app crashes, it doesn't have a chance to notify host, so host is still using its memory, backed by guest huge page. If huge page files are still reserved in hugetlbfs, we have a chance to recover virtio first, then unlink the huge pages. Otherwise if the huge pages are allocated by other process, its memory could be corrupted by host. Certainly it is not implemented like that for this purpose, but i think it is a temporary solution for this user space virtio driver issue. /huawei > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; } >
> -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) > Sent: Tuesday, September 29, 2015 1:04 AM > To: dev@dpdk.org > Subject: [dpdk-dev] Unlinking hugepage backing file after initialiation > > Hello, > As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are > not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still > consumed. Therefore, is there any reason not to unlink backing files after initialization For secondary process(es) to be able to open/map them too? Konstantin >? If no, I will send a patch for the change. > > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; }
On Tue, Sep 29, 2015 at 09:03:15AM +0000, Ananyev, Konstantin wrote: > > > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) > > Sent: Tuesday, September 29, 2015 1:04 AM > > To: dev@dpdk.org > > Subject: [dpdk-dev] Unlinking hugepage backing file after initialiation > > > > Hello, > > As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are > > not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still > > consumed. Therefore, is there any reason not to unlink backing files after initialization > > For secondary process(es) to be able to open/map them too? > Konstantin > Exactly. The hugepages are kept present on the file system so that secondary processes can use them to attach to a primary process memory in a multi-process setup. What is done instead is that any old hugepage files are cleaned up when the application starts (or restarts). Regards, /Bruce > >? If no, I will send a patch for the change. > > > > -- > > - Thanks > > char * (*shesha) (uint64_t cache, uint8_t F00D) > > { return 0x0000C0DE; }
What do you mean by secondary process attaching to primary process (Master-slave setup ?) ? The first process crashed, how can we be sure that memory is not half written ? -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: Bruce Richardson <bruce.richardson@intel.com<mailto:bruce.richardson@intel.com>> Organization: Intel Shannon Ltd. Date: Tuesday, September 29, 2015 at 4:14 AM To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com<mailto:konstantin.ananyev@intel.com>> Cc: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation On Tue, Sep 29, 2015 at 09:03:15AM +0000, Ananyev, Konstantin wrote: > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) > Sent: Tuesday, September 29, 2015 1:04 AM > To: dev@dpdk.org<mailto:dev@dpdk.org> > Subject: [dpdk-dev] Unlinking hugepage backing file after initialiation > > Hello, > As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are > not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still > consumed. Therefore, is there any reason not to unlink backing files after initialization For secondary process(es) to be able to open/map them too? Konstantin Exactly. The hugepages are kept present on the file system so that secondary processes can use them to attach to a primary process memory in a multi-process setup. What is done instead is that any old hugepage files are cleaned up when the application starts (or restarts). Regards, /Bruce >? If no, I will send a patch for the change. > > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; }
On 9/29/2015 10:38 AM, Xie, Huawei wrote: > On 9/29/2015 8:04 AM, shesha Sreenivasamurthy (shesha) wrote: >> Hello, >> As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still consumed. Therefore, is there any reason not to unlink backing files after initialization ? If no, I will send a patch for the change. > shesha: > You remind me the virtio unexpected crashing issue. DPDK runs in user > space. It is quite possible it dies unexpectedly, either crash or being > killed. > When the dpdk virtio app crashes, it doesn't have a chance to notify > host, so host is still using its memory, backed by guest huge page. > If huge page files are still reserved in hugetlbfs, we have a chance to > recover virtio first, then unlink the huge pages. > Otherwise if the huge pages are allocated by other process, its memory > could be corrupted by host. > > Certainly it is not implemented like that for this purpose, but i think > it is a temporary solution for this user space virtio driver issue. I realized it is not a virtio specific issue, but apply to all user space driver. And the chance is very very small. Also commented by Bruce/Konstantin, it is implemented this way for multiple processes. > > /huawei > > > > > > >> -- >> - Thanks >> char * (*shesha) (uint64_t cache, uint8_t F00D) >> { return 0x0000C0DE; } >> >
If huge pages are allocated for the guest and if the guest crashes there may be a chance that the new guest may not be able to get huge pages again as some other guest or process on the host used it. But I am not able to understand memory corruption you are talking about. In my opinion, if a process using a piece of memory goes away, it should not re-attach to the same piece of memory without running a sanity check on it. -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com>> Date: Tuesday, September 29, 2015 at 8:15 AM To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>> Cc: "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>>, "ms >> Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation On 9/29/2015 10:38 AM, Xie, Huawei wrote: On 9/29/2015 8:04 AM, shesha Sreenivasamurthy (shesha) wrote: Hello, As of DPDK2.1, backing files are created in hugetablefs during mapping (in eal_memory.c::rte_eal_hugepage_init()) and these files are not cleaned up (unlinked) after initialization (mmap-ing). This means, when the application crashes or stopped, the memory is still consumed. Therefore, is there any reason not to unlink backing files after initialization ? If no, I will send a patch for the change. shesha: You remind me the virtio unexpected crashing issue. DPDK runs in user space. It is quite possible it dies unexpectedly, either crash or being killed. When the dpdk virtio app crashes, it doesn't have a chance to notify host, so host is still using its memory, backed by guest huge page. If huge page files are still reserved in hugetlbfs, we have a chance to recover virtio first, then unlink the huge pages. Otherwise if the huge pages are allocated by other process, its memory could be corrupted by host. Certainly it is not implemented like that for this purpose, but i think it is a temporary solution for this user space virtio driver issue. I realized it is not a virtio specific issue, but apply to all user space driver. And the chance is very very small. Also commented by Bruce/Konstantin, it is implemented this way for multiple processes. /huawei -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; }
On Tue, Sep 29, 2015 at 03:48:08PM +0000, shesha Sreenivasamurthy (shesha) wrote:
> If huge pages are allocated for the guest and if the guest crashes there may be
> a chance that the new guest may not be able to get huge pages again as some
> other guest or process on the host used it. But I am not able to understand
> memory corruption you are talking about. In my opinion, if a process using a
> piece of memory goes away, it should not re-attach to the same piece of memory
> without running a sanity check on it.
guest memory is allocated an freed by hypervisor, right?
I don't think it's dpdk's job.
--
MST
Sure. Then, is there any real reason why the backing files should not be unlinked ? -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> Date: Tuesday, September 29, 2015 at 9:16 AM To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>> Cc: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation On Tue, Sep 29, 2015 at 03:48:08PM +0000, shesha Sreenivasamurthy (shesha) wrote: If huge pages are allocated for the guest and if the guest crashes there may be a chance that the new guest may not be able to get huge pages again as some other guest or process on the host used it. But I am not able to understand memory corruption you are talking about. In my opinion, if a process using a piece of memory goes away, it should not re-attach to the same piece of memory without running a sanity check on it. guest memory is allocated an freed by hypervisor, right? I don't think it's dpdk's job. -- MST
On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote:
> Sure. Then, is there any real reason why the backing files should not be
> unlinked ?
AFAIK qemu unlinks them already.
--
MST
What I heard is the following: A multi-process DPDK application, working either in master-worker or master-slave fashion, can potentially benefit by keeping the backing files in hugetlbfs. However, it is does not work today as the pages are cleaned and added back when the application restarts. On the other hand, for a single process application there is actually no benefit keeping the pages around. Therefore, I was wondering if we can make this configurable by passing a command line argument that will either unlink or keep the backing files. -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> Date: Tuesday, September 29, 2015 at 2:35 PM To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>> Cc: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote: Sure. Then, is there any real reason why the backing files should not be unlinked ? AFAIK qemu unlinks them already. -- MST
> -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) > Sent: Wednesday, September 30, 2015 10:44 PM > To: dev@dpdk.org > Cc: Michael S. Tsirkin > Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation > > What I heard is the following: A multi-process DPDK application, working either in master-worker or master-slave fashion, can > potentially benefit by keeping the backing files in hugetlbfs. However, it is does not work today as the pages are cleaned and added > back when the application restarts. Who says it is not working? I admit that DPDK MP model is probably a bit constrained, but it does work. It is probably good to read some docs: http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html and/or look at the code that does MP support inside DPDK. I think that might make things clearer. Konstantin > On the other hand, for a single process application there is actually no benefit keeping the pages > around. > > Therefore, I was wondering if we can make this configurable by passing a command line argument that will either unlink or keep the > backing files. > > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; } > > From: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> > Date: Tuesday, September 29, 2015 at 2:35 PM > To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>> > Cc: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" > <dev@dpdk.org<mailto:dev@dpdk.org>> > Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation > > On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote: > Sure. Then, is there any real reason why the backing files should not be > unlinked ? > > AFAIK qemu unlinks them already. > > -- > MST
My bad that I said its not working, apologies. Isn’t it correct to say that single process application do not benefit from having backing files ? In that case can make this configurable by passing a command line argument that will either unlink or keep the backing files, defaulting it to keeping the backing files. Single process application to do not need these files around can pass additional param to unlink these files ? -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com<mailto:konstantin.ananyev@intel.com>> Date: Wednesday, September 30, 2015 at 2:53 PM To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> Cc: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> Subject: RE: [dpdk-dev] Unlinking hugepage backing file after initialiation -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) Sent: Wednesday, September 30, 2015 10:44 PM To: dev@dpdk.org<mailto:dev@dpdk.org> Cc: Michael S. Tsirkin Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation What I heard is the following: A multi-process DPDK application, working either in master-worker or master-slave fashion, can potentially benefit by keeping the backing files in hugetlbfs. However, it is does not work today as the pages are cleaned and added back when the application restarts. Who says it is not working? I admit that DPDK MP model is probably a bit constrained, but it does work. It is probably good to read some docs: http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html and/or look at the code that does MP support inside DPDK. I think that might make things clearer. Konstantin On the other hand, for a single process application there is actually no benefit keeping the pages around. Therefore, I was wondering if we can make this configurable by passing a command line argument that will either unlink or keep the backing files. -- - Thanks char * (*shesha) (uint64_t cache, uint8_t F00D) { return 0x0000C0DE; } From: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com><mailto:mst@redhat.com>> Date: Tuesday, September 29, 2015 at 2:35 PM To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com><mailto:shesha@cisco.com>> Cc: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com><mailto:huawei.xie@intel.com>>, "dev@dpdk.org<mailto:dev@dpdk.org><mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org><mailto:dev@dpdk.org>> Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote: Sure. Then, is there any real reason why the backing files should not be unlinked ? AFAIK qemu unlinks them already. -- MST
On Wed, Sep 30, 2015 at 10:04:36PM +0000, shesha Sreenivasamurthy (shesha) wrote: > My bad that I said its not working, apologies. > > Isn’t it correct to say that single process application do not benefit from having backing files ? In that case can make this configurable by passing a command line argument that will either unlink or keep the backing files, defaulting it to keeping the backing files. Single process application to do not need these files around can pass additional param to unlink these files ? > Sure. Or else the user can just use rm after starting the application. Or the application itself can also remove the files after starting up. There is no reason that this needs to be done by the EAL :-) /Bruce > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; } > > From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com<mailto:konstantin.ananyev@intel.com>> > Date: Wednesday, September 30, 2015 at 2:53 PM > To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com>>, "dev@dpdk.org<mailto:dev@dpdk.org>" <dev@dpdk.org<mailto:dev@dpdk.org>> > Cc: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com>> > Subject: RE: [dpdk-dev] Unlinking hugepage backing file after initialiation > > > > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of shesha Sreenivasamurthy (shesha) > Sent: Wednesday, September 30, 2015 10:44 PM > To: dev@dpdk.org<mailto:dev@dpdk.org> > Cc: Michael S. Tsirkin > Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation > What I heard is the following: A multi-process DPDK application, working either in master-worker or master-slave fashion, can > potentially benefit by keeping the backing files in hugetlbfs. However, it is does not work today as the pages are cleaned and added > back when the application restarts. > > Who says it is not working? > I admit that DPDK MP model is probably a bit constrained, but it does work. > It is probably good to read some docs: > http://dpdk.org/doc/guides/prog_guide/multi_proc_support.html > and/or look at the code that does MP support inside DPDK. > I think that might make things clearer. > Konstantin > > On the other hand, for a single process application there is actually no benefit keeping the pages > around. > Therefore, I was wondering if we can make this configurable by passing a command line argument that will either unlink or keep the > backing files. > -- > - Thanks > char * (*shesha) (uint64_t cache, uint8_t F00D) > { return 0x0000C0DE; } > From: "Michael S. Tsirkin" <mst@redhat.com<mailto:mst@redhat.com><mailto:mst@redhat.com>> > Date: Tuesday, September 29, 2015 at 2:35 PM > To: Cisco Employee <shesha@cisco.com<mailto:shesha@cisco.com><mailto:shesha@cisco.com>> > Cc: "Xie, Huawei" <huawei.xie@intel.com<mailto:huawei.xie@intel.com><mailto:huawei.xie@intel.com>>, "dev@dpdk.org<mailto:dev@dpdk.org><mailto:dev@dpdk.org>" > <dev@dpdk.org<mailto:dev@dpdk.org><mailto:dev@dpdk.org>> > Subject: Re: [dpdk-dev] Unlinking hugepage backing file after initialiation > On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote: > Sure. Then, is there any real reason why the backing files should not be > unlinked ? > AFAIK qemu unlinks them already. > -- > MST > >
On 9/30/2015 5:36 AM, Michael S. Tsirkin wrote:
> On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote:
>> Sure. Then, is there any real reason why the backing files should not be
>> unlinked ?
> AFAIK qemu unlinks them already.
Sorry, i didn't make it clear. Let us take the physical Ethernet
controller in the host for example
1) DPDK app1 unlinked huge page after initialization.
2) DPDK app1 crashed or got killed unexpectedly.
3) The nic device is just DMAing to the buffer memory allocated from
the huge page.
4) Another app2 started, allocated memory from the hugetlbfs, and the
memory allocated happened to be the buffer memory.
Ok, the nic device dmaed to memory of app2, which corrupted app2.
Btw, the window opened is very very narrow, but we could avoid this
corruption if we don't unlink huge page immediately. We could
reinitialize the nic through binding operation and then remove the huge
page.
I mentioned virtio at the first time. For its case, the one who does DMA
is vhost and i am talking about the guest huge page not the huge pages
used to back guest memory.
So we had better not unlink huge pages unless we have other solution to
avoid the corruption.
On Mon, Oct 05, 2015 at 01:08:52PM +0000, Xie, Huawei wrote:
> On 9/30/2015 5:36 AM, Michael S. Tsirkin wrote:
> > On Tue, Sep 29, 2015 at 05:50:00PM +0000, shesha Sreenivasamurthy (shesha) wrote:
> >> Sure. Then, is there any real reason why the backing files should not be
> >> unlinked ?
> > AFAIK qemu unlinks them already.
> Sorry, i didn't make it clear. Let us take the physical Ethernet
> controller in the host for example
>
> 1) DPDK app1 unlinked huge page after initialization.
> 2) DPDK app1 crashed or got killed unexpectedly.
> 3) The nic device is just DMAing to the buffer memory allocated from
> the huge page.
> 4) Another app2 started, allocated memory from the hugetlbfs, and the
> memory allocated happened to be the buffer memory.
> Ok, the nic device dmaed to memory of app2, which corrupted app2.
> Btw, the window opened is very very narrow, but we could avoid this
> corruption if we don't unlink huge page immediately. We could
> reinitialize the nic through binding operation and then remove the huge
> page.
>
> I mentioned virtio at the first time. For its case, the one who does DMA
> is vhost and i am talking about the guest huge page not the huge pages
> used to back guest memory.
>
> So we had better not unlink huge pages unless we have other solution to
> avoid the corruption.
Oh, I get it now. It's when you (ab)use UIO to bypass all normal kernel
protections. There's no problem when using VFIO.
So kernel doesn't protect you in case of a crash, but I guess you
can try to protect yourself.
For example, write a separate service that you can pass the hugepage FDs
and the device FDs to. Have it hold on to them, and when it detects your
app crashed, have it reset the device before closing the FDs.
Just make sure that one doesn't crash :).
But really, people should just use VFIO.
--
MST
On 10/5/2015 9:20 PM, Michael S. Tsirkin wrote:
> But really, people should just use VFIO.
Not sure if the app crashes, kernel a) unmap b) tears down IOMMU, if
other app still has chance to allocate the same memory between step a
and step b.
Need to check whether memory are bound to huge page files after allocated.