Dear Stephen, The problem has not been solved, but I found a workaround. According to the documentation (https://doc.dpdk.org/guides/prog_guide/gpudev.html, sec 11.3), rte_extmem_register should be invoked with GPU_PAGE_SIZE as an argument.  If GPU_PAGE_SIZE is set to 2 MB instead of 64 kB, registration of 72 GB of GPU memory (on a Grace Hopper) is done in about ten seconds, not hours. rte_extmem_register(ext_mem.buf_ptr, ext_mem.buf_len, NULL, ext_mem.buf_iova, GPU_PAGE_SIZE); Thanks,  John Romein On 05-10-2024 00:16, Stephen Hemminger wrote: > On Wed, 1 Nov 2023 22:21:16 +0100 > John Romein wrote: > >> Dear Slava, >> >> Thank you for looking at the patch.  With the original code, I saw that >> the application spent literally hours in this function during program >> start up, if tens of gigabytes of GPU memory are registered.  This was >> due to qsort being invoked for every new added item (to keep the list >> sorted).  So I tried to write equivalent code that sorts the list only >> once, after all items were added.  At least for our application, this >> works well and is /much/ faster, as the complexity decreased from n^2 >> log(n) to n log(n).  But I must admit that I have no idea /what/ is >> being sorted, or why; I only understand this isolated piece of code (or >> at least I think so).  So if you think there are better ways to >> initialize the list, then I am sure you will be absolutely right.  But I >> will not be able to implement this, as I do not understand the full >> context of the code. >> >> Kind Regards,  John > Looks like the problem remains but patch has been sitting around for 11 months. > Was this resolved?