From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 8EF8123C for ; Fri, 4 May 2018 09:52:36 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 04 May 2018 00:52:34 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.49,361,1520924400"; d="scan'208";a="52345039" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.252.18.186]) ([10.252.18.186]) by fmsmga001.fm.intel.com with ESMTP; 04 May 2018 00:52:32 -0700 To: Maxime Coquelin , Jianfeng Tan , dev@dpdk.org Cc: tiwei.bie@intel.com, zhiyong.yang@intel.com References: <1524756847-141034-1-git-send-email-jianfeng.tan@intel.com> <93064ddf-b753-4d3e-2992-4fb94e984b36@intel.com> <968eaf20-b450-c047-782e-d417c78e732b@redhat.com> From: "Burakov, Anatoly" Message-ID: <212da774-c715-f479-2614-44ec68963e40@intel.com> Date: Fri, 4 May 2018 08:52:30 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.7.0 MIME-Version: 1.0 In-Reply-To: <968eaf20-b450-c047-782e-d417c78e732b@redhat.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH] net/virtio-user: fix hugepage files enumeration X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 04 May 2018 07:52:37 -0000 On 04-May-18 8:40 AM, Maxime Coquelin wrote: > Hi Anatoly, > > On 04/27/2018 11:31 AM, Burakov, Anatoly wrote: >> On 26-Apr-18 4:34 PM, Jianfeng Tan wrote: >>> After the commit 2a04139f66b4 ("eal: add single file segments option"), >>> one hugepage file could contain multiple hugepages which are further >>> mapped to different memory regions. >>> >>> Original enumeration implementation cannot handle this situation. >>> >>> This patch filters out the duplicated files; and adjust the size after >>> the enumeration. >>> >>> Fixes: 6a84c37e3975 ("net/virtio-user: add vhost-user adapter layer") >>> >>> Signed-off-by: Jianfeng Tan >>> --- >>>   .../howto/virtio_user_for_container_networking.rst |  3 ++- >>>   drivers/net/virtio/virtio_user/vhost_user.c        | 28 >>> ++++++++++++++++++++-- >>>   2 files changed, 28 insertions(+), 3 deletions(-) >>> >>> diff --git >>> a/doc/guides/howto/virtio_user_for_container_networking.rst >>> b/doc/guides/howto/virtio_user_for_container_networking.rst >>> index aa68b53..476ce3a 100644 >>> --- a/doc/guides/howto/virtio_user_for_container_networking.rst >>> +++ b/doc/guides/howto/virtio_user_for_container_networking.rst >>> @@ -109,7 +109,8 @@ We have below limitations in this solution: >>>    * Cannot work with --no-huge option. Currently, DPDK uses >>> anonymous mapping >>>      under this option which cannot be reopened to share with vhost >>> backend. >>>    * Cannot work when there are more than >>> VHOST_MEMORY_MAX_NREGIONS(8) hugepages. >>> -   In another word, do not use 2MB hugepage so far. >>> +   If you have more regions (especially when 2MB hugepages are >>> used), the option, >>> +   --single-file-segments, can help to reduce the number of shared >>> files. >>>    * Applications should not use file name like HUGEFILE_FMT >>> ("%smap_%d"). That >>>      will bring confusion when sharing hugepage files with backend by >>> name. >>>    * Root privilege is a must. DPDK resolves physical addresses of >>> hugepages >>> diff --git a/drivers/net/virtio/virtio_user/vhost_user.c >>> b/drivers/net/virtio/virtio_user/vhost_user.c >>> index a6df97a..01201c9 100644 >>> --- a/drivers/net/virtio/virtio_user/vhost_user.c >>> +++ b/drivers/net/virtio/virtio_user/vhost_user.c >>> @@ -138,12 +138,13 @@ struct hugepage_file_info { >>>   static int >>>   get_hugepage_file_info(struct hugepage_file_info huges[], int max) >>>   { >>> -    int idx; >>> +    int idx, k, exist; >>>       FILE *f; >>>       char buf[BUFSIZ], *tmp, *tail; >>>       char *str_underline, *str_start; >>>       int huge_index; >>>       uint64_t v_start, v_end; >>> +    struct stat stats; >>>       f = fopen("/proc/self/maps", "r"); >>>       if (!f) { >>> @@ -183,16 +184,39 @@ get_hugepage_file_info(struct >>> hugepage_file_info huges[], int max) >>>           if (sscanf(str_start, "map_%d", &huge_index) != 1) >>>               continue; >>> +        /* skip duplicated file which is mapped to different regions */ >>> +        for (k = 0, exist = -1; k < idx; ++k) { >>> +            if (!strcmp(huges[k].path, tmp)) { >>> +                exist = k; >>> +                break; >>> +            } >>> +        } >>> +        if (exist >= 0) >>> +            continue; >>> + >>>           if (idx >= max) { >>>               PMD_DRV_LOG(ERR, "Exceed maximum of %d", max); >>>               goto error; >>>           } >>> + >>>           huges[idx].addr = v_start; >>> -        huges[idx].size = v_end - v_start; >>> +        huges[idx].size = v_end - v_start; /* To be corrected later */ >>>           snprintf(huges[idx].path, PATH_MAX, "%s", tmp); >>>           idx++; >>>       } >>> +    /* correct the size for files who have many regions */ >>> +    for (k = 0; k < idx; ++k) { >>> +        if (stat(huges[k].path, &stats) < 0) { >>> +            PMD_DRV_LOG(ERR, "Failed to stat %s, %s\n", >>> +                    huges[k].path, strerror(errno)); >>> +            continue; >>> +        } >>> +        huges[k].size = stats.st_size; >>> +        PMD_DRV_LOG(INFO, "file %s, size %"PRIx64"\n", >>> +                huges[k].path, huges[k].size); >>> +    } >>> + >>>       fclose(f); >>>       return idx; >>> >> >> That sounds like potentially a lot of strcmp()'s (quadratic?). Can't >> it be sped up somehow? Maybe use rte_hash for storing this data? >> > > This patch is required to have virtio-user to work with 2MB pages. > While it may be improved later, I think we should pick it for v18.05. > Is it fine for you? Looks fine to me otherwise. > > Thanks, > Maxime > -- Thanks, Anatoly