From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 8CFF7A052B;
	Fri, 31 Jul 2020 13:55:30 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id DDF972BB8;
	Fri, 31 Jul 2020 13:55:29 +0200 (CEST)
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id 68FE311A2;
 Fri, 31 Jul 2020 13:55:27 +0200 (CEST)
IronPort-SDR: mthjHDR4IlOx5LB9aULR6Yaxkova9sjk7+BuIfyGTNG6an+nyUsaUMkkjIc9GDCr99Ou3lMunk
 BYcwpUtjFyNw==
X-IronPort-AV: E=McAfee;i="6000,8403,9698"; a="149588635"
X-IronPort-AV: E=Sophos;i="5.75,418,1589266800"; d="scan'208";a="149588635"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga007.jf.intel.com ([10.7.209.58])
 by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 31 Jul 2020 04:55:26 -0700
IronPort-SDR: Q9Kwaz2Km9W2P3OhiiUEFfgPGDi+q3Q6HocAFb4AQwPb3PGJjWQbxZXezFy6hshoZHKcqEMqPA
 Thd6Jkq55/fQ==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.75,418,1589266800"; d="scan'208";a="331043831"
Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.220.248])
 ([10.213.220.248])
 by orsmga007.jf.intel.com with ESMTP; 31 Jul 2020 04:55:24 -0700
To: wangyunjian <wangyunjian@huawei.com>, "dev@dpdk.org" <dev@dpdk.org>,
 "david.marchand@redhat.com" <david.marchand@redhat.com>
Cc: "Lilijun (Jerry)" <jerry.lilijun@huawei.com>,
 xudingke <xudingke@huawei.com>, "stable@dpdk.org" <stable@dpdk.org>
References: <cdb0d6482a5c3a5d7d23625d230d9ef41d657320.1594903876.git.wangyunjian@huawei.com>
 <1595515713-24640-1-git-send-email-wangyunjian@huawei.com>
 <c35b31da-703e-d4d4-6283-0bb4e8ad57e2@intel.com>
 <34EFBCA9F01B0748BEB6B629CE643AE60D0EAAD8@dggemm513-mbx.china.huawei.com>
 <a75edde1-a30c-71d1-5191-6c37cad23779@intel.com>
 <34EFBCA9F01B0748BEB6B629CE643AE60D0F695F@dggemm513-mbx.china.huawei.com>
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Message-ID: <21d092c3-83ce-a3f4-2411-193a5f46a111@intel.com>
Date: Fri, 31 Jul 2020 12:55:23 +0100
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101
 Thunderbird/68.10.0
MIME-Version: 1.0
In-Reply-To: <34EFBCA9F01B0748BEB6B629CE643AE60D0F695F@dggemm513-mbx.china.huawei.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [dpdk-dev] [PATCH v2] eal/linux: do not create user mem map
 repeatedly when it exists
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

On 30-Jul-20 2:16 PM, wangyunjian wrote:
>> -----Original Message-----
>> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
>> Sent: Monday, July 27, 2020 5:24 PM
>> To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org;
>> david.marchand@redhat.com
>> Cc: Lilijun (Jerry) <jerry.lilijun@huawei.com>; xudingke
>> <xudingke@huawei.com>; stable@dpdk.org
>> Subject: Re: [dpdk-dev] [PATCH v2] eal/linux: do not create user mem map
>> repeatedly when it exists
>>
>> On 25-Jul-20 10:59 AM, wangyunjian wrote:
>>>> -----Original Message-----
>>>> From: Burakov, Anatoly [mailto:anatoly.burakov@intel.com]
>>>> Sent: Friday, July 24, 2020 9:25 PM
>>>> To: wangyunjian <wangyunjian@huawei.com>; dev@dpdk.org;
>>>> david.marchand@redhat.com
>>>> Cc: Lilijun (Jerry) <jerry.lilijun@huawei.com>; xudingke
>>>> <xudingke@huawei.com>; stable@dpdk.org
>>>> Subject: Re: [dpdk-dev] [PATCH v2] eal/linux: do not create user mem
>>>> map repeatedly when it exists
>>>>
>>>> On 23-Jul-20 3:48 PM, wangyunjian wrote:
>>>>> From: Yunjian Wang <wangyunjian@huawei.com>
>>>>>
>>>>> Currently, we will create new user mem map entry for the same memory
>>>>> segment, but in fact it has already been added to the user mem maps.
>>>>> It's not necessary to create it twice.
>>>>>
>>>>> To resolve the issue, add support to remove the same entry in the
>>>>> function compact_user_maps().
>>>>>
>>>>> Fixes: 0cbce3a167f1 ("vfio: skip DMA map failure if already mapped")
>>>>> Cc: stable@dpdk.org
>>>>>
>>>>> Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
>>>>> ---
>>>>> v2:
>>>>> * Remove the same entry in the function compact_user_maps()
>>>>> ---
>>>>>     lib/librte_eal/linux/eal_vfio.c | 5 +++++
>>>>>     1 file changed, 5 insertions(+)
>>>>>
>>>>> diff --git a/lib/librte_eal/linux/eal_vfio.c
>>>>> b/lib/librte_eal/linux/eal_vfio.c index abb12a354..df99307b7 100644
>>>>> --- a/lib/librte_eal/linux/eal_vfio.c
>>>>> +++ b/lib/librte_eal/linux/eal_vfio.c
>>>>> @@ -167,6 +167,10 @@ adjust_map(struct user_mem_map *src, struct
>>>> user_mem_map *end,
>>>>>     static int
>>>>>     merge_map(struct user_mem_map *left, struct user_mem_map
>> *right)
>>>>>     {
>>>>> +	/* merge the same maps into one */
>>>>> +	if (memcmp(left, right, sizeof(struct user_mem_map)) == 0)
>>>>> +		goto out;
>>>>> +
>>>>
>>>> merge_map looks for adjacent maps only, but does not handle maps that
>>>> are wholly contained within one another ("the same map" also matches
>>>> this definition). wouldn't it be better to check for that instead of
>>>> *just* handling identical maps?
>>>
>>> What about using the initial implementation?
>>> We don't create new user mem map entry for the same memory segment.
>>
>> I don't like this implementation because it relies on particulars of how VFIO
>> mapping work without explicitly specifying them. I.e. it's prone to breaking
>> when changing code. That's not even mentioning that we have no guarantees
>> on kernel behavior in that particular case being identical on all supported
>> platforms.
>>
>> I would honestly prefer an explicit compaction over implicit one.
> 
> What about this implementation?

Again, this works, but i feel like specializing it to just merge the 
exact same maps is missing an opportunity to provide a more general 
solution that merges same *and* subset maps.

> 
> diff --git a/lib/librte_eal/linux/eal_vfio.c b/lib/librte_eal/linux/eal_vfio.c
> index e07979936..8dcb04cd9 100644
> --- a/lib/librte_eal/linux/eal_vfio.c
> +++ b/lib/librte_eal/linux/eal_vfio.c
> @@ -179,6 +179,19 @@ merge_map(struct user_mem_map *left, struct user_mem_map *right)
>   	return 1;
>   }
>   
> +/* try merging two same maps into one, return 1 if succeeded */
> +static int
> +merge_same_map(struct user_mem_map *left, struct user_mem_map *right)
> +{
> +	if (memcmp(left, right, sizeof(struct user_mem_map)) != 0) {
> +		return 0;
> +	}
> +
> +	memset(right, 0, sizeof(*right));
> +
> +	return 1;
> +}
> +
>   static struct user_mem_map *
>   find_user_mem_map(struct user_mem_maps *user_mem_maps, uint64_t addr,
>   		uint64_t iova, uint64_t len)
> @@ -232,7 +245,7 @@ compact_user_maps(struct user_mem_maps *user_mem_maps)
>   		if (is_null_map(l) || is_null_map(r))
>   			continue;
>   
> -		if (merge_map(l, r))
> +		if (merge_map(l, r) || merge_same_map(l, r))
>   			n_merged++;
>   	}
> 
> Thanks,
> Yunjian
> 
>>
>>>
>>> @@ -1828,6 +1828,13 @@  container_dma_map(struct vfio_config
>> *vfio_cfg, uint64_t vaddr, uint64_t iova,
>>>    		ret = -1;
>>>    		goto out;
>>>    	}
>>> +
>>> +	/* we don't need create new user mem map entry
>>> +	 * for the same memory segment.
>>> +	 */
>>> +	if (errno == EBUSY || errno == EEXIST)
>>> +		goto out;
>>> +
>>>    	/* create new user mem map entry */
>>>    	new_map =
>> &user_mem_maps->maps[user_mem_maps->n_maps++];
>>>    	new_map->addr = vaddr;
>>>
>>> Thanks,
>>> Yunjian
>>>>
>>>>>     	if (left->addr + left->len != right->addr)
>>>>>     		return 0;
>>>>>     	if (left->iova + left->len != right->iova) @@ -174,6 +178,7 @@
>>>>> merge_map(struct user_mem_map *left, struct
>>>> user_mem_map *right)
>>>>>
>>>>>     	left->len += right->len;
>>>>>
>>>>> +out:
>>>>>     	memset(right, 0, sizeof(*right));
>>>>>
>>>>>     	return 1;
>>>>>
>>>>
>>>>
>>>> --
>>>> Thanks,
>>>> Anatoly
>>
>>
>> --
>> Thanks,
>> Anatoly


-- 
Thanks,
Anatoly