From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <arybchenko@solarflare.com>
Received: from dispatch1-us1.ppe-hosted.com (dispatch1-us1.ppe-hosted.com
 [148.163.129.52]) by dpdk.org (Postfix) with ESMTP id A86744C9D
 for <dev@dpdk.org>; Thu, 15 Mar 2018 12:49:59 +0100 (CET)
X-Virus-Scanned: Proofpoint Essentials engine
Received: from webmail.solarflare.com (webmail.solarflare.com [12.187.104.26])
 (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits))
 (No client certificate requested)
 by mx1-us1.ppe-hosted.com (Proofpoint Essentials ESMTP Server) with ESMTPS id
 AA72E940065; Thu, 15 Mar 2018 11:49:57 +0000 (UTC)
Received: from [192.168.38.17] (84.52.114.114) by ocex03.SolarFlarecom.com
 (10.20.40.36) with Microsoft SMTP Server (TLS) id 15.0.1044.25; Thu, 15 Mar
 2018 04:49:54 -0700
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>, <dev@dpdk.org>
CC: Olivier MATZ <olivier.matz@6wind.com>
References: <1516713372-10572-1-git-send-email-arybchenko@solarflare.com>
 <1520696382-16400-1-git-send-email-arybchenko@solarflare.com>
 <1520696382-16400-4-git-send-email-arybchenko@solarflare.com>
 <a3292ce8-59f0-575c-efbc-22efd06a3090@intel.com>
 <72792bc9-8e64-9a18-ce6c-ffd6777f1748@solarflare.com>
 <216ac4e5-2815-de23-4185-fd28210c0477@intel.com>
 <f3148dd6-10c3-7fbb-e73b-b8f43d0a4bd3@solarflare.com>
 <b26d07a8-5e99-85f1-1948-5b8d81979134@intel.com>
From: Andrew Rybchenko <arybchenko@solarflare.com>
Message-ID: <7873dc42-d58f-7dd2-abe0-3eac3ee5ebfb@solarflare.com>
Date: Thu, 15 Mar 2018 14:49:46 +0300
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <b26d07a8-5e99-85f1-1948-5b8d81979134@intel.com>
Content-Language: en-GB
X-Originating-IP: [84.52.114.114]
X-ClientProxiedBy: ocex03.SolarFlarecom.com (10.20.40.36) To
 ocex03.SolarFlarecom.com (10.20.40.36)
X-MDID: 1521114598-5dz81TNeOpO6
Content-Type: text/plain; charset="utf-8"; format=flowed
Content-Transfer-Encoding: 8bit
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: Re: [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get
 capabilities
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2018 11:50:00 -0000

On 03/15/2018 12:48 PM, Burakov, Anatoly wrote:
> On 14-Mar-18 5:24 PM, Andrew Rybchenko wrote:
>> On 03/14/2018 07:53 PM, Burakov, Anatoly wrote:
>>> On 14-Mar-18 4:12 PM, Andrew Rybchenko wrote:
>>>> On 03/14/2018 05:40 PM, Burakov, Anatoly wrote:
>>>>> On 10-Mar-18 3:39 PM, Andrew Rybchenko wrote:
>>>>>> The callback was introduced to let generic code to know octeontx
>>>>>> mempool driver requirements to use single physically contiguous
>>>>>> memory chunk to store all objects and align object address to
>>>>>> total object size. Now these requirements are met using a new
>>>>>> callbacks to calculate required memory chunk size and to populate
>>>>>> objects using provided memory chunk.
>>>>>>
>>>>>> These capability flags are not used anywhere else.
>>>>>>
>>>>>> Restricting capabilities to flags is not generic and likely to
>>>>>> be insufficient to describe mempool driver features. If required
>>>>>> in the future, API which returns structured information may be
>>>>>> added.
>>>>>>
>>>>>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>>>>>> ---
>>>>>
>>>>> Just a general comment - it is not enough to describe minimum 
>>>>> memchunk requirements. With memory hotplug patchset that's 
>>>>> hopefully getting merged in 18.05, memzones will no longer be 
>>>>> guaranteed to be IOVA-contiguous. So, if a driver requires its 
>>>>> mempool to not only be populated from a single memzone, but a 
>>>>> single *physically contiguous* memzone, going by only callbacks 
>>>>> will not do, because whether or not something should be a single 
>>>>> memzone says nothing about whether this memzone has to also be 
>>>>> IOVA-contiguous.
>>>>>
>>>>> So i believe this needs to stay in one form or another.
>>>>>
>>>>> (also it would be nice to have a flag that a user could pass to 
>>>>> mempool_create that would force memzone reservation be 
>>>>> IOVA-contiguous, but that's a topic for another conversation. 
>>>>> prime user for this would be KNI.)
>>>>
>>>> I think that min_chunk_size should be treated as IOVA-contiguous.
>>>
>>> Why? It's perfectly reasonable to e.g. implement a software mempool 
>>> driver that would perform some optimizations due to all objects 
>>> being in the same VA-contiguous memzone, yet not be dependent on 
>>> underlying physical memory layout. These are two separate concerns IMO.
>>
>> It looks like there is some misunderstanding here or I simply don't 
>> understand your point.
>> Above I mean that driver should be able to advertise its requirements 
>> on IOVA-contiguous regions.
>> If driver do not care about physical memory layout, no problem.
>
> Please correct me if i'm wrong, but my understanding was that you 
> wanted to use min_chunk as a way to express minimum requirements for 
> IOVA-contiguous memory. If i understood you correctly, i don't think 
> that's the way to go because there could be valid use cases where a 
> mempool driver would like to advertise min_chunk_size to be equal to 
> its total size (i.e. allocate everything in a single memzone), yet not 
> require that memzone to be IOVA-contiguous. I think these are two 
> different concerns, and one does not, and should not imply the other.

Aha, you're saying that virtual-contiguous and IOVA-contiguous 
requirements are different things that it could be usecases where 
virtual contiguous is important but IOVA-contiguos is not required. It 
is perfectly fine.
As I understand IOVA-contiguous (physical) typically means 
virtual-contiguous as well. Requirements to have everything virtually 
contiguous and some blocks physically contiguous are unlikely. So, it 
may be reduced to either virtual or physical contiguous. If mempool does 
not care about physical contiguous at all, MEMPOOL_F_NO_PHYS_CONTIG flag 
should be used and min_chunk_size should mean virtual contiguous 
requirements. If mempool requires physical contiguous objects, there is 
*no* MEMPOOL_F_NO_PHYS_CONTIG flag and min_chunk_size means physical 
contiguous requirements.

>>
>>> > So, we
>>>> have 4 levels:
>>>>   - MEMPOOL_F_NO_PHYS_CONTIG (min_chunk_size == 0) -- 
>>>> IOVA-congtiguous is not required at all
>>>>   - no MEMPOOL_F_NO_PHYS_CONTIG (min_chunk_size == total_obj_size) 
>>>> -- object should be IOVA-contiguous
>>>>   - min_chunk_size > total_obj_size  -- group of objects should be 
>>>> IOVA-contiguous
>>>>   - min_chunk_size == <all-objects-size> -- all objects should be 
>>>> IOVA-contiguous
>>>
>>> I don't think this "automagic" decision on what should be 
>>> IOVA-contiguous or not is the way to go. It needlessly complicates 
>>> things, when all it takes is another flag passed to mempool 
>>> allocator somewhere.
>>
>> No, it is not just one flag. We really need option (3) above: group 
>> of objects IOVA-contiguous in [1].
>> Of course, it is possible to use option (4) instead: everything 
>> IOVA-contigous, but I think it is bad - it may be very big and 
>> hard/impossible to allocate due to fragmentation.
>>
>
> Exactly: we shouldn't be forcing IOVA-contiguous memory just because 
> mempool requrested a big min_chunk_size, nor do i think it is wise to 
> encode such heuristics (referring to your 4 "levels" quoted above) 
> into the mempool allocator.
>
>>> I'm not sure what is the best solution here. Perhaps another option 
>>> would be to let mempool drivers allocate their memory as well? I.e. 
>>> leave current behavior as default, as it's likely that it would be 
>>> suitable for nearly all use cases, but provide another option to 
>>> override memory allocation completely, so that e.g. octeontx could 
>>> just do a memzone_reserve_contig() without regard for default 
>>> allocation settings. I think this could be the cleanest solution.
>>
>> For me it is hard to say. I don't know DPDK history good enough to 
>> say why there is a mempool API to populate objects on externally 
>> provided memory. If it may be removed, it is OK for me to do memory 
>> allocation inside rte_mempool or mempool drivers. Otherwise, if it is 
>> still allowed to allocate memory externally and pass it to mempool, 
>> it must be a way to express IOVA-contiguos requirements.
>>
>> [1] https://dpdk.org/dev/patchwork/patch/34338/
>
> Populating mempool objects is not the same as reserving memory where 
> those objects would reside. The closest to "allocate memory 
> externally" we have is rte_mempool_xmem_create(), which you are 
> removing in this patchset.

It is not the only function. Other functions remain: 
rte_mempool_populate_iova, rte_mempool_populate_iova_tab, 
rte_mempool_populate_virt. These functions may be used to add mem areas 
to mempool to populate objects. So, the memory is allocated externally 
and external entity needs to know requirements on memory allocation: 
size and virtual or both virtual/physical contiguous.

>>
>>>
>>>>
>>>> If so, how allocation should be implemented?
>>>>   1. if (min_chunk_size > min_page_size)
>>>>      a. try all contiguous
>>>>      b. if cannot, do by mem_chunk_size contiguous
>>>>   2. else allocate non-contiguous
>>>>
>>>> -- 
>>>> Andrew.
>>>
>>>
>>
>
>