From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <anatoly.burakov@intel.com>
Received: from mga11.intel.com (mga11.intel.com [192.55.52.93])
 by dpdk.org (Postfix) with ESMTP id A87F05B32
 for <dev@dpdk.org>; Thu, 15 Mar 2018 10:48:55 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by fmsmga102.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 15 Mar 2018 02:48:44 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,310,1517904000"; d="scan'208";a="39104860"
Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.237.220.112])
 ([10.237.220.112])
 by orsmga001.jf.intel.com with ESMTP; 15 Mar 2018 02:48:43 -0700
To: Andrew Rybchenko <arybchenko@solarflare.com>, dev@dpdk.org
Cc: Olivier MATZ <olivier.matz@6wind.com>
References: <1516713372-10572-1-git-send-email-arybchenko@solarflare.com>
 <1520696382-16400-1-git-send-email-arybchenko@solarflare.com>
 <1520696382-16400-4-git-send-email-arybchenko@solarflare.com>
 <a3292ce8-59f0-575c-efbc-22efd06a3090@intel.com>
 <72792bc9-8e64-9a18-ce6c-ffd6777f1748@solarflare.com>
 <216ac4e5-2815-de23-4185-fd28210c0477@intel.com>
 <f3148dd6-10c3-7fbb-e73b-b8f43d0a4bd3@solarflare.com>
From: "Burakov, Anatoly" <anatoly.burakov@intel.com>
Message-ID: <b26d07a8-5e99-85f1-1948-5b8d81979134@intel.com>
Date: Thu, 15 Mar 2018 09:48:41 +0000
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101
 Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <f3148dd6-10c3-7fbb-e73b-b8f43d0a4bd3@solarflare.com>
Content-Type: text/plain; charset=utf-8; format=flowed
Content-Language: en-US
Content-Transfer-Encoding: 8bit
Subject: Re: [dpdk-dev] [PATCH v1 3/9] mempool: remove callback to get
 capabilities
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2018 09:48:56 -0000

On 14-Mar-18 5:24 PM, Andrew Rybchenko wrote:
> On 03/14/2018 07:53 PM, Burakov, Anatoly wrote:
>> On 14-Mar-18 4:12 PM, Andrew Rybchenko wrote:
>>> On 03/14/2018 05:40 PM, Burakov, Anatoly wrote:
>>>> On 10-Mar-18 3:39 PM, Andrew Rybchenko wrote:
>>>>> The callback was introduced to let generic code to know octeontx
>>>>> mempool driver requirements to use single physically contiguous
>>>>> memory chunk to store all objects and align object address to
>>>>> total object size. Now these requirements are met using a new
>>>>> callbacks to calculate required memory chunk size and to populate
>>>>> objects using provided memory chunk.
>>>>>
>>>>> These capability flags are not used anywhere else.
>>>>>
>>>>> Restricting capabilities to flags is not generic and likely to
>>>>> be insufficient to describe mempool driver features. If required
>>>>> in the future, API which returns structured information may be
>>>>> added.
>>>>>
>>>>> Signed-off-by: Andrew Rybchenko <arybchenko@solarflare.com>
>>>>> ---
>>>>
>>>> Just a general comment - it is not enough to describe minimum 
>>>> memchunk requirements. With memory hotplug patchset that's hopefully 
>>>> getting merged in 18.05, memzones will no longer be guaranteed to be 
>>>> IOVA-contiguous. So, if a driver requires its mempool to not only be 
>>>> populated from a single memzone, but a single *physically 
>>>> contiguous* memzone, going by only callbacks will not do, because 
>>>> whether or not something should be a single memzone says nothing 
>>>> about whether this memzone has to also be IOVA-contiguous.
>>>>
>>>> So i believe this needs to stay in one form or another.
>>>>
>>>> (also it would be nice to have a flag that a user could pass to 
>>>> mempool_create that would force memzone reservation be 
>>>> IOVA-contiguous, but that's a topic for another conversation. prime 
>>>> user for this would be KNI.)
>>>
>>> I think that min_chunk_size should be treated as IOVA-contiguous.
>>
>> Why? It's perfectly reasonable to e.g. implement a software mempool 
>> driver that would perform some optimizations due to all objects being 
>> in the same VA-contiguous memzone, yet not be dependent on underlying 
>> physical memory layout. These are two separate concerns IMO.
> 
> It looks like there is some misunderstanding here or I simply don't 
> understand your point.
> Above I mean that driver should be able to advertise its requirements on 
> IOVA-contiguous regions.
> If driver do not care about physical memory layout, no problem.

Please correct me if i'm wrong, but my understanding was that you wanted 
to use min_chunk as a way to express minimum requirements for 
IOVA-contiguous memory. If i understood you correctly, i don't think 
that's the way to go because there could be valid use cases where a 
mempool driver would like to advertise min_chunk_size to be equal to its 
total size (i.e. allocate everything in a single memzone), yet not 
require that memzone to be IOVA-contiguous. I think these are two 
different concerns, and one does not, and should not imply the other.

> 
>> > So, we
>>> have 4 levels:
>>>   - MEMPOOL_F_NO_PHYS_CONTIG (min_chunk_size == 0) -- 
>>> IOVA-congtiguous is not required at all
>>>   - no MEMPOOL_F_NO_PHYS_CONTIG (min_chunk_size == total_obj_size) -- 
>>> object should be IOVA-contiguous
>>>   - min_chunk_size > total_obj_size  -- group of objects should be 
>>> IOVA-contiguous
>>>   - min_chunk_size == <all-objects-size> -- all objects should be 
>>> IOVA-contiguous
>>
>> I don't think this "automagic" decision on what should be 
>> IOVA-contiguous or not is the way to go. It needlessly complicates 
>> things, when all it takes is another flag passed to mempool allocator 
>> somewhere.
> 
> No, it is not just one flag. We really need option (3) above: group of 
> objects IOVA-contiguous in [1].
> Of course, it is possible to use option (4) instead: everything 
> IOVA-contigous, but I think it is bad - it may be very big and 
> hard/impossible to allocate due to fragmentation.
> 

Exactly: we shouldn't be forcing IOVA-contiguous memory just because 
mempool requrested a big min_chunk_size, nor do i think it is wise to 
encode such heuristics (referring to your 4 "levels" quoted above) into 
the mempool allocator.

>> I'm not sure what is the best solution here. Perhaps another option 
>> would be to let mempool drivers allocate their memory as well? I.e. 
>> leave current behavior as default, as it's likely that it would be 
>> suitable for nearly all use cases, but provide another option to 
>> override memory allocation completely, so that e.g. octeontx could 
>> just do a memzone_reserve_contig() without regard for default 
>> allocation settings. I think this could be the cleanest solution.
> 
> For me it is hard to say. I don't know DPDK history good enough to say 
> why there is a mempool API to populate objects on externally provided 
> memory. If it may be removed, it is OK for me to do memory allocation 
> inside rte_mempool or mempool drivers. Otherwise, if it is still allowed 
> to allocate memory externally and pass it to mempool, it must be a way 
> to express IOVA-contiguos requirements.
> 
> [1] https://dpdk.org/dev/patchwork/patch/34338/

Populating mempool objects is not the same as reserving memory where 
those objects would reside. The closest to "allocate memory externally" 
we have is rte_mempool_xmem_create(), which you are removing in this 
patchset.

> 
>>
>>>
>>> If so, how allocation should be implemented?
>>>   1. if (min_chunk_size > min_page_size)
>>>      a. try all contiguous
>>>      b. if cannot, do by mem_chunk_size contiguous
>>>   2. else allocate non-contiguous
>>>
>>> -- 
>>> Andrew.
>>
>>
> 


-- 
Thanks,
Anatoly