* [dpdk-dev] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-10 8:56 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device Alejandro Lucero
` (4 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
A device can suffer addressing limitations. This functions checks
memsegs have iovas within the supported range based on dma mask.
PMD should use this during initialization if supported devices
suffer addressing limitations, returning an error if this function
returns memsegs out of range.
Another potential usage is for emulated IOMMU hardware with addressing
limitations.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/common/eal_common_memory.c | 33 ++++++++++++++++++++++++++++++
lib/librte_eal/common/include/rte_memory.h | 3 +++
lib/librte_eal/rte_eal_version.map | 1 +
3 files changed, 37 insertions(+)
diff --git a/lib/librte_eal/common/eal_common_memory.c b/lib/librte_eal/common/eal_common_memory.c
index fc6c44d..f5efebe 100644
--- a/lib/librte_eal/common/eal_common_memory.c
+++ b/lib/librte_eal/common/eal_common_memory.c
@@ -109,6 +109,39 @@
}
}
+/* check memseg iovas are within the required range based on dma mask */
+int
+rte_eal_check_dma_mask(uint8_t maskbits)
+{
+
+ const struct rte_mem_config *mcfg;
+ uint64_t mask;
+ int i;
+
+ /* create dma mask */
+ mask = ~((1ULL << maskbits) - 1);
+
+ /* get pointer to global configuration */
+ mcfg = rte_eal_get_configuration()->mem_config;
+
+ for (i = 0; i < RTE_MAX_MEMSEG; i++) {
+ if (mcfg->memseg[i].addr == NULL)
+ break;
+
+ if (mcfg->memseg[i].iova & mask) {
+ RTE_LOG(INFO, EAL,
+ "memseg[%d] iova %"PRIx64" out of range:\n",
+ i, mcfg->memseg[i].iova);
+
+ RTE_LOG(INFO, EAL, "\tusing dma mask %"PRIx64"\n",
+ mask);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
/* return the number of memory channels */
unsigned rte_memory_get_nchannel(void)
{
diff --git a/lib/librte_eal/common/include/rte_memory.h b/lib/librte_eal/common/include/rte_memory.h
index 80a8fc0..b2a0168 100644
--- a/lib/librte_eal/common/include/rte_memory.h
+++ b/lib/librte_eal/common/include/rte_memory.h
@@ -209,6 +209,9 @@ struct rte_memseg {
*/
unsigned rte_memory_get_nrank(void);
+/* check memsegs iovas are within a range based on dma mask */
+int rte_eal_check_dma_mask(uint8_t maskbits);
+
/**
* Drivers based on uio will not load unless physical
* addresses are obtainable. It is only possible to get
diff --git a/lib/librte_eal/rte_eal_version.map b/lib/librte_eal/rte_eal_version.map
index f4f46c1..aa6cf87 100644
--- a/lib/librte_eal/rte_eal_version.map
+++ b/lib/librte_eal/rte_eal_version.map
@@ -184,6 +184,7 @@ DPDK_17.11 {
rte_eal_create_uio_dev;
rte_bus_get_iommu_class;
+ rte_eal_check_dma_mask;
rte_eal_has_pci;
rte_eal_iova_mode;
rte_eal_mbuf_default_mempool_ops;
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses Alejandro Lucero
@ 2018-07-10 8:56 ` Eelco Chaudron
2018-07-10 9:34 ` Alejandro Lucero
0 siblings, 1 reply; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 8:56 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
> A device can suffer addressing limitations. This functions checks
> memsegs have iovas within the supported range based on dma mask.
>
> PMD should use this during initialization if supported devices
> suffer addressing limitations, returning an error if this function
> returns memsegs out of range.
>
> Another potential usage is for emulated IOMMU hardware with addressing
> limitations.
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
> ---
> lib/librte_eal/common/eal_common_memory.c | 33
> ++++++++++++++++++++++++++++++
> lib/librte_eal/common/include/rte_memory.h | 3 +++
> lib/librte_eal/rte_eal_version.map | 1 +
> 3 files changed, 37 insertions(+)
>
> diff --git a/lib/librte_eal/common/eal_common_memory.c
> b/lib/librte_eal/common/eal_common_memory.c
> index fc6c44d..f5efebe 100644
> --- a/lib/librte_eal/common/eal_common_memory.c
> +++ b/lib/librte_eal/common/eal_common_memory.c
> @@ -109,6 +109,39 @@
> }
> }
>
> +/* check memseg iovas are within the required range based on dma mask
> */
> +int
> +rte_eal_check_dma_mask(uint8_t maskbits)
> +{
> +
> + const struct rte_mem_config *mcfg;
> + uint64_t mask;
> + int i;
> +
I think we should add some sanity check to the input maskbits, i.e.
[64,0) or [64, 32]? What would be a reasonable lower bound.
> + /* create dma mask */
> + mask = ~((1ULL << maskbits) - 1);
> +
> + /* get pointer to global configuration */
> + mcfg = rte_eal_get_configuration()->mem_config;
> +
> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
> + if (mcfg->memseg[i].addr == NULL)
> + break;
> +
> + if (mcfg->memseg[i].iova & mask) {
> + RTE_LOG(INFO, EAL,
> + "memseg[%d] iova %"PRIx64" out of range:\n",
> + i, mcfg->memseg[i].iova);
> +
> + RTE_LOG(INFO, EAL, "\tusing dma mask %"PRIx64"\n",
> + mask);
> + return -1;
> + }
> + }
> +
> + return 0;
> +}
> +
> /* return the number of memory channels */
> unsigned rte_memory_get_nchannel(void)
> {
> diff --git a/lib/librte_eal/common/include/rte_memory.h
> b/lib/librte_eal/common/include/rte_memory.h
> index 80a8fc0..b2a0168 100644
> --- a/lib/librte_eal/common/include/rte_memory.h
> +++ b/lib/librte_eal/common/include/rte_memory.h
> @@ -209,6 +209,9 @@ struct rte_memseg {
> */
> unsigned rte_memory_get_nrank(void);
>
> +/* check memsegs iovas are within a range based on dma mask */
> +int rte_eal_check_dma_mask(uint8_t maskbits);
> +
> /**
> * Drivers based on uio will not load unless physical
> * addresses are obtainable. It is only possible to get
> diff --git a/lib/librte_eal/rte_eal_version.map
> b/lib/librte_eal/rte_eal_version.map
> index f4f46c1..aa6cf87 100644
> --- a/lib/librte_eal/rte_eal_version.map
> +++ b/lib/librte_eal/rte_eal_version.map
> @@ -184,6 +184,7 @@ DPDK_17.11 {
>
> rte_eal_create_uio_dev;
> rte_bus_get_iommu_class;
> + rte_eal_check_dma_mask;
> rte_eal_has_pci;
> rte_eal_iova_mode;
> rte_eal_mbuf_default_mempool_ops;
> --
> 1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 8:56 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
@ 2018-07-10 9:34 ` Alejandro Lucero
2018-07-10 10:06 ` Eelco Chaudron
0 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 9:34 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com> wrote:
>
>
> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>
> A device can suffer addressing limitations. This functions checks
>> memsegs have iovas within the supported range based on dma mask.
>>
>> PMD should use this during initialization if supported devices
>> suffer addressing limitations, returning an error if this function
>> returns memsegs out of range.
>>
>> Another potential usage is for emulated IOMMU hardware with addressing
>> limitations.
>>
>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>> ---
>> lib/librte_eal/common/eal_common_memory.c | 33
>> ++++++++++++++++++++++++++++++
>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>> lib/librte_eal/rte_eal_version.map | 1 +
>> 3 files changed, 37 insertions(+)
>>
>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>> b/lib/librte_eal/common/eal_common_memory.c
>> index fc6c44d..f5efebe 100644
>> --- a/lib/librte_eal/common/eal_common_memory.c
>> +++ b/lib/librte_eal/common/eal_common_memory.c
>> @@ -109,6 +109,39 @@
>> }
>> }
>>
>> +/* check memseg iovas are within the required range based on dma mask */
>> +int
>> +rte_eal_check_dma_mask(uint8_t maskbits)
>> +{
>> +
>> + const struct rte_mem_config *mcfg;
>> + uint64_t mask;
>> + int i;
>> +
>>
>
> I think we should add some sanity check to the input maskbits, i.e. [64,0)
> or [64, 32]? What would be a reasonable lower bound.
>
>
This is not a user's API, so any invocation will be reviewed, but I guess
adding a sanity check here does not harm.
Not sure about lower bound but upper should 64, although it does not make
sense but it is safe. Lower bound is not so problematic.
>
> + /* create dma mask */
>> + mask = ~((1ULL << maskbits) - 1);
>> +
>> + /* get pointer to global configuration */
>> + mcfg = rte_eal_get_configuration()->mem_config;
>> +
>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>> + if (mcfg->memseg[i].addr == NULL)
>> + break;
>> +
>> + if (mcfg->memseg[i].iova & mask) {
>> + RTE_LOG(INFO, EAL,
>> + "memseg[%d] iova %"PRIx64" out of
>> range:\n",
>> + i, mcfg->memseg[i].iova);
>> +
>> + RTE_LOG(INFO, EAL, "\tusing dma mask %"PRIx64"\n",
>> + mask);
>> + return -1;
>> + }
>> + }
>> +
>> + return 0;
>> +}
>> +
>> /* return the number of memory channels */
>> unsigned rte_memory_get_nchannel(void)
>> {
>> diff --git a/lib/librte_eal/common/include/rte_memory.h
>> b/lib/librte_eal/common/include/rte_memory.h
>> index 80a8fc0..b2a0168 100644
>> --- a/lib/librte_eal/common/include/rte_memory.h
>> +++ b/lib/librte_eal/common/include/rte_memory.h
>> @@ -209,6 +209,9 @@ struct rte_memseg {
>> */
>> unsigned rte_memory_get_nrank(void);
>>
>> +/* check memsegs iovas are within a range based on dma mask */
>> +int rte_eal_check_dma_mask(uint8_t maskbits);
>> +
>> /**
>> * Drivers based on uio will not load unless physical
>> * addresses are obtainable. It is only possible to get
>> diff --git a/lib/librte_eal/rte_eal_version.map
>> b/lib/librte_eal/rte_eal_version.map
>> index f4f46c1..aa6cf87 100644
>> --- a/lib/librte_eal/rte_eal_version.map
>> +++ b/lib/librte_eal/rte_eal_version.map
>> @@ -184,6 +184,7 @@ DPDK_17.11 {
>>
>> rte_eal_create_uio_dev;
>> rte_bus_get_iommu_class;
>> + rte_eal_check_dma_mask;
>> rte_eal_has_pci;
>> rte_eal_iova_mode;
>> rte_eal_mbuf_default_mempool_ops;
>> --
>> 1.9.1
>>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 9:34 ` Alejandro Lucero
@ 2018-07-10 10:06 ` Eelco Chaudron
2018-07-10 10:52 ` Alejandro Lucero
0 siblings, 1 reply; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 10:06 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
> wrote:
>
>>
>>
>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>
>> A device can suffer addressing limitations. This functions checks
>>> memsegs have iovas within the supported range based on dma mask.
>>>
>>> PMD should use this during initialization if supported devices
>>> suffer addressing limitations, returning an error if this function
>>> returns memsegs out of range.
>>>
>>> Another potential usage is for emulated IOMMU hardware with
>>> addressing
>>> limitations.
>>>
>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>> ---
>>> lib/librte_eal/common/eal_common_memory.c | 33
>>> ++++++++++++++++++++++++++++++
>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>> lib/librte_eal/rte_eal_version.map | 1 +
>>> 3 files changed, 37 insertions(+)
>>>
>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>> b/lib/librte_eal/common/eal_common_memory.c
>>> index fc6c44d..f5efebe 100644
>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>> @@ -109,6 +109,39 @@
>>> }
>>> }
>>>
>>> +/* check memseg iovas are within the required range based on dma
>>> mask */
>>> +int
>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>> +{
>>> +
>>> + const struct rte_mem_config *mcfg;
>>> + uint64_t mask;
>>> + int i;
>>> +
>>>
>>
>> I think we should add some sanity check to the input maskbits, i.e.
>> [64,0)
>> or [64, 32]? What would be a reasonable lower bound.
>>
>>
> This is not a user's API, so any invocation will be reviewed, but I
> guess
> adding a sanity check here does not harm.
>
> Not sure about lower bound but upper should 64, although it does not
> make
> sense but it is safe. Lower bound is not so problematic.
>
>
>>
>> + /* create dma mask */
>>> + mask = ~((1ULL << maskbits) - 1);
>>> +
>>> + /* get pointer to global configuration */
>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>> +
>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>> + if (mcfg->memseg[i].addr == NULL)
>>> + break;
Looking at some other code, it looks like NULL entries might exists. So
should a continue; rather than a break; be used here?
>>> +
>>> + if (mcfg->memseg[i].iova & mask) {
>>> + RTE_LOG(INFO, EAL,
>>> + "memseg[%d] iova %"PRIx64" out of
>>> range:\n",
>>> + i, mcfg->memseg[i].iova);
>>> +
>>> + RTE_LOG(INFO, EAL, "\tusing dma mask
>>> %"PRIx64"\n",
>>> + mask);
>>> + return -1;
>>> + }
>>> + }
>>> +
>>> + return 0;
>>> +}
>>> +
>>> /* return the number of memory channels */
>>> unsigned rte_memory_get_nchannel(void)
>>> {
>>> diff --git a/lib/librte_eal/common/include/rte_memory.h
>>> b/lib/librte_eal/common/include/rte_memory.h
>>> index 80a8fc0..b2a0168 100644
>>> --- a/lib/librte_eal/common/include/rte_memory.h
>>> +++ b/lib/librte_eal/common/include/rte_memory.h
>>> @@ -209,6 +209,9 @@ struct rte_memseg {
>>> */
>>> unsigned rte_memory_get_nrank(void);
>>>
>>> +/* check memsegs iovas are within a range based on dma mask */
>>> +int rte_eal_check_dma_mask(uint8_t maskbits);
>>> +
>>> /**
>>> * Drivers based on uio will not load unless physical
>>> * addresses are obtainable. It is only possible to get
>>> diff --git a/lib/librte_eal/rte_eal_version.map
>>> b/lib/librte_eal/rte_eal_version.map
>>> index f4f46c1..aa6cf87 100644
>>> --- a/lib/librte_eal/rte_eal_version.map
>>> +++ b/lib/librte_eal/rte_eal_version.map
>>> @@ -184,6 +184,7 @@ DPDK_17.11 {
>>>
>>> rte_eal_create_uio_dev;
>>> rte_bus_get_iommu_class;
>>> + rte_eal_check_dma_mask;
>>> rte_eal_has_pci;
>>> rte_eal_iova_mode;
>>> rte_eal_mbuf_default_mempool_ops;
>>> --
>>> 1.9.1
>>>
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 10:06 ` Eelco Chaudron
@ 2018-07-10 10:52 ` Alejandro Lucero
2018-07-10 11:14 ` Eelco Chaudron
0 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 10:52 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron <echaudro@redhat.com>
wrote:
>
>
> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>
> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
>> wrote:
>>
>>
>>>
>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>
>>> A device can suffer addressing limitations. This functions checks
>>>
>>>> memsegs have iovas within the supported range based on dma mask.
>>>>
>>>> PMD should use this during initialization if supported devices
>>>> suffer addressing limitations, returning an error if this function
>>>> returns memsegs out of range.
>>>>
>>>> Another potential usage is for emulated IOMMU hardware with addressing
>>>> limitations.
>>>>
>>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>> ---
>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>> ++++++++++++++++++++++++++++++
>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>> 3 files changed, 37 insertions(+)
>>>>
>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>> index fc6c44d..f5efebe 100644
>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>> @@ -109,6 +109,39 @@
>>>> }
>>>> }
>>>>
>>>> +/* check memseg iovas are within the required range based on dma mask
>>>> */
>>>> +int
>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>> +{
>>>> +
>>>> + const struct rte_mem_config *mcfg;
>>>> + uint64_t mask;
>>>> + int i;
>>>> +
>>>>
>>>>
>>> I think we should add some sanity check to the input maskbits, i.e.
>>> [64,0)
>>> or [64, 32]? What would be a reasonable lower bound.
>>>
>>>
>>> This is not a user's API, so any invocation will be reviewed, but I guess
>> adding a sanity check here does not harm.
>>
>> Not sure about lower bound but upper should 64, although it does not make
>> sense but it is safe. Lower bound is not so problematic.
>>
>>
>>
>>> + /* create dma mask */
>>>
>>>> + mask = ~((1ULL << maskbits) - 1);
>>>> +
>>>> + /* get pointer to global configuration */
>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>> +
>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>> + if (mcfg->memseg[i].addr == NULL)
>>>> + break;
>>>>
>>>
> Looking at some other code, it looks like NULL entries might exists. So
> should a continue; rather than a break; be used here?
>
>
I do not think so. memsegs are allocated sequentially, so first with addr
as NULL implies no more memsegs.
>
> +
>>>> + if (mcfg->memseg[i].iova & mask) {
>>>> + RTE_LOG(INFO, EAL,
>>>> + "memseg[%d] iova %"PRIx64" out of
>>>> range:\n",
>>>> + i, mcfg->memseg[i].iova);
>>>> +
>>>> + RTE_LOG(INFO, EAL, "\tusing dma mask
>>>> %"PRIx64"\n",
>>>> + mask);
>>>> + return -1;
>>>> + }
>>>> + }
>>>> +
>>>> + return 0;
>>>> +}
>>>> +
>>>> /* return the number of memory channels */
>>>> unsigned rte_memory_get_nchannel(void)
>>>> {
>>>> diff --git a/lib/librte_eal/common/include/rte_memory.h
>>>> b/lib/librte_eal/common/include/rte_memory.h
>>>> index 80a8fc0..b2a0168 100644
>>>> --- a/lib/librte_eal/common/include/rte_memory.h
>>>> +++ b/lib/librte_eal/common/include/rte_memory.h
>>>> @@ -209,6 +209,9 @@ struct rte_memseg {
>>>> */
>>>> unsigned rte_memory_get_nrank(void);
>>>>
>>>> +/* check memsegs iovas are within a range based on dma mask */
>>>> +int rte_eal_check_dma_mask(uint8_t maskbits);
>>>> +
>>>> /**
>>>> * Drivers based on uio will not load unless physical
>>>> * addresses are obtainable. It is only possible to get
>>>> diff --git a/lib/librte_eal/rte_eal_version.map
>>>> b/lib/librte_eal/rte_eal_version.map
>>>> index f4f46c1..aa6cf87 100644
>>>> --- a/lib/librte_eal/rte_eal_version.map
>>>> +++ b/lib/librte_eal/rte_eal_version.map
>>>> @@ -184,6 +184,7 @@ DPDK_17.11 {
>>>>
>>>> rte_eal_create_uio_dev;
>>>> rte_bus_get_iommu_class;
>>>> + rte_eal_check_dma_mask;
>>>> rte_eal_has_pci;
>>>> rte_eal_iova_mode;
>>>> rte_eal_mbuf_default_mempool_ops;
>>>> --
>>>> 1.9.1
>>>>
>>>>
>>>
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 10:52 ` Alejandro Lucero
@ 2018-07-10 11:14 ` Eelco Chaudron
2018-07-10 11:33 ` Burakov, Anatoly
2018-07-10 11:40 ` Alejandro Lucero
0 siblings, 2 replies; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 11:14 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On 10 Jul 2018, at 12:52, Alejandro Lucero wrote:
> On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron <echaudro@redhat.com>
> wrote:
>
>>
>>
>> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>>
>> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
>>> wrote:
>>>
>>>
>>>>
>>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>>
>>>> A device can suffer addressing limitations. This functions checks
>>>>
>>>>> memsegs have iovas within the supported range based on dma mask.
>>>>>
>>>>> PMD should use this during initialization if supported devices
>>>>> suffer addressing limitations, returning an error if this function
>>>>> returns memsegs out of range.
>>>>>
>>>>> Another potential usage is for emulated IOMMU hardware with
>>>>> addressing
>>>>> limitations.
>>>>>
>>>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>> ---
>>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>>> ++++++++++++++++++++++++++++++
>>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>>> 3 files changed, 37 insertions(+)
>>>>>
>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>> index fc6c44d..f5efebe 100644
>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>> @@ -109,6 +109,39 @@
>>>>> }
>>>>> }
>>>>>
>>>>> +/* check memseg iovas are within the required range based on dma
>>>>> mask
>>>>> */
>>>>> +int
>>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>>> +{
>>>>> +
>>>>> + const struct rte_mem_config *mcfg;
>>>>> + uint64_t mask;
>>>>> + int i;
>>>>> +
>>>>>
>>>>>
>>>> I think we should add some sanity check to the input maskbits, i.e.
>>>> [64,0)
>>>> or [64, 32]? What would be a reasonable lower bound.
>>>>
>>>>
>>>> This is not a user's API, so any invocation will be reviewed, but I
>>>> guess
>>> adding a sanity check here does not harm.
>>>
>>> Not sure about lower bound but upper should 64, although it does not
>>> make
>>> sense but it is safe. Lower bound is not so problematic.
>>>
>>>
>>>
>>>> + /* create dma mask */
>>>>
>>>>> + mask = ~((1ULL << maskbits) - 1);
>>>>> +
>>>>> + /* get pointer to global configuration */
>>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>>> +
>>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>>> + if (mcfg->memseg[i].addr == NULL)
>>>>> + break;
>>>>>
>>>>
>> Looking at some other code, it looks like NULL entries might exists.
>> So
>> should a continue; rather than a break; be used here?
>>
>>
> I do not think so. memsegs are allocated sequentially, so first with
> addr
> as NULL implies no more memsegs.
I was referring to the mem walk functions, rte_memseg_list_walk(). Maybe
some having more experience with this area can review/comment.
>
>
>>
>> +
>>>>> + if (mcfg->memseg[i].iova & mask) {
>>>>> + RTE_LOG(INFO, EAL,
>>>>> + "memseg[%d] iova %"PRIx64" out of
>>>>> range:\n",
>>>>> + i, mcfg->memseg[i].iova);
>>>>> +
>>>>> + RTE_LOG(INFO, EAL, "\tusing dma mask
>>>>> %"PRIx64"\n",
>>>>> + mask);
>>>>> + return -1;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>> +
>>>>> /* return the number of memory channels */
>>>>> unsigned rte_memory_get_nchannel(void)
>>>>> {
>>>>> diff --git a/lib/librte_eal/common/include/rte_memory.h
>>>>> b/lib/librte_eal/common/include/rte_memory.h
>>>>> index 80a8fc0..b2a0168 100644
>>>>> --- a/lib/librte_eal/common/include/rte_memory.h
>>>>> +++ b/lib/librte_eal/common/include/rte_memory.h
>>>>> @@ -209,6 +209,9 @@ struct rte_memseg {
>>>>> */
>>>>> unsigned rte_memory_get_nrank(void);
>>>>>
>>>>> +/* check memsegs iovas are within a range based on dma mask */
>>>>> +int rte_eal_check_dma_mask(uint8_t maskbits);
>>>>> +
>>>>> /**
>>>>> * Drivers based on uio will not load unless physical
>>>>> * addresses are obtainable. It is only possible to get
>>>>> diff --git a/lib/librte_eal/rte_eal_version.map
>>>>> b/lib/librte_eal/rte_eal_version.map
>>>>> index f4f46c1..aa6cf87 100644
>>>>> --- a/lib/librte_eal/rte_eal_version.map
>>>>> +++ b/lib/librte_eal/rte_eal_version.map
>>>>> @@ -184,6 +184,7 @@ DPDK_17.11 {
>>>>>
>>>>> rte_eal_create_uio_dev;
>>>>> rte_bus_get_iommu_class;
>>>>> + rte_eal_check_dma_mask;
>>>>> rte_eal_has_pci;
>>>>> rte_eal_iova_mode;
>>>>> rte_eal_mbuf_default_mempool_ops;
>>>>> --
>>>>> 1.9.1
>>>>>
>>>>>
>>>>
>>
>>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 11:14 ` Eelco Chaudron
@ 2018-07-10 11:33 ` Burakov, Anatoly
2018-07-10 11:43 ` Alejandro Lucero
2018-07-10 11:40 ` Alejandro Lucero
1 sibling, 1 reply; 25+ messages in thread
From: Burakov, Anatoly @ 2018-07-10 11:33 UTC (permalink / raw)
To: Eelco Chaudron, Alejandro Lucero
Cc: dev, stable, Maxime Coquelin, Ferruh Yigit
On 10-Jul-18 12:14 PM, Eelco Chaudron wrote:
>
>
> On 10 Jul 2018, at 12:52, Alejandro Lucero wrote:
>
>> On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron <echaudro@redhat.com>
>> wrote:
>>
>>>
>>>
>>> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>>>
>>> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>>>
>>>>> A device can suffer addressing limitations. This functions checks
>>>>>
>>>>>> memsegs have iovas within the supported range based on dma mask.
>>>>>>
>>>>>> PMD should use this during initialization if supported devices
>>>>>> suffer addressing limitations, returning an error if this function
>>>>>> returns memsegs out of range.
>>>>>>
>>>>>> Another potential usage is for emulated IOMMU hardware with
>>>>>> addressing
>>>>>> limitations.
>>>>>>
>>>>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>> ---
>>>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>>>> ++++++++++++++++++++++++++++++
>>>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>>>> 3 files changed, 37 insertions(+)
>>>>>>
>>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>>> index fc6c44d..f5efebe 100644
>>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>>> @@ -109,6 +109,39 @@
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> +/* check memseg iovas are within the required range based on dma
>>>>>> mask
>>>>>> */
>>>>>> +int
>>>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>>>> +{
>>>>>> +
>>>>>> + const struct rte_mem_config *mcfg;
>>>>>> + uint64_t mask;
>>>>>> + int i;
>>>>>> +
>>>>>>
>>>>>>
>>>>> I think we should add some sanity check to the input maskbits, i.e.
>>>>> [64,0)
>>>>> or [64, 32]? What would be a reasonable lower bound.
>>>>>
>>>>>
>>>>> This is not a user's API, so any invocation will be reviewed, but I
>>>>> guess
>>>> adding a sanity check here does not harm.
>>>>
>>>> Not sure about lower bound but upper should 64, although it does not
>>>> make
>>>> sense but it is safe. Lower bound is not so problematic.
>>>>
>>>>
>>>>
>>>>> + /* create dma mask */
>>>>>
>>>>>> + mask = ~((1ULL << maskbits) - 1);
>>>>>> +
>>>>>> + /* get pointer to global configuration */
>>>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>>>> +
>>>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>>>> + if (mcfg->memseg[i].addr == NULL)
>>>>>> + break;
>>>>>>
>>>>>
>>> Looking at some other code, it looks like NULL entries might exists. So
>>> should a continue; rather than a break; be used here?
>>>
>>>
>> I do not think so. memsegs are allocated sequentially, so first with addr
>> as NULL implies no more memsegs.
>
> I was referring to the mem walk functions, rte_memseg_list_walk(). Maybe
> some having more experience with this area can review/comment.
Pre-18.05, all memsegs are allocated continuously. Memseg lists and
memseg list walk functions are 18.05+.
Alejandro, perhaps it would be worth it to tag your patchset with
"pre-18.05" to avoid similar confusion in the future?
--
Thanks,
Anatoly
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 11:33 ` Burakov, Anatoly
@ 2018-07-10 11:43 ` Alejandro Lucero
2018-07-10 11:55 ` Eelco Chaudron
0 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 11:43 UTC (permalink / raw)
To: Burakov, Anatoly
Cc: Eelco Chaudron, dev, stable, Maxime Coquelin, Ferruh Yigit
On Tue, Jul 10, 2018 at 12:33 PM, Burakov, Anatoly <
anatoly.burakov@intel.com> wrote:
> On 10-Jul-18 12:14 PM, Eelco Chaudron wrote:
>
>>
>>
>> On 10 Jul 2018, at 12:52, Alejandro Lucero wrote:
>>
>> On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron <echaudro@redhat.com>
>>> wrote:
>>>
>>>
>>>>
>>>> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>>>>
>>>> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
>>>>
>>>>> wrote:
>>>>>
>>>>>
>>>>>
>>>>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>>>>
>>>>>> A device can suffer addressing limitations. This functions checks
>>>>>>
>>>>>> memsegs have iovas within the supported range based on dma mask.
>>>>>>>
>>>>>>> PMD should use this during initialization if supported devices
>>>>>>> suffer addressing limitations, returning an error if this function
>>>>>>> returns memsegs out of range.
>>>>>>>
>>>>>>> Another potential usage is for emulated IOMMU hardware with
>>>>>>> addressing
>>>>>>> limitations.
>>>>>>>
>>>>>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>> ---
>>>>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>>>>> ++++++++++++++++++++++++++++++
>>>>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>>>>> 3 files changed, 37 insertions(+)
>>>>>>>
>>>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>>>> index fc6c44d..f5efebe 100644
>>>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>>>> @@ -109,6 +109,39 @@
>>>>>>> }
>>>>>>> }
>>>>>>>
>>>>>>> +/* check memseg iovas are within the required range based on dma
>>>>>>> mask
>>>>>>> */
>>>>>>> +int
>>>>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>>>>> +{
>>>>>>> +
>>>>>>> + const struct rte_mem_config *mcfg;
>>>>>>> + uint64_t mask;
>>>>>>> + int i;
>>>>>>> +
>>>>>>>
>>>>>>>
>>>>>>> I think we should add some sanity check to the input maskbits, i.e.
>>>>>> [64,0)
>>>>>> or [64, 32]? What would be a reasonable lower bound.
>>>>>>
>>>>>>
>>>>>> This is not a user's API, so any invocation will be reviewed, but I
>>>>>> guess
>>>>>>
>>>>> adding a sanity check here does not harm.
>>>>>
>>>>> Not sure about lower bound but upper should 64, although it does not
>>>>> make
>>>>> sense but it is safe. Lower bound is not so problematic.
>>>>>
>>>>>
>>>>>
>>>>> + /* create dma mask */
>>>>>>
>>>>>> + mask = ~((1ULL << maskbits) - 1);
>>>>>>> +
>>>>>>> + /* get pointer to global configuration */
>>>>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>>>>> +
>>>>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>>>>> + if (mcfg->memseg[i].addr == NULL)
>>>>>>> + break;
>>>>>>>
>>>>>>>
>>>>>> Looking at some other code, it looks like NULL entries might exists.
>>>> So
>>>> should a continue; rather than a break; be used here?
>>>>
>>>>
>>>> I do not think so. memsegs are allocated sequentially, so first with
>>> addr
>>> as NULL implies no more memsegs.
>>>
>>
>> I was referring to the mem walk functions, rte_memseg_list_walk(). Maybe
>> some having more experience with this area can review/comment.
>>
>
> Pre-18.05, all memsegs are allocated continuously. Memseg lists and memseg
> list walk functions are 18.05+.
>
> Alejandro, perhaps it would be worth it to tag your patchset with
> "pre-18.05" to avoid similar confusion in the future?
>
>
Yes, that will help. I'm sending a new version shortly and I'll make it
clear.
> --
> Thanks,
> Anatoly
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 11:43 ` Alejandro Lucero
@ 2018-07-10 11:55 ` Eelco Chaudron
0 siblings, 0 replies; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 11:55 UTC (permalink / raw)
To: Alejandro Lucero
Cc: Burakov, Anatoly, dev, stable, Maxime Coquelin, Ferruh Yigit
On 10 Jul 2018, at 13:43, Alejandro Lucero wrote:
> On Tue, Jul 10, 2018 at 12:33 PM, Burakov, Anatoly <
> anatoly.burakov@intel.com> wrote:
>
>> On 10-Jul-18 12:14 PM, Eelco Chaudron wrote:
>>
>>>
>>>
>>> On 10 Jul 2018, at 12:52, Alejandro Lucero wrote:
>>>
>>> On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron
>>> <echaudro@redhat.com>
>>>> wrote:
>>>>
>>>>
>>>>>
>>>>> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>>>>>
>>>>> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron
>>>>> <echaudro@redhat.com>
>>>>>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>>>>>
>>>>>>> A device can suffer addressing limitations. This functions
>>>>>>> checks
>>>>>>>
>>>>>>> memsegs have iovas within the supported range based on dma mask.
>>>>>>>>
>>>>>>>> PMD should use this during initialization if supported devices
>>>>>>>> suffer addressing limitations, returning an error if this
>>>>>>>> function
>>>>>>>> returns memsegs out of range.
>>>>>>>>
>>>>>>>> Another potential usage is for emulated IOMMU hardware with
>>>>>>>> addressing
>>>>>>>> limitations.
>>>>>>>>
>>>>>>>> Signed-off-by: Alejandro Lucero
>>>>>>>> <alejandro.lucero@netronome.com>
>>>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>>>> ---
>>>>>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>>>>>> ++++++++++++++++++++++++++++++
>>>>>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>>>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>>>>>> 3 files changed, 37 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>>>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>>>>> index fc6c44d..f5efebe 100644
>>>>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>>>>> @@ -109,6 +109,39 @@
>>>>>>>> }
>>>>>>>> }
>>>>>>>>
>>>>>>>> +/* check memseg iovas are within the required range based on
>>>>>>>> dma
>>>>>>>> mask
>>>>>>>> */
>>>>>>>> +int
>>>>>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>>>>>> +{
>>>>>>>> +
>>>>>>>> + const struct rte_mem_config *mcfg;
>>>>>>>> + uint64_t mask;
>>>>>>>> + int i;
>>>>>>>> +
>>>>>>>>
>>>>>>>>
>>>>>>>> I think we should add some sanity check to the input maskbits,
>>>>>>>> i.e.
>>>>>>> [64,0)
>>>>>>> or [64, 32]? What would be a reasonable lower bound.
>>>>>>>
>>>>>>>
>>>>>>> This is not a user's API, so any invocation will be reviewed,
>>>>>>> but I
>>>>>>> guess
>>>>>>>
>>>>>> adding a sanity check here does not harm.
>>>>>>
>>>>>> Not sure about lower bound but upper should 64, although it does
>>>>>> not
>>>>>> make
>>>>>> sense but it is safe. Lower bound is not so problematic.
>>>>>>
>>>>>>
>>>>>>
>>>>>> + /* create dma mask */
>>>>>>>
>>>>>>> + mask = ~((1ULL << maskbits) - 1);
>>>>>>>> +
>>>>>>>> + /* get pointer to global configuration */
>>>>>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>>>>>> +
>>>>>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>>>>>> + if (mcfg->memseg[i].addr == NULL)
>>>>>>>> + break;
>>>>>>>>
>>>>>>>>
>>>>>>> Looking at some other code, it looks like NULL entries might
>>>>>>> exists.
>>>>> So
>>>>> should a continue; rather than a break; be used here?
>>>>>
>>>>>
>>>>> I do not think so. memsegs are allocated sequentially, so first
>>>>> with
>>>> addr
>>>> as NULL implies no more memsegs.
>>>>
>>>
>>> I was referring to the mem walk functions, rte_memseg_list_walk().
>>> Maybe
>>> some having more experience with this area can review/comment.
>>>
>>
>> Pre-18.05, all memsegs are allocated continuously. Memseg lists and
>> memseg
>> list walk functions are 18.05+.
>>
>> Alejandro, perhaps it would be worth it to tag your patchset with
>> "pre-18.05" to avoid similar confusion in the future?
>>
>>
> Yes, that will help. I'm sending a new version shortly and I'll make
> it
> clear.
Thanks, I’ll review the new version if it’s ready before the end of
tomorrow CET, as I will be on PTO.
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses
2018-07-10 11:14 ` Eelco Chaudron
2018-07-10 11:33 ` Burakov, Anatoly
@ 2018-07-10 11:40 ` Alejandro Lucero
1 sibling, 0 replies; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 11:40 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On Tue, Jul 10, 2018 at 12:14 PM, Eelco Chaudron <echaudro@redhat.com>
wrote:
>
>
> On 10 Jul 2018, at 12:52, Alejandro Lucero wrote:
>
> On Tue, Jul 10, 2018 at 11:06 AM, Eelco Chaudron <echaudro@redhat.com>
>> wrote:
>>
>>
>>>
>>> On 10 Jul 2018, at 11:34, Alejandro Lucero wrote:
>>>
>>> On Tue, Jul 10, 2018 at 9:56 AM, Eelco Chaudron <echaudro@redhat.com>
>>>
>>>> wrote:
>>>>
>>>>
>>>>
>>>>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>>>>
>>>>> A device can suffer addressing limitations. This functions checks
>>>>>
>>>>> memsegs have iovas within the supported range based on dma mask.
>>>>>>
>>>>>> PMD should use this during initialization if supported devices
>>>>>> suffer addressing limitations, returning an error if this function
>>>>>> returns memsegs out of range.
>>>>>>
>>>>>> Another potential usage is for emulated IOMMU hardware with addressing
>>>>>> limitations.
>>>>>>
>>>>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>>>> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
>>>>>> ---
>>>>>> lib/librte_eal/common/eal_common_memory.c | 33
>>>>>> ++++++++++++++++++++++++++++++
>>>>>> lib/librte_eal/common/include/rte_memory.h | 3 +++
>>>>>> lib/librte_eal/rte_eal_version.map | 1 +
>>>>>> 3 files changed, 37 insertions(+)
>>>>>>
>>>>>> diff --git a/lib/librte_eal/common/eal_common_memory.c
>>>>>> b/lib/librte_eal/common/eal_common_memory.c
>>>>>> index fc6c44d..f5efebe 100644
>>>>>> --- a/lib/librte_eal/common/eal_common_memory.c
>>>>>> +++ b/lib/librte_eal/common/eal_common_memory.c
>>>>>> @@ -109,6 +109,39 @@
>>>>>> }
>>>>>> }
>>>>>>
>>>>>> +/* check memseg iovas are within the required range based on dma mask
>>>>>> */
>>>>>> +int
>>>>>> +rte_eal_check_dma_mask(uint8_t maskbits)
>>>>>> +{
>>>>>> +
>>>>>> + const struct rte_mem_config *mcfg;
>>>>>> + uint64_t mask;
>>>>>> + int i;
>>>>>> +
>>>>>>
>>>>>>
>>>>>> I think we should add some sanity check to the input maskbits, i.e.
>>>>> [64,0)
>>>>> or [64, 32]? What would be a reasonable lower bound.
>>>>>
>>>>>
>>>>> This is not a user's API, so any invocation will be reviewed, but I
>>>>> guess
>>>>>
>>>> adding a sanity check here does not harm.
>>>>
>>>> Not sure about lower bound but upper should 64, although it does not
>>>> make
>>>> sense but it is safe. Lower bound is not so problematic.
>>>>
>>>>
>>>>
>>>> + /* create dma mask */
>>>>>
>>>>> + mask = ~((1ULL << maskbits) - 1);
>>>>>> +
>>>>>> + /* get pointer to global configuration */
>>>>>> + mcfg = rte_eal_get_configuration()->mem_config;
>>>>>> +
>>>>>> + for (i = 0; i < RTE_MAX_MEMSEG; i++) {
>>>>>> + if (mcfg->memseg[i].addr == NULL)
>>>>>> + break;
>>>>>>
>>>>>>
>>>>> Looking at some other code, it looks like NULL entries might exists. So
>>> should a continue; rather than a break; be used here?
>>>
>>>
>>> I do not think so. memsegs are allocated sequentially, so first with addr
>> as NULL implies no more memsegs.
>>
>
> I was referring to the mem walk functions, rte_memseg_list_walk(). Maybe
> some having more experience with this area can review/comment.
>
>
This patchset applies to 17.11.3 which has not that function implemented.
You can see what rte_eal_get_physmem_size and rte_dump_physmem_layout do in
lib/librte_eal/common/eal_common_memory.c file regarding memseg "walks"
when addr is NULL.
>
>
>>
>>
>>> +
>>>
>>>> + if (mcfg->memseg[i].iova & mask) {
>>>>>> + RTE_LOG(INFO, EAL,
>>>>>> + "memseg[%d] iova %"PRIx64" out of
>>>>>> range:\n",
>>>>>> + i, mcfg->memseg[i].iova);
>>>>>> +
>>>>>> + RTE_LOG(INFO, EAL, "\tusing dma mask
>>>>>> %"PRIx64"\n",
>>>>>> + mask);
>>>>>> + return -1;
>>>>>> + }
>>>>>> + }
>>>>>> +
>>>>>> + return 0;
>>>>>> +}
>>>>>> +
>>>>>> /* return the number of memory channels */
>>>>>> unsigned rte_memory_get_nchannel(void)
>>>>>> {
>>>>>> diff --git a/lib/librte_eal/common/include/rte_memory.h
>>>>>> b/lib/librte_eal/common/include/rte_memory.h
>>>>>> index 80a8fc0..b2a0168 100644
>>>>>> --- a/lib/librte_eal/common/include/rte_memory.h
>>>>>> +++ b/lib/librte_eal/common/include/rte_memory.h
>>>>>> @@ -209,6 +209,9 @@ struct rte_memseg {
>>>>>> */
>>>>>> unsigned rte_memory_get_nrank(void);
>>>>>>
>>>>>> +/* check memsegs iovas are within a range based on dma mask */
>>>>>> +int rte_eal_check_dma_mask(uint8_t maskbits);
>>>>>> +
>>>>>> /**
>>>>>> * Drivers based on uio will not load unless physical
>>>>>> * addresses are obtainable. It is only possible to get
>>>>>> diff --git a/lib/librte_eal/rte_eal_version.map
>>>>>> b/lib/librte_eal/rte_eal_version.map
>>>>>> index f4f46c1..aa6cf87 100644
>>>>>> --- a/lib/librte_eal/rte_eal_version.map
>>>>>> +++ b/lib/librte_eal/rte_eal_version.map
>>>>>> @@ -184,6 +184,7 @@ DPDK_17.11 {
>>>>>>
>>>>>> rte_eal_create_uio_dev;
>>>>>> rte_bus_get_iommu_class;
>>>>>> + rte_eal_check_dma_mask;
>>>>>> rte_eal_has_pci;
>>>>>> rte_eal_iova_mode;
>>>>>> rte_eal_mbuf_default_mempool_ops;
>>>>>> --
>>>>>> 1.9.1
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>
>>>
>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-07 17:30 ` Andrew Rybchenko
2018-07-10 8:57 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode Alejandro Lucero
` (3 subsequent siblings)
5 siblings, 2 replies; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
A PMD should invoke this function for checking memsegs iovas are within
the supported range by the device.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
lib/librte_ether/rte_ethdev.h | 13 +++++++++++++
lib/librte_ether/rte_ethdev_version.map | 1 +
2 files changed, 14 insertions(+)
diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
index eba11ca..e51a432 100644
--- a/lib/librte_ether/rte_ethdev.h
+++ b/lib/librte_ether/rte_ethdev.h
@@ -2799,6 +2799,19 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
int rte_eth_dev_set_vlan_pvid(uint16_t port_id, uint16_t pvid, int on);
/**
+ * check device dma mask within expected range based on dma mask.
+ *
+ * @param maskbits
+ * mask length in bits
+ *
+ */
+static inline int
+rte_eth_dev_check_dma_mask(uint8_t maskbits)
+{
+ return rte_eal_check_dma_mask(maskbits);
+}
+
+/**
*
* Retrieve a burst of input packets from a receive queue of an Ethernet
* device. The retrieved packets are stored in *rte_mbuf* structures whose
diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
index e9681ac..0b11b8a 100644
--- a/lib/librte_ether/rte_ethdev_version.map
+++ b/lib/librte_ether/rte_ethdev_version.map
@@ -191,6 +191,7 @@ DPDK_17.08 {
DPDK_17.11 {
global:
+ rte_eth_dev_check_dma_mask;
rte_eth_dev_get_sec_ctx;
rte_eth_dev_pool_ops_supported;
rte_eth_dev_reset;
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device Alejandro Lucero
@ 2018-07-07 17:30 ` Andrew Rybchenko
2018-07-10 8:57 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
1 sibling, 0 replies; 25+ messages in thread
From: Andrew Rybchenko @ 2018-07-07 17:30 UTC (permalink / raw)
To: Alejandro Lucero, dev
Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
On 04.07.2018 15:53, Alejandro Lucero wrote:
> A PMD should invoke this function for checking memsegs iovas are within
> the supported range by the device.
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> ---
> lib/librte_ether/rte_ethdev.h | 13 +++++++++++++
> lib/librte_ether/rte_ethdev_version.map | 1 +
> 2 files changed, 14 insertions(+)
>
> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.h
> index eba11ca..e51a432 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2799,6 +2799,19 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t port_id,
> int rte_eth_dev_set_vlan_pvid(uint16_t port_id, uint16_t pvid, int on);
>
> /**
> + * check device dma mask within expected range based on dma mask.
> + *
> + * @param maskbits
> + * mask length in bits
> + *
> + */
> +static inline int
> +rte_eth_dev_check_dma_mask(uint8_t maskbits)
> +{
> + return rte_eal_check_dma_mask(maskbits);
I'm afraid I don't understand why do we need the wrapper.
May PMD use EAL function directly?
> +}
> +
> +/**
> *
> * Retrieve a burst of input packets from a receive queue of an Ethernet
> * device. The retrieved packets are stored in *rte_mbuf* structures whose
> diff --git a/lib/librte_ether/rte_ethdev_version.map b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..0b11b8a 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -191,6 +191,7 @@ DPDK_17.08 {
> DPDK_17.11 {
> global:
>
> + rte_eth_dev_check_dma_mask;
> rte_eth_dev_get_sec_ctx;
> rte_eth_dev_pool_ops_supported;
> rte_eth_dev_reset;
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device Alejandro Lucero
2018-07-07 17:30 ` Andrew Rybchenko
@ 2018-07-10 8:57 ` Eelco Chaudron
2018-07-10 9:42 ` Alejandro Lucero
1 sibling, 1 reply; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 8:57 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, anatoly.burakov, maxime.coquelin, ferruh.yigit,
Andrew Rybchenko
On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
> A PMD should invoke this function for checking memsegs iovas are
> within
> the supported range by the device.
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Agree with Andrew here, why not call rte_eal_check_dma_mask() directly
in nfp_net_txq_full()?
> ---
> lib/librte_ether/rte_ethdev.h | 13 +++++++++++++
> lib/librte_ether/rte_ethdev_version.map | 1 +
> 2 files changed, 14 insertions(+)
>
> diff --git a/lib/librte_ether/rte_ethdev.h
> b/lib/librte_ether/rte_ethdev.h
> index eba11ca..e51a432 100644
> --- a/lib/librte_ether/rte_ethdev.h
> +++ b/lib/librte_ether/rte_ethdev.h
> @@ -2799,6 +2799,19 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t
> port_id,
> int rte_eth_dev_set_vlan_pvid(uint16_t port_id, uint16_t pvid, int
> on);
>
> /**
> + * check device dma mask within expected range based on dma mask.
> + *
> + * @param maskbits
> + * mask length in bits
> + *
> + */
> +static inline int
> +rte_eth_dev_check_dma_mask(uint8_t maskbits)
> +{
> + return rte_eal_check_dma_mask(maskbits);
> +}
> +
> +/**
> *
> * Retrieve a burst of input packets from a receive queue of an
> Ethernet
> * device. The retrieved packets are stored in *rte_mbuf* structures
> whose
> diff --git a/lib/librte_ether/rte_ethdev_version.map
> b/lib/librte_ether/rte_ethdev_version.map
> index e9681ac..0b11b8a 100644
> --- a/lib/librte_ether/rte_ethdev_version.map
> +++ b/lib/librte_ether/rte_ethdev_version.map
> @@ -191,6 +191,7 @@ DPDK_17.08 {
> DPDK_17.11 {
> global:
>
> + rte_eth_dev_check_dma_mask;
> rte_eth_dev_get_sec_ctx;
> rte_eth_dev_pool_ops_supported;
> rte_eth_dev_reset;
> --
> 1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device
2018-07-10 8:57 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
@ 2018-07-10 9:42 ` Alejandro Lucero
2018-07-10 9:44 ` Alejandro Lucero
0 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 9:42 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit,
Andrew Rybchenko
On Tue, Jul 10, 2018 at 9:57 AM, Eelco Chaudron <echaudro@redhat.com> wrote:
>
>
> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>
> A PMD should invoke this function for checking memsegs iovas are within
>> the supported range by the device.
>>
>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>
>
> Agree with Andrew here, why not call rte_eal_check_dma_mask() directly in
> nfp_net_txq_full()?
>
>
My idea was to add this indirection for handling dma mask when just part of
the IOVAs are not usable. Now, if the dma mask finds a problem, the PMD
does not make any port initialization.
Memory management is changing and ideally an app should just allocate
memory safe to be used by the PMD when that memory is going to be used for
sending or receiving data, what is not always the case.
It is true this indirection is not being used for any purpose by now, so
yes, I could use a direct call the the EAL one.
>
> ---
>> lib/librte_ether/rte_ethdev.h | 13 +++++++++++++
>> lib/librte_ether/rte_ethdev_version.map | 1 +
>> 2 files changed, 14 insertions(+)
>>
>> diff --git a/lib/librte_ether/rte_ethdev.h b/lib/librte_ether/rte_ethdev.
>> h
>> index eba11ca..e51a432 100644
>> --- a/lib/librte_ether/rte_ethdev.h
>> +++ b/lib/librte_ether/rte_ethdev.h
>> @@ -2799,6 +2799,19 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t
>> port_id,
>> int rte_eth_dev_set_vlan_pvid(uint16_t port_id, uint16_t pvid, int on);
>>
>> /**
>> + * check device dma mask within expected range based on dma mask.
>> + *
>> + * @param maskbits
>> + * mask length in bits
>> + *
>> + */
>> +static inline int
>> +rte_eth_dev_check_dma_mask(uint8_t maskbits)
>> +{
>> + return rte_eal_check_dma_mask(maskbits);
>> +}
>> +
>> +/**
>> *
>> * Retrieve a burst of input packets from a receive queue of an Ethernet
>> * device. The retrieved packets are stored in *rte_mbuf* structures
>> whose
>> diff --git a/lib/librte_ether/rte_ethdev_version.map
>> b/lib/librte_ether/rte_ethdev_version.map
>> index e9681ac..0b11b8a 100644
>> --- a/lib/librte_ether/rte_ethdev_version.map
>> +++ b/lib/librte_ether/rte_ethdev_version.map
>> @@ -191,6 +191,7 @@ DPDK_17.08 {
>> DPDK_17.11 {
>> global:
>>
>> + rte_eth_dev_check_dma_mask;
>> rte_eth_dev_get_sec_ctx;
>> rte_eth_dev_pool_ops_supported;
>> rte_eth_dev_reset;
>> --
>> 1.9.1
>>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device
2018-07-10 9:42 ` Alejandro Lucero
@ 2018-07-10 9:44 ` Alejandro Lucero
0 siblings, 0 replies; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 9:44 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit,
Andrew Rybchenko
On Tue, Jul 10, 2018 at 10:42 AM, Alejandro Lucero <
alejandro.lucero@netronome.com> wrote:
>
>
> On Tue, Jul 10, 2018 at 9:57 AM, Eelco Chaudron <echaudro@redhat.com>
> wrote:
>
>>
>>
>> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>>
>> A PMD should invoke this function for checking memsegs iovas are within
>>> the supported range by the device.
>>>
>>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>>>
>>
>> Agree with Andrew here, why not call rte_eal_check_dma_mask() directly in
>> nfp_net_txq_full()?
>>
>>
BTW, IOVA checking can not be done inside PMD functions like that one
because that is in the fast path. Any IOVA checking needs to be done at
initialization time.
>
> My idea was to add this indirection for handling dma mask when just part
> of the IOVAs are not usable. Now, if the dma mask finds a problem, the PMD
> does not make any port initialization.
>
> Memory management is changing and ideally an app should just allocate
> memory safe to be used by the PMD when that memory is going to be used for
> sending or receiving data, what is not always the case.
>
> It is true this indirection is not being used for any purpose by now, so
> yes, I could use a direct call the the EAL one.
>
>
>>
>> ---
>>> lib/librte_ether/rte_ethdev.h | 13 +++++++++++++
>>> lib/librte_ether/rte_ethdev_version.map | 1 +
>>> 2 files changed, 14 insertions(+)
>>>
>>> diff --git a/lib/librte_ether/rte_ethdev.h
>>> b/lib/librte_ether/rte_ethdev.h
>>> index eba11ca..e51a432 100644
>>> --- a/lib/librte_ether/rte_ethdev.h
>>> +++ b/lib/librte_ether/rte_ethdev.h
>>> @@ -2799,6 +2799,19 @@ int rte_eth_dev_set_vlan_ether_type(uint16_t
>>> port_id,
>>> int rte_eth_dev_set_vlan_pvid(uint16_t port_id, uint16_t pvid, int on);
>>>
>>> /**
>>> + * check device dma mask within expected range based on dma mask.
>>> + *
>>> + * @param maskbits
>>> + * mask length in bits
>>> + *
>>> + */
>>> +static inline int
>>> +rte_eth_dev_check_dma_mask(uint8_t maskbits)
>>> +{
>>> + return rte_eal_check_dma_mask(maskbits);
>>> +}
>>> +
>>> +/**
>>> *
>>> * Retrieve a burst of input packets from a receive queue of an Ethernet
>>> * device. The retrieved packets are stored in *rte_mbuf* structures
>>> whose
>>> diff --git a/lib/librte_ether/rte_ethdev_version.map
>>> b/lib/librte_ether/rte_ethdev_version.map
>>> index e9681ac..0b11b8a 100644
>>> --- a/lib/librte_ether/rte_ethdev_version.map
>>> +++ b/lib/librte_ether/rte_ethdev_version.map
>>> @@ -191,6 +191,7 @@ DPDK_17.08 {
>>> DPDK_17.11 {
>>> global:
>>>
>>> + rte_eth_dev_check_dma_mask;
>>> rte_eth_dev_get_sec_ctx;
>>> rte_eth_dev_pool_ops_supported;
>>> rte_eth_dev_reset;
>>> --
>>> 1.9.1
>>>
>>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [dpdk-dev] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 1/6] mem: add function for checking memsegs IOVAs addresses Alejandro Lucero
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 2/6] ethdev: add function for checking IOVAs by a device Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-10 10:14 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 4/6] mem: use address hint for mapping hugepages Alejandro Lucero
` (2 subsequent siblings)
5 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
Although VT-d emulation currently only supports 39 bits, it could
be iovas being within that supported range. This patch allows
IOVA mode in such a case.
Indeed, memory initialization code can be modified for using lower
virtual addresses than those used by the kernel for 64 bits processes
by default, and therefore memsegs iovas can use 39 bits or less for
most system. And this is likely 100% true for VMs.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
drivers/bus/pci/linux/pci.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
index 74deef3..792c819 100644
--- a/drivers/bus/pci/linux/pci.c
+++ b/drivers/bus/pci/linux/pci.c
@@ -43,6 +43,7 @@
#include <rte_devargs.h>
#include <rte_memcpy.h>
#include <rte_vfio.h>
+#include <rte_memory.h>
#include "eal_private.h"
#include "eal_filesystem.h"
@@ -613,10 +614,12 @@
fclose(fp);
mgaw = ((vtd_cap_reg & VTD_CAP_MGAW_MASK) >> VTD_CAP_MGAW_SHIFT) + 1;
- if (mgaw < X86_VA_WIDTH)
+
+ if (!rte_eal_check_dma_mask(mgaw))
+ return true;
+ else
return false;
- return true;
}
#elif defined(RTE_ARCH_PPC_64)
static bool
@@ -640,13 +643,17 @@
{
struct rte_pci_device *dev = NULL;
struct rte_pci_driver *drv = NULL;
+ int iommu_dma_mask_check_done = 0;
FOREACH_DRIVER_ON_PCIBUS(drv) {
FOREACH_DEVICE_ON_PCIBUS(dev) {
if (!rte_pci_match(drv, dev))
continue;
- if (!pci_one_device_iommu_support_va(dev))
- return false;
+ if (!iommu_dma_mask_check_done) {
+ if (pci_one_device_iommu_support_va(dev) < 0)
+ return false;
+ iommu_dma_mask_check_done = 1;
+ }
}
}
return true;
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode Alejandro Lucero
@ 2018-07-10 10:14 ` Eelco Chaudron
2018-07-10 15:37 ` Alejandro Lucero
0 siblings, 1 reply; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 10:14 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
> Although VT-d emulation currently only supports 39 bits, it could
> be iovas being within that supported range. This patch allows
> IOVA mode in such a case.
>
> Indeed, memory initialization code can be modified for using lower
> virtual addresses than those used by the kernel for 64 bits processes
> by default, and therefore memsegs iovas can use 39 bits or less for
> most system. And this is likely 100% true for VMs.
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> ---
> drivers/bus/pci/linux/pci.c | 15 +++++++++++----
> 1 file changed, 11 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
> index 74deef3..792c819 100644
> --- a/drivers/bus/pci/linux/pci.c
> +++ b/drivers/bus/pci/linux/pci.c
> @@ -43,6 +43,7 @@
> #include <rte_devargs.h>
> #include <rte_memcpy.h>
> #include <rte_vfio.h>
> +#include <rte_memory.h>
>
> #include "eal_private.h"
> #include "eal_filesystem.h"
> @@ -613,10 +614,12 @@
> fclose(fp);
>
> mgaw = ((vtd_cap_reg & VTD_CAP_MGAW_MASK) >> VTD_CAP_MGAW_SHIFT) +
> 1;
> - if (mgaw < X86_VA_WIDTH)
> +
> + if (!rte_eal_check_dma_mask(mgaw))
If think in this case we still need to check the X86_VA_WIDTH, i.e.
if (mgaw < X86_VA_WIDTH && !rte_eal_check_dma_mask(mgaw))
> + return true;
> + else
> return false;
>
> - return true;
> }
> #elif defined(RTE_ARCH_PPC_64)
> static bool
> @@ -640,13 +643,17 @@
> {
> struct rte_pci_device *dev = NULL;
> struct rte_pci_driver *drv = NULL;
> + int iommu_dma_mask_check_done = 0;
>
> FOREACH_DRIVER_ON_PCIBUS(drv) {
> FOREACH_DEVICE_ON_PCIBUS(dev) {
> if (!rte_pci_match(drv, dev))
> continue;
> - if (!pci_one_device_iommu_support_va(dev))
> - return false;
> + if (!iommu_dma_mask_check_done) {
> + if (pci_one_device_iommu_support_va(dev) < 0)
> + return false;
> + iommu_dma_mask_check_done = 1;
Not sure why this change? Why do we only need to check one device on the
bus?
In addition, if this is what was intended, rather than a variable you
can return true in this case, or did you intended to clear the
iommu_dma_mask_check_done on every PCI BUS iteration?
> + }
> }
> }
> return true;
> --
> 1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode
2018-07-10 10:14 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
@ 2018-07-10 15:37 ` Alejandro Lucero
0 siblings, 0 replies; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-10 15:37 UTC (permalink / raw)
To: Eelco Chaudron
Cc: dev, stable, Burakov, Anatoly, Maxime Coquelin, Ferruh Yigit
On Tue, Jul 10, 2018 at 11:14 AM, Eelco Chaudron <echaudro@redhat.com>
wrote:
>
>
> On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
>
> Although VT-d emulation currently only supports 39 bits, it could
>> be iovas being within that supported range. This patch allows
>> IOVA mode in such a case.
>>
>> Indeed, memory initialization code can be modified for using lower
>> virtual addresses than those used by the kernel for 64 bits processes
>> by default, and therefore memsegs iovas can use 39 bits or less for
>> most system. And this is likely 100% true for VMs.
>>
>> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
>> ---
>> drivers/bus/pci/linux/pci.c | 15 +++++++++++----
>> 1 file changed, 11 insertions(+), 4 deletions(-)
>>
>> diff --git a/drivers/bus/pci/linux/pci.c b/drivers/bus/pci/linux/pci.c
>> index 74deef3..792c819 100644
>> --- a/drivers/bus/pci/linux/pci.c
>> +++ b/drivers/bus/pci/linux/pci.c
>> @@ -43,6 +43,7 @@
>> #include <rte_devargs.h>
>> #include <rte_memcpy.h>
>> #include <rte_vfio.h>
>> +#include <rte_memory.h>
>>
>> #include "eal_private.h"
>> #include "eal_filesystem.h"
>> @@ -613,10 +614,12 @@
>> fclose(fp);
>>
>> mgaw = ((vtd_cap_reg & VTD_CAP_MGAW_MASK) >> VTD_CAP_MGAW_SHIFT)
>> + 1;
>> - if (mgaw < X86_VA_WIDTH)
>> +
>> + if (!rte_eal_check_dma_mask(mgaw))
>>
>
> If think in this case we still need to check the X86_VA_WIDTH, i.e.
> if (mgaw < X86_VA_WIDTH && !rte_eal_check_dma_mask(mgaw))
>
>
> + return true;
>> + else
>> return false;
>>
>> - return true;
>> }
>> #elif defined(RTE_ARCH_PPC_64)
>> static bool
>> @@ -640,13 +643,17 @@
>> {
>> struct rte_pci_device *dev = NULL;
>> struct rte_pci_driver *drv = NULL;
>> + int iommu_dma_mask_check_done = 0;
>>
>> FOREACH_DRIVER_ON_PCIBUS(drv) {
>> FOREACH_DEVICE_ON_PCIBUS(dev) {
>> if (!rte_pci_match(drv, dev))
>> continue;
>> - if (!pci_one_device_iommu_support_va(dev))
>> - return false;
>> + if (!iommu_dma_mask_check_done) {
>> + if (pci_one_device_iommu_support_va(dev)
>> < 0)
>> + return false;
>> + iommu_dma_mask_check_done = 1;
>>
>
> Not sure why this change? Why do we only need to check one device on the
> bus?
>
>
Because there is just one emulated IOMMU hardware. The limitation in this
case is not in a specific PCI device. And I do not think it is possible to
have two different (emulated or not) IOMMU hardware. Yes, you can have more
than one controller but being same IOMMU type.
> In addition, if this is what was intended, rather than a variable you can
> return true in this case, or did you intended to clear the
> iommu_dma_mask_check_done on every PCI BUS iteration?
>
>
If pci_one_device_iommu_support_va, because the dma check, finds out the
IOVAs are out of range, then the IOVA mode is PA and no further checks are
required. But there could be a PCI device precluding the IOVA VA, so all
the PCI devices need to be processed.
> + }
>> }
>> }
>> return true;
>> --
>> 1.9.1
>>
>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [dpdk-dev] [PATCH v3 4/6] mem: use address hint for mapping hugepages
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
` (2 preceding siblings ...)
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 3/6] bus/pci: use IOVAs check when setting IOVA mode Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-10 11:15 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 5/6] net/nfp: check hugepages IOVAs based on DMA mask Alejandro Lucero
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 6/6] net/nfp: support IOVA VA mode Alejandro Lucero
5 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
Linux kernel uses a really high address as starting address for
serving mmaps calls. If there exists addressing limitations and
IOVA mode is VA, this starting address is likely too high for
those devices. However, it is possible to use a lower address in
the process virtual address space as with 64 bits there is a lot
of available space.
This patch adds an address hint as starting address for 64 bits
systems.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
---
lib/librte_eal/linuxapp/eal/eal_memory.c | 55 ++++++++++++++++++++++++++------
1 file changed, 46 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_memory.c b/lib/librte_eal/linuxapp/eal/eal_memory.c
index 17c20d4..2ed4017 100644
--- a/lib/librte_eal/linuxapp/eal/eal_memory.c
+++ b/lib/librte_eal/linuxapp/eal/eal_memory.c
@@ -88,6 +88,23 @@
static uint64_t baseaddr_offset;
+#ifdef RTE_ARCH_64
+/*
+ * Linux kernel uses a really high address as starting address for serving
+ * mmaps calls. If there exists addressing limitations and IOVA mode is VA,
+ * this starting address is likely too high for those devices. However, it
+ * is possible to use a lower address in the process virtual address space
+ * as with 64 bits there is a lot of available space.
+ *
+ * Current known limitations are 39 or 40 bits. Setting the starting address
+ * at 4GB implies there are 508GB or 1020GB for mapping the available
+ * hugepages. This is likely enough for most systems, although a device with
+ * addressing limitations should call rte_dev_check_dma_mask for ensuring all
+ * memory is within supported range.
+ */
+static uint64_t baseaddr = 0x100000000;
+#endif
+
static bool phys_addrs_available = true;
#define RANDOMIZE_VA_SPACE_FILE "/proc/sys/kernel/randomize_va_space"
@@ -250,6 +267,23 @@
}
}
+static void *
+get_addr_hint(void)
+{
+ if (internal_config.base_virtaddr != 0) {
+ return (void *) (uintptr_t)
+ (internal_config.base_virtaddr +
+ baseaddr_offset);
+ } else {
+#ifdef RTE_ARCH_64
+ return (void *) (uintptr_t) (baseaddr +
+ baseaddr_offset);
+#else
+ return NULL;
+#endif
+ }
+}
+
/*
* Try to mmap *size bytes in /dev/zero. If it is successful, return the
* pointer to the mmap'd area and keep *size unmodified. Else, retry
@@ -260,16 +294,10 @@
static void *
get_virtual_area(size_t *size, size_t hugepage_sz)
{
- void *addr;
+ void *addr, *addr_hint;
int fd;
long aligned_addr;
- if (internal_config.base_virtaddr != 0) {
- addr = (void*) (uintptr_t) (internal_config.base_virtaddr +
- baseaddr_offset);
- }
- else addr = NULL;
-
RTE_LOG(DEBUG, EAL, "Ask a virtual area of 0x%zx bytes\n", *size);
fd = open("/dev/zero", O_RDONLY);
@@ -278,7 +306,9 @@
return NULL;
}
do {
- addr = mmap(addr,
+ addr_hint = get_addr_hint();
+
+ addr = mmap(addr_hint,
(*size) + hugepage_sz, PROT_READ,
#ifdef RTE_ARCH_PPC_64
MAP_PRIVATE | MAP_ANONYMOUS | MAP_HUGETLB,
@@ -286,8 +316,15 @@
MAP_PRIVATE,
#endif
fd, 0);
- if (addr == MAP_FAILED)
+ if (addr == MAP_FAILED) {
+ /* map failed. Let's try with less memory */
*size -= hugepage_sz;
+ } else if (addr_hint && addr != addr_hint) {
+ /* hint was not used. Try with another offset */
+ munmap(addr, (*size) + hugepage_sz);
+ addr = MAP_FAILED;
+ baseaddr_offset += 0x100000000;
+ }
} while (addr == MAP_FAILED && *size > 0);
if (addr == MAP_FAILED) {
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* Re: [dpdk-dev] [dpdk-stable] [PATCH v3 4/6] mem: use address hint for mapping hugepages
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 4/6] mem: use address hint for mapping hugepages Alejandro Lucero
@ 2018-07-10 11:15 ` Eelco Chaudron
0 siblings, 0 replies; 25+ messages in thread
From: Eelco Chaudron @ 2018-07-10 11:15 UTC (permalink / raw)
To: Alejandro Lucero
Cc: dev, stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
On 4 Jul 2018, at 14:53, Alejandro Lucero wrote:
> Linux kernel uses a really high address as starting address for
> serving mmaps calls. If there exists addressing limitations and
> IOVA mode is VA, this starting address is likely too high for
> those devices. However, it is possible to use a lower address in
> the process virtual address space as with 64 bits there is a lot
> of available space.
>
> This patch adds an address hint as starting address for 64 bits
> systems.
>
> Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
> Acked-by: Anatoly Burakov <anatoly.burakov@intel.com>
Looks good to me!
Cheers,
Eelco
Acked-by: Eelco Chaudron <echaudro@redhat.com>
^ permalink raw reply [flat|nested] 25+ messages in thread
* [dpdk-dev] [PATCH v3 5/6] net/nfp: check hugepages IOVAs based on DMA mask
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
` (3 preceding siblings ...)
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 4/6] mem: use address hint for mapping hugepages Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-10 10:17 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 6/6] net/nfp: support IOVA VA mode Alejandro Lucero
5 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
NFP devices can not handle DMA addresses requiring more than
40 bits. This patch uses rte_dev_check_dma_mask with 40 bits
and avoids device initialization if memory out of NFP range.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
drivers/net/nfp/nfp_net.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index d9cd047..5976f37 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -2649,6 +2649,14 @@ uint32_t nfp_net_txq_full(struct nfp_net_txq *txq)
pci_dev = RTE_ETH_DEV_TO_PCI(eth_dev);
+ /* NFP can not handle DMA addresses requiring more than 40 bits */
+ if (rte_eth_dev_check_dma_mask(40) < 0) {
+ RTE_LOG(INFO, PMD, "device %s can not be used:",
+ pci_dev->device.name);
+ RTE_LOG(INFO, PMD, "\trestricted dma mask to 40 bits!\n");
+ return -ENODEV;
+ };
+
if ((pci_dev->id.device_id == PCI_DEVICE_ID_NFP4000_PF_NIC) ||
(pci_dev->id.device_id == PCI_DEVICE_ID_NFP6000_PF_NIC)) {
port = get_pf_port_number(eth_dev->data->name);
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread
* [dpdk-dev] [PATCH v3 6/6] net/nfp: support IOVA VA mode
2018-07-04 12:53 [dpdk-dev] [PATCH v3 0/6] use IOVAs check based on DMA mask Alejandro Lucero
` (4 preceding siblings ...)
2018-07-04 12:53 ` [dpdk-dev] [PATCH v3 5/6] net/nfp: check hugepages IOVAs based on DMA mask Alejandro Lucero
@ 2018-07-04 12:53 ` Alejandro Lucero
2018-07-10 10:18 ` [dpdk-dev] [dpdk-stable] " Eelco Chaudron
5 siblings, 1 reply; 25+ messages in thread
From: Alejandro Lucero @ 2018-07-04 12:53 UTC (permalink / raw)
To: dev; +Cc: stable, anatoly.burakov, maxime.coquelin, ferruh.yigit
NFP can handle IOVA as VA. It requires to check those IOVAs
being in the supported range what is done during initialization.
Signed-off-by: Alejandro Lucero <alejandro.lucero@netronome.com>
---
drivers/net/nfp/nfp_net.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/nfp/nfp_net.c b/drivers/net/nfp/nfp_net.c
index 5976f37..354dec3 100644
--- a/drivers/net/nfp/nfp_net.c
+++ b/drivers/net/nfp/nfp_net.c
@@ -3053,14 +3053,16 @@ static int eth_nfp_pci_remove(struct rte_pci_device *pci_dev)
static struct rte_pci_driver rte_nfp_net_pf_pmd = {
.id_table = pci_id_nfp_pf_net_map,
- .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+ RTE_PCI_DRV_IOVA_AS_VA,
.probe = nfp_pf_pci_probe,
.remove = eth_nfp_pci_remove,
};
static struct rte_pci_driver rte_nfp_net_vf_pmd = {
.id_table = pci_id_nfp_vf_net_map,
- .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC,
+ .drv_flags = RTE_PCI_DRV_NEED_MAPPING | RTE_PCI_DRV_INTR_LSC |
+ RTE_PCI_DRV_IOVA_AS_VA,
.probe = eth_nfp_pci_probe,
.remove = eth_nfp_pci_remove,
};
--
1.9.1
^ permalink raw reply [flat|nested] 25+ messages in thread