* [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx @ 2018-09-25 12:54 Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain ` (5 more replies) 0 siblings, 6 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain ::Background:: After the restructuring of memory in last release(s), one of the major impact on fslmc/dpaa bus (and its devices) was the performance drop when using physical addressing. Previously, it was assumed that physical range was contiguous for any given request for hugepage memory. That way, whenever a virtual address was returned, it was easy to fetch physical equivalent, in almost constant time. But, with memory hotplug series, that assumption was negated. Every call that device drivers made for rte_mem_virt2iova or rte_mem_virt2phy were expensive. (Using IOVA_CONTIG is an app dependency which is not a practical option). For fslmc, working on Physical or Virtual (IOMMU supported) address is an optional thing. For dpaa bus, it is not optional and only physical addressing is supported. Thus, it impacted dpaa bus the most. ::DPAAX PA-VA Table:: - A simple table containing entries for all physical memory range available on a particular SoC (in this case, NXP's LS104x and LS20xx series, which are handled by dpaa and fslmc bus, respectively). As of now, this is SoC dependent for fetching range. - We populate the table either through the mempool handler (for mempool pinned memory) or through the memory event callbacks (for cases where working memory is allocated by application). - Though aim is only to translate addresses for descriptors which are Rx'd from devices, this is a generic layer which should work in other cases as well (though, not the target of current testing). ::About patches:: Patch 1: There was an issue in existing PA/VA mode reporting being done by fslmc bus. This patch fixes it. Patch 2: Common libraries/commponents can be dependency for the bus thus, blocking parallel compilation Patch 3: Add the library in common/dpaax. This is a single patch as functions are mostly inter-linked. Patch 4~5: Add support in dpaa and fslmc bus, respectively. It is not possible to unlink the bus and device drivers, thus, these patches have blanket change across all drivers. ::Next Steps:: - Some optimization are required to tune the access pattern of the table. These would be posted as additional patches. - In case there is any possible split of patches, I will post another version. But until then, this is the layout. ::Other Notes:: - There are some checkpatch warnings about 80char limit. I am currently ignoring them as I prefer to keep those lines longer for readability. Shreyansh Jain (5): bus/fslmc: fix physical addressing check drivers: common as dependency for bus common/dpaax: add library for PA VA translation table dpaa: enable dpaax library fslmc: enable dpaax library config/common_base | 5 + config/common_linuxapp | 5 + drivers/Makefile | 1 + drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 + drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 + drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 24 + drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 7 - drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 32 +- drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 509 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 12 + drivers/common/meson.build | 2 +- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 + drivers/event/dpaa/Makefile | 1 + drivers/event/dpaa2/Makefile | 2 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 3 + drivers/mempool/dpaa/dpaa_mempool.h | 4 +- drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +- drivers/net/dpaa/Makefile | 1 + drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 2 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 2 + 35 files changed, 799 insertions(+), 60 deletions(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH 1/5] bus/fslmc: fix physical addressing check 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain @ 2018-09-25 12:54 ` Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 2/5] drivers: common as dependency for bus Shreyansh Jain ` (4 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain, hemant.agrawal In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported class is RTE_IOVA_PA. Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") Cc: hemant.agrawal@nxp.com Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/fslmc_bus.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index d2900edc5..f5135e538 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -487,6 +487,10 @@ rte_dpaa2_get_iommu_class(void) bool is_vfio_noiommu_enabled = 1; bool has_iova_va; +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; +#endif + if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH 2/5] drivers: common as dependency for bus 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-09-25 12:54 ` Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain It is possible that bus requires common library for compilation. Prior to this patch, bus and common compiled parallel. But, post this dependency is created. This is especially important for the DPAA/FSLMC buses which are going to use the common/dpaax library. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/Makefile b/drivers/Makefile index 75660765e..7d5da5d9f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -5,6 +5,7 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += common DIRS-y += bus +DEPDIRS-bus := common DIRS-y += mempool DEPDIRS-mempool := common bus DIRS-y += net -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 2/5] drivers: common as dependency for bus Shreyansh Jain @ 2018-09-25 12:54 ` Shreyansh Jain 2018-09-25 13:28 ` Burakov, Anatoly 2018-09-25 12:54 ` [dpdk-dev] [PATCH 4/5] dpaa: enable dpaax library Shreyansh Jain ` (2 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain A common library, valid for dpaaX drivers, which is used to maintain a local copy of PA->VA translations. In case of physical addressing mode (one of the option for FSLMC, and only option for DPAA bus), the addresses of descriptors Rx'd are physical. These need to be converted into equivalent VA for rte_mbuf and other similar calls. Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This library is an attempt to reduce the overall cost associated with this translation. A small table is maintained, containing continuous entries representing a continguous physical range. Each of these entries stores the equivalent VA, which is fed during mempool creation, or memory allocation/deallocation callbacks. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- config/common_base | 5 + config/common_linuxapp | 5 + drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 509 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 12 + drivers/common/meson.build | 2 +- 10 files changed, 722 insertions(+), 1 deletion(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map diff --git a/config/common_base b/config/common_base index 6eb65ba4e..ba3dd05bc 100644 --- a/config/common_base +++ b/config/common_base @@ -138,6 +138,11 @@ CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE=n # CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=n + # # Compile the Intel FPGA bus # diff --git a/config/common_linuxapp b/config/common_linuxapp index 9c5ea9d89..57a847f3e 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -29,6 +29,11 @@ CONFIG_RTE_PROC_INFO=y CONFIG_RTE_LIBRTE_VMBUS=y CONFIG_RTE_LIBRTE_NETVSC_PMD=y +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=y + # NXP DPAA BUS and drivers CONFIG_RTE_LIBRTE_DPAA_BUS=y CONFIG_RTE_LIBRTE_DPAA_MEMPOOL=y diff --git a/drivers/common/Makefile b/drivers/common/Makefile index 5f72da0ed..1a5a6706c 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -12,4 +12,8 @@ ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y) DIRS-y += mvep endif +ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y) +DIRS-y += dpaax +endif + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/common/dpaax/Makefile b/drivers/common/dpaax/Makefile new file mode 100644 index 000000000..94d2cf0ce --- /dev/null +++ b/drivers/common/dpaax/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2018 NXP +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_common_dpaax.a + +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +# versioning export map +EXPORT_MAP := rte_common_dpaax_version.map + +# library version +LIBABIVER := 1 + +# +# all source are stored in SRCS-y +# +SRCS-y += dpaax_iova_table.c + +LDLIBS += -lrte_eal + +SYMLINK-y-include += dpaax_iova_table.h + +include $(RTE_SDK)/mk/rte.lib.mk \ No newline at end of file diff --git a/drivers/common/dpaax/dpaax_iova_table.c b/drivers/common/dpaax/dpaax_iova_table.c new file mode 100644 index 000000000..73acd1646 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.c @@ -0,0 +1,509 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#include <rte_memory.h> + +#include "dpaax_iova_table.h" +#include "dpaax_logs.h" + +/* Global dpaax logger identifier */ +int dpaax_logger; + +/* Global table reference */ +struct dpaax_iova_table *dpaax_iova_table_p; + +static int dpaax_handle_memevents(void); + +/* A structure representing the device-tree node available in /proc/device-tree. + */ +struct reg_node { + phys_addr_t addr; + size_t len; +}; + +/* A ntohll equivalent routine + * XXX: This is only applicable for 64 bit environment. + */ +static void +rotate_8(unsigned char *arr) +{ + uint32_t temp; + uint32_t *first_half; + uint32_t *second_half; + + first_half = (uint32_t *)(arr); + second_half = (uint32_t *)(arr + 4); + + temp = *first_half; + *first_half = *second_half; + *second_half = temp; + + *first_half = ntohl(*first_half); + *second_half = ntohl(*second_half); +} + +/* read_memory_nodes + * Memory layout for DPAAx platforms (LS1043, LS1046, LS1088, LS2088, LX2160) + * are populated by Uboot and available in device tree: + * /proc/device-tree/memory@<address>/reg <= register. + * Entries are of the form: + * (<8 byte start addr><8 byte length>)(..more similar blocks of start,len>).. + * + * @param count + * OUT populate number of entries found in memory node + * @return + * Pointer to array of reg_node elements, count size + */ +static struct reg_node * +read_memory_node(unsigned int *count) +{ + int fd, ret, i; + unsigned int j; + glob_t result = {0}; + struct stat statbuf = {0}; + char file_data[MEM_NODE_FILE_LEN]; + struct reg_node *nodes = NULL; + + *count = 0; + + ret = glob(MEM_NODE_PATH_GLOB, 0, NULL, &result); + if (ret != 0) { + DPAAX_ERR("Unable to glob device-tree memory node: (%s)(%d)", + MEM_NODE_PATH_GLOB, ret); + goto out; + } + + if (result.gl_pathc != 1) { + /* Either more than one memory@<addr> node found, or none. + * In either case, cannot work ahead. + */ + DPAAX_ERR("Found (%zu) entries in device-tree. Not supported!", + result.gl_pathc); + goto out; + } + + DPAAX_DEBUG("Opening and parsing device-tree node: (%s)", + result.gl_pathv[0]); + fd = open(result.gl_pathv[0], O_RDONLY); + if (fd < 0) { + DPAAX_ERR("Unable to open the device-tree node: (%s)(fd=%d)", + MEM_NODE_PATH_GLOB, fd); + goto cleanup; + } + + /* Stat to get the file size */ + ret = fstat(fd, &statbuf); + if (ret != 0) { + DPAAX_ERR("Unable to get device-tree memory node size."); + goto cleanup; + } + + DPAAX_DEBUG("Size of device-tree mem node: %lu", statbuf.st_size); + if (statbuf.st_size > MEM_NODE_FILE_LEN) { + DPAAX_WARN("More memory nodes available than assumed."); + DPAAX_WARN("System may not work properly!"); + } + + ret = read(fd, file_data, statbuf.st_size > MEM_NODE_FILE_LEN ? + MEM_NODE_FILE_LEN : statbuf.st_size); + if (ret <= 0) { + DPAAX_ERR("Unable to read device-tree memory node: (%d)", ret); + goto cleanup; + } + + /* The reg node should be multiple of 16 bytes, 8 bytes each for addr + * and len. + */ + *count = (statbuf.st_size / 16); + if ((*count) <= 0 || (statbuf.st_size % 16 != 0)) { + DPAAX_ERR("Invalid memory node values or count. (size=%lu)", + statbuf.st_size); + goto cleanup; + } + + /* each entry is of 16 bytes, and size/16 is total count of entries */ + nodes = malloc(sizeof(struct reg_node) * (*count)); + if (!nodes) { + DPAAX_ERR("Failure in allocating working memory."); + goto cleanup; + } + memset(nodes, 0, sizeof(struct reg_node) * (*count)); + + for (i = 0, j = 0; i < (statbuf.st_size) && j < (*count); i += 16, j++) { + memcpy(&nodes[j], file_data + i, 16); + /* Rotate (ntohl) each 8 byte entry */ + rotate_8((unsigned char *)(&(nodes[j].addr))); + rotate_8((unsigned char *)(&(nodes[j].len))); + } + + DPAAX_DEBUG("Device-tree memory node data:"); + do { + DPAAX_DEBUG("\n %08" PRIx64 " %08zu", nodes[j].addr, nodes[j].len); + } while (--j); + +cleanup: + close(fd); + globfree(&result); +out: + return nodes; +} + +int +dpaax_iova_table_populate(void) +{ + int ret; + unsigned int i, node_count; + size_t tot_memory_size, total_table_size; + struct reg_node *nodes; + struct dpaax_iovat_element *entry; + + /* dpaax_iova_table_p is a singleton - only one instance should be + * created. + */ + if (dpaax_iova_table_p) { + DPAAX_DEBUG("Multiple allocation attempt for IOVA Table (%p)", + dpaax_iova_table_p); + /* This can be an error case as well - some path not cleaning + * up table - but, for now, it is assumed that if IOVA Table + * pointer is valid, table is allocated. + */ + return 0; + } + + nodes = read_memory_node(&node_count); + if (nodes == NULL || node_count <= 0) { + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + return -1; + } + + tot_memory_size = 0; + for (i = 0; i < node_count; i++) + tot_memory_size += nodes[i].len; + + DPAAX_DEBUG("Total available PA memory size: %zu", tot_memory_size); + + /* Total table size = meta data + tot_memory_size/8 */ + total_table_size = sizeof(struct dpaax_iova_table) + + (sizeof(struct dpaax_iovat_element) * node_count) + + ((tot_memory_size / DPAAX_MEM_SPLIT) * sizeof(uint64_t)); + + /* TODO: This memory doesn't need to shared but needs to be always + * pinned to RAM (no swap out) - using hugepage rather than malloc + */ + dpaax_iova_table_p = rte_zmalloc(NULL, total_table_size, 0); + if (dpaax_iova_table_p == NULL) { + DPAAX_WARN("Unable to allocate memory for PA->VA Table;"); + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + free(nodes); + return -1; + } + + /* Initialize table */ + dpaax_iova_table_p->count = node_count; + entry = dpaax_iova_table_p->entries; + + DPAAX_DEBUG("IOVA Table entries: (entry start = %p)", (void *)entry); + DPAAX_DEBUG("\t(entry),(start),(len),(next)"); + + for (i = 0; i < node_count; i++) { + /* dpaax_iova_table_p + * | dpaax_iova_table_p->entries + * | | + * | | + * V V + * +------+------+-------+---+----------+---------+--- + * |iova_ |entry | entry | | pages | pages | + * |table | 1 | 2 |...| entry 1 | entry2 | + * +-----'+.-----+-------+---+;---------+;--------+--- + * \ \ / / + * `~~~~~~|~~~~~>pages / + * \ / + * `~~~~~~~~~~~>pages + */ + entry[i].start = nodes[i].addr; + entry[i].len = nodes[i].len; + if (i > 0) + entry[i].pages = entry[i-1].pages + + ((entry[i-1].len/DPAAX_MEM_SPLIT)); + else + entry[i].pages = (uint64_t *)((unsigned char *)entry + + (sizeof(struct dpaax_iovat_element) * + node_count)); + + DPAAX_DEBUG("\t(%u),(%8"PRIx64"),(%8zu),(%8p)", + i, entry[i].start, entry[i].len, entry[i].pages); + } + + /* Release memory associated with nodes array - not required now */ + free(nodes); + + DPAAX_DEBUG("Adding mem-event handler\n"); + ret = dpaax_handle_memevents(); + if (ret) { + DPAAX_ERR("Unable to add mem-event handler"); + DPAAX_WARN("Cases with non-buffer pool mem won't work!"); + } + + return 0; +} + +void +dpaax_iova_table_depopulate(void) +{ + if (dpaax_iova_table_p == NULL) + return; + + rte_free(dpaax_iova_table_p->entries); + dpaax_iova_table_p = NULL; + + DPAAX_DEBUG("IOVA Table cleanedup"); +} + +int +dpaax_iova_table_add(phys_addr_t paddr, void *vaddr, size_t length) +{ + int found = 0; + unsigned int i; + size_t req_length = length, e_offset; + struct dpaax_iovat_element *entry; + uintptr_t align_vaddr; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + align_vaddr = ((uintptr_t)vaddr & DPAAX_MEM_SPLIT_MASK); + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Add: Incorrect entry for PA->VA Table" + "(%"PRIu64")", paddr); + DPAAX_ERR("Add: Lowest address: %"PRIu64"", + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + + do { + e_offset = ((align_paddr - entry[i].start) / DPAAX_MEM_SPLIT); + /* TODO: Whatif something already exists at this + * location - is that an error? For now, ignoring the + * case. + */ + entry[i].pages[e_offset] = align_vaddr; + DPAAX_DEBUG("Added: vaddr=%zu for Phy:%"PRIu64" at %zu" + " remaining len %zu", align_vaddr, + align_paddr, e_offset, req_length); + + /* Incoming request can be larger than the + * DPAAX_MEM_SPLIT size - in which case, multiple + * entries in entry->pages[] are filled up. + */ + if (req_length <= DPAAX_MEM_SPLIT) + break; + align_paddr += DPAAX_MEM_SPLIT; + align_vaddr += DPAAX_MEM_SPLIT; + req_length -= DPAAX_MEM_SPLIT; + } while (1); + + break; + } + + if (!found) { + /* There might be case where the incoming physical address is + * beyond the address discovered in the memory node of + * device-tree. Specially if some malloc'd area is used by EAL + * and the memevent handlers passes that across. But, this is + * not necessarily an error. + */ + DPAAX_DEBUG("Add: Unable to find slot for vaddr:(%p)," + " phy(%"PRIu64")", + vaddr, paddr); + return -1; + } + + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, + vaddr, paddr, length); + return 0; +} + +int +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) +{ + int found = 0; + unsigned int i; + size_t e_offset; + struct dpaax_iovat_element *entry; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Del: Incorrect entry for PA->VA Table " + "(%"PRIu64")", paddr); + DPAAX_ERR("Del: Lowest address: %"PRIu64, + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + e_offset = (align_paddr / DPAAX_MEM_SPLIT); + entry->pages[e_offset] = 0; + + /* Addition might have populated multiple entries, but removal + * won't do that. Someone might be using internal entries. + * Removal is essentially a dummy - maynot be ever required + */ + break; + } + + if (!found) { + DPAAX_WARN("Del: Unable to find slot for phy(%"PRIu64")", + paddr); + return -1; + } + + DPAAX_DEBUG("Del: Found slot at (%"PRIu64")[(%zu)] for phy(%"PRIu64")", + entry[i].start, e_offset, paddr); + return 0; +} + +/* dpaax_iova_table_dump + * Dump the table, with its entries, on screen. Only works in Debug Mode + * Not for weak hearted - the tables can get quite large + */ +void +dpaax_iova_table_dump(void) +{ + unsigned int i, j; + struct dpaax_iovat_element *entry; + + /* In case DEBUG is not enabled, some 'if' conditions might misbehave + * as they have nothing else in them except a DPAAX_DEBUG() which if + * tuned out would leave 'if' naked. + */ + if (rte_log_get_global_level() < RTE_LOG_DEBUG) { + DPAAX_ERR("Set log level to Debug for PA->Table dump!"); + return; + } + + DPAAX_DEBUG(" === Start of PA->VA Translation Table ==="); + if (dpaax_iova_table_p == NULL) + DPAAX_DEBUG("\tNULL"); + + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + DPAAX_DEBUG("\t(%16i),(%16"PRIu64"),(%16zu),(%16p)", + i, entry[i].start, entry[i].len, entry[i].pages); + DPAAX_DEBUG("\t\t (PA), (VA)"); + for (j = 0; j < (entry->len/DPAAX_MEM_SPLIT); j++) { + if (entry[i].pages[j] == 0) + continue; + DPAAX_DEBUG("\t\t(%16"PRIx64"),(%16"PRIx64")", + (entry[i].start + (j * sizeof(uint64_t))), + entry[i].pages[j]); + } + } + DPAAX_DEBUG(" === End of PA->VA Translation Table ==="); +} + +static void +dpaax_memevent_cb(enum rte_mem_event type, const void *addr, size_t len, + void *arg __rte_unused) +{ + struct rte_memseg_list *msl; + struct rte_memseg *ms; + size_t cur_len = 0, map_len = 0; + phys_addr_t phys_addr; + void *virt_addr; + int ret; + + DPAAX_DEBUG("Called with addr=%p, len=%zu", addr, len); + + msl = rte_mem_virt2memseg_list(addr); + + while (cur_len < len) { + const void *va = RTE_PTR_ADD(addr, cur_len); + + ms = rte_mem_virt2memseg(va, msl); + phys_addr = rte_mem_virt2phy(ms->addr); + virt_addr = ms->addr; + map_len = ms->len; + + DPAAX_DEBUG("Request for %s, va=%p, virt_addr=%p," + "iova=%"PRIu64", map_len=%zu", + type == RTE_MEM_EVENT_ALLOC ? + "alloc" : "dealloc", + va, virt_addr, phys_addr, map_len); + + if (type == RTE_MEM_EVENT_ALLOC) + ret = dpaax_iova_table_add(phys_addr, virt_addr, + map_len); + else + ret = dpaax_iova_table_del(phys_addr, map_len); + + if (ret != 0) { + DPAAX_ERR("PA-Table entry update failed. " + "Map=%d, addr=%p, len=%zu, err:(%d)", + type, va, map_len, ret); + return; + } + + cur_len += map_len; + } +} + +static int +dpaax_memevent_walk_memsegs(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, size_t len, + void *arg __rte_unused) +{ + DPAAX_DEBUG("Walking for %p (pa=%"PRIu64") and len %zu", + ms->addr, ms->phys_addr, len); + dpaax_iova_table_add(rte_mem_virt2phy(ms->addr), ms->addr, len); + return 0; +} + +static int +dpaax_handle_memevents(void) +{ + /* First, walk through all memsegs and pin them, before installing + * handler. This assures that all memseg which have already been + * identified/allocated by EAL, are already part of PA->VA Table. This + * is especially for cases where application allocates memory before + * the EAL or this is an externally allocated memory passed to EAL. + */ + rte_memseg_contig_walk_thread_unsafe(dpaax_memevent_walk_memsegs, NULL); + + return rte_mem_event_callback_register("dpaax_memevents_cb", + dpaax_memevent_cb, NULL); +} + +RTE_INIT(dpaax_log) +{ + dpaax_logger = rte_log_register("pmd.common.dpaax"); + if (dpaax_logger >= 0) + rte_log_set_level(dpaax_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/dpaax/dpaax_iova_table.h b/drivers/common/dpaax/dpaax_iova_table.h new file mode 100644 index 000000000..056e4e0a1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.h @@ -0,0 +1,104 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_IOVA_TABLE_H_ +#define _DPAAX_IOVA_TABLE_H_ + +#include <unistd.h> +#include <stdio.h> +#include <string.h> +#include <stdbool.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <dirent.h> +#include <fcntl.h> +#include <glob.h> +#include <errno.h> +#include <arpa/inet.h> + +#include <rte_eal.h> +#include <rte_branch_prediction.h> +#include <rte_memory.h> +#include <rte_malloc.h> + +struct dpaax_iovat_element { + phys_addr_t start; /**< Start address of block of physical pages */ + size_t len; /**< Difference of end-start for quick access */ + uint64_t *pages; /**< VA for each physical page in this block */ +}; + +struct dpaax_iova_table { + unsigned int count; /**< No. of blocks of contiguous physical pages */ + struct dpaax_iovat_element entries[0]; +}; + +/* Pointer to the table, which is common for DPAA/DPAA2 and only a single + * instance is required across net/crypto/event drivers. This table is + * populated iff devices are found on the bus. + */ +extern struct dpaax_iova_table *dpaax_iova_table_p; + +/* Device tree file for memory layout is named 'memory@<addr>' where the 'addr' + * is SoC dependent, or even Uboot fixup dependent. + */ +#define MEM_NODE_PATH_GLOB "/proc/device-tree/memory[@0-9]*/reg" +/* Device file should be multiple of 16 bytes, each containing 8 byte of addr + * and its length. Assuming max of 5 entries. + */ +#define MEM_NODE_FILE_LEN ((16 * 5) + 1) + +/* Table is made up of DPAAX_MEM_SPLIT elements for each contiguous zone. This + * helps avoid separate handling for cases where more than one size of hugepage + * is supported. + */ +#define DPAAX_MEM_SPLIT (1<<21) +#define DPAAX_MEM_SPLIT_MASK ~(DPAAX_MEM_SPLIT - 1) /**< Floor aligned */ +#define DPAAX_MEM_SPLIT_MASK_OFF (DPAAX_MEM_SPLIT - 1) /**< Offset */ + +/* APIs exposed */ +int dpaax_iova_table_populate(void); +void dpaax_iova_table_depopulate(void); +int dpaax_iova_table_add(phys_addr_t paddr, void *vaddr, size_t length); +int dpaax_iova_table_del(phys_addr_t paddr, size_t len); +void dpaax_iova_table_dump(void); + +static inline void *dpaax_iova_table_get_va(phys_addr_t paddr) __attribute__((hot)); + +static inline void * +dpaax_iova_table_get_va(phys_addr_t paddr) { + unsigned int i = 0, index; + void *vaddr = 0; + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; + struct dpaax_iovat_element *entry; + + entry = dpaax_iova_table_p->entries; + + do { + if (unlikely(i > dpaax_iova_table_p->count)) + break; + + if (paddr_align < entry[i].start) { + /* Incorrect paddr; Not in memory range */ + return 0; + } + + if (paddr_align > (entry[i].start + entry[i].len)) { + i++; + continue; + } + + /* paddr > entry->start && paddr <= entry->(start+len) */ + index = (paddr_align - entry[i].start)/DPAAX_MEM_SPLIT; + vaddr = (void *)((uintptr_t)entry[i].pages[index] + offset); + break; + } while (1); + + return vaddr; +} + +#endif /* _DPAAX_IOVA_TABLE_H_ */ diff --git a/drivers/common/dpaax/dpaax_logs.h b/drivers/common/dpaax/dpaax_logs.h new file mode 100644 index 000000000..bf1b27cc1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_logs.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_LOGS_H_ +#define _DPAAX_LOGS_H_ + +#include <rte_log.h> + +extern int dpaax_logger; + +#define DPAAX_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, dpaax_logger, "dpaax: " fmt "\n", \ + ##args) + +/* Debug logs are with Function names */ +#define DPAAX_DEBUG(fmt, args...) \ + rte_log(RTE_LOG_DEBUG, dpaax_logger, "dpaax: %s(): " fmt "\n", \ + __func__, ##args) + +#define DPAAX_INFO(fmt, args...) \ + DPAAX_LOG(INFO, fmt, ## args) +#define DPAAX_ERR(fmt, args...) \ + DPAAX_LOG(ERR, fmt, ## args) +#define DPAAX_WARN(fmt, args...) \ + DPAAX_LOG(WARNING, fmt, ## args) + +/* DP Logs, toggled out at compile time if level lower than current level */ +#define DPAAX_DP_LOG(level, fmt, args...) \ + RTE_LOG_DP(level, PMD, fmt, ## args) + +#define DPAAX_DP_DEBUG(fmt, args...) \ + DPAAX_DP_LOG(DEBUG, fmt, ## args) +#define DPAAX_DP_INFO(fmt, args...) \ + DPAAX_DP_LOG(INFO, fmt, ## args) +#define DPAAX_DP_WARN(fmt, args...) \ + DPAAX_DP_LOG(WARNING, fmt, ## args) + +#endif /* _DPAAX_LOGS_H_ */ diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build new file mode 100644 index 000000000..98a1bdd48 --- /dev/null +++ b/drivers/common/dpaax/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 NXP + +allow_experimental_apis = true + +if host_machine.system() != 'linux' + build = false +endif + +sources = files('dpaax_iova_table.c') + +cflags += ['-D_GNU_SOURCE'] diff --git a/drivers/common/dpaax/rte_common_dpaax_version.map b/drivers/common/dpaax/rte_common_dpaax_version.map new file mode 100644 index 000000000..6c0efde20 --- /dev/null +++ b/drivers/common/dpaax/rte_common_dpaax_version.map @@ -0,0 +1,12 @@ +DPDK_18.11 { + global: + + dpaax_iova_table_add; + dpaax_iova_table_del; + dpaax_iova_table_depopulate; + dpaax_iova_table_dump; + dpaax_iova_table_p; + dpaax_iova_table_populate; + + local: *; +}; \ No newline at end of file diff --git a/drivers/common/meson.build b/drivers/common/meson.build index f828ce7f7..0257d4d2b 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -2,6 +2,6 @@ # Copyright(c) 2018 Cavium, Inc std_deps = ['eal'] -drivers = ['mvep', 'octeontx', 'qat'] +drivers = ['dpaax', 'mvep', 'octeontx', 'qat'] config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON' driver_name_fmt = 'rte_common_@0@' -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 12:54 ` [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-09-25 13:28 ` Burakov, Anatoly 2018-09-25 13:39 ` Shreyansh Jain 0 siblings, 1 reply; 53+ messages in thread From: Burakov, Anatoly @ 2018-09-25 13:28 UTC (permalink / raw) To: Shreyansh Jain, ferruh.yigit; +Cc: dev On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: > A common library, valid for dpaaX drivers, which is used to maintain > a local copy of PA->VA translations. > > In case of physical addressing mode (one of the option for FSLMC, and > only option for DPAA bus), the addresses of descriptors Rx'd are > physical. These need to be converted into equivalent VA for rte_mbuf > and other similar calls. > > Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This > library is an attempt to reduce the overall cost associated with > this translation. > > A small table is maintained, containing continuous entries > representing a continguous physical range. Each of these entries > stores the equivalent VA, which is fed during mempool creation, or > memory allocation/deallocation callbacks. > > Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> > --- Hi Shreyansh, So, basically, you're reimplementing old DPDK's memory view (storing VA's in a PA-centric way). Makes sense :) I should caution you that right now, external memory allocator implementation does *not* trigger any callbacks for newly added memory. So, anything coming from external memory will not be reflected in your table, unless it happens to be already there before dpaax_iova_table_populate() gets called. This patchset makes a good argument for why perhaps it should trigger callbacks. Thoughts? Also, a couple of nitpicks below. > config/common_base | 5 + > config/common_linuxapp | 5 + > drivers/common/Makefile | 4 + > drivers/common/dpaax/Makefile | 31 ++ > drivers/common/dpaax/dpaax_iova_table.c | 509 ++++++++++++++++++ > drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ > drivers/common/dpaax/dpaax_logs.h | 39 ++ > drivers/common/dpaax/meson.build | 12 + <snip> > + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," > + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, > + vaddr, paddr, length); > + return 0; > +} > + > +int > +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) len is not unused. > +{ > + int found = 0; > + unsigned int i; > + size_t e_offset; > + struct dpaax_iovat_element *entry; > + phys_addr_t align_paddr; > + > + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; > + > + /* Check if paddr is available in table */ <snip> > +static inline void * > +dpaax_iova_table_get_va(phys_addr_t paddr) { > + unsigned int i = 0, index; > + void *vaddr = 0; > + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; > + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; > + struct dpaax_iovat_element *entry; > + > + entry = dpaax_iova_table_p->entries; > + > + do { > + if (unlikely(i > dpaax_iova_table_p->count)) > + break; > + > + if (paddr_align < entry[i].start) { > + /* Incorrect paddr; Not in memory range */ > + return 0; NULL? -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 13:28 ` Burakov, Anatoly @ 2018-09-25 13:39 ` Shreyansh Jain 2018-09-25 13:51 ` Burakov, Anatoly 2018-10-09 10:45 ` Shreyansh Jain 0 siblings, 2 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 13:39 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev Hello Anatoly, On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: > On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >> A common library, valid for dpaaX drivers, which is used to maintain >> a local copy of PA->VA translations. >> >> In case of physical addressing mode (one of the option for FSLMC, and >> only option for DPAA bus), the addresses of descriptors Rx'd are >> physical. These need to be converted into equivalent VA for rte_mbuf >> and other similar calls. >> >> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >> library is an attempt to reduce the overall cost associated with >> this translation. >> >> A small table is maintained, containing continuous entries >> representing a continguous physical range. Each of these entries >> stores the equivalent VA, which is fed during mempool creation, or >> memory allocation/deallocation callbacks. >> >> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >> --- > > Hi Shreyansh, > > So, basically, you're reimplementing old DPDK's memory view (storing > VA's in a PA-centric way). Makes sense :) Yes, and frankly, I couldn't come up with any other way. > > I should caution you that right now, external memory allocator > implementation does *not* trigger any callbacks for newly added memory. > So, anything coming from external memory will not be reflected in your > table, unless it happens to be already there before > dpaax_iova_table_populate() gets called. This patchset makes a good > argument for why perhaps it should trigger callbacks. Thoughts? Oh. Then I must be finishing reading through your patches for external memory sooner. I didn't realize this. > > Also, a couple of nitpicks below. > >> config/common_base | 5 + >> config/common_linuxapp | 5 + >> drivers/common/Makefile | 4 + >> drivers/common/dpaax/Makefile | 31 ++ >> drivers/common/dpaax/dpaax_iova_table.c | 509 ++++++++++++++++++ >> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >> drivers/common/dpaax/dpaax_logs.h | 39 ++ >> drivers/common/dpaax/meson.build | 12 + > > <snip> > >> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," >> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >> + vaddr, paddr, length); >> + return 0; >> +} >> + >> +int >> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) > > len is not unused. I will fix this. Actually, this function itself is useless - more for symmetry reason. Callers would be either simply updating the table, or ignoring it completely. But, yes, this is indeed wrong that I set that unused. > >> +{ >> + int found = 0; >> + unsigned int i; >> + size_t e_offset; >> + struct dpaax_iovat_element *entry; >> + phys_addr_t align_paddr; >> + >> + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; >> + >> + /* Check if paddr is available in table */ > > <snip> > >> +static inline void * >> +dpaax_iova_table_get_va(phys_addr_t paddr) { >> + unsigned int i = 0, index; >> + void *vaddr = 0; >> + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; >> + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; >> + struct dpaax_iovat_element *entry; >> + >> + entry = dpaax_iova_table_p->entries; >> + >> + do { >> + if (unlikely(i > dpaax_iova_table_p->count)) >> + break; >> + >> + if (paddr_align < entry[i].start) { >> + /* Incorrect paddr; Not in memory range */ >> + return 0; > > NULL? Yes, NULL. I will fix that as well. > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 13:39 ` Shreyansh Jain @ 2018-09-25 13:51 ` Burakov, Anatoly 2018-09-25 14:00 ` Shreyansh Jain 2018-10-09 10:45 ` Shreyansh Jain 1 sibling, 1 reply; 53+ messages in thread From: Burakov, Anatoly @ 2018-09-25 13:51 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 25-Sep-18 2:39 PM, Shreyansh Jain wrote: > Hello Anatoly, > > On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>> A common library, valid for dpaaX drivers, which is used to maintain >>> a local copy of PA->VA translations. >>> >>> In case of physical addressing mode (one of the option for FSLMC, and >>> only option for DPAA bus), the addresses of descriptors Rx'd are >>> physical. These need to be converted into equivalent VA for rte_mbuf >>> and other similar calls. >>> >>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>> library is an attempt to reduce the overall cost associated with >>> this translation. >>> >>> A small table is maintained, containing continuous entries >>> representing a continguous physical range. Each of these entries >>> stores the equivalent VA, which is fed during mempool creation, or >>> memory allocation/deallocation callbacks. >>> >>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >>> --- >> >> Hi Shreyansh, >> >> So, basically, you're reimplementing old DPDK's memory view (storing >> VA's in a PA-centric way). Makes sense :) > > Yes, and frankly, I couldn't come up with any other way. > >> >> I should caution you that right now, external memory allocator >> implementation does *not* trigger any callbacks for newly added >> memory. So, anything coming from external memory will not be reflected >> in your table, unless it happens to be already there before >> dpaax_iova_table_populate() gets called. This patchset makes a good >> argument for why perhaps it should trigger callbacks. Thoughts? > > Oh. Then I must be finishing reading through your patches for external > memory sooner. I didn't realize this. To be clear, the current implementation of external memory allocators is not necessarily final - it's not too late to add callbacks to enable your use case better, if that's required (and it should be pretty easy to implement as well). -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 13:51 ` Burakov, Anatoly @ 2018-09-25 14:00 ` Shreyansh Jain 2018-09-25 14:08 ` Burakov, Anatoly 0 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 14:00 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev On Tuesday 25 September 2018 07:21 PM, Burakov, Anatoly wrote: > On 25-Sep-18 2:39 PM, Shreyansh Jain wrote: >> Hello Anatoly, >> >> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>> A common library, valid for dpaaX drivers, which is used to maintain >>>> a local copy of PA->VA translations. >>>> >>>> In case of physical addressing mode (one of the option for FSLMC, and >>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>> and other similar calls. >>>> >>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>> library is an attempt to reduce the overall cost associated with >>>> this translation. >>>> >>>> A small table is maintained, containing continuous entries >>>> representing a continguous physical range. Each of these entries >>>> stores the equivalent VA, which is fed during mempool creation, or >>>> memory allocation/deallocation callbacks. >>>> >>>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >>>> --- >>> >>> Hi Shreyansh, >>> >>> So, basically, you're reimplementing old DPDK's memory view (storing >>> VA's in a PA-centric way). Makes sense :) >> >> Yes, and frankly, I couldn't come up with any other way. >> >>> >>> I should caution you that right now, external memory allocator >>> implementation does *not* trigger any callbacks for newly added >>> memory. So, anything coming from external memory will not be >>> reflected in your table, unless it happens to be already there before >>> dpaax_iova_table_populate() gets called. This patchset makes a good >>> argument for why perhaps it should trigger callbacks. Thoughts? >> >> Oh. Then I must be finishing reading through your patches for external >> memory sooner. I didn't realize this. > > To be clear, the current implementation of external memory allocators is > not necessarily final - it's not too late to add callbacks to enable > your use case better, if that's required (and it should be pretty easy > to implement as well). > Is there any reason why external may not be raising call back right now? I might have missed any previous conversation on this. Or may be, it is just lack of need. As for whether it is required - I do see a need. It is definitely possible that after rte_eal_init has been completed (and underlying probe), applications allocate memory. In which case, even existing memevent callbacks (like the one in fslmc_bus, which VFIO/DMA maps the area) would have issues. From the external memory patchset, I do see that it is assumed DMA mapping is caller's responsibility. Having such callback would help drives reduce that throwback of responsibility. (Speaking of external memory patches, I also realize that my memevent callback in this patch series need to handle msl->external). ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 14:00 ` Shreyansh Jain @ 2018-09-25 14:08 ` Burakov, Anatoly 2018-09-26 10:16 ` Burakov, Anatoly 0 siblings, 1 reply; 53+ messages in thread From: Burakov, Anatoly @ 2018-09-25 14:08 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 25-Sep-18 3:00 PM, Shreyansh Jain wrote: > On Tuesday 25 September 2018 07:21 PM, Burakov, Anatoly wrote: >> On 25-Sep-18 2:39 PM, Shreyansh Jain wrote: >>> Hello Anatoly, >>> >>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>> a local copy of PA->VA translations. >>>>> >>>>> In case of physical addressing mode (one of the option for FSLMC, and >>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>> and other similar calls. >>>>> >>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>> library is an attempt to reduce the overall cost associated with >>>>> this translation. >>>>> >>>>> A small table is maintained, containing continuous entries >>>>> representing a continguous physical range. Each of these entries >>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>> memory allocation/deallocation callbacks. >>>>> >>>>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >>>>> --- >>>> >>>> Hi Shreyansh, >>>> >>>> So, basically, you're reimplementing old DPDK's memory view (storing >>>> VA's in a PA-centric way). Makes sense :) >>> >>> Yes, and frankly, I couldn't come up with any other way. >>> >>>> >>>> I should caution you that right now, external memory allocator >>>> implementation does *not* trigger any callbacks for newly added >>>> memory. So, anything coming from external memory will not be >>>> reflected in your table, unless it happens to be already there >>>> before dpaax_iova_table_populate() gets called. This patchset makes >>>> a good argument for why perhaps it should trigger callbacks. Thoughts? >>> >>> Oh. Then I must be finishing reading through your patches for >>> external memory sooner. I didn't realize this. >> >> To be clear, the current implementation of external memory allocators >> is not necessarily final - it's not too late to add callbacks to >> enable your use case better, if that's required (and it should be >> pretty easy to implement as well). >> > > Is there any reason why external may not be raising call back right now? > I might have missed any previous conversation on this. Or may be, it is > just lack of need. Well, pretty much - it didn't occur to me that it may be needed. I specifically went out of my way to note that it is the responsibility of the user to perform any DMA mappings, but i missed the fact that there may be other users interested to know that a user has just added a new external memory segment. > > As for whether it is required - I do see a need. It is definitely > possible that after rte_eal_init has been completed (and underlying > probe), applications allocate memory. In which case, even existing > memevent callbacks (like the one in fslmc_bus, which VFIO/DMA maps the > area) would have issues. From the external memory patchset, I do see > that it is assumed DMA mapping is caller's responsibility. > > Having such callback would help drives reduce that throwback of > responsibility. I do not want to assume that user necessarily wants to map external memory for DMA unless explicitly asked to do so. At the same time, i can see that some uses may not have anything to do with DMA mapping and may instead be cases like yours, where you just need the address. In our case, we can just ignore external memory in VFIO and virtio callbacks, but still allow other callbacks to handle external memory as they see fit. > > (Speaking of external memory patches, I also realize that my memevent > callback in this patch series need to handle msl->external). Yes, we have to be careful on merge. > -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 14:08 ` Burakov, Anatoly @ 2018-09-26 10:16 ` Burakov, Anatoly 0 siblings, 0 replies; 53+ messages in thread From: Burakov, Anatoly @ 2018-09-26 10:16 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 25-Sep-18 3:08 PM, Burakov, Anatoly wrote: > On 25-Sep-18 3:00 PM, Shreyansh Jain wrote: >> On Tuesday 25 September 2018 07:21 PM, Burakov, Anatoly wrote: >>> On 25-Sep-18 2:39 PM, Shreyansh Jain wrote: >>>> Hello Anatoly, >>>> >>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>>> a local copy of PA->VA translations. >>>>>> >>>>>> In case of physical addressing mode (one of the option for FSLMC, and >>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>>> and other similar calls. >>>>>> >>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>> library is an attempt to reduce the overall cost associated with >>>>>> this translation. >>>>>> >>>>>> A small table is maintained, containing continuous entries >>>>>> representing a continguous physical range. Each of these entries >>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>> memory allocation/deallocation callbacks. >>>>>> >>>>>> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >>>>>> --- >>>>> >>>>> Hi Shreyansh, >>>>> >>>>> So, basically, you're reimplementing old DPDK's memory view >>>>> (storing VA's in a PA-centric way). Makes sense :) >>>> >>>> Yes, and frankly, I couldn't come up with any other way. >>>> >>>>> >>>>> I should caution you that right now, external memory allocator >>>>> implementation does *not* trigger any callbacks for newly added >>>>> memory. So, anything coming from external memory will not be >>>>> reflected in your table, unless it happens to be already there >>>>> before dpaax_iova_table_populate() gets called. This patchset makes >>>>> a good argument for why perhaps it should trigger callbacks. Thoughts? >>>> >>>> Oh. Then I must be finishing reading through your patches for >>>> external memory sooner. I didn't realize this. >>> >>> To be clear, the current implementation of external memory allocators >>> is not necessarily final - it's not too late to add callbacks to >>> enable your use case better, if that's required (and it should be >>> pretty easy to implement as well). >>> >> >> Is there any reason why external may not be raising call back right >> now? I might have missed any previous conversation on this. Or may be, >> it is just lack of need. > > Well, pretty much - it didn't occur to me that it may be needed. I > specifically went out of my way to note that it is the responsibility of > the user to perform any DMA mappings, but i missed the fact that there > may be other users interested to know that a user has just added a new > external memory segment. > >> >> As for whether it is required - I do see a need. It is definitely >> possible that after rte_eal_init has been completed (and underlying >> probe), applications allocate memory. In which case, even existing >> memevent callbacks (like the one in fslmc_bus, which VFIO/DMA maps the >> area) would have issues. From the external memory patchset, I do see >> that it is assumed DMA mapping is caller's responsibility. >> >> Having such callback would help drives reduce that throwback of >> responsibility. > > I do not want to assume that user necessarily wants to map external > memory for DMA unless explicitly asked to do so. At the same time, i can > see that some uses may not have anything to do with DMA mapping and may > instead be cases like yours, where you just need the address. In our > case, we can just ignore external memory in VFIO and virtio callbacks, > but still allow other callbacks to handle external memory as they see fit. > >> >> (Speaking of external memory patches, I also realize that my memevent >> callback in this patch series need to handle msl->external). > > Yes, we have to be careful on merge. > Hi Shreyansh, I'm currently implementing callback support for external memory. I think the decision to leave DMA mapping to the user may have been a bad one. For starters, one of the buses (bus/fslmc) has its own VFIO infrastructure independent of EAL's VFIO, so if we don't map it there - we can't map it at all because there's no generic manual way to do a DMA map with bus/fslmc driver after the fact. Also, if we leave VFIO mapping to the user, we end up with an inconsistency where we provide memory callbacks which can potentially create DMA mappings (such as what some MLX drivers do), but VFIO DMA mappings will be ignored because "reasons". The only place where we really *don't* want to see external memory is virtio, because there is no segment fd support for external memory yet. Every other place, i think it's a good idea to not skip external memory. So, starting from v5, DMA mapping will be performed automatically for external memory when IOVA addresses are available. -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-09-25 13:39 ` Shreyansh Jain 2018-09-25 13:51 ` Burakov, Anatoly @ 2018-10-09 10:45 ` Shreyansh Jain 2018-10-11 9:03 ` Burakov, Anatoly 1 sibling, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 10:45 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: > Hello Anatoly, > > On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>> A common library, valid for dpaaX drivers, which is used to maintain >>> a local copy of PA->VA translations. >>> >>> In case of physical addressing mode (one of the option for FSLMC, and >>> only option for DPAA bus), the addresses of descriptors Rx'd are >>> physical. These need to be converted into equivalent VA for rte_mbuf >>> and other similar calls. >>> >>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>> library is an attempt to reduce the overall cost associated with >>> this translation. >>> >>> A small table is maintained, containing continuous entries >>> representing a continguous physical range. Each of these entries >>> stores the equivalent VA, which is fed during mempool creation, or >>> memory allocation/deallocation callbacks. >>> [...] > >> >> Also, a couple of nitpicks below. >> >>> cosnfig/common_base | 5 + >>> config/common_linuxapp | 5 + >>> drivers/common/Makefile | 4 + >>> drivers/common/dpaax/Makefile | 31 ++ >>> drivers/common/dpaax/dpaax_iova_table.c | 509 ++++++++++++++++++ >>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>> drivers/common/dpaax/meson.build | 12 + >> >> <snip> >> >>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," >>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>> + vaddr, paddr, length); >>> + return 0; >>> +} >>> + >>> +int >>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >> >> len is not unused. > > I will fix this. > Actually, this function itself is useless - more for symmetry reason. > Callers would be either simply updating the table, or ignoring it > completely. But, yes, this is indeed wrong that I set that unused. > Actually, I was wrong in my first reply. In case of dpaax_iova_table_del(), len is indeed redundant. This is because the mapping is for a complete page (min of 2MB size), even if the request is for lesser length. So, removal of a single entry (of fixed size) would be done. In fact, while on this, I think deleting a PA->VA entry itself is incorrect (not just useless). A single entry (~2MB equivalent) can represent multiple users (working on a rte_malloc'd area, for example). So, effectively, its always an update - not an add or del. I will send updated series with this change. [...] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-09 10:45 ` Shreyansh Jain @ 2018-10-11 9:03 ` Burakov, Anatoly 2018-10-11 10:02 ` Shreyansh Jain 0 siblings, 1 reply; 53+ messages in thread From: Burakov, Anatoly @ 2018-10-11 9:03 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: > On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >> Hello Anatoly, >> >> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>> A common library, valid for dpaaX drivers, which is used to maintain >>>> a local copy of PA->VA translations. >>>> >>>> In case of physical addressing mode (one of the option for FSLMC, and >>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>> and other similar calls. >>>> >>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>> library is an attempt to reduce the overall cost associated with >>>> this translation. >>>> >>>> A small table is maintained, containing continuous entries >>>> representing a continguous physical range. Each of these entries >>>> stores the equivalent VA, which is fed during mempool creation, or >>>> memory allocation/deallocation callbacks. >>>> > > [...] > >> >>> >>> Also, a couple of nitpicks below. >>> >>>> cosnfig/common_base | 5 + >>>> config/common_linuxapp | 5 + >>>> drivers/common/Makefile | 4 + >>>> drivers/common/dpaax/Makefile | 31 ++ >>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>> ++++++++++++++++++ >>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>> drivers/common/dpaax/meson.build | 12 + >>> >>> <snip> >>> >>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>> vaddr:(%p)," >>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>> + vaddr, paddr, length); >>>> + return 0; >>>> +} >>>> + >>>> +int >>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>> >>> len is not unused. >> >> I will fix this. >> Actually, this function itself is useless - more for symmetry reason. >> Callers would be either simply updating the table, or ignoring it >> completely. But, yes, this is indeed wrong that I set that unused. >> > > Actually, I was wrong in my first reply. In case of > dpaax_iova_table_del(), len is indeed redundant. This is because the > mapping is for a complete page (min of 2MB size), even if the request is > for lesser length. So, removal of a single entry (of fixed size) would > be done. > > In fact, while on this, I think deleting a PA->VA entry itself is > incorrect (not just useless). A single entry (~2MB equivalent) can > represent multiple users (working on a rte_malloc'd area, for example). > So, effectively, its always an update - not an add or del. I'm not sure what you mean here. If you got a mem event about memory area being freed, it's guaranteed to *not* have any users - neither malloc, nor any other memory. And len is always page-aligned. > > I will send updated series with this change. > > [...] > > -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 9:03 ` Burakov, Anatoly @ 2018-10-11 10:02 ` Shreyansh Jain 2018-10-11 10:07 ` Shreyansh Jain 2018-10-11 10:09 ` Burakov, Anatoly 0 siblings, 2 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-11 10:02 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: > On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>> Hello Anatoly, >>> >>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>> a local copy of PA->VA translations. >>>>> >>>>> In case of physical addressing mode (one of the option for FSLMC, and >>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>> and other similar calls. >>>>> >>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>> library is an attempt to reduce the overall cost associated with >>>>> this translation. >>>>> >>>>> A small table is maintained, containing continuous entries >>>>> representing a continguous physical range. Each of these entries >>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>> memory allocation/deallocation callbacks. >>>>> >> >> [...] >> >>> >>>> >>>> Also, a couple of nitpicks below. >>>> >>>>> cosnfig/common_base | 5 + >>>>> config/common_linuxapp | 5 + >>>>> drivers/common/Makefile | 4 + >>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>> ++++++++++++++++++ >>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>> drivers/common/dpaax/meson.build | 12 + >>>> >>>> <snip> >>>> >>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>> vaddr:(%p)," >>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>>> + vaddr, paddr, length); >>>>> + return 0; >>>>> +} >>>>> + >>>>> +int >>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>> >>>> len is not unused. >>> >>> I will fix this. >>> Actually, this function itself is useless - more for symmetry reason. >>> Callers would be either simply updating the table, or ignoring it >>> completely. But, yes, this is indeed wrong that I set that unused. >>> >> >> Actually, I was wrong in my first reply. In case of >> dpaax_iova_table_del(), len is indeed redundant. This is because the >> mapping is for a complete page (min of 2MB size), even if the request >> is for lesser length. So, removal of a single entry (of fixed size) >> would be done. >> >> In fact, while on this, I think deleting a PA->VA entry itself is >> incorrect (not just useless). A single entry (~2MB equivalent) can >> represent multiple users (working on a rte_malloc'd area, for >> example). So, effectively, its always an update - not an add or del. > > I'm not sure what you mean here. If you got a mem event about memory > area being freed, it's guaranteed to *not* have any users - neither > malloc, nor any other memory. And len is always page-aligned. ok. Maybe I am getting this wrong, but consider this: 1) hugepage size=2MB 2) a = malloc(1M) this will pin an entry in table for a block starting at VA=(a) and PA=(a'). Each entry is of 2MB length - that means, even if someone were to access a+1048577 for an equivalent PA, they would get it (though, that is a incorrect access). 3) b = malloc(1M) this *might* lead to a case where same 2MB page is used and VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). = After b, the PA-VA table has a single entry of 2MB, representing two mallocs. It can be used for translation for any thread requesting PAs of a or b. 4) Free(a) - this would attempt to remove one 2MB entry from PA-VA table. But, 'b' is already valid. Access to get_pa(VA(b)) should return me the PA(b). - 'len' is not even used as the entry in PA-VA table is of a fixed size. In the above, (3) is an assumption I am making based on my understanding how mem allocator is working. Is that wrong? Basically, this is a restriction of this table - it has a min chunk of 2MB - even for 1G hugepages - and hence, it is not possible to honor deletes. I know this is convoluted logic - but, this keeps it simple and use-able without much performance impact. [...] ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 10:02 ` Shreyansh Jain @ 2018-10-11 10:07 ` Shreyansh Jain 2018-10-11 10:13 ` Burakov, Anatoly 2018-10-11 10:09 ` Burakov, Anatoly 1 sibling, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-11 10:07 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote: > On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: >> On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >>> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>>> Hello Anatoly, >>>> >>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>>> a local copy of PA->VA translations. >>>>>> >>>>>> In case of physical addressing mode (one of the option for FSLMC, and >>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>>> and other similar calls. >>>>>> >>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>> library is an attempt to reduce the overall cost associated with >>>>>> this translation. >>>>>> >>>>>> A small table is maintained, containing continuous entries >>>>>> representing a continguous physical range. Each of these entries >>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>> memory allocation/deallocation callbacks. >>>>>> >>> >>> [...] >>> >>>> >>>>> >>>>> Also, a couple of nitpicks below. >>>>> >>>>>> cosnfig/common_base | 5 + >>>>>> config/common_linuxapp | 5 + >>>>>> drivers/common/Makefile | 4 + >>>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>>> ++++++++++++++++++ >>>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>>> drivers/common/dpaax/meson.build | 12 + >>>>> >>>>> <snip> >>>>> >>>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>>> vaddr:(%p)," >>>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>>>> + vaddr, paddr, length); >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +int >>>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>>> >>>>> len is not unused. >>>> >>>> I will fix this. >>>> Actually, this function itself is useless - more for symmetry reason. >>>> Callers would be either simply updating the table, or ignoring it >>>> completely. But, yes, this is indeed wrong that I set that unused. >>>> >>> >>> Actually, I was wrong in my first reply. In case of >>> dpaax_iova_table_del(), len is indeed redundant. This is because the >>> mapping is for a complete page (min of 2MB size), even if the request >>> is for lesser length. So, removal of a single entry (of fixed size) >>> would be done. >>> >>> In fact, while on this, I think deleting a PA->VA entry itself is >>> incorrect (not just useless). A single entry (~2MB equivalent) can >>> represent multiple users (working on a rte_malloc'd area, for >>> example). So, effectively, its always an update - not an add or del. >> >> I'm not sure what you mean here. If you got a mem event about memory >> area being freed, it's guaranteed to *not* have any users - neither >> malloc, nor any other memory. And len is always page-aligned. > > ok. Maybe I am getting this wrong, but consider this: > > 1) hugepage size=2MB > 2) a = malloc(1M) > this will pin an entry in table for a block starting at VA=(a) and > PA=(a'). Each entry is of 2MB length - that means, even if someone were > to access a+1048577 for an equivalent PA, they would get it (though, > that is a incorrect access). > 3) b = malloc(1M) > this *might* lead to a case where same 2MB page is used and > VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). > = After b, the PA-VA table has a single entry of 2MB, representing two > mallocs. It can be used for translation for any thread requesting PAs of > a or b. > 4) Free(a) > - this would attempt to remove one 2MB entry from PA-VA table. But, > 'b' is already valid. Access to get_pa(VA(b)) should return me the PA(b). > - 'len' is not even used as the entry in PA-VA table is of a fixed size. Just to add to this: - if talking about the mem_event callback, it definitely won't be a case where same page is still being served under another rte_malloc - But, calls can come to delete from users of PA-VA table based on their own rte_free(). And, your comment makes me think - I should probably del entry from the table only when mem_event callback is received. > > In the above, (3) is an assumption I am making based on my understanding > how mem allocator is working. Is that wrong? > > Basically, this is a restriction of this table - it has a min chunk of > 2MB - even for 1G hugepages - and hence, it is not possible to honor > deletes. I know this is convoluted logic - but, this keeps it simple and > use-able without much performance impact. > > [...] > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 10:07 ` Shreyansh Jain @ 2018-10-11 10:13 ` Burakov, Anatoly 2018-10-11 10:39 ` Shreyansh Jain 0 siblings, 1 reply; 53+ messages in thread From: Burakov, Anatoly @ 2018-10-11 10:13 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 11-Oct-18 11:07 AM, Shreyansh Jain wrote: > On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote: >> On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: >>> On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >>>> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>>>> Hello Anatoly, >>>>> >>>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>>>> a local copy of PA->VA translations. >>>>>>> >>>>>>> In case of physical addressing mode (one of the option for FSLMC, >>>>>>> and >>>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>>>> and other similar calls. >>>>>>> >>>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>>> library is an attempt to reduce the overall cost associated with >>>>>>> this translation. >>>>>>> >>>>>>> A small table is maintained, containing continuous entries >>>>>>> representing a continguous physical range. Each of these entries >>>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>>> memory allocation/deallocation callbacks. >>>>>>> >>>> >>>> [...] >>>> >>>>> >>>>>> >>>>>> Also, a couple of nitpicks below. >>>>>> >>>>>>> cosnfig/common_base | 5 + >>>>>>> config/common_linuxapp | 5 + >>>>>>> drivers/common/Makefile | 4 + >>>>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>>>> ++++++++++++++++++ >>>>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>>>> drivers/common/dpaax/meson.build | 12 + >>>>>> >>>>>> <snip> >>>>>> >>>>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>>>> vaddr:(%p)," >>>>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>>>>> + vaddr, paddr, length); >>>>>>> + return 0; >>>>>>> +} >>>>>>> + >>>>>>> +int >>>>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>>>> >>>>>> len is not unused. >>>>> >>>>> I will fix this. >>>>> Actually, this function itself is useless - more for symmetry reason. >>>>> Callers would be either simply updating the table, or ignoring it >>>>> completely. But, yes, this is indeed wrong that I set that unused. >>>>> >>>> >>>> Actually, I was wrong in my first reply. In case of >>>> dpaax_iova_table_del(), len is indeed redundant. This is because the >>>> mapping is for a complete page (min of 2MB size), even if the >>>> request is for lesser length. So, removal of a single entry (of >>>> fixed size) would be done. >>>> >>>> In fact, while on this, I think deleting a PA->VA entry itself is >>>> incorrect (not just useless). A single entry (~2MB equivalent) can >>>> represent multiple users (working on a rte_malloc'd area, for >>>> example). So, effectively, its always an update - not an add or del. >>> >>> I'm not sure what you mean here. If you got a mem event about memory >>> area being freed, it's guaranteed to *not* have any users - neither >>> malloc, nor any other memory. And len is always page-aligned. >> >> ok. Maybe I am getting this wrong, but consider this: >> >> 1) hugepage size=2MB >> 2) a = malloc(1M) >> this will pin an entry in table for a block starting at VA=(a) and >> PA=(a'). Each entry is of 2MB length - that means, even if someone >> were to access a+1048577 for an equivalent PA, they would get it >> (though, that is a incorrect access). >> 3) b = malloc(1M) >> this *might* lead to a case where same 2MB page is used and >> VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). >> = After b, the PA-VA table has a single entry of 2MB, representing two >> mallocs. It can be used for translation for any thread requesting PAs >> of a or b. >> 4) Free(a) >> - this would attempt to remove one 2MB entry from PA-VA table. But, >> 'b' is already valid. Access to get_pa(VA(b)) should return me the PA(b). >> - 'len' is not even used as the entry in PA-VA table is of a fixed >> size. > > Just to add to this: > - if talking about the mem_event callback, it definitely won't be a case > where same page is still being served under another rte_malloc > - But, calls can come to delete from users of PA-VA table based on their > own rte_free(). > > And, your comment makes me think - I should probably del entry from the > table only when mem_event callback is received. Mem events are not triggered on rte_free(), they're triggered on page deallocation. A call to rte_free/rte_memzone_free/rte_mempool_free etc. *might* trigger a page deallocation, but *only* if the memory area being freed encompasses an entire page. If you rte_malloc() 64 bytes and then rte_free() those 64 bytes, you won't get a mem event *unless* these were the only 64 bytes allocated on a particular page, and the entire page is no longer used by anything else. > >> >> In the above, (3) is an assumption I am making based on my >> understanding how mem allocator is working. Is that wrong? >> >> Basically, this is a restriction of this table - it has a min chunk of >> 2MB - even for 1G hugepages - and hence, it is not possible to honor >> deletes. I know this is convoluted logic - but, this keeps it simple >> and use-able without much performance impact. >> >> [...] >> > > -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 10:13 ` Burakov, Anatoly @ 2018-10-11 10:39 ` Shreyansh Jain 2018-10-11 10:46 ` Burakov, Anatoly 0 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-11 10:39 UTC (permalink / raw) To: Burakov, Anatoly; +Cc: ferruh.yigit, dev On Thursday 11 October 2018 03:43 PM, Burakov, Anatoly wrote: > On 11-Oct-18 11:07 AM, Shreyansh Jain wrote: >> On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote: >>> On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: >>>> On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >>>>> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>>>>> Hello Anatoly, >>>>>> >>>>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>>>> A common library, valid for dpaaX drivers, which is used to >>>>>>>> maintain >>>>>>>> a local copy of PA->VA translations. >>>>>>>> >>>>>>>> In case of physical addressing mode (one of the option for >>>>>>>> FSLMC, and >>>>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>>>> physical. These need to be converted into equivalent VA for >>>>>>>> rte_mbuf >>>>>>>> and other similar calls. >>>>>>>> >>>>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>>>> library is an attempt to reduce the overall cost associated with >>>>>>>> this translation. >>>>>>>> >>>>>>>> A small table is maintained, containing continuous entries >>>>>>>> representing a continguous physical range. Each of these entries >>>>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>>>> memory allocation/deallocation callbacks. >>>>>>>> >>>>> >>>>> [...] >>>>> >>>>>> >>>>>>> >>>>>>> Also, a couple of nitpicks below. >>>>>>> >>>>>>>> cosnfig/common_base | 5 + >>>>>>>> config/common_linuxapp | 5 + >>>>>>>> drivers/common/Makefile | 4 + >>>>>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>>>>> ++++++++++++++++++ >>>>>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>>>>> drivers/common/dpaax/meson.build | 12 + >>>>>>> >>>>>>> <snip> >>>>>>> >>>>>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>>>>> vaddr:(%p)," >>>>>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>>>>>> + vaddr, paddr, length); >>>>>>>> + return 0; >>>>>>>> +} >>>>>>>> + >>>>>>>> +int >>>>>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>>>>> >>>>>>> len is not unused. >>>>>> >>>>>> I will fix this. >>>>>> Actually, this function itself is useless - more for symmetry reason. >>>>>> Callers would be either simply updating the table, or ignoring it >>>>>> completely. But, yes, this is indeed wrong that I set that unused. >>>>>> >>>>> >>>>> Actually, I was wrong in my first reply. In case of >>>>> dpaax_iova_table_del(), len is indeed redundant. This is because >>>>> the mapping is for a complete page (min of 2MB size), even if the >>>>> request is for lesser length. So, removal of a single entry (of >>>>> fixed size) would be done. >>>>> >>>>> In fact, while on this, I think deleting a PA->VA entry itself is >>>>> incorrect (not just useless). A single entry (~2MB equivalent) can >>>>> represent multiple users (working on a rte_malloc'd area, for >>>>> example). So, effectively, its always an update - not an add or del. >>>> >>>> I'm not sure what you mean here. If you got a mem event about memory >>>> area being freed, it's guaranteed to *not* have any users - neither >>>> malloc, nor any other memory. And len is always page-aligned. >>> >>> ok. Maybe I am getting this wrong, but consider this: >>> >>> 1) hugepage size=2MB >>> 2) a = malloc(1M) >>> this will pin an entry in table for a block starting at VA=(a) and >>> PA=(a'). Each entry is of 2MB length - that means, even if someone >>> were to access a+1048577 for an equivalent PA, they would get it >>> (though, that is a incorrect access). >>> 3) b = malloc(1M) >>> this *might* lead to a case where same 2MB page is used and >>> VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). >>> = After b, the PA-VA table has a single entry of 2MB, representing >>> two mallocs. It can be used for translation for any thread requesting >>> PAs of a or b. >>> 4) Free(a) >>> - this would attempt to remove one 2MB entry from PA-VA table. But, >>> 'b' is already valid. Access to get_pa(VA(b)) should return me the >>> PA(b). >>> - 'len' is not even used as the entry in PA-VA table is of a fixed >>> size. >> >> Just to add to this: >> - if talking about the mem_event callback, it definitely won't be a >> case where same page is still being served under another rte_malloc >> - But, calls can come to delete from users of PA-VA table based on >> their own rte_free(). >> >> And, your comment makes me think - I should probably del entry from >> the table only when mem_event callback is received. > > Mem events are not triggered on rte_free(), they're triggered on page > deallocation. A call to rte_free/rte_memzone_free/rte_mempool_free etc. > *might* trigger a page deallocation, but *only* if the memory area being > freed encompasses an entire page. If you rte_malloc() 64 bytes and then > rte_free() those 64 bytes, you won't get a mem event *unless* these were > the only 64 bytes allocated on a particular page, and the entire page is > no longer used by anything else. My understanding is same. But, it seems my explanation wasn't well written: For a rte_free(), I am not expecting that mem_event is raised - but, the caller of rte_free() (the eth or crypto drivers, or applications) may call the PA-VA table del function to remove the entries. This voluntary delete of table entry from the drivers or applications using PA-VA calling del of PA-VA table - is not correct. The path from mem_event callback clearing the PA-VA table entry is correct (which I removed in v2) - that time the page (len) would definitely not be used by anyone and can be removed from PA-VA table. And, yes, I agree that mem-event may not be on an rte_free(). > >> >>> >>> In the above, (3) is an assumption I am making based on my >>> understanding how mem allocator is working. Is that wrong? >>> >>> Basically, this is a restriction of this table - it has a min chunk >>> of 2MB - even for 1G hugepages - and hence, it is not possible to >>> honor deletes. I know this is convoluted logic - but, this keeps it >>> simple and use-able without much performance impact. >>> >>> [...] >>> >> >> > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 10:39 ` Shreyansh Jain @ 2018-10-11 10:46 ` Burakov, Anatoly 0 siblings, 0 replies; 53+ messages in thread From: Burakov, Anatoly @ 2018-10-11 10:46 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 11-Oct-18 11:39 AM, Shreyansh Jain wrote: > On Thursday 11 October 2018 03:43 PM, Burakov, Anatoly wrote: >> On 11-Oct-18 11:07 AM, Shreyansh Jain wrote: >>> On Thursday 11 October 2018 03:32 PM, Shreyansh Jain wrote: >>>> On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: >>>>> On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >>>>>> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>>>>>> Hello Anatoly, >>>>>>> >>>>>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>>>>> A common library, valid for dpaaX drivers, which is used to >>>>>>>>> maintain >>>>>>>>> a local copy of PA->VA translations. >>>>>>>>> >>>>>>>>> In case of physical addressing mode (one of the option for >>>>>>>>> FSLMC, and >>>>>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>>>>> physical. These need to be converted into equivalent VA for >>>>>>>>> rte_mbuf >>>>>>>>> and other similar calls. >>>>>>>>> >>>>>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>>>>> library is an attempt to reduce the overall cost associated with >>>>>>>>> this translation. >>>>>>>>> >>>>>>>>> A small table is maintained, containing continuous entries >>>>>>>>> representing a continguous physical range. Each of these entries >>>>>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>>>>> memory allocation/deallocation callbacks. >>>>>>>>> >>>>>> >>>>>> [...] >>>>>> >>>>>>> >>>>>>>> >>>>>>>> Also, a couple of nitpicks below. >>>>>>>> >>>>>>>>> cosnfig/common_base | 5 + >>>>>>>>> config/common_linuxapp | 5 + >>>>>>>>> drivers/common/Makefile | 4 + >>>>>>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>>>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>>>>>> ++++++++++++++++++ >>>>>>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>>>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>>>>>> drivers/common/dpaax/meson.build | 12 + >>>>>>>> >>>>>>>> <snip> >>>>>>>> >>>>>>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>>>>>> vaddr:(%p)," >>>>>>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, >>>>>>>>> e_offset, >>>>>>>>> + vaddr, paddr, length); >>>>>>>>> + return 0; >>>>>>>>> +} >>>>>>>>> + >>>>>>>>> +int >>>>>>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>>>>>> >>>>>>>> len is not unused. >>>>>>> >>>>>>> I will fix this. >>>>>>> Actually, this function itself is useless - more for symmetry >>>>>>> reason. >>>>>>> Callers would be either simply updating the table, or ignoring it >>>>>>> completely. But, yes, this is indeed wrong that I set that unused. >>>>>>> >>>>>> >>>>>> Actually, I was wrong in my first reply. In case of >>>>>> dpaax_iova_table_del(), len is indeed redundant. This is because >>>>>> the mapping is for a complete page (min of 2MB size), even if the >>>>>> request is for lesser length. So, removal of a single entry (of >>>>>> fixed size) would be done. >>>>>> >>>>>> In fact, while on this, I think deleting a PA->VA entry itself is >>>>>> incorrect (not just useless). A single entry (~2MB equivalent) can >>>>>> represent multiple users (working on a rte_malloc'd area, for >>>>>> example). So, effectively, its always an update - not an add or del. >>>>> >>>>> I'm not sure what you mean here. If you got a mem event about >>>>> memory area being freed, it's guaranteed to *not* have any users - >>>>> neither malloc, nor any other memory. And len is always page-aligned. >>>> >>>> ok. Maybe I am getting this wrong, but consider this: >>>> >>>> 1) hugepage size=2MB >>>> 2) a = malloc(1M) >>>> this will pin an entry in table for a block starting at VA=(a) >>>> and PA=(a'). Each entry is of 2MB length - that means, even if >>>> someone were to access a+1048577 for an equivalent PA, they would >>>> get it (though, that is a incorrect access). >>>> 3) b = malloc(1M) >>>> this *might* lead to a case where same 2MB page is used and >>>> VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). >>>> = After b, the PA-VA table has a single entry of 2MB, representing >>>> two mallocs. It can be used for translation for any thread >>>> requesting PAs of a or b. >>>> 4) Free(a) >>>> - this would attempt to remove one 2MB entry from PA-VA table. >>>> But, 'b' is already valid. Access to get_pa(VA(b)) should return me >>>> the PA(b). >>>> - 'len' is not even used as the entry in PA-VA table is of a fixed >>>> size. >>> >>> Just to add to this: >>> - if talking about the mem_event callback, it definitely won't be a >>> case where same page is still being served under another rte_malloc >>> - But, calls can come to delete from users of PA-VA table based on >>> their own rte_free(). >>> >>> And, your comment makes me think - I should probably del entry from >>> the table only when mem_event callback is received. >> >> Mem events are not triggered on rte_free(), they're triggered on page >> deallocation. A call to rte_free/rte_memzone_free/rte_mempool_free >> etc. *might* trigger a page deallocation, but *only* if the memory >> area being freed encompasses an entire page. If you rte_malloc() 64 >> bytes and then rte_free() those 64 bytes, you won't get a mem event >> *unless* these were the only 64 bytes allocated on a particular page, >> and the entire page is no longer used by anything else. > > My understanding is same. > But, it seems my explanation wasn't well written: > > For a rte_free(), I am not expecting that mem_event is raised - but, the > caller of rte_free() (the eth or crypto drivers, or applications) may > call the PA-VA table del function to remove the entries. > > This voluntary delete of table entry from the drivers or applications > using PA-VA calling del of PA-VA table - is not correct. Yes, so it appears :) -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table 2018-10-11 10:02 ` Shreyansh Jain 2018-10-11 10:07 ` Shreyansh Jain @ 2018-10-11 10:09 ` Burakov, Anatoly 1 sibling, 0 replies; 53+ messages in thread From: Burakov, Anatoly @ 2018-10-11 10:09 UTC (permalink / raw) To: Shreyansh Jain; +Cc: ferruh.yigit, dev On 11-Oct-18 11:02 AM, Shreyansh Jain wrote: > On Thursday 11 October 2018 02:33 PM, Burakov, Anatoly wrote: >> On 09-Oct-18 11:45 AM, Shreyansh Jain wrote: >>> On Tuesday 25 September 2018 07:09 PM, Shreyansh Jain wrote: >>>> Hello Anatoly, >>>> >>>> On Tuesday 25 September 2018 06:58 PM, Burakov, Anatoly wrote: >>>>> On 25-Sep-18 1:54 PM, Shreyansh Jain wrote: >>>>>> A common library, valid for dpaaX drivers, which is used to maintain >>>>>> a local copy of PA->VA translations. >>>>>> >>>>>> In case of physical addressing mode (one of the option for FSLMC, and >>>>>> only option for DPAA bus), the addresses of descriptors Rx'd are >>>>>> physical. These need to be converted into equivalent VA for rte_mbuf >>>>>> and other similar calls. >>>>>> >>>>>> Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This >>>>>> library is an attempt to reduce the overall cost associated with >>>>>> this translation. >>>>>> >>>>>> A small table is maintained, containing continuous entries >>>>>> representing a continguous physical range. Each of these entries >>>>>> stores the equivalent VA, which is fed during mempool creation, or >>>>>> memory allocation/deallocation callbacks. >>>>>> >>> >>> [...] >>> >>>> >>>>> >>>>> Also, a couple of nitpicks below. >>>>> >>>>>> cosnfig/common_base | 5 + >>>>>> config/common_linuxapp | 5 + >>>>>> drivers/common/Makefile | 4 + >>>>>> drivers/common/dpaax/Makefile | 31 ++ >>>>>> drivers/common/dpaax/dpaax_iova_table.c | 509 >>>>>> ++++++++++++++++++ >>>>>> drivers/common/dpaax/dpaax_iova_table.h | 104 ++++ >>>>>> drivers/common/dpaax/dpaax_logs.h | 39 ++ >>>>>> drivers/common/dpaax/meson.build | 12 + >>>>> >>>>> <snip> >>>>> >>>>>> + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for >>>>>> vaddr:(%p)," >>>>>> + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, >>>>>> + vaddr, paddr, length); >>>>>> + return 0; >>>>>> +} >>>>>> + >>>>>> +int >>>>>> +dpaax_iova_table_del(phys_addr_t paddr, size_t len __rte_unused) >>>>> >>>>> len is not unused. >>>> >>>> I will fix this. >>>> Actually, this function itself is useless - more for symmetry reason. >>>> Callers would be either simply updating the table, or ignoring it >>>> completely. But, yes, this is indeed wrong that I set that unused. >>>> >>> >>> Actually, I was wrong in my first reply. In case of >>> dpaax_iova_table_del(), len is indeed redundant. This is because the >>> mapping is for a complete page (min of 2MB size), even if the request >>> is for lesser length. So, removal of a single entry (of fixed size) >>> would be done. >>> >>> In fact, while on this, I think deleting a PA->VA entry itself is >>> incorrect (not just useless). A single entry (~2MB equivalent) can >>> represent multiple users (working on a rte_malloc'd area, for >>> example). So, effectively, its always an update - not an add or del. >> >> I'm not sure what you mean here. If you got a mem event about memory >> area being freed, it's guaranteed to *not* have any users - neither >> malloc, nor any other memory. And len is always page-aligned. > > ok. Maybe I am getting this wrong, but consider this: > > 1) hugepage size=2MB > 2) a = malloc(1M) > this will pin an entry in table for a block starting at VA=(a) and > PA=(a'). Each entry is of 2MB length - that means, even if someone were > to access a+1048577 for an equivalent PA, they would get it (though, > that is a incorrect access). > 3) b = malloc(1M) > this *might* lead to a case where same 2MB page is used and > VA=(b==(a+1MB)). Being hugepage backed, PA=(b=PA(a)+1M). > = After b, the PA-VA table has a single entry of 2MB, representing two > mallocs. It can be used for translation for any thread requesting PAs of > a or b. > 4) Free(a) > - this would attempt to remove one 2MB entry from PA-VA table. But, No, it wouldn't. Not unless 'b' was also freed. Malloc will not free the area unless there are no users of this area (as in, it checks if a free malloc element encompasses an entire page). If you do two allocations, 1MB in size each, the hugepage will contain two buys malloc elements, one meg each. You free one, but the other one isn't free, so no action is taken. > 'b' is already valid. Access to get_pa(VA(b)) should return me the PA(b). > - 'len' is not even used as the entry in PA-VA table is of a fixed size. > > In the above, (3) is an assumption I am making based on my understanding > how mem allocator is working. Is that wrong? > > Basically, this is a restriction of this table - it has a min chunk of > 2MB - even for 1G hugepages - and hence, it is not possible to honor > deletes. I know this is convoluted logic - but, this keeps it simple and > use-able without much performance impact. The granularity of mem events are page size. You will not ever get a mem event on anything other than a full page, and you are guaranteed that it is free and not used by anything within DPDK. > > [...] > > -- Thanks, Anatoly ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH 4/5] dpaa: enable dpaax library 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (2 preceding siblings ...) 2018-09-25 12:54 ` [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-09-25 12:54 ` Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 5/5] fslmc: " Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain With this patch, dpaa bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa, event/dpaa and net/dpaa as they are dependent on the bus/dpaa and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 ++++ drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 ++++++ drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 ++++++ drivers/event/dpaa/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 3 +++ drivers/mempool/dpaa/dpaa_mempool.h | 4 +--- drivers/net/dpaa/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 27 insertions(+), 4 deletions(-) diff --git a/drivers/bus/dpaa/Makefile b/drivers/bus/dpaa/Makefile index bffaa9d92..5eb7c24db 100644 --- a/drivers/bus/dpaa/Makefile +++ b/drivers/bus/dpaa/Makefile @@ -48,5 +48,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += \ LDLIBS += -lpthread LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c index 16fabd1be..8bfb085e9 100644 --- a/drivers/bus/dpaa/dpaa_bus.c +++ b/drivers/bus/dpaa/dpaa_bus.c @@ -34,6 +34,7 @@ #include <rte_dpaa_bus.h> #include <rte_dpaa_logs.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -546,6 +547,9 @@ rte_dpaa_bus_probe(void) fclose(svr_file); } + /* And initialize the PA->VA translation table */ + dpaax_iova_table_populate(); + /* For each registered driver, and device, call the driver->probe */ TAILQ_FOREACH(dev, &rte_dpaa_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_dpaa_bus.driver_list, next) { diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build index d10b62c03..42676fbc5 100644 --- a/drivers/bus/dpaa/meson.build +++ b/drivers/bus/dpaa/meson.build @@ -5,7 +5,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev'] +deps += ['common_dpaax', 'eventdev'] sources = files('base/fman/fman.c', 'base/fman/fman_hw.c', 'base/fman/netcfg_layer.c', diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h index 15dc6a4ac..1d580a000 100644 --- a/drivers/bus/dpaa/rte_dpaa_bus.h +++ b/drivers/bus/dpaa/rte_dpaa_bus.h @@ -8,6 +8,7 @@ #include <rte_bus.h> #include <rte_mempool.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -110,6 +111,11 @@ extern struct dpaa_memseg_list rte_dpaa_memsegs; static inline void *rte_dpaa_mem_ptov(phys_addr_t paddr) { struct dpaa_memseg *ms; + void *va; + + va = dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* Check if the address is already part of the memseg list internally * maintained by the dpaa driver. diff --git a/drivers/crypto/dpaa_sec/Makefile b/drivers/crypto/dpaa_sec/Makefile index 9be447041..674a7a398 100644 --- a/drivers/crypto/dpaa_sec/Makefile +++ b/drivers/crypto/dpaa_sec/Makefile @@ -38,5 +38,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA_SEC) += dpaa_sec.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 2f0a5d285..65df12592 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -107,6 +107,12 @@ dpaa_mem_vtop(void *vaddr) static inline void * dpaa_mem_ptov(rte_iova_t paddr) { + void *va; + + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va)) + return va; + return rte_mem_iova2virt(paddr); } diff --git a/drivers/event/dpaa/Makefile b/drivers/event/dpaa/Makefile index ddd855227..6f93e7f40 100644 --- a/drivers/event/dpaa/Makefile +++ b/drivers/event/dpaa/Makefile @@ -34,5 +34,6 @@ LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS += -lrte_eventdev -lrte_pmd_dpaa -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/Makefile b/drivers/mempool/dpaa/Makefile index da8da1e90..9cf36856c 100644 --- a/drivers/mempool/dpaa/Makefile +++ b/drivers/mempool/dpaa/Makefile @@ -31,5 +31,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += dpaa_mempool.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c index 1c121223b..ce3f370ff 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.c +++ b/drivers/mempool/dpaa/dpaa_mempool.c @@ -26,6 +26,7 @@ #include <rte_ring.h> #include <dpaa_mempool.h> +#include <dpaax_iova_table.h> /* List of all the memseg information locally maintained in dpaa driver. This * is to optimize the PA_to_VA searches until a better mechanism (algo) is @@ -280,6 +281,8 @@ dpaa_populate(struct rte_mempool *mp, unsigned int max_objs, MEMPOOL_INIT_FUNC_TRACE(); + dpaax_iova_table_add(paddr, vaddr, len); + if (!mp || !mp->pool_data) { DPAA_MEMPOOL_ERR("Invalid mempool provided\n"); return 0; diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h index 092f326cb..533e1c6e2 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.h +++ b/drivers/mempool/dpaa/dpaa_mempool.h @@ -43,10 +43,8 @@ struct dpaa_bp_info { }; static inline void * -DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info, uint64_t addr) +DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info __rte_unused, uint64_t addr) { - if (bp_info->ptov_off) - return ((void *) (size_t)(addr + bp_info->ptov_off)); return rte_dpaa_mem_ptov(addr); } diff --git a/drivers/net/dpaa/Makefile b/drivers/net/dpaa/Makefile index d7a0a50c5..1c4f7d914 100644 --- a/drivers/net/dpaa/Makefile +++ b/drivers/net/dpaa/Makefile @@ -38,6 +38,7 @@ LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA_PMD)-include := rte_pmd_dpaa.h diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 899d51a23..89a008fe3 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -115,6 +115,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n) _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_BUCKET) += -lrte_mempool_bucket _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_STACK) += -lrte_mempool_stack ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH 5/5] fslmc: enable dpaax library 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (3 preceding siblings ...) 2018-09-25 12:54 ` [dpdk-dev] [PATCH 4/5] dpaa: enable dpaax library Shreyansh Jain @ 2018-09-25 12:54 ` Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-09-25 12:54 UTC (permalink / raw) To: ferruh.yigit; +Cc: dev, anatoly.burakov, Shreyansh Jain With this patch, fslmc bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa2, event/dpaa2, net/dpaa2, raw/dpaa2_cmdif and raw/dpaa2_qdma as they are dependent on the bus/fslmc and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 20 +++++++++++++++ drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c | 7 ------ drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 32 ++++++++---------------- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/event/dpaa2/Makefile | 2 ++ drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +++------------------ drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 2 ++ drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 1 + 13 files changed, 45 insertions(+), 55 deletions(-) diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile index 515d0f534..c5b580a4a 100644 --- a/drivers/bus/fslmc/Makefile +++ b/drivers/bus/fslmc/Makefile @@ -19,6 +19,7 @@ CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax # versioning export map EXPORT_MAP := rte_bus_fslmc_version.map diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index f5135e538..7dbe01e08 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -20,6 +20,8 @@ #include <fslmc_vfio.h> #include "fslmc_logs.h" +#include <dpaax_iova_table.h> + int dpaa2_logtype_bus; #define VFIO_IOMMU_GROUP_PATH "/sys/kernel/iommu_groups" @@ -375,6 +377,19 @@ rte_fslmc_probe(void) probe_all = rte_fslmc_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST; + /* In case of PA, the FD addresses returned by qbman APIs are physical + * addresses, which need conversion into equivalent VA address for + * rte_mbuf. For that, a table (a serial array, in memory) is used to + * increase translation efficiency. + * This has to be done before probe as some device initialization + * (during) probe allocate memory (dpaa2_sec) which needs to be pinned + * to this table. + */ + ret = dpaax_iova_table_populate(); + if (ret) { + DPAA2_BUS_WARN("PA->VA Translation table not available;"); + } + TAILQ_FOREACH(dev, &rte_fslmc_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_fslmc_bus.driver_list, next) { ret = rte_fslmc_match(drv, dev); @@ -450,6 +465,11 @@ rte_fslmc_driver_unregister(struct rte_dpaa2_driver *driver) fslmc_bus = driver->fslmc_bus; + /* Cleanup the PA->VA Translation table; From whereever this function + * is called from. + */ + dpaax_iova_table_depopulate(); + TAILQ_REMOVE(&fslmc_bus->driver_list, driver, next); /* Update Bus references */ driver->fslmc_bus = NULL; diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build index 22a56a6fc..49d71d2ba 100644 --- a/drivers/bus/fslmc/meson.build +++ b/drivers/bus/fslmc/meson.build @@ -5,7 +5,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev', 'kvargs'] +deps += ['common_dpaax', 'eventdev', 'kvargs'] sources = files('fslmc_bus.c', 'fslmc_vfio.c', 'mc/dpbp.c', diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c index db49d637f..39c5adf90 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c +++ b/drivers/bus/fslmc/portal/dpaa2_hw_dpbp.c @@ -28,13 +28,6 @@ #include "portal/dpaa2_hw_pvt.h" #include "portal/dpaa2_hw_dpio.h" -/* List of all the memseg information locally maintained in dpaa2 driver. This - * is to optimize the PA_to_VA searches until a better mechanism (algo) is - * available. - */ -struct dpaa2_memseg_list rte_dpaa2_memsegs - = TAILQ_HEAD_INITIALIZER(rte_dpaa2_memsegs); - TAILQ_HEAD(dpbp_dev_list, dpaa2_dpbp_dev); static struct dpbp_dev_list dpbp_dev_list = TAILQ_HEAD_INITIALIZER(dpbp_dev_list); /*!< DPBP device list */ diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h index ec8f42806..7306d2598 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h +++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h @@ -9,6 +9,7 @@ #define _DPAA2_HW_PVT_H_ #include <rte_eventdev.h> +#include <dpaax_iova_table.h> #include <mc/fsl_mc_sys.h> #include <fsl_qbman_portal.h> @@ -277,42 +278,29 @@ enum qbman_fd_format { */ #define DPAA2_EQ_RESP_ALWAYS 1 -/* Various structures representing contiguous memory maps */ -struct dpaa2_memseg { - TAILQ_ENTRY(dpaa2_memseg) next; - char *vaddr; - rte_iova_t iova; - size_t len; -}; - -TAILQ_HEAD(dpaa2_memseg_list, dpaa2_memseg); -extern struct dpaa2_memseg_list rte_dpaa2_memsegs; - #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA extern uint8_t dpaa2_virt_mode; static void *dpaa2_mem_ptov(phys_addr_t paddr) __attribute__((unused)); -/* todo - this is costly, need to write a fast coversion routine */ + static void *dpaa2_mem_ptov(phys_addr_t paddr) { - struct dpaa2_memseg *ms; + void *va; if (dpaa2_virt_mode) return (void *)(size_t)paddr; - /* Check if the address is already part of the memseg list internally - * maintained by the dpaa2 driver. - */ - TAILQ_FOREACH(ms, &rte_dpaa2_memsegs, next) { - if (paddr >= ms->iova && paddr < - ms->iova + ms->len) - return RTE_PTR_ADD(ms->vaddr, (uintptr_t)(paddr - ms->iova)); - } + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* If not, Fallback to full memseg list searching */ - return rte_mem_iova2virt(paddr); + va = rte_mem_iova2virt(paddr); + + return va; } static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) __attribute__((unused)); + static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) { const struct rte_memseg *memseg; diff --git a/drivers/crypto/dpaa2_sec/Makefile b/drivers/crypto/dpaa2_sec/Makefile index da3d8f84f..1f951a14b 100644 --- a/drivers/crypto/dpaa2_sec/Makefile +++ b/drivers/crypto/dpaa2_sec/Makefile @@ -51,5 +51,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_cryptodev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/event/dpaa2/Makefile b/drivers/event/dpaa2/Makefile index 46f7d061e..de6771551 100644 --- a/drivers/event/dpaa2/Makefile +++ b/drivers/event/dpaa2/Makefile @@ -21,6 +21,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal LDLIBS += -lrte_eal -lrte_eventdev LDLIBS += -lrte_bus_fslmc -lrte_mempool_dpaa2 -lrte_pmd_dpaa2 LDLIBS += -lrte_bus_vdev -lrte_pmd_dpaa2_sec +LDLIBS += -lrte_common_dpaax CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2 CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2/mc CFLAGS += -I$(RTE_SDK)/drivers/crypto/dpaa2_sec @@ -39,4 +40,5 @@ CFLAGS += -DALLOW_EXPERIMENTAL_API SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA2_EVENTDEV) += dpaa2_hw_dpcon.c SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA2_EVENTDEV) += dpaa2_eventdev.c + include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa2/Makefile b/drivers/mempool/dpaa2/Makefile index 9e4c87d79..0fc69c3bf 100644 --- a/drivers/mempool/dpaa2/Makefile +++ b/drivers/mempool/dpaa2/Makefile @@ -30,6 +30,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += dpaa2_hw_mempool.c LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)-include := rte_dpaa2_mempool.h diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c index 84ff12811..e74825598 100644 --- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c +++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c @@ -30,6 +30,8 @@ #include "dpaa2_hw_mempool.h" #include "dpaa2_hw_mempool_logs.h" +#include <dpaax_iova_table.h> + struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID]; static struct dpaa2_bp_list *h_bp_list; @@ -393,31 +395,8 @@ dpaa2_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t paddr, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - struct dpaa2_memseg *ms; - - /* For each memory chunk pinned to the Mempool, a linked list of the - * contained memsegs is created for searching when PA to VA - * conversion is required. - */ - ms = rte_zmalloc(NULL, sizeof(struct dpaa2_memseg), 0); - if (!ms) { - DPAA2_MEMPOOL_ERR("Unable to allocate internal memory."); - DPAA2_MEMPOOL_WARN("Fast Physical to Virtual Addr translation would not be available."); - /* If the element is not added, it would only lead to failure - * in searching for the element and the logic would Fallback - * to traditional DPDK memseg traversal code. So, this is not - * a blocking error - but, error would be printed on screen. - */ - return 0; - } - - ms->vaddr = vaddr; - ms->iova = paddr; - ms->len = len; - /* Head insertions are generally faster than tail insertions as the - * buffers pinned are picked from rear end. - */ - TAILQ_INSERT_HEAD(&rte_dpaa2_memsegs, ms, next); + /* Insert entry into the PA->VA Table */ + dpaax_iova_table_add(paddr, vaddr, len); return rte_mempool_op_populate_default(mp, max_objs, vaddr, paddr, len, obj_cb, obj_cb_arg); diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile index 42d45c1a8..59f7bf4a7 100644 --- a/drivers/net/dpaa2/Makefile +++ b/drivers/net/dpaa2/Makefile @@ -41,5 +41,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_cmdif/Makefile b/drivers/raw/dpaa2_cmdif/Makefile index 9b863dda2..83b5ecb56 100644 --- a/drivers/raw/dpaa2_cmdif/Makefile +++ b/drivers/raw/dpaa2_cmdif/Makefile @@ -21,6 +21,7 @@ LDLIBS += -lrte_eal LDLIBS += -lrte_kvargs LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_cmdif_version.map @@ -33,4 +34,5 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA2_CMDIF_RAWDEV) += dpaa2_cmdif.c SYMLINK-$(CONFIG_RTE_LIBRTE_PMD_DPAA2_CMDIF_RAWDEV)-include += rte_pmd_dpaa2_cmdif.h + include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_qdma/Makefile b/drivers/raw/dpaa2_qdma/Makefile index d88809ead..2f79a3f41 100644 --- a/drivers/raw/dpaa2_qdma/Makefile +++ b/drivers/raw/dpaa2_qdma/Makefile @@ -22,6 +22,7 @@ LDLIBS += -lrte_mempool LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev LDLIBS += -lrte_ring +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_qdma_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 89a008fe3..abfbe387c 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -119,6 +119,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += -lrte_mempool_dpaa2 endif -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (4 preceding siblings ...) 2018-09-25 12:54 ` [dpdk-dev] [PATCH 5/5] fslmc: " Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain ` (5 more replies) 5 siblings, 6 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain ::Background:: After the restructuring of memory in last release(s), one of the major impact on fslmc/dpaa bus (and its devices) was the performance drop when using physical addressing. Previously, it was assumed that physical range was contiguous for any given request for hugepage memory. That way, whenever a virtual address was returned, it was easy to fetch physical equivalent, in almost constant time. But, with memory hotplug series, that assumption was negated. Every call that device drivers made for rte_mem_virt2iova or rte_mem_virt2phy were expensive. (Using IOVA_CONTIG is an app dependency which is not a practical option). For fslmc, working on Physical or Virtual (IOMMU supported) address is an optional thing. For dpaa bus, it is not optional and only physical addressing is supported. Thus, it impacted dpaa bus the most. ::DPAAX PA-VA Table:: - A simple table containing entries for all physical memory range available on a particular SoC (in this case, NXP's LS104x and LS20xx series, which are handled by dpaa and fslmc bus, respectively). As of now, this is SoC dependent for fetching range. - We populate the table either through the mempool handler (for mempool pinned memory) or through the memory event callbacks (for cases where working memory is allocated by application). - Though aim is only to translate addresses for descriptors which are Rx'd from devices, this is a generic layer which should work in other cases as well (though, not the target of current testing). ::About patches:: Patch 1: There was an issue in existing PA/VA mode reporting being done by fslmc bus. This patch fixes it. Patch 2: Common libraries/commponents can be dependency for the bus thus, blocking parallel compilation Patch 3: Add the library in common/dpaax. This is a single patch as functions are mostly inter-linked. Patch 4~5: Add support in dpaa and fslmc bus, respectively. It is not possible to unlink the bus and device drivers, thus, these patches have blanket change across all drivers. ::Next Steps:: - Some optimization are required to tune the access pattern of the table. These would be posted as additional patches. - In case there is any possible split of patches, I will post another version. But until then, this is the layout. ::Version History:: v1->v2: - Rework of review comments on v1 - Removed dpaax_iova_table_del API - that is redundant - Changed paax_iova_table_add to paax_iova_table_update to make it more relevant - Previous patch removed an advertised API (rte_dpaa2_memsegs). This is fixed. A deprecation notice would now be sent for removal in next release. - Rebase on master (5f73c2670f); Also verified on net-next/mater (317f8b01f) Shreyansh Jain (5): bus/fslmc: fix physical addressing check drivers: common as dependency for bus common/dpaax: add library for PA VA translation table dpaa: enable dpaax library fslmc: enable dpaax library config/common_base | 5 + config/common_linuxapp | 5 + drivers/Makefile | 1 + drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 + drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 + drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 24 + drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 +- drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 456 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 + drivers/event/dpaa/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 + drivers/mempool/dpaa/dpaa_mempool.h | 4 +- drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +- drivers/net/dpaa/Makefile | 1 + drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 2 + 34 files changed, 743 insertions(+), 42 deletions(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-12 9:01 ` Pavan Nikhilesh 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 2/5] drivers: common as dependency for bus Shreyansh Jain ` (4 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain, hemant.agrawal In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported class is RTE_IOVA_PA. Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") Cc: hemant.agrawal@nxp.com Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/fslmc_bus.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index bfe81e236..a4f9a9eee 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -491,6 +491,10 @@ rte_dpaa2_get_iommu_class(void) bool is_vfio_noiommu_enabled = 1; bool has_iova_va; +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; +#endif + if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-12 9:01 ` Pavan Nikhilesh 2018-10-12 10:44 ` Shreyansh Jain 0 siblings, 1 reply; 53+ messages in thread From: Pavan Nikhilesh @ 2018-10-12 9:01 UTC (permalink / raw) To: Shreyansh Jain, anatoly.burakov, hemant.agrawal; +Cc: jkollanukkaran, dev Hi Shreyansh, On Tue, Oct 09, 2018 at 04:55:44PM +0530, Shreyansh Jain wrote: > In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported > class is RTE_IOVA_PA. > > Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") > Cc: hemant.agrawal@nxp.com > > Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> > --- > drivers/bus/fslmc/fslmc_bus.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c > index bfe81e236..a4f9a9eee 100644 > --- a/drivers/bus/fslmc/fslmc_bus.c > +++ b/drivers/bus/fslmc/fslmc_bus.c > @@ -491,6 +491,10 @@ rte_dpaa2_get_iommu_class(void) > bool is_vfio_noiommu_enabled = 1; > bool has_iova_va; > > +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA > + return RTE_IOVA_PA; > +#endif > + As, RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is set to true by default[1] and fslmc bus being always registered[2] irrespective of the underlying platform, the IOVA class will be always returned as PA. This will break multiple platforms as some work only when IOVA as VA. I think you need to verify if the underlying platform is really FLMC similar to DPAA[3] [1] ->[master]ltp-pvn[dpdk] $ grep -nir "RTE_LIBRTE_DPAA2_USE_PHYS_IOVA" config/ config/meson.build:86:dpdk_conf.set('RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', true) config/common_base:218:CONFIG_RTE_LIBRTE_DPAA2_USE_PHYS_IOVA=y [2] config/common_linuxapp:45:CONFIG_RTE_LIBRTE_FSLMC_BUS=y [3] static enum rte_iova_mode rte_dpaa_get_iommu_class(void) { if ((access(DPAA_DEV_PATH1, F_OK) != 0) && (access(DPAA_DEV_PATH2, F_OK) != 0)) { return RTE_IOVA_DC; } return RTE_IOVA_PA; } > if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) > return RTE_IOVA_DC; > > -- > 2.17.1 > Thanks, Pavan. ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check 2018-10-12 9:01 ` Pavan Nikhilesh @ 2018-10-12 10:44 ` Shreyansh Jain 2018-10-12 16:29 ` Pavan Nikhilesh 0 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-12 10:44 UTC (permalink / raw) To: Pavan Nikhilesh; +Cc: anatoly.burakov, hemant.agrawal, jkollanukkaran, dev On Friday 12 October 2018 02:31 PM, Pavan Nikhilesh wrote: > Hi Shreyansh, > > On Tue, Oct 09, 2018 at 04:55:44PM +0530, Shreyansh Jain wrote: >> In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported >> class is RTE_IOVA_PA. >> >> Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") >> Cc: hemant.agrawal@nxp.com >> >> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >> --- >> drivers/bus/fslmc/fslmc_bus.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c >> index bfe81e236..a4f9a9eee 100644 >> --- a/drivers/bus/fslmc/fslmc_bus.c >> +++ b/drivers/bus/fslmc/fslmc_bus.c >> @@ -491,6 +491,10 @@ rte_dpaa2_get_iommu_class(void) >> bool is_vfio_noiommu_enabled = 1; >> bool has_iova_va; >> >> +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA >> + return RTE_IOVA_PA; >> +#endif >> + > > As, RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is set to true by default[1] and fslmc bus > being always registered[2] irrespective of the underlying platform, the IOVA class > will be always returned as PA. > This will break multiple platforms as some work only when IOVA as VA. I think > you need to verify if the underlying platform is really FLMC similar to DPAA[3] Thats a good catch and bad patch from me :( - Thanks for review. I will do this now: ---->8--- static enum rte_iova_mode rte_dpaa2_get_iommu_class(void) { bool is_vfio_noiommu_enabled = 1; bool has_iova_va; - #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA - return RTE_IOVA_PA; - #endif if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; + #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; + #endif ---->8--- In this case, in case no FSLMC device is detected (which would be cases you are referring to), DC would be returned. There is no other explicit way for me to check the PA/VA combination on the DPAA2 bus. Even for the DPAA function [3]that you have mentioned, that is not actually checking PA/VA applicability - it is just checking if we have DPAA enabled or not (complete bus). Is that OK? > > [1] > ->[master]ltp-pvn[dpdk] $ grep -nir "RTE_LIBRTE_DPAA2_USE_PHYS_IOVA" config/ > config/meson.build:86:dpdk_conf.set('RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', true) > config/common_base:218:CONFIG_RTE_LIBRTE_DPAA2_USE_PHYS_IOVA=y > > [2] > config/common_linuxapp:45:CONFIG_RTE_LIBRTE_FSLMC_BUS=y > > [3] > static enum rte_iova_mode > rte_dpaa_get_iommu_class(void) > { > if ((access(DPAA_DEV_PATH1, F_OK) != 0) && > (access(DPAA_DEV_PATH2, F_OK) != 0)) { > return RTE_IOVA_DC; > } > return RTE_IOVA_PA; > } > > >> if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) >> return RTE_IOVA_DC; >> ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check 2018-10-12 10:44 ` Shreyansh Jain @ 2018-10-12 16:29 ` Pavan Nikhilesh 0 siblings, 0 replies; 53+ messages in thread From: Pavan Nikhilesh @ 2018-10-12 16:29 UTC (permalink / raw) To: Shreyansh Jain, anatoly.burakov, hemant.agrawal, jkollanukkaran; +Cc: dev On Fri, Oct 12, 2018 at 04:14:36PM +0530, Shreyansh Jain wrote: > On Friday 12 October 2018 02:31 PM, Pavan Nikhilesh wrote: > > Hi Shreyansh, > > > > On Tue, Oct 09, 2018 at 04:55:44PM +0530, Shreyansh Jain wrote: > > > In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported > > > class is RTE_IOVA_PA. > > > > > > Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") > > > Cc: hemant.agrawal@nxp.com > > > > > > Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> > > > --- > > > drivers/bus/fslmc/fslmc_bus.c | 4 ++++ > > > 1 file changed, 4 insertions(+) > > > > > > diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c > > > index bfe81e236..a4f9a9eee 100644 > > > --- a/drivers/bus/fslmc/fslmc_bus.c > > > +++ b/drivers/bus/fslmc/fslmc_bus.c > > > @@ -491,6 +491,10 @@ rte_dpaa2_get_iommu_class(void) > > > bool is_vfio_noiommu_enabled = 1; > > > bool has_iova_va; > > > > > > +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA > > > + return RTE_IOVA_PA; > > > +#endif > > > + > > > > As, RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is set to true by default[1] and fslmc bus > > being always registered[2] irrespective of the underlying platform, the IOVA class > > will be always returned as PA. > > This will break multiple platforms as some work only when IOVA as VA. I think > > you need to verify if the underlying platform is really FLMC similar to DPAA[3] > > Thats a good catch and bad patch from me :( - Thanks for review. > I will do this now: > > ---->8--- > static enum rte_iova_mode > rte_dpaa2_get_iommu_class(void) > { > bool is_vfio_noiommu_enabled = 1; > bool has_iova_va; > > - #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA > - return RTE_IOVA_PA; > - #endif > > if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) > return RTE_IOVA_DC; > > + #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA > + return RTE_IOVA_PA; > + #endif > ---->8--- > > In this case, in case no FSLMC device is detected (which would be cases > you are referring to), DC would be returned. > > There is no other explicit way for me to check the PA/VA combination on > the DPAA2 bus. Even for the DPAA function [3]that you have mentioned, > that is not actually checking PA/VA applicability - it is just checking > if we have DPAA enabled or not (complete bus). > > Is that OK? Looks good to me :) cheers - Pavan. > > > > > [1] > > ->[master]ltp-pvn[dpdk] $ grep -nir "RTE_LIBRTE_DPAA2_USE_PHYS_IOVA" config/ > > config/meson.build:86:dpdk_conf.set('RTE_LIBRTE_DPAA2_USE_PHYS_IOVA', true) > > config/common_base:218:CONFIG_RTE_LIBRTE_DPAA2_USE_PHYS_IOVA=y > > > > [2] > > config/common_linuxapp:45:CONFIG_RTE_LIBRTE_FSLMC_BUS=y > > > > [3] > > static enum rte_iova_mode > > rte_dpaa_get_iommu_class(void) > > { > > if ((access(DPAA_DEV_PATH1, F_OK) != 0) && > > (access(DPAA_DEV_PATH2, F_OK) != 0)) { > > return RTE_IOVA_DC; > > } > > return RTE_IOVA_PA; > > } > > > > > > > if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) > > > return RTE_IOVA_DC; > > > > ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 2/5] drivers: common as dependency for bus 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain Prior to this patch, bus and common compiled parallel. But, post this dependency is created. This is especially important for the DPAA/FSLMC buses which are going to use the common/dpaax library. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/Makefile b/drivers/Makefile index 75660765e..7d5da5d9f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -5,6 +5,7 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += common DIRS-y += bus +DEPDIRS-bus := common DIRS-y += mempool DEPDIRS-mempool := common bus DIRS-y += net -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 3/5] common/dpaax: add library for PA VA translation table 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 2/5] drivers: common as dependency for bus Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 4/5] dpaa: enable dpaax library Shreyansh Jain ` (2 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain A common library, valid for dpaaX drivers, which is used to maintain a local copy of PA->VA translations. In case of physical addressing mode (one of the option for FSLMC, and only option for DPAA bus), the addresses of descriptors Rx'd are physical. These need to be converted into equivalent VA for rte_mbuf and other similar calls. Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This library is an attempt to reduce the overall cost associated with this translation. A small table is maintained, containing continuous entries representing a continguous physical range. Each of these entries stores the equivalent VA, which is fed during mempool creation, or memory allocation/deallocation callbacks. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- config/common_base | 5 + config/common_linuxapp | 5 + drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 456 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- 10 files changed, 667 insertions(+), 1 deletion(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map diff --git a/config/common_base b/config/common_base index acc5211bc..cf4e28a95 100644 --- a/config/common_base +++ b/config/common_base @@ -138,6 +138,11 @@ CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE=n # CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=n + # # Compile the Intel FPGA bus # diff --git a/config/common_linuxapp b/config/common_linuxapp index 9c5ea9d89..57a847f3e 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -29,6 +29,11 @@ CONFIG_RTE_PROC_INFO=y CONFIG_RTE_LIBRTE_VMBUS=y CONFIG_RTE_LIBRTE_NETVSC_PMD=y +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=y + # NXP DPAA BUS and drivers CONFIG_RTE_LIBRTE_DPAA_BUS=y CONFIG_RTE_LIBRTE_DPAA_MEMPOOL=y diff --git a/drivers/common/Makefile b/drivers/common/Makefile index 5f72da0ed..1a5a6706c 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -12,4 +12,8 @@ ifeq ($(CONFIG_RTE_LIBRTE_MVPP2_PMD),y) DIRS-y += mvep endif +ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y) +DIRS-y += dpaax +endif + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/common/dpaax/Makefile b/drivers/common/dpaax/Makefile new file mode 100644 index 000000000..94d2cf0ce --- /dev/null +++ b/drivers/common/dpaax/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2018 NXP +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_common_dpaax.a + +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +# versioning export map +EXPORT_MAP := rte_common_dpaax_version.map + +# library version +LIBABIVER := 1 + +# +# all source are stored in SRCS-y +# +SRCS-y += dpaax_iova_table.c + +LDLIBS += -lrte_eal + +SYMLINK-y-include += dpaax_iova_table.h + +include $(RTE_SDK)/mk/rte.lib.mk \ No newline at end of file diff --git a/drivers/common/dpaax/dpaax_iova_table.c b/drivers/common/dpaax/dpaax_iova_table.c new file mode 100644 index 000000000..05af78629 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.c @@ -0,0 +1,456 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#include <rte_memory.h> + +#include "dpaax_iova_table.h" +#include "dpaax_logs.h" + +/* Global dpaax logger identifier */ +int dpaax_logger; + +/* Global table reference */ +struct dpaax_iova_table *dpaax_iova_table_p; + +static int dpaax_handle_memevents(void); + +/* A structure representing the device-tree node available in /proc/device-tree. + */ +struct reg_node { + phys_addr_t addr; + size_t len; +}; + +/* A ntohll equivalent routine + * XXX: This is only applicable for 64 bit environment. + */ +static void +rotate_8(unsigned char *arr) +{ + uint32_t temp; + uint32_t *first_half; + uint32_t *second_half; + + first_half = (uint32_t *)(arr); + second_half = (uint32_t *)(arr + 4); + + temp = *first_half; + *first_half = *second_half; + *second_half = temp; + + *first_half = ntohl(*first_half); + *second_half = ntohl(*second_half); +} + +/* read_memory_nodes + * Memory layout for DPAAx platforms (LS1043, LS1046, LS1088, LS2088, LX2160) + * are populated by Uboot and available in device tree: + * /proc/device-tree/memory@<address>/reg <= register. + * Entries are of the form: + * (<8 byte start addr><8 byte length>)(..more similar blocks of start,len>).. + * + * @param count + * OUT populate number of entries found in memory node + * @return + * Pointer to array of reg_node elements, count size + */ +static struct reg_node * +read_memory_node(unsigned int *count) +{ + int fd, ret, i; + unsigned int j; + glob_t result = {0}; + struct stat statbuf = {0}; + char file_data[MEM_NODE_FILE_LEN]; + struct reg_node *nodes = NULL; + + *count = 0; + + ret = glob(MEM_NODE_PATH_GLOB, 0, NULL, &result); + if (ret != 0) { + DPAAX_ERR("Unable to glob device-tree memory node: (%s)(%d)", + MEM_NODE_PATH_GLOB, ret); + goto out; + } + + if (result.gl_pathc != 1) { + /* Either more than one memory@<addr> node found, or none. + * In either case, cannot work ahead. + */ + DPAAX_ERR("Found (%zu) entries in device-tree. Not supported!", + result.gl_pathc); + goto out; + } + + DPAAX_DEBUG("Opening and parsing device-tree node: (%s)", + result.gl_pathv[0]); + fd = open(result.gl_pathv[0], O_RDONLY); + if (fd < 0) { + DPAAX_ERR("Unable to open the device-tree node: (%s)(fd=%d)", + MEM_NODE_PATH_GLOB, fd); + goto cleanup; + } + + /* Stat to get the file size */ + ret = fstat(fd, &statbuf); + if (ret != 0) { + DPAAX_ERR("Unable to get device-tree memory node size."); + goto cleanup; + } + + DPAAX_DEBUG("Size of device-tree mem node: %lu", statbuf.st_size); + if (statbuf.st_size > MEM_NODE_FILE_LEN) { + DPAAX_WARN("More memory nodes available than assumed."); + DPAAX_WARN("System may not work properly!"); + } + + ret = read(fd, file_data, statbuf.st_size > MEM_NODE_FILE_LEN ? + MEM_NODE_FILE_LEN : statbuf.st_size); + if (ret <= 0) { + DPAAX_ERR("Unable to read device-tree memory node: (%d)", ret); + goto cleanup; + } + + /* The reg node should be multiple of 16 bytes, 8 bytes each for addr + * and len. + */ + *count = (statbuf.st_size / 16); + if ((*count) <= 0 || (statbuf.st_size % 16 != 0)) { + DPAAX_ERR("Invalid memory node values or count. (size=%lu)", + statbuf.st_size); + goto cleanup; + } + + /* each entry is of 16 bytes, and size/16 is total count of entries */ + nodes = malloc(sizeof(struct reg_node) * (*count)); + if (!nodes) { + DPAAX_ERR("Failure in allocating working memory."); + goto cleanup; + } + memset(nodes, 0, sizeof(struct reg_node) * (*count)); + + for (i = 0, j = 0; i < (statbuf.st_size) && j < (*count); i += 16, j++) { + memcpy(&nodes[j], file_data + i, 16); + /* Rotate (ntohl) each 8 byte entry */ + rotate_8((unsigned char *)(&(nodes[j].addr))); + rotate_8((unsigned char *)(&(nodes[j].len))); + } + + DPAAX_DEBUG("Device-tree memory node data:"); + do { + DPAAX_DEBUG("\n %08" PRIx64 " %08zu", nodes[j].addr, nodes[j].len); + } while (--j); + +cleanup: + close(fd); + globfree(&result); +out: + return nodes; +} + +int +dpaax_iova_table_populate(void) +{ + int ret; + unsigned int i, node_count; + size_t tot_memory_size, total_table_size; + struct reg_node *nodes; + struct dpaax_iovat_element *entry; + + /* dpaax_iova_table_p is a singleton - only one instance should be + * created. + */ + if (dpaax_iova_table_p) { + DPAAX_DEBUG("Multiple allocation attempt for IOVA Table (%p)", + dpaax_iova_table_p); + /* This can be an error case as well - some path not cleaning + * up table - but, for now, it is assumed that if IOVA Table + * pointer is valid, table is allocated. + */ + return 0; + } + + nodes = read_memory_node(&node_count); + if (nodes == NULL || node_count <= 0) { + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + return -1; + } + + tot_memory_size = 0; + for (i = 0; i < node_count; i++) + tot_memory_size += nodes[i].len; + + DPAAX_DEBUG("Total available PA memory size: %zu", tot_memory_size); + + /* Total table size = meta data + tot_memory_size/8 */ + total_table_size = sizeof(struct dpaax_iova_table) + + (sizeof(struct dpaax_iovat_element) * node_count) + + ((tot_memory_size / DPAAX_MEM_SPLIT) * sizeof(uint64_t)); + + /* TODO: This memory doesn't need to shared but needs to be always + * pinned to RAM (no swap out) - using hugepage rather than malloc + */ + dpaax_iova_table_p = rte_zmalloc(NULL, total_table_size, 0); + if (dpaax_iova_table_p == NULL) { + DPAAX_WARN("Unable to allocate memory for PA->VA Table;"); + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + free(nodes); + return -1; + } + + /* Initialize table */ + dpaax_iova_table_p->count = node_count; + entry = dpaax_iova_table_p->entries; + + DPAAX_DEBUG("IOVA Table entries: (entry start = %p)", (void *)entry); + DPAAX_DEBUG("\t(entry),(start),(len),(next)"); + + for (i = 0; i < node_count; i++) { + /* dpaax_iova_table_p + * | dpaax_iova_table_p->entries + * | | + * | | + * V V + * +------+------+-------+---+----------+---------+--- + * |iova_ |entry | entry | | pages | pages | + * |table | 1 | 2 |...| entry 1 | entry2 | + * +-----'+.-----+-------+---+;---------+;--------+--- + * \ \ / / + * `~~~~~~|~~~~~>pages / + * \ / + * `~~~~~~~~~~~>pages + */ + entry[i].start = nodes[i].addr; + entry[i].len = nodes[i].len; + if (i > 0) + entry[i].pages = entry[i-1].pages + + ((entry[i-1].len/DPAAX_MEM_SPLIT)); + else + entry[i].pages = (uint64_t *)((unsigned char *)entry + + (sizeof(struct dpaax_iovat_element) * + node_count)); + + DPAAX_DEBUG("\t(%u),(%8"PRIx64"),(%8zu),(%8p)", + i, entry[i].start, entry[i].len, entry[i].pages); + } + + /* Release memory associated with nodes array - not required now */ + free(nodes); + + DPAAX_DEBUG("Adding mem-event handler\n"); + ret = dpaax_handle_memevents(); + if (ret) { + DPAAX_ERR("Unable to add mem-event handler"); + DPAAX_WARN("Cases with non-buffer pool mem won't work!"); + } + + return 0; +} + +void +dpaax_iova_table_depopulate(void) +{ + if (dpaax_iova_table_p == NULL) + return; + + rte_free(dpaax_iova_table_p->entries); + dpaax_iova_table_p = NULL; + + DPAAX_DEBUG("IOVA Table cleanedup"); +} + +int +dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length) +{ + int found = 0; + unsigned int i; + size_t req_length = length, e_offset; + struct dpaax_iovat_element *entry; + uintptr_t align_vaddr; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + align_vaddr = ((uintptr_t)vaddr & DPAAX_MEM_SPLIT_MASK); + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Add: Incorrect entry for PA->VA Table" + "(%"PRIu64")", paddr); + DPAAX_ERR("Add: Lowest address: %"PRIu64"", + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + + do { + e_offset = ((align_paddr - entry[i].start) / DPAAX_MEM_SPLIT); + /* TODO: Whatif something already exists at this + * location - is that an error? For now, ignoring the + * case. + */ + entry[i].pages[e_offset] = align_vaddr; + DPAAX_DEBUG("Added: vaddr=%zu for Phy:%"PRIu64" at %zu" + " remaining len %zu", align_vaddr, + align_paddr, e_offset, req_length); + + /* Incoming request can be larger than the + * DPAAX_MEM_SPLIT size - in which case, multiple + * entries in entry->pages[] are filled up. + */ + if (req_length <= DPAAX_MEM_SPLIT) + break; + align_paddr += DPAAX_MEM_SPLIT; + align_vaddr += DPAAX_MEM_SPLIT; + req_length -= DPAAX_MEM_SPLIT; + } while (1); + + break; + } + + if (!found) { + /* There might be case where the incoming physical address is + * beyond the address discovered in the memory node of + * device-tree. Specially if some malloc'd area is used by EAL + * and the memevent handlers passes that across. But, this is + * not necessarily an error. + */ + DPAAX_DEBUG("Add: Unable to find slot for vaddr:(%p)," + " phy(%"PRIu64")", + vaddr, paddr); + return -1; + } + + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, + vaddr, paddr, length); + return 0; +} + +/* dpaax_iova_table_dump + * Dump the table, with its entries, on screen. Only works in Debug Mode + * Not for weak hearted - the tables can get quite large + */ +void +dpaax_iova_table_dump(void) +{ + unsigned int i, j; + struct dpaax_iovat_element *entry; + + /* In case DEBUG is not enabled, some 'if' conditions might misbehave + * as they have nothing else in them except a DPAAX_DEBUG() which if + * tuned out would leave 'if' naked. + */ + if (rte_log_get_global_level() < RTE_LOG_DEBUG) { + DPAAX_ERR("Set log level to Debug for PA->Table dump!"); + return; + } + + DPAAX_DEBUG(" === Start of PA->VA Translation Table ==="); + if (dpaax_iova_table_p == NULL) + DPAAX_DEBUG("\tNULL"); + + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + DPAAX_DEBUG("\t(%16i),(%16"PRIu64"),(%16zu),(%16p)", + i, entry[i].start, entry[i].len, entry[i].pages); + DPAAX_DEBUG("\t\t (PA), (VA)"); + for (j = 0; j < (entry->len/DPAAX_MEM_SPLIT); j++) { + if (entry[i].pages[j] == 0) + continue; + DPAAX_DEBUG("\t\t(%16"PRIx64"),(%16"PRIx64")", + (entry[i].start + (j * sizeof(uint64_t))), + entry[i].pages[j]); + } + } + DPAAX_DEBUG(" === End of PA->VA Translation Table ==="); +} + +static void +dpaax_memevent_cb(enum rte_mem_event type, const void *addr, size_t len, + void *arg __rte_unused) +{ + struct rte_memseg_list *msl; + struct rte_memseg *ms; + size_t cur_len = 0, map_len = 0; + phys_addr_t phys_addr; + void *virt_addr; + int ret; + + DPAAX_DEBUG("Called with addr=%p, len=%zu", addr, len); + + msl = rte_mem_virt2memseg_list(addr); + + while (cur_len < len) { + const void *va = RTE_PTR_ADD(addr, cur_len); + + ms = rte_mem_virt2memseg(va, msl); + phys_addr = rte_mem_virt2phy(ms->addr); + virt_addr = ms->addr; + map_len = ms->len; + + DPAAX_DEBUG("Request for %s, va=%p, virt_addr=%p," + "iova=%"PRIu64", map_len=%zu", + type == RTE_MEM_EVENT_ALLOC ? + "alloc" : "dealloc", + va, virt_addr, phys_addr, map_len); + + if (type == RTE_MEM_EVENT_ALLOC) + ret = dpaax_iova_table_update(phys_addr, virt_addr, + map_len); + + if (ret != 0) { + DPAAX_ERR("PA-Table entry update failed. " + "Map=%d, addr=%p, len=%zu, err:(%d)", + type, va, map_len, ret); + return; + } + + cur_len += map_len; + } +} + +static int +dpaax_memevent_walk_memsegs(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, size_t len, + void *arg __rte_unused) +{ + DPAAX_DEBUG("Walking for %p (pa=%"PRIu64") and len %zu", + ms->addr, ms->phys_addr, len); + dpaax_iova_table_update(rte_mem_virt2phy(ms->addr), ms->addr, len); + return 0; +} + +static int +dpaax_handle_memevents(void) +{ + /* First, walk through all memsegs and pin them, before installing + * handler. This assures that all memseg which have already been + * identified/allocated by EAL, are already part of PA->VA Table. This + * is especially for cases where application allocates memory before + * the EAL or this is an externally allocated memory passed to EAL. + */ + rte_memseg_contig_walk_thread_unsafe(dpaax_memevent_walk_memsegs, NULL); + + return rte_mem_event_callback_register("dpaax_memevents_cb", + dpaax_memevent_cb, NULL); +} + +RTE_INIT(dpaax_log) +{ + dpaax_logger = rte_log_register("pmd.common.dpaax"); + if (dpaax_logger >= 0) + rte_log_set_level(dpaax_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/dpaax/dpaax_iova_table.h b/drivers/common/dpaax/dpaax_iova_table.h new file mode 100644 index 000000000..3e913ef45 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_IOVA_TABLE_H_ +#define _DPAAX_IOVA_TABLE_H_ + +#include <unistd.h> +#include <stdio.h> +#include <string.h> +#include <stdbool.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <dirent.h> +#include <fcntl.h> +#include <glob.h> +#include <errno.h> +#include <arpa/inet.h> + +#include <rte_eal.h> +#include <rte_branch_prediction.h> +#include <rte_memory.h> +#include <rte_malloc.h> + +struct dpaax_iovat_element { + phys_addr_t start; /**< Start address of block of physical pages */ + size_t len; /**< Difference of end-start for quick access */ + uint64_t *pages; /**< VA for each physical page in this block */ +}; + +struct dpaax_iova_table { + unsigned int count; /**< No. of blocks of contiguous physical pages */ + struct dpaax_iovat_element entries[0]; +}; + +/* Pointer to the table, which is common for DPAA/DPAA2 and only a single + * instance is required across net/crypto/event drivers. This table is + * populated iff devices are found on the bus. + */ +extern struct dpaax_iova_table *dpaax_iova_table_p; + +/* Device tree file for memory layout is named 'memory@<addr>' where the 'addr' + * is SoC dependent, or even Uboot fixup dependent. + */ +#define MEM_NODE_PATH_GLOB "/proc/device-tree/memory[@0-9]*/reg" +/* Device file should be multiple of 16 bytes, each containing 8 byte of addr + * and its length. Assuming max of 5 entries. + */ +#define MEM_NODE_FILE_LEN ((16 * 5) + 1) + +/* Table is made up of DPAAX_MEM_SPLIT elements for each contiguous zone. This + * helps avoid separate handling for cases where more than one size of hugepage + * is supported. + */ +#define DPAAX_MEM_SPLIT (1<<21) +#define DPAAX_MEM_SPLIT_MASK ~(DPAAX_MEM_SPLIT - 1) /**< Floor aligned */ +#define DPAAX_MEM_SPLIT_MASK_OFF (DPAAX_MEM_SPLIT - 1) /**< Offset */ + +/* APIs exposed */ +int dpaax_iova_table_populate(void); +void dpaax_iova_table_depopulate(void); +int dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length); +void dpaax_iova_table_dump(void); + +static inline void *dpaax_iova_table_get_va(phys_addr_t paddr) __attribute__((hot)); + +static inline void * +dpaax_iova_table_get_va(phys_addr_t paddr) { + unsigned int i = 0, index; + void *vaddr = 0; + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; + struct dpaax_iovat_element *entry; + + entry = dpaax_iova_table_p->entries; + + do { + if (unlikely(i > dpaax_iova_table_p->count)) + break; + + if (paddr_align < entry[i].start) { + /* Incorrect paddr; Not in memory range */ + return NULL; + } + + if (paddr_align > (entry[i].start + entry[i].len)) { + i++; + continue; + } + + /* paddr > entry->start && paddr <= entry->(start+len) */ + index = (paddr_align - entry[i].start)/DPAAX_MEM_SPLIT; + vaddr = (void *)((uintptr_t)entry[i].pages[index] + offset); + break; + } while (1); + + return vaddr; +} + +#endif /* _DPAAX_IOVA_TABLE_H_ */ diff --git a/drivers/common/dpaax/dpaax_logs.h b/drivers/common/dpaax/dpaax_logs.h new file mode 100644 index 000000000..bf1b27cc1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_logs.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_LOGS_H_ +#define _DPAAX_LOGS_H_ + +#include <rte_log.h> + +extern int dpaax_logger; + +#define DPAAX_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, dpaax_logger, "dpaax: " fmt "\n", \ + ##args) + +/* Debug logs are with Function names */ +#define DPAAX_DEBUG(fmt, args...) \ + rte_log(RTE_LOG_DEBUG, dpaax_logger, "dpaax: %s(): " fmt "\n", \ + __func__, ##args) + +#define DPAAX_INFO(fmt, args...) \ + DPAAX_LOG(INFO, fmt, ## args) +#define DPAAX_ERR(fmt, args...) \ + DPAAX_LOG(ERR, fmt, ## args) +#define DPAAX_WARN(fmt, args...) \ + DPAAX_LOG(WARNING, fmt, ## args) + +/* DP Logs, toggled out at compile time if level lower than current level */ +#define DPAAX_DP_LOG(level, fmt, args...) \ + RTE_LOG_DP(level, PMD, fmt, ## args) + +#define DPAAX_DP_DEBUG(fmt, args...) \ + DPAAX_DP_LOG(DEBUG, fmt, ## args) +#define DPAAX_DP_INFO(fmt, args...) \ + DPAAX_DP_LOG(INFO, fmt, ## args) +#define DPAAX_DP_WARN(fmt, args...) \ + DPAAX_DP_LOG(WARNING, fmt, ## args) + +#endif /* _DPAAX_LOGS_H_ */ diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build new file mode 100644 index 000000000..98a1bdd48 --- /dev/null +++ b/drivers/common/dpaax/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 NXP + +allow_experimental_apis = true + +if host_machine.system() != 'linux' + build = false +endif + +sources = files('dpaax_iova_table.c') + +cflags += ['-D_GNU_SOURCE'] diff --git a/drivers/common/dpaax/rte_common_dpaax_version.map b/drivers/common/dpaax/rte_common_dpaax_version.map new file mode 100644 index 000000000..8131c9e30 --- /dev/null +++ b/drivers/common/dpaax/rte_common_dpaax_version.map @@ -0,0 +1,11 @@ +DPDK_18.11 { + global: + + dpaax_iova_table_update; + dpaax_iova_table_depopulate; + dpaax_iova_table_dump; + dpaax_iova_table_p; + dpaax_iova_table_populate; + + local: *; +}; diff --git a/drivers/common/meson.build b/drivers/common/meson.build index f828ce7f7..0257d4d2b 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -2,6 +2,6 @@ # Copyright(c) 2018 Cavium, Inc std_deps = ['eal'] -drivers = ['mvep', 'octeontx', 'qat'] +drivers = ['dpaax', 'mvep', 'octeontx', 'qat'] config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON' driver_name_fmt = 'rte_common_@0@' -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 4/5] dpaa: enable dpaax library 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (2 preceding siblings ...) 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 5/5] fslmc: " Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain With this patch, dpaa bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa, event/dpaa and net/dpaa as they are dependent on the bus/dpaa and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 ++++ drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 ++++++ drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 ++++++ drivers/event/dpaa/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 ++++ drivers/mempool/dpaa/dpaa_mempool.h | 4 +--- drivers/net/dpaa/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/bus/dpaa/Makefile b/drivers/bus/dpaa/Makefile index bffaa9d92..5eb7c24db 100644 --- a/drivers/bus/dpaa/Makefile +++ b/drivers/bus/dpaa/Makefile @@ -48,5 +48,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += \ LDLIBS += -lpthread LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c index 49cd04dbb..b373742e7 100644 --- a/drivers/bus/dpaa/dpaa_bus.c +++ b/drivers/bus/dpaa/dpaa_bus.c @@ -34,6 +34,7 @@ #include <rte_dpaa_bus.h> #include <rte_dpaa_logs.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -546,6 +547,9 @@ rte_dpaa_bus_probe(void) fclose(svr_file); } + /* And initialize the PA->VA translation table */ + dpaax_iova_table_populate(); + /* For each registered driver, and device, call the driver->probe */ TAILQ_FOREACH(dev, &rte_dpaa_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_dpaa_bus.driver_list, next) { diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build index d10b62c03..42676fbc5 100644 --- a/drivers/bus/dpaa/meson.build +++ b/drivers/bus/dpaa/meson.build @@ -5,7 +5,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev'] +deps += ['common_dpaax', 'eventdev'] sources = files('base/fman/fman.c', 'base/fman/fman_hw.c', 'base/fman/netcfg_layer.c', diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h index 15dc6a4ac..1d580a000 100644 --- a/drivers/bus/dpaa/rte_dpaa_bus.h +++ b/drivers/bus/dpaa/rte_dpaa_bus.h @@ -8,6 +8,7 @@ #include <rte_bus.h> #include <rte_mempool.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -110,6 +111,11 @@ extern struct dpaa_memseg_list rte_dpaa_memsegs; static inline void *rte_dpaa_mem_ptov(phys_addr_t paddr) { struct dpaa_memseg *ms; + void *va; + + va = dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* Check if the address is already part of the memseg list internally * maintained by the dpaa driver. diff --git a/drivers/crypto/dpaa_sec/Makefile b/drivers/crypto/dpaa_sec/Makefile index 9be447041..674a7a398 100644 --- a/drivers/crypto/dpaa_sec/Makefile +++ b/drivers/crypto/dpaa_sec/Makefile @@ -38,5 +38,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA_SEC) += dpaa_sec.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 7c0459f9f..54f1913f2 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -107,6 +107,12 @@ dpaa_mem_vtop(void *vaddr) static inline void * dpaa_mem_ptov(rte_iova_t paddr) { + void *va; + + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va)) + return va; + return rte_mem_iova2virt(paddr); } diff --git a/drivers/event/dpaa/Makefile b/drivers/event/dpaa/Makefile index ddd855227..6f93e7f40 100644 --- a/drivers/event/dpaa/Makefile +++ b/drivers/event/dpaa/Makefile @@ -34,5 +34,6 @@ LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS += -lrte_eventdev -lrte_pmd_dpaa -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/Makefile b/drivers/mempool/dpaa/Makefile index da8da1e90..9cf36856c 100644 --- a/drivers/mempool/dpaa/Makefile +++ b/drivers/mempool/dpaa/Makefile @@ -31,5 +31,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += dpaa_mempool.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c index 1c121223b..b05fb7b9d 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.c +++ b/drivers/mempool/dpaa/dpaa_mempool.c @@ -26,6 +26,7 @@ #include <rte_ring.h> #include <dpaa_mempool.h> +#include <dpaax_iova_table.h> /* List of all the memseg information locally maintained in dpaa driver. This * is to optimize the PA_to_VA searches until a better mechanism (algo) is @@ -285,6 +286,9 @@ dpaa_populate(struct rte_mempool *mp, unsigned int max_objs, return 0; } + /* Update the PA-VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); + bp_info = DPAA_MEMPOOL_TO_POOL_INFO(mp); total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h index 092f326cb..533e1c6e2 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.h +++ b/drivers/mempool/dpaa/dpaa_mempool.h @@ -43,10 +43,8 @@ struct dpaa_bp_info { }; static inline void * -DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info, uint64_t addr) +DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info __rte_unused, uint64_t addr) { - if (bp_info->ptov_off) - return ((void *) (size_t)(addr + bp_info->ptov_off)); return rte_dpaa_mem_ptov(addr); } diff --git a/drivers/net/dpaa/Makefile b/drivers/net/dpaa/Makefile index d7a0a50c5..1c4f7d914 100644 --- a/drivers/net/dpaa/Makefile +++ b/drivers/net/dpaa/Makefile @@ -38,6 +38,7 @@ LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA_PMD)-include := rte_pmd_dpaa.h diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 32579e4b7..15097995e 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -115,6 +115,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n) _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_BUCKET) += -lrte_mempool_bucket _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_STACK) += -lrte_mempool_stack ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v2 5/5] fslmc: enable dpaax library 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (3 preceding siblings ...) 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 4/5] dpaa: enable dpaax library Shreyansh Jain @ 2018-10-09 11:25 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-09 11:25 UTC (permalink / raw) To: ferruh.yigit; +Cc: anatoly.burakov, dev, Shreyansh Jain With this patch, fslmc bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa2, event/dpaa2, net/dpaa2, raw/dpaa2_cmdif and raw/dpaa2_qdma as they are dependent on the bus/fslmc and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 20 ++++++++++++++++ drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 ++++++++--------- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 ++++-------------------- drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 43 insertions(+), 37 deletions(-) diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile index 515d0f534..c5b580a4a 100644 --- a/drivers/bus/fslmc/Makefile +++ b/drivers/bus/fslmc/Makefile @@ -19,6 +19,7 @@ CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax # versioning export map EXPORT_MAP := rte_bus_fslmc_version.map diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index a4f9a9eee..b6e20144d 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -20,6 +20,8 @@ #include <fslmc_vfio.h> #include "fslmc_logs.h" +#include <dpaax_iova_table.h> + int dpaa2_logtype_bus; #define VFIO_IOMMU_GROUP_PATH "/sys/kernel/iommu_groups" @@ -375,6 +377,19 @@ rte_fslmc_probe(void) probe_all = rte_fslmc_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST; + /* In case of PA, the FD addresses returned by qbman APIs are physical + * addresses, which need conversion into equivalent VA address for + * rte_mbuf. For that, a table (a serial array, in memory) is used to + * increase translation efficiency. + * This has to be done before probe as some device initialization + * (during) probe allocate memory (dpaa2_sec) which needs to be pinned + * to this table. + */ + ret = dpaax_iova_table_populate(); + if (ret) { + DPAA2_BUS_WARN("PA->VA Translation table not available;"); + } + TAILQ_FOREACH(dev, &rte_fslmc_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_fslmc_bus.driver_list, next) { ret = rte_fslmc_match(drv, dev); @@ -454,6 +469,11 @@ rte_fslmc_driver_unregister(struct rte_dpaa2_driver *driver) fslmc_bus = driver->fslmc_bus; + /* Cleanup the PA->VA Translation table; From whereever this function + * is called from. + */ + dpaax_iova_table_depopulate(); + TAILQ_REMOVE(&fslmc_bus->driver_list, driver, next); /* Update Bus references */ driver->fslmc_bus = NULL; diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build index 22a56a6fc..49d71d2ba 100644 --- a/drivers/bus/fslmc/meson.build +++ b/drivers/bus/fslmc/meson.build @@ -5,7 +5,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev', 'kvargs'] +deps += ['common_dpaax', 'eventdev', 'kvargs'] sources = files('fslmc_bus.c', 'fslmc_vfio.c', 'mc/dpbp.c', diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h index 820759360..678ee34b8 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h +++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h @@ -9,6 +9,7 @@ #define _DPAA2_HW_PVT_H_ #include <rte_eventdev.h> +#include <dpaax_iova_table.h> #include <mc/fsl_mc_sys.h> #include <fsl_qbman_portal.h> @@ -275,28 +276,26 @@ extern struct dpaa2_memseg_list rte_dpaa2_memsegs; #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA extern uint8_t dpaa2_virt_mode; static void *dpaa2_mem_ptov(phys_addr_t paddr) __attribute__((unused)); -/* todo - this is costly, need to write a fast coversion routine */ + static void *dpaa2_mem_ptov(phys_addr_t paddr) { - struct dpaa2_memseg *ms; + void *va; if (dpaa2_virt_mode) return (void *)(size_t)paddr; - /* Check if the address is already part of the memseg list internally - * maintained by the dpaa2 driver. - */ - TAILQ_FOREACH(ms, &rte_dpaa2_memsegs, next) { - if (paddr >= ms->iova && paddr < - ms->iova + ms->len) - return RTE_PTR_ADD(ms->vaddr, (uintptr_t)(paddr - ms->iova)); - } + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* If not, Fallback to full memseg list searching */ - return rte_mem_iova2virt(paddr); + va = rte_mem_iova2virt(paddr); + + return va; } static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) __attribute__((unused)); + static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) { const struct rte_memseg *memseg; diff --git a/drivers/crypto/dpaa2_sec/Makefile b/drivers/crypto/dpaa2_sec/Makefile index da3d8f84f..1f951a14b 100644 --- a/drivers/crypto/dpaa2_sec/Makefile +++ b/drivers/crypto/dpaa2_sec/Makefile @@ -51,5 +51,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_cryptodev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/event/dpaa2/Makefile b/drivers/event/dpaa2/Makefile index 5e1a63200..7a71161de 100644 --- a/drivers/event/dpaa2/Makefile +++ b/drivers/event/dpaa2/Makefile @@ -21,6 +21,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal LDLIBS += -lrte_eal -lrte_eventdev LDLIBS += -lrte_bus_fslmc -lrte_mempool_dpaa2 -lrte_pmd_dpaa2 LDLIBS += -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2 CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2/mc diff --git a/drivers/mempool/dpaa2/Makefile b/drivers/mempool/dpaa2/Makefile index 9e4c87d79..0fc69c3bf 100644 --- a/drivers/mempool/dpaa2/Makefile +++ b/drivers/mempool/dpaa2/Makefile @@ -30,6 +30,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += dpaa2_hw_mempool.c LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)-include := rte_dpaa2_mempool.h diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c index 84ff12811..c5f60c5c6 100644 --- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c +++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c @@ -30,6 +30,8 @@ #include "dpaa2_hw_mempool.h" #include "dpaa2_hw_mempool_logs.h" +#include <dpaax_iova_table.h> + struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID]; static struct dpaa2_bp_list *h_bp_list; @@ -393,31 +395,8 @@ dpaa2_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t paddr, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - struct dpaa2_memseg *ms; - - /* For each memory chunk pinned to the Mempool, a linked list of the - * contained memsegs is created for searching when PA to VA - * conversion is required. - */ - ms = rte_zmalloc(NULL, sizeof(struct dpaa2_memseg), 0); - if (!ms) { - DPAA2_MEMPOOL_ERR("Unable to allocate internal memory."); - DPAA2_MEMPOOL_WARN("Fast Physical to Virtual Addr translation would not be available."); - /* If the element is not added, it would only lead to failure - * in searching for the element and the logic would Fallback - * to traditional DPDK memseg traversal code. So, this is not - * a blocking error - but, error would be printed on screen. - */ - return 0; - } - - ms->vaddr = vaddr; - ms->iova = paddr; - ms->len = len; - /* Head insertions are generally faster than tail insertions as the - * buffers pinned are picked from rear end. - */ - TAILQ_INSERT_HEAD(&rte_dpaa2_memsegs, ms, next); + /* Insert entry into the PA->VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); return rte_mempool_op_populate_default(mp, max_objs, vaddr, paddr, len, obj_cb, obj_cb_arg); diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile index 9b0b14331..52649a945 100644 --- a/drivers/net/dpaa2/Makefile +++ b/drivers/net/dpaa2/Makefile @@ -40,5 +40,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_cmdif/Makefile b/drivers/raw/dpaa2_cmdif/Makefile index 9b863dda2..3c56c4b44 100644 --- a/drivers/raw/dpaa2_cmdif/Makefile +++ b/drivers/raw/dpaa2_cmdif/Makefile @@ -21,6 +21,7 @@ LDLIBS += -lrte_eal LDLIBS += -lrte_kvargs LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_cmdif_version.map diff --git a/drivers/raw/dpaa2_qdma/Makefile b/drivers/raw/dpaa2_qdma/Makefile index d88809ead..2f79a3f41 100644 --- a/drivers/raw/dpaa2_qdma/Makefile +++ b/drivers/raw/dpaa2_qdma/Makefile @@ -22,6 +22,7 @@ LDLIBS += -lrte_mempool LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev LDLIBS += -lrte_ring +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_qdma_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 15097995e..9ef18b78f 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -119,6 +119,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += -lrte_mempool_dpaa2 endif -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (4 preceding siblings ...) 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 5/5] fslmc: " Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain ` (5 more replies) 5 siblings, 6 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain ::Background:: After the restructuring of memory in last release(s), one of the major impact on fslmc/dpaa bus (and its devices) was the performance drop when using physical addressing. Previously, it was assumed that physical range was contiguous for any given request for hugepage memory. That way, whenever a virtual address was returned, it was easy to fetch physical equivalent, in almost constant time. But, with memory hotplug series, that assumption was negated. Every call that device drivers made for rte_mem_virt2iova or rte_mem_virt2phy were expensive. (Using IOVA_CONTIG is an app dependency which is not a practical option). For fslmc, working on Physical or Virtual (IOMMU supported) address is an optional thing. For dpaa bus, it is not optional and only physical addressing is supported. Thus, it impacted dpaa bus the most. ::DPAAX PA-VA Table:: - A simple table containing entries for all physical memory range available on a particular SoC (in this case, NXP's LS104x and LS20xx series, which are handled by dpaa and fslmc bus, respectively). As of now, this is SoC dependent for fetching range. - We populate the table either through the mempool handler (for mempool pinned memory) or through the memory event callbacks (for cases where working memory is allocated by application). - Though aim is only to translate addresses for descriptors which are Rx'd from devices, this is a generic layer which should work in other cases as well (though, not the target of current testing). ::About patches:: Patch 1: There was an issue in existing PA/VA mode reporting being done by fslmc bus. This patch fixes it. Patch 2: Common libraries/commponents can be dependency for the bus thus, blocking parallel compilation Patch 3: Add the library in common/dpaax. This is a single patch as functions are mostly inter-linked. Patch 4~5: Add support in dpaa and fslmc bus, respectively. It is not possible to unlink the bus and device drivers, thus, these patches have blanket change across all drivers. ::Next Steps:: - Some optimization are required to tune the access pattern of the table. These would be posted as additional patches. - In case there is any possible split of patches, I will post another version. But until then, this is the layout. ::Version History:: v2->v3: - Added back del operation (update) for mem-events, which was removed in v2 - Change IOMMU(PA) detection for FSLMC Bus (review comment: Pavan) - Rebase on master (6673fe0ce2) v1->v2: - Rework of review comments on v1 - Removed dpaax_iova_table_del API - that is redundant - Changed paax_iova_table_add to paax_iova_table_update to make it more relevant - Previous patch removed an advertised API (rte_dpaa2_memsegs). This is fixed. A deprecation notice would now be sent for removal in next release. - Rebase on master (5f73c2670f); Also verified on net-next/mater (317f8b01f) Shreyansh Jain (5): bus/fslmc: fix physical addressing check drivers: common as dependency for bus common/dpaax: add library for PA VA translation table dpaa: enable dpaax library fslmc: enable dpaax library config/common_base | 5 + config/common_linuxapp | 5 + drivers/Makefile | 1 + drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 + drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 + drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 24 + drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 +- drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 + drivers/event/dpaa/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 + drivers/mempool/dpaa/dpaa_mempool.h | 4 +- drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +- drivers/net/dpaa/Makefile | 1 + drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 2 + 34 files changed, 748 insertions(+), 42 deletions(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-13 16:08 ` Pavan Nikhilesh 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 2/5] drivers: common as dependency for bus Shreyansh Jain ` (4 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain, hemant.agrawal In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported class is RTE_IOVA_PA. Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") Cc: hemant.agrawal@nxp.com Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/fslmc_bus.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 960f55071..19e33caf1 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -493,6 +493,10 @@ rte_dpaa2_get_iommu_class(void) bool is_vfio_noiommu_enabled = 1; bool has_iova_va; +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; +#endif + if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-13 16:08 ` Pavan Nikhilesh 2018-10-15 6:36 ` Shreyansh Jain 0 siblings, 1 reply; 53+ messages in thread From: Pavan Nikhilesh @ 2018-10-13 16:08 UTC (permalink / raw) To: Shreyansh Jain, anatoly.burakov, hemant.agrawal, ferruh.yigit, thomas; +Cc: dev On Sat, Oct 13, 2018 at 05:51:26PM +0530, Shreyansh Jain wrote: > In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported > class is RTE_IOVA_PA. > > Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") > Cc: hemant.agrawal@nxp.com > > Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> > --- > drivers/bus/fslmc/fslmc_bus.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c > index 960f55071..19e33caf1 100644 > --- a/drivers/bus/fslmc/fslmc_bus.c > +++ b/drivers/bus/fslmc/fslmc_bus.c > @@ -493,6 +493,10 @@ rte_dpaa2_get_iommu_class(void) > bool is_vfio_noiommu_enabled = 1; > bool has_iova_va; > > +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA > + return RTE_IOVA_PA; > +#endif > + I think you forgot to move it below the device list check as discussed in previous patchset? :). > if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) > return RTE_IOVA_DC; > > -- > 2.17.1 > ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check 2018-10-13 16:08 ` Pavan Nikhilesh @ 2018-10-15 6:36 ` Shreyansh Jain 0 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:36 UTC (permalink / raw) To: Pavan Nikhilesh Cc: anatoly.burakov, Hemant Agrawal, ferruh.yigit, thomas, dev On Saturday 13 October 2018 09:38 PM, Pavan Nikhilesh wrote: > On Sat, Oct 13, 2018 at 05:51:26PM +0530, Shreyansh Jain wrote: >> In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported >> class is RTE_IOVA_PA. >> >> Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") >> Cc: hemant.agrawal@nxp.com >> >> Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> >> --- >> drivers/bus/fslmc/fslmc_bus.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c >> index 960f55071..19e33caf1 100644 >> --- a/drivers/bus/fslmc/fslmc_bus.c >> +++ b/drivers/bus/fslmc/fslmc_bus.c >> @@ -493,6 +493,10 @@ rte_dpaa2_get_iommu_class(void) >> bool is_vfio_noiommu_enabled = 1; >> bool has_iova_va; >> >> +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA >> + return RTE_IOVA_PA; >> +#endif >> + > > I think you forgot to move it below the device list check as discussed in > previous patchset? :). Yes, :(. Sorry. I mixed up my internal branches. > >> if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) >> return RTE_IOVA_DC; >> >> -- >> 2.17.1 >> ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 2/5] drivers: common as dependency for bus 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain Prior to this patch, bus and common compiled parallel. But, post this dependency is created. This is especially important for the DPAA/FSLMC buses which are going to use the common/dpaax library. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/Makefile b/drivers/Makefile index 75660765e..7d5da5d9f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -5,6 +5,7 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += common DIRS-y += bus +DEPDIRS-bus := common DIRS-y += mempool DEPDIRS-mempool := common bus DIRS-y += net -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 3/5] common/dpaax: add library for PA VA translation table 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 2/5] drivers: common as dependency for bus Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 4/5] dpaa: enable dpaax library Shreyansh Jain ` (2 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain A common library, valid for dpaaX drivers, which is used to maintain a local copy of PA->VA translations. In case of physical addressing mode (one of the option for FSLMC, and only option for DPAA bus), the addresses of descriptors Rx'd are physical. These need to be converted into equivalent VA for rte_mbuf and other similar calls. Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This library is an attempt to reduce the overall cost associated with this translation. A small table is maintained, containing continuous entries representing a continguous physical range. Each of these entries stores the equivalent VA, which is fed during mempool creation, or memory allocation/deallocation callbacks. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- config/common_base | 5 + config/common_linuxapp | 5 + drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- 10 files changed, 672 insertions(+), 1 deletion(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map diff --git a/config/common_base b/config/common_base index 8c7ead68d..7f10f7215 100644 --- a/config/common_base +++ b/config/common_base @@ -139,6 +139,11 @@ CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE=n # CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=n + # # Compile the Intel FPGA bus # diff --git a/config/common_linuxapp b/config/common_linuxapp index 485e1467d..76b884c48 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -29,6 +29,11 @@ CONFIG_RTE_PROC_INFO=y CONFIG_RTE_LIBRTE_VMBUS=y CONFIG_RTE_LIBRTE_NETVSC_PMD=y +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=y + # NXP DPAA BUS and drivers CONFIG_RTE_LIBRTE_DPAA_BUS=y CONFIG_RTE_LIBRTE_DPAA_MEMPOOL=y diff --git a/drivers/common/Makefile b/drivers/common/Makefile index b498c238f..6392a3412 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -14,4 +14,8 @@ ifneq (,$(findstring y,$(MVEP-y))) DIRS-y += mvep endif +ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y) +DIRS-y += dpaax +endif + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/common/dpaax/Makefile b/drivers/common/dpaax/Makefile new file mode 100644 index 000000000..94d2cf0ce --- /dev/null +++ b/drivers/common/dpaax/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2018 NXP +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_common_dpaax.a + +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +# versioning export map +EXPORT_MAP := rte_common_dpaax_version.map + +# library version +LIBABIVER := 1 + +# +# all source are stored in SRCS-y +# +SRCS-y += dpaax_iova_table.c + +LDLIBS += -lrte_eal + +SYMLINK-y-include += dpaax_iova_table.h + +include $(RTE_SDK)/mk/rte.lib.mk \ No newline at end of file diff --git a/drivers/common/dpaax/dpaax_iova_table.c b/drivers/common/dpaax/dpaax_iova_table.c new file mode 100644 index 000000000..d54267bb7 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.c @@ -0,0 +1,461 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#include <rte_memory.h> + +#include "dpaax_iova_table.h" +#include "dpaax_logs.h" + +/* Global dpaax logger identifier */ +int dpaax_logger; + +/* Global table reference */ +struct dpaax_iova_table *dpaax_iova_table_p; + +static int dpaax_handle_memevents(void); + +/* A structure representing the device-tree node available in /proc/device-tree. + */ +struct reg_node { + phys_addr_t addr; + size_t len; +}; + +/* A ntohll equivalent routine + * XXX: This is only applicable for 64 bit environment. + */ +static void +rotate_8(unsigned char *arr) +{ + uint32_t temp; + uint32_t *first_half; + uint32_t *second_half; + + first_half = (uint32_t *)(arr); + second_half = (uint32_t *)(arr + 4); + + temp = *first_half; + *first_half = *second_half; + *second_half = temp; + + *first_half = ntohl(*first_half); + *second_half = ntohl(*second_half); +} + +/* read_memory_nodes + * Memory layout for DPAAx platforms (LS1043, LS1046, LS1088, LS2088, LX2160) + * are populated by Uboot and available in device tree: + * /proc/device-tree/memory@<address>/reg <= register. + * Entries are of the form: + * (<8 byte start addr><8 byte length>)(..more similar blocks of start,len>).. + * + * @param count + * OUT populate number of entries found in memory node + * @return + * Pointer to array of reg_node elements, count size + */ +static struct reg_node * +read_memory_node(unsigned int *count) +{ + int fd, ret, i; + unsigned int j; + glob_t result = {0}; + struct stat statbuf = {0}; + char file_data[MEM_NODE_FILE_LEN]; + struct reg_node *nodes = NULL; + + *count = 0; + + ret = glob(MEM_NODE_PATH_GLOB, 0, NULL, &result); + if (ret != 0) { + DPAAX_ERR("Unable to glob device-tree memory node: (%s)(%d)", + MEM_NODE_PATH_GLOB, ret); + goto out; + } + + if (result.gl_pathc != 1) { + /* Either more than one memory@<addr> node found, or none. + * In either case, cannot work ahead. + */ + DPAAX_ERR("Found (%zu) entries in device-tree. Not supported!", + result.gl_pathc); + goto out; + } + + DPAAX_DEBUG("Opening and parsing device-tree node: (%s)", + result.gl_pathv[0]); + fd = open(result.gl_pathv[0], O_RDONLY); + if (fd < 0) { + DPAAX_ERR("Unable to open the device-tree node: (%s)(fd=%d)", + MEM_NODE_PATH_GLOB, fd); + goto cleanup; + } + + /* Stat to get the file size */ + ret = fstat(fd, &statbuf); + if (ret != 0) { + DPAAX_ERR("Unable to get device-tree memory node size."); + goto cleanup; + } + + DPAAX_DEBUG("Size of device-tree mem node: %lu", statbuf.st_size); + if (statbuf.st_size > MEM_NODE_FILE_LEN) { + DPAAX_WARN("More memory nodes available than assumed."); + DPAAX_WARN("System may not work properly!"); + } + + ret = read(fd, file_data, statbuf.st_size > MEM_NODE_FILE_LEN ? + MEM_NODE_FILE_LEN : statbuf.st_size); + if (ret <= 0) { + DPAAX_ERR("Unable to read device-tree memory node: (%d)", ret); + goto cleanup; + } + + /* The reg node should be multiple of 16 bytes, 8 bytes each for addr + * and len. + */ + *count = (statbuf.st_size / 16); + if ((*count) <= 0 || (statbuf.st_size % 16 != 0)) { + DPAAX_ERR("Invalid memory node values or count. (size=%lu)", + statbuf.st_size); + goto cleanup; + } + + /* each entry is of 16 bytes, and size/16 is total count of entries */ + nodes = malloc(sizeof(struct reg_node) * (*count)); + if (!nodes) { + DPAAX_ERR("Failure in allocating working memory."); + goto cleanup; + } + memset(nodes, 0, sizeof(struct reg_node) * (*count)); + + for (i = 0, j = 0; i < (statbuf.st_size) && j < (*count); i += 16, j++) { + memcpy(&nodes[j], file_data + i, 16); + /* Rotate (ntohl) each 8 byte entry */ + rotate_8((unsigned char *)(&(nodes[j].addr))); + rotate_8((unsigned char *)(&(nodes[j].len))); + } + + DPAAX_DEBUG("Device-tree memory node data:"); + do { + DPAAX_DEBUG("\n %08" PRIx64 " %08zu", nodes[j].addr, nodes[j].len); + } while (--j); + +cleanup: + close(fd); + globfree(&result); +out: + return nodes; +} + +int +dpaax_iova_table_populate(void) +{ + int ret; + unsigned int i, node_count; + size_t tot_memory_size, total_table_size; + struct reg_node *nodes; + struct dpaax_iovat_element *entry; + + /* dpaax_iova_table_p is a singleton - only one instance should be + * created. + */ + if (dpaax_iova_table_p) { + DPAAX_DEBUG("Multiple allocation attempt for IOVA Table (%p)", + dpaax_iova_table_p); + /* This can be an error case as well - some path not cleaning + * up table - but, for now, it is assumed that if IOVA Table + * pointer is valid, table is allocated. + */ + return 0; + } + + nodes = read_memory_node(&node_count); + if (nodes == NULL || node_count <= 0) { + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + return -1; + } + + tot_memory_size = 0; + for (i = 0; i < node_count; i++) + tot_memory_size += nodes[i].len; + + DPAAX_DEBUG("Total available PA memory size: %zu", tot_memory_size); + + /* Total table size = meta data + tot_memory_size/8 */ + total_table_size = sizeof(struct dpaax_iova_table) + + (sizeof(struct dpaax_iovat_element) * node_count) + + ((tot_memory_size / DPAAX_MEM_SPLIT) * sizeof(uint64_t)); + + /* TODO: This memory doesn't need to shared but needs to be always + * pinned to RAM (no swap out) - using hugepage rather than malloc + */ + dpaax_iova_table_p = rte_zmalloc(NULL, total_table_size, 0); + if (dpaax_iova_table_p == NULL) { + DPAAX_WARN("Unable to allocate memory for PA->VA Table;"); + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + free(nodes); + return -1; + } + + /* Initialize table */ + dpaax_iova_table_p->count = node_count; + entry = dpaax_iova_table_p->entries; + + DPAAX_DEBUG("IOVA Table entries: (entry start = %p)", (void *)entry); + DPAAX_DEBUG("\t(entry),(start),(len),(next)"); + + for (i = 0; i < node_count; i++) { + /* dpaax_iova_table_p + * | dpaax_iova_table_p->entries + * | | + * | | + * V V + * +------+------+-------+---+----------+---------+--- + * |iova_ |entry | entry | | pages | pages | + * |table | 1 | 2 |...| entry 1 | entry2 | + * +-----'+.-----+-------+---+;---------+;--------+--- + * \ \ / / + * `~~~~~~|~~~~~>pages / + * \ / + * `~~~~~~~~~~~>pages + */ + entry[i].start = nodes[i].addr; + entry[i].len = nodes[i].len; + if (i > 0) + entry[i].pages = entry[i-1].pages + + ((entry[i-1].len/DPAAX_MEM_SPLIT)); + else + entry[i].pages = (uint64_t *)((unsigned char *)entry + + (sizeof(struct dpaax_iovat_element) * + node_count)); + + DPAAX_DEBUG("\t(%u),(%8"PRIx64"),(%8zu),(%8p)", + i, entry[i].start, entry[i].len, entry[i].pages); + } + + /* Release memory associated with nodes array - not required now */ + free(nodes); + + DPAAX_DEBUG("Adding mem-event handler\n"); + ret = dpaax_handle_memevents(); + if (ret) { + DPAAX_ERR("Unable to add mem-event handler"); + DPAAX_WARN("Cases with non-buffer pool mem won't work!"); + } + + return 0; +} + +void +dpaax_iova_table_depopulate(void) +{ + if (dpaax_iova_table_p == NULL) + return; + + rte_free(dpaax_iova_table_p->entries); + dpaax_iova_table_p = NULL; + + DPAAX_DEBUG("IOVA Table cleanedup"); +} + +int +dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length) +{ + int found = 0; + unsigned int i; + size_t req_length = length, e_offset; + struct dpaax_iovat_element *entry; + uintptr_t align_vaddr; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + align_vaddr = ((uintptr_t)vaddr & DPAAX_MEM_SPLIT_MASK); + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Add: Incorrect entry for PA->VA Table" + "(%"PRIu64")", paddr); + DPAAX_ERR("Add: Lowest address: %"PRIu64"", + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + + do { + e_offset = ((align_paddr - entry[i].start) / DPAAX_MEM_SPLIT); + /* TODO: Whatif something already exists at this + * location - is that an error? For now, ignoring the + * case. + */ + entry[i].pages[e_offset] = align_vaddr; + DPAAX_DEBUG("Added: vaddr=%zu for Phy:%"PRIu64" at %zu" + " remaining len %zu", align_vaddr, + align_paddr, e_offset, req_length); + + /* Incoming request can be larger than the + * DPAAX_MEM_SPLIT size - in which case, multiple + * entries in entry->pages[] are filled up. + */ + if (req_length <= DPAAX_MEM_SPLIT) + break; + align_paddr += DPAAX_MEM_SPLIT; + align_vaddr += DPAAX_MEM_SPLIT; + req_length -= DPAAX_MEM_SPLIT; + } while (1); + + break; + } + + if (!found) { + /* There might be case where the incoming physical address is + * beyond the address discovered in the memory node of + * device-tree. Specially if some malloc'd area is used by EAL + * and the memevent handlers passes that across. But, this is + * not necessarily an error. + */ + DPAAX_DEBUG("Add: Unable to find slot for vaddr:(%p)," + " phy(%"PRIu64")", + vaddr, paddr); + return -1; + } + + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, + vaddr, paddr, length); + return 0; +} + +/* dpaax_iova_table_dump + * Dump the table, with its entries, on screen. Only works in Debug Mode + * Not for weak hearted - the tables can get quite large + */ +void +dpaax_iova_table_dump(void) +{ + unsigned int i, j; + struct dpaax_iovat_element *entry; + + /* In case DEBUG is not enabled, some 'if' conditions might misbehave + * as they have nothing else in them except a DPAAX_DEBUG() which if + * tuned out would leave 'if' naked. + */ + if (rte_log_get_global_level() < RTE_LOG_DEBUG) { + DPAAX_ERR("Set log level to Debug for PA->Table dump!"); + return; + } + + DPAAX_DEBUG(" === Start of PA->VA Translation Table ==="); + if (dpaax_iova_table_p == NULL) + DPAAX_DEBUG("\tNULL"); + + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + DPAAX_DEBUG("\t(%16i),(%16"PRIu64"),(%16zu),(%16p)", + i, entry[i].start, entry[i].len, entry[i].pages); + DPAAX_DEBUG("\t\t (PA), (VA)"); + for (j = 0; j < (entry->len/DPAAX_MEM_SPLIT); j++) { + if (entry[i].pages[j] == 0) + continue; + DPAAX_DEBUG("\t\t(%16"PRIx64"),(%16"PRIx64")", + (entry[i].start + (j * sizeof(uint64_t))), + entry[i].pages[j]); + } + } + DPAAX_DEBUG(" === End of PA->VA Translation Table ==="); +} + +static void +dpaax_memevent_cb(enum rte_mem_event type, const void *addr, size_t len, + void *arg __rte_unused) +{ + struct rte_memseg_list *msl; + struct rte_memseg *ms; + size_t cur_len = 0, map_len = 0; + phys_addr_t phys_addr; + void *virt_addr; + int ret; + + DPAAX_DEBUG("Called with addr=%p, len=%zu", addr, len); + + msl = rte_mem_virt2memseg_list(addr); + + while (cur_len < len) { + const void *va = RTE_PTR_ADD(addr, cur_len); + + ms = rte_mem_virt2memseg(va, msl); + phys_addr = rte_mem_virt2phy(ms->addr); + virt_addr = ms->addr; + map_len = ms->len; + + DPAAX_DEBUG("Request for %s, va=%p, virt_addr=%p," + "iova=%"PRIu64", map_len=%zu", + type == RTE_MEM_EVENT_ALLOC ? + "alloc" : "dealloc", + va, virt_addr, phys_addr, map_len); + + if (type == RTE_MEM_EVENT_ALLOC) + ret = dpaax_iova_table_update(phys_addr, virt_addr, + map_len); + else + /* In case of mem_events for MEM_EVENT_FREE, complete + * hugepage is released and its PA entry is set to 0. + */ + ret = dpaax_iova_table_update(phys_addr, 0, map_len); + + if (ret != 0) { + DPAAX_ERR("PA-Table entry update failed. " + "Map=%d, addr=%p, len=%zu, err:(%d)", + type, va, map_len, ret); + return; + } + + cur_len += map_len; + } +} + +static int +dpaax_memevent_walk_memsegs(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, size_t len, + void *arg __rte_unused) +{ + DPAAX_DEBUG("Walking for %p (pa=%"PRIu64") and len %zu", + ms->addr, ms->phys_addr, len); + dpaax_iova_table_update(rte_mem_virt2phy(ms->addr), ms->addr, len); + return 0; +} + +static int +dpaax_handle_memevents(void) +{ + /* First, walk through all memsegs and pin them, before installing + * handler. This assures that all memseg which have already been + * identified/allocated by EAL, are already part of PA->VA Table. This + * is especially for cases where application allocates memory before + * the EAL or this is an externally allocated memory passed to EAL. + */ + rte_memseg_contig_walk_thread_unsafe(dpaax_memevent_walk_memsegs, NULL); + + return rte_mem_event_callback_register("dpaax_memevents_cb", + dpaax_memevent_cb, NULL); +} + +RTE_INIT(dpaax_log) +{ + dpaax_logger = rte_log_register("pmd.common.dpaax"); + if (dpaax_logger >= 0) + rte_log_set_level(dpaax_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/dpaax/dpaax_iova_table.h b/drivers/common/dpaax/dpaax_iova_table.h new file mode 100644 index 000000000..3e913ef45 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_IOVA_TABLE_H_ +#define _DPAAX_IOVA_TABLE_H_ + +#include <unistd.h> +#include <stdio.h> +#include <string.h> +#include <stdbool.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <dirent.h> +#include <fcntl.h> +#include <glob.h> +#include <errno.h> +#include <arpa/inet.h> + +#include <rte_eal.h> +#include <rte_branch_prediction.h> +#include <rte_memory.h> +#include <rte_malloc.h> + +struct dpaax_iovat_element { + phys_addr_t start; /**< Start address of block of physical pages */ + size_t len; /**< Difference of end-start for quick access */ + uint64_t *pages; /**< VA for each physical page in this block */ +}; + +struct dpaax_iova_table { + unsigned int count; /**< No. of blocks of contiguous physical pages */ + struct dpaax_iovat_element entries[0]; +}; + +/* Pointer to the table, which is common for DPAA/DPAA2 and only a single + * instance is required across net/crypto/event drivers. This table is + * populated iff devices are found on the bus. + */ +extern struct dpaax_iova_table *dpaax_iova_table_p; + +/* Device tree file for memory layout is named 'memory@<addr>' where the 'addr' + * is SoC dependent, or even Uboot fixup dependent. + */ +#define MEM_NODE_PATH_GLOB "/proc/device-tree/memory[@0-9]*/reg" +/* Device file should be multiple of 16 bytes, each containing 8 byte of addr + * and its length. Assuming max of 5 entries. + */ +#define MEM_NODE_FILE_LEN ((16 * 5) + 1) + +/* Table is made up of DPAAX_MEM_SPLIT elements for each contiguous zone. This + * helps avoid separate handling for cases where more than one size of hugepage + * is supported. + */ +#define DPAAX_MEM_SPLIT (1<<21) +#define DPAAX_MEM_SPLIT_MASK ~(DPAAX_MEM_SPLIT - 1) /**< Floor aligned */ +#define DPAAX_MEM_SPLIT_MASK_OFF (DPAAX_MEM_SPLIT - 1) /**< Offset */ + +/* APIs exposed */ +int dpaax_iova_table_populate(void); +void dpaax_iova_table_depopulate(void); +int dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length); +void dpaax_iova_table_dump(void); + +static inline void *dpaax_iova_table_get_va(phys_addr_t paddr) __attribute__((hot)); + +static inline void * +dpaax_iova_table_get_va(phys_addr_t paddr) { + unsigned int i = 0, index; + void *vaddr = 0; + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; + struct dpaax_iovat_element *entry; + + entry = dpaax_iova_table_p->entries; + + do { + if (unlikely(i > dpaax_iova_table_p->count)) + break; + + if (paddr_align < entry[i].start) { + /* Incorrect paddr; Not in memory range */ + return NULL; + } + + if (paddr_align > (entry[i].start + entry[i].len)) { + i++; + continue; + } + + /* paddr > entry->start && paddr <= entry->(start+len) */ + index = (paddr_align - entry[i].start)/DPAAX_MEM_SPLIT; + vaddr = (void *)((uintptr_t)entry[i].pages[index] + offset); + break; + } while (1); + + return vaddr; +} + +#endif /* _DPAAX_IOVA_TABLE_H_ */ diff --git a/drivers/common/dpaax/dpaax_logs.h b/drivers/common/dpaax/dpaax_logs.h new file mode 100644 index 000000000..bf1b27cc1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_logs.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_LOGS_H_ +#define _DPAAX_LOGS_H_ + +#include <rte_log.h> + +extern int dpaax_logger; + +#define DPAAX_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, dpaax_logger, "dpaax: " fmt "\n", \ + ##args) + +/* Debug logs are with Function names */ +#define DPAAX_DEBUG(fmt, args...) \ + rte_log(RTE_LOG_DEBUG, dpaax_logger, "dpaax: %s(): " fmt "\n", \ + __func__, ##args) + +#define DPAAX_INFO(fmt, args...) \ + DPAAX_LOG(INFO, fmt, ## args) +#define DPAAX_ERR(fmt, args...) \ + DPAAX_LOG(ERR, fmt, ## args) +#define DPAAX_WARN(fmt, args...) \ + DPAAX_LOG(WARNING, fmt, ## args) + +/* DP Logs, toggled out at compile time if level lower than current level */ +#define DPAAX_DP_LOG(level, fmt, args...) \ + RTE_LOG_DP(level, PMD, fmt, ## args) + +#define DPAAX_DP_DEBUG(fmt, args...) \ + DPAAX_DP_LOG(DEBUG, fmt, ## args) +#define DPAAX_DP_INFO(fmt, args...) \ + DPAAX_DP_LOG(INFO, fmt, ## args) +#define DPAAX_DP_WARN(fmt, args...) \ + DPAAX_DP_LOG(WARNING, fmt, ## args) + +#endif /* _DPAAX_LOGS_H_ */ diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build new file mode 100644 index 000000000..98a1bdd48 --- /dev/null +++ b/drivers/common/dpaax/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 NXP + +allow_experimental_apis = true + +if host_machine.system() != 'linux' + build = false +endif + +sources = files('dpaax_iova_table.c') + +cflags += ['-D_GNU_SOURCE'] diff --git a/drivers/common/dpaax/rte_common_dpaax_version.map b/drivers/common/dpaax/rte_common_dpaax_version.map new file mode 100644 index 000000000..8131c9e30 --- /dev/null +++ b/drivers/common/dpaax/rte_common_dpaax_version.map @@ -0,0 +1,11 @@ +DPDK_18.11 { + global: + + dpaax_iova_table_update; + dpaax_iova_table_depopulate; + dpaax_iova_table_dump; + dpaax_iova_table_p; + dpaax_iova_table_populate; + + local: *; +}; diff --git a/drivers/common/meson.build b/drivers/common/meson.build index f828ce7f7..0257d4d2b 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -2,6 +2,6 @@ # Copyright(c) 2018 Cavium, Inc std_deps = ['eal'] -drivers = ['mvep', 'octeontx', 'qat'] +drivers = ['dpaax', 'mvep', 'octeontx', 'qat'] config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON' driver_name_fmt = 'rte_common_@0@' -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 4/5] dpaa: enable dpaax library 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (2 preceding siblings ...) 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 5/5] fslmc: " Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, dpaa bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa, event/dpaa and net/dpaa as they are dependent on the bus/dpaa and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 ++++ drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 ++++++ drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 ++++++ drivers/event/dpaa/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 ++++ drivers/mempool/dpaa/dpaa_mempool.h | 4 +--- drivers/net/dpaa/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/bus/dpaa/Makefile b/drivers/bus/dpaa/Makefile index 9337b5f92..381a5c659 100644 --- a/drivers/bus/dpaa/Makefile +++ b/drivers/bus/dpaa/Makefile @@ -48,5 +48,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += \ LDLIBS += -lpthread LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c index 138e0f98d..381c3b17c 100644 --- a/drivers/bus/dpaa/dpaa_bus.c +++ b/drivers/bus/dpaa/dpaa_bus.c @@ -34,6 +34,7 @@ #include <rte_dpaa_bus.h> #include <rte_dpaa_logs.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -548,6 +549,9 @@ rte_dpaa_bus_probe(void) fclose(svr_file); } + /* And initialize the PA->VA translation table */ + dpaax_iova_table_populate(); + /* For each registered driver, and device, call the driver->probe */ TAILQ_FOREACH(dev, &rte_dpaa_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_dpaa_bus.driver_list, next) { diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build index 5e7705571..11a3c9499 100644 --- a/drivers/bus/dpaa/meson.build +++ b/drivers/bus/dpaa/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev'] +deps += ['common_dpaax', 'eventdev'] sources = files('base/fman/fman.c', 'base/fman/fman_hw.c', 'base/fman/netcfg_layer.c', diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h index 15dc6a4ac..1d580a000 100644 --- a/drivers/bus/dpaa/rte_dpaa_bus.h +++ b/drivers/bus/dpaa/rte_dpaa_bus.h @@ -8,6 +8,7 @@ #include <rte_bus.h> #include <rte_mempool.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -110,6 +111,11 @@ extern struct dpaa_memseg_list rte_dpaa_memsegs; static inline void *rte_dpaa_mem_ptov(phys_addr_t paddr) { struct dpaa_memseg *ms; + void *va; + + va = dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* Check if the address is already part of the memseg list internally * maintained by the dpaa driver. diff --git a/drivers/crypto/dpaa_sec/Makefile b/drivers/crypto/dpaa_sec/Makefile index 9be447041..674a7a398 100644 --- a/drivers/crypto/dpaa_sec/Makefile +++ b/drivers/crypto/dpaa_sec/Makefile @@ -38,5 +38,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA_SEC) += dpaa_sec.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 7c0459f9f..54f1913f2 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -107,6 +107,12 @@ dpaa_mem_vtop(void *vaddr) static inline void * dpaa_mem_ptov(rte_iova_t paddr) { + void *va; + + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va)) + return va; + return rte_mem_iova2virt(paddr); } diff --git a/drivers/event/dpaa/Makefile b/drivers/event/dpaa/Makefile index ddd855227..6f93e7f40 100644 --- a/drivers/event/dpaa/Makefile +++ b/drivers/event/dpaa/Makefile @@ -34,5 +34,6 @@ LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS += -lrte_eventdev -lrte_pmd_dpaa -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/Makefile b/drivers/mempool/dpaa/Makefile index da8da1e90..9cf36856c 100644 --- a/drivers/mempool/dpaa/Makefile +++ b/drivers/mempool/dpaa/Makefile @@ -31,5 +31,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += dpaa_mempool.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c index 1c121223b..b05fb7b9d 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.c +++ b/drivers/mempool/dpaa/dpaa_mempool.c @@ -26,6 +26,7 @@ #include <rte_ring.h> #include <dpaa_mempool.h> +#include <dpaax_iova_table.h> /* List of all the memseg information locally maintained in dpaa driver. This * is to optimize the PA_to_VA searches until a better mechanism (algo) is @@ -285,6 +286,9 @@ dpaa_populate(struct rte_mempool *mp, unsigned int max_objs, return 0; } + /* Update the PA-VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); + bp_info = DPAA_MEMPOOL_TO_POOL_INFO(mp); total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h index 092f326cb..533e1c6e2 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.h +++ b/drivers/mempool/dpaa/dpaa_mempool.h @@ -43,10 +43,8 @@ struct dpaa_bp_info { }; static inline void * -DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info, uint64_t addr) +DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info __rte_unused, uint64_t addr) { - if (bp_info->ptov_off) - return ((void *) (size_t)(addr + bp_info->ptov_off)); return rte_dpaa_mem_ptov(addr); } diff --git a/drivers/net/dpaa/Makefile b/drivers/net/dpaa/Makefile index d7a0a50c5..1c4f7d914 100644 --- a/drivers/net/dpaa/Makefile +++ b/drivers/net/dpaa/Makefile @@ -38,6 +38,7 @@ LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA_PMD)-include := rte_pmd_dpaa.h diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 3ece996e8..85605e38e 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -117,6 +117,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n) _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_BUCKET) += -lrte_mempool_bucket _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_STACK) += -lrte_mempool_stack ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v3 5/5] fslmc: enable dpaax library 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (3 preceding siblings ...) 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 4/5] dpaa: enable dpaax library Shreyansh Jain @ 2018-10-13 12:21 ` Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-13 12:21 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, fslmc bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa2, event/dpaa2, net/dpaa2, raw/dpaa2_cmdif and raw/dpaa2_qdma as they are dependent on the bus/fslmc and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 20 ++++++++++++++++ drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 ++++++++--------- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 ++++-------------------- drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 43 insertions(+), 37 deletions(-) diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile index e95551980..218d9bd28 100644 --- a/drivers/bus/fslmc/Makefile +++ b/drivers/bus/fslmc/Makefile @@ -19,6 +19,7 @@ CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax # versioning export map EXPORT_MAP := rte_bus_fslmc_version.map diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 19e33caf1..f2784ce51 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -20,6 +20,8 @@ #include <fslmc_vfio.h> #include "fslmc_logs.h" +#include <dpaax_iova_table.h> + int dpaa2_logtype_bus; #define VFIO_IOMMU_GROUP_PATH "/sys/kernel/iommu_groups" @@ -377,6 +379,19 @@ rte_fslmc_probe(void) probe_all = rte_fslmc_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST; + /* In case of PA, the FD addresses returned by qbman APIs are physical + * addresses, which need conversion into equivalent VA address for + * rte_mbuf. For that, a table (a serial array, in memory) is used to + * increase translation efficiency. + * This has to be done before probe as some device initialization + * (during) probe allocate memory (dpaa2_sec) which needs to be pinned + * to this table. + */ + ret = dpaax_iova_table_populate(); + if (ret) { + DPAA2_BUS_WARN("PA->VA Translation table not available;"); + } + TAILQ_FOREACH(dev, &rte_fslmc_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_fslmc_bus.driver_list, next) { ret = rte_fslmc_match(drv, dev); @@ -456,6 +471,11 @@ rte_fslmc_driver_unregister(struct rte_dpaa2_driver *driver) fslmc_bus = driver->fslmc_bus; + /* Cleanup the PA->VA Translation table; From whereever this function + * is called from. + */ + dpaax_iova_table_depopulate(); + TAILQ_REMOVE(&fslmc_bus->driver_list, driver, next); /* Update Bus references */ driver->fslmc_bus = NULL; diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build index 54ca92d0c..18c45495b 100644 --- a/drivers/bus/fslmc/meson.build +++ b/drivers/bus/fslmc/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev', 'kvargs'] +deps += ['common_dpaax', 'eventdev', 'kvargs'] sources = files('fslmc_bus.c', 'fslmc_vfio.c', 'mc/dpbp.c', diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h index 820759360..678ee34b8 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h +++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h @@ -9,6 +9,7 @@ #define _DPAA2_HW_PVT_H_ #include <rte_eventdev.h> +#include <dpaax_iova_table.h> #include <mc/fsl_mc_sys.h> #include <fsl_qbman_portal.h> @@ -275,28 +276,26 @@ extern struct dpaa2_memseg_list rte_dpaa2_memsegs; #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA extern uint8_t dpaa2_virt_mode; static void *dpaa2_mem_ptov(phys_addr_t paddr) __attribute__((unused)); -/* todo - this is costly, need to write a fast coversion routine */ + static void *dpaa2_mem_ptov(phys_addr_t paddr) { - struct dpaa2_memseg *ms; + void *va; if (dpaa2_virt_mode) return (void *)(size_t)paddr; - /* Check if the address is already part of the memseg list internally - * maintained by the dpaa2 driver. - */ - TAILQ_FOREACH(ms, &rte_dpaa2_memsegs, next) { - if (paddr >= ms->iova && paddr < - ms->iova + ms->len) - return RTE_PTR_ADD(ms->vaddr, (uintptr_t)(paddr - ms->iova)); - } + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* If not, Fallback to full memseg list searching */ - return rte_mem_iova2virt(paddr); + va = rte_mem_iova2virt(paddr); + + return va; } static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) __attribute__((unused)); + static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) { const struct rte_memseg *memseg; diff --git a/drivers/crypto/dpaa2_sec/Makefile b/drivers/crypto/dpaa2_sec/Makefile index da3d8f84f..1f951a14b 100644 --- a/drivers/crypto/dpaa2_sec/Makefile +++ b/drivers/crypto/dpaa2_sec/Makefile @@ -51,5 +51,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_cryptodev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/event/dpaa2/Makefile b/drivers/event/dpaa2/Makefile index 5e1a63200..7a71161de 100644 --- a/drivers/event/dpaa2/Makefile +++ b/drivers/event/dpaa2/Makefile @@ -21,6 +21,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal LDLIBS += -lrte_eal -lrte_eventdev LDLIBS += -lrte_bus_fslmc -lrte_mempool_dpaa2 -lrte_pmd_dpaa2 LDLIBS += -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2 CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2/mc diff --git a/drivers/mempool/dpaa2/Makefile b/drivers/mempool/dpaa2/Makefile index 9e4c87d79..0fc69c3bf 100644 --- a/drivers/mempool/dpaa2/Makefile +++ b/drivers/mempool/dpaa2/Makefile @@ -30,6 +30,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += dpaa2_hw_mempool.c LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)-include := rte_dpaa2_mempool.h diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c index 84ff12811..c5f60c5c6 100644 --- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c +++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c @@ -30,6 +30,8 @@ #include "dpaa2_hw_mempool.h" #include "dpaa2_hw_mempool_logs.h" +#include <dpaax_iova_table.h> + struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID]; static struct dpaa2_bp_list *h_bp_list; @@ -393,31 +395,8 @@ dpaa2_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t paddr, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - struct dpaa2_memseg *ms; - - /* For each memory chunk pinned to the Mempool, a linked list of the - * contained memsegs is created for searching when PA to VA - * conversion is required. - */ - ms = rte_zmalloc(NULL, sizeof(struct dpaa2_memseg), 0); - if (!ms) { - DPAA2_MEMPOOL_ERR("Unable to allocate internal memory."); - DPAA2_MEMPOOL_WARN("Fast Physical to Virtual Addr translation would not be available."); - /* If the element is not added, it would only lead to failure - * in searching for the element and the logic would Fallback - * to traditional DPDK memseg traversal code. So, this is not - * a blocking error - but, error would be printed on screen. - */ - return 0; - } - - ms->vaddr = vaddr; - ms->iova = paddr; - ms->len = len; - /* Head insertions are generally faster than tail insertions as the - * buffers pinned are picked from rear end. - */ - TAILQ_INSERT_HEAD(&rte_dpaa2_memsegs, ms, next); + /* Insert entry into the PA->VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); return rte_mempool_op_populate_default(mp, max_objs, vaddr, paddr, len, obj_cb, obj_cb_arg); diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile index 9b0b14331..52649a945 100644 --- a/drivers/net/dpaa2/Makefile +++ b/drivers/net/dpaa2/Makefile @@ -40,5 +40,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_cmdif/Makefile b/drivers/raw/dpaa2_cmdif/Makefile index 9b863dda2..3c56c4b44 100644 --- a/drivers/raw/dpaa2_cmdif/Makefile +++ b/drivers/raw/dpaa2_cmdif/Makefile @@ -21,6 +21,7 @@ LDLIBS += -lrte_eal LDLIBS += -lrte_kvargs LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_cmdif_version.map diff --git a/drivers/raw/dpaa2_qdma/Makefile b/drivers/raw/dpaa2_qdma/Makefile index d88809ead..2f79a3f41 100644 --- a/drivers/raw/dpaa2_qdma/Makefile +++ b/drivers/raw/dpaa2_qdma/Makefile @@ -22,6 +22,7 @@ LDLIBS += -lrte_mempool LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev LDLIBS += -lrte_ring +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_qdma_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 85605e38e..f2218df39 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -121,6 +121,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += -lrte_mempool_dpaa2 endif -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (4 preceding siblings ...) 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 5/5] fslmc: " Shreyansh Jain @ 2018-10-15 6:41 ` Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain ` (5 more replies) 5 siblings, 6 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:41 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain ::Background:: After the restructuring of memory in last release(s), one of the major impact on fslmc/dpaa bus (and its devices) was the performance drop when using physical addressing. Previously, it was assumed that physical range was contiguous for any given request for hugepage memory. That way, whenever a virtual address was returned, it was easy to fetch physical equivalent, in almost constant time. But, with memory hotplug series, that assumption was negated. Every call that device drivers made for rte_mem_virt2iova or rte_mem_virt2phy were expensive. (Using IOVA_CONTIG is an app dependency which is not a practical option). For fslmc, working on Physical or Virtual (IOMMU supported) address is an optional thing. For dpaa bus, it is not optional and only physical addressing is supported. Thus, it impacted dpaa bus the most. ::DPAAX PA-VA Table:: - A simple table containing entries for all physical memory range available on a particular SoC (in this case, NXP's LS104x and LS20xx series, which are handled by dpaa and fslmc bus, respectively). As of now, this is SoC dependent for fetching range. - We populate the table either through the mempool handler (for mempool pinned memory) or through the memory event callbacks (for cases where working memory is allocated by application). - Though aim is only to translate addresses for descriptors which are Rx'd from devices, this is a generic layer which should work in other cases as well (though, not the target of current testing). ::About patches:: Patch 1: There was an issue in existing PA/VA mode reporting being done by fslmc bus. This patch fixes it. Patch 2: Common libraries/commponents can be dependency for the bus thus, blocking parallel compilation Patch 3: Add the library in common/dpaax. This is a single patch as functions are mostly inter-linked. Patch 4~5: Add support in dpaa and fslmc bus, respectively. It is not possible to unlink the bus and device drivers, thus, these patches have blanket change across all drivers. ::Next Steps:: - Some optimization are required to tune the access pattern of the table. These would be posted as additional patches. - In case there is any possible split of patches, I will post another version. But until then, this is the layout. ::Version History:: v3->v4: - Fixed missing rework against review comment from Pavan: shift the IOVA mode detection code in bus/fslmc - Rebased over master (abe92131c92) v2->v3: - Added back del operation (update) for mem-events, which was removed in v2 - Change IOMMU(PA) detection for FSLMC Bus (review comment: Pavan) - Rebase on master (6673fe0ce2) v1->v2: - Rework of review comments on v1 - Removed dpaax_iova_table_del API - that is redundant - Changed paax_iova_table_add to paax_iova_table_update to make it more relevant - Previous patch removed an advertised API (rte_dpaa2_memsegs). This is fixed. A deprecation notice would now be sent for removal in next release. - Rebase on master (5f73c2670f); Also verified on net-next/mater (317f8b01f) Shreyansh Jain (5): bus/fslmc: fix physical addressing check drivers: common as dependency for bus common/dpaax: add library for PA VA translation table dpaa: enable dpaax library fslmc: enable dpaax library config/common_base | 5 + config/common_linuxapp | 5 + drivers/Makefile | 1 + drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 + drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 + drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 24 + drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 +- drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 + drivers/event/dpaa/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 + drivers/mempool/dpaa/dpaa_mempool.h | 4 +- drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +- drivers/net/dpaa/Makefile | 1 + drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 2 + 34 files changed, 748 insertions(+), 42 deletions(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 1/5] bus/fslmc: fix physical addressing check 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain @ 2018-10-15 6:41 ` Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 2/5] drivers: common as dependency for bus Shreyansh Jain ` (4 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:41 UTC (permalink / raw) To: ferruh.yigit, thomas Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain, hemant.agrawal In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported class is RTE_IOVA_PA. Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") Cc: hemant.agrawal@nxp.com Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/fslmc_bus.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 960f55071..2bc9457bc 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -496,6 +496,10 @@ rte_dpaa2_get_iommu_class(void) if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; +#endif + /* check if all devices on the bus support Virtual addressing or not */ has_iova_va = fslmc_all_device_support_iova(); -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 2/5] drivers: common as dependency for bus 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-15 6:41 ` Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:41 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain Prior to this patch, bus and common compiled parallel. But, post this dependency is created. This is especially important for the DPAA/FSLMC buses which are going to use the common/dpaax library. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/Makefile b/drivers/Makefile index 75660765e..7d5da5d9f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -5,6 +5,7 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += common DIRS-y += bus +DEPDIRS-bus := common DIRS-y += mempool DEPDIRS-mempool := common bus DIRS-y += net -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 3/5] common/dpaax: add library for PA VA translation table 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 2/5] drivers: common as dependency for bus Shreyansh Jain @ 2018-10-15 6:42 ` Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 4/5] dpaa: enable dpaax library Shreyansh Jain ` (2 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:42 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain A common library, valid for dpaaX drivers, which is used to maintain a local copy of PA->VA translations. In case of physical addressing mode (one of the option for FSLMC, and only option for DPAA bus), the addresses of descriptors Rx'd are physical. These need to be converted into equivalent VA for rte_mbuf and other similar calls. Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This library is an attempt to reduce the overall cost associated with this translation. A small table is maintained, containing continuous entries representing a continguous physical range. Each of these entries stores the equivalent VA, which is fed during mempool creation, or memory allocation/deallocation callbacks. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- config/common_base | 5 + config/common_linuxapp | 5 + drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- 10 files changed, 672 insertions(+), 1 deletion(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map diff --git a/config/common_base b/config/common_base index 8c7ead68d..7f10f7215 100644 --- a/config/common_base +++ b/config/common_base @@ -139,6 +139,11 @@ CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE=n # CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=n + # # Compile the Intel FPGA bus # diff --git a/config/common_linuxapp b/config/common_linuxapp index 485e1467d..76b884c48 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -29,6 +29,11 @@ CONFIG_RTE_PROC_INFO=y CONFIG_RTE_LIBRTE_VMBUS=y CONFIG_RTE_LIBRTE_NETVSC_PMD=y +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=y + # NXP DPAA BUS and drivers CONFIG_RTE_LIBRTE_DPAA_BUS=y CONFIG_RTE_LIBRTE_DPAA_MEMPOOL=y diff --git a/drivers/common/Makefile b/drivers/common/Makefile index b498c238f..6392a3412 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -14,4 +14,8 @@ ifneq (,$(findstring y,$(MVEP-y))) DIRS-y += mvep endif +ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y) +DIRS-y += dpaax +endif + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/common/dpaax/Makefile b/drivers/common/dpaax/Makefile new file mode 100644 index 000000000..94d2cf0ce --- /dev/null +++ b/drivers/common/dpaax/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2018 NXP +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_common_dpaax.a + +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +# versioning export map +EXPORT_MAP := rte_common_dpaax_version.map + +# library version +LIBABIVER := 1 + +# +# all source are stored in SRCS-y +# +SRCS-y += dpaax_iova_table.c + +LDLIBS += -lrte_eal + +SYMLINK-y-include += dpaax_iova_table.h + +include $(RTE_SDK)/mk/rte.lib.mk \ No newline at end of file diff --git a/drivers/common/dpaax/dpaax_iova_table.c b/drivers/common/dpaax/dpaax_iova_table.c new file mode 100644 index 000000000..d54267bb7 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.c @@ -0,0 +1,461 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#include <rte_memory.h> + +#include "dpaax_iova_table.h" +#include "dpaax_logs.h" + +/* Global dpaax logger identifier */ +int dpaax_logger; + +/* Global table reference */ +struct dpaax_iova_table *dpaax_iova_table_p; + +static int dpaax_handle_memevents(void); + +/* A structure representing the device-tree node available in /proc/device-tree. + */ +struct reg_node { + phys_addr_t addr; + size_t len; +}; + +/* A ntohll equivalent routine + * XXX: This is only applicable for 64 bit environment. + */ +static void +rotate_8(unsigned char *arr) +{ + uint32_t temp; + uint32_t *first_half; + uint32_t *second_half; + + first_half = (uint32_t *)(arr); + second_half = (uint32_t *)(arr + 4); + + temp = *first_half; + *first_half = *second_half; + *second_half = temp; + + *first_half = ntohl(*first_half); + *second_half = ntohl(*second_half); +} + +/* read_memory_nodes + * Memory layout for DPAAx platforms (LS1043, LS1046, LS1088, LS2088, LX2160) + * are populated by Uboot and available in device tree: + * /proc/device-tree/memory@<address>/reg <= register. + * Entries are of the form: + * (<8 byte start addr><8 byte length>)(..more similar blocks of start,len>).. + * + * @param count + * OUT populate number of entries found in memory node + * @return + * Pointer to array of reg_node elements, count size + */ +static struct reg_node * +read_memory_node(unsigned int *count) +{ + int fd, ret, i; + unsigned int j; + glob_t result = {0}; + struct stat statbuf = {0}; + char file_data[MEM_NODE_FILE_LEN]; + struct reg_node *nodes = NULL; + + *count = 0; + + ret = glob(MEM_NODE_PATH_GLOB, 0, NULL, &result); + if (ret != 0) { + DPAAX_ERR("Unable to glob device-tree memory node: (%s)(%d)", + MEM_NODE_PATH_GLOB, ret); + goto out; + } + + if (result.gl_pathc != 1) { + /* Either more than one memory@<addr> node found, or none. + * In either case, cannot work ahead. + */ + DPAAX_ERR("Found (%zu) entries in device-tree. Not supported!", + result.gl_pathc); + goto out; + } + + DPAAX_DEBUG("Opening and parsing device-tree node: (%s)", + result.gl_pathv[0]); + fd = open(result.gl_pathv[0], O_RDONLY); + if (fd < 0) { + DPAAX_ERR("Unable to open the device-tree node: (%s)(fd=%d)", + MEM_NODE_PATH_GLOB, fd); + goto cleanup; + } + + /* Stat to get the file size */ + ret = fstat(fd, &statbuf); + if (ret != 0) { + DPAAX_ERR("Unable to get device-tree memory node size."); + goto cleanup; + } + + DPAAX_DEBUG("Size of device-tree mem node: %lu", statbuf.st_size); + if (statbuf.st_size > MEM_NODE_FILE_LEN) { + DPAAX_WARN("More memory nodes available than assumed."); + DPAAX_WARN("System may not work properly!"); + } + + ret = read(fd, file_data, statbuf.st_size > MEM_NODE_FILE_LEN ? + MEM_NODE_FILE_LEN : statbuf.st_size); + if (ret <= 0) { + DPAAX_ERR("Unable to read device-tree memory node: (%d)", ret); + goto cleanup; + } + + /* The reg node should be multiple of 16 bytes, 8 bytes each for addr + * and len. + */ + *count = (statbuf.st_size / 16); + if ((*count) <= 0 || (statbuf.st_size % 16 != 0)) { + DPAAX_ERR("Invalid memory node values or count. (size=%lu)", + statbuf.st_size); + goto cleanup; + } + + /* each entry is of 16 bytes, and size/16 is total count of entries */ + nodes = malloc(sizeof(struct reg_node) * (*count)); + if (!nodes) { + DPAAX_ERR("Failure in allocating working memory."); + goto cleanup; + } + memset(nodes, 0, sizeof(struct reg_node) * (*count)); + + for (i = 0, j = 0; i < (statbuf.st_size) && j < (*count); i += 16, j++) { + memcpy(&nodes[j], file_data + i, 16); + /* Rotate (ntohl) each 8 byte entry */ + rotate_8((unsigned char *)(&(nodes[j].addr))); + rotate_8((unsigned char *)(&(nodes[j].len))); + } + + DPAAX_DEBUG("Device-tree memory node data:"); + do { + DPAAX_DEBUG("\n %08" PRIx64 " %08zu", nodes[j].addr, nodes[j].len); + } while (--j); + +cleanup: + close(fd); + globfree(&result); +out: + return nodes; +} + +int +dpaax_iova_table_populate(void) +{ + int ret; + unsigned int i, node_count; + size_t tot_memory_size, total_table_size; + struct reg_node *nodes; + struct dpaax_iovat_element *entry; + + /* dpaax_iova_table_p is a singleton - only one instance should be + * created. + */ + if (dpaax_iova_table_p) { + DPAAX_DEBUG("Multiple allocation attempt for IOVA Table (%p)", + dpaax_iova_table_p); + /* This can be an error case as well - some path not cleaning + * up table - but, for now, it is assumed that if IOVA Table + * pointer is valid, table is allocated. + */ + return 0; + } + + nodes = read_memory_node(&node_count); + if (nodes == NULL || node_count <= 0) { + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + return -1; + } + + tot_memory_size = 0; + for (i = 0; i < node_count; i++) + tot_memory_size += nodes[i].len; + + DPAAX_DEBUG("Total available PA memory size: %zu", tot_memory_size); + + /* Total table size = meta data + tot_memory_size/8 */ + total_table_size = sizeof(struct dpaax_iova_table) + + (sizeof(struct dpaax_iovat_element) * node_count) + + ((tot_memory_size / DPAAX_MEM_SPLIT) * sizeof(uint64_t)); + + /* TODO: This memory doesn't need to shared but needs to be always + * pinned to RAM (no swap out) - using hugepage rather than malloc + */ + dpaax_iova_table_p = rte_zmalloc(NULL, total_table_size, 0); + if (dpaax_iova_table_p == NULL) { + DPAAX_WARN("Unable to allocate memory for PA->VA Table;"); + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + free(nodes); + return -1; + } + + /* Initialize table */ + dpaax_iova_table_p->count = node_count; + entry = dpaax_iova_table_p->entries; + + DPAAX_DEBUG("IOVA Table entries: (entry start = %p)", (void *)entry); + DPAAX_DEBUG("\t(entry),(start),(len),(next)"); + + for (i = 0; i < node_count; i++) { + /* dpaax_iova_table_p + * | dpaax_iova_table_p->entries + * | | + * | | + * V V + * +------+------+-------+---+----------+---------+--- + * |iova_ |entry | entry | | pages | pages | + * |table | 1 | 2 |...| entry 1 | entry2 | + * +-----'+.-----+-------+---+;---------+;--------+--- + * \ \ / / + * `~~~~~~|~~~~~>pages / + * \ / + * `~~~~~~~~~~~>pages + */ + entry[i].start = nodes[i].addr; + entry[i].len = nodes[i].len; + if (i > 0) + entry[i].pages = entry[i-1].pages + + ((entry[i-1].len/DPAAX_MEM_SPLIT)); + else + entry[i].pages = (uint64_t *)((unsigned char *)entry + + (sizeof(struct dpaax_iovat_element) * + node_count)); + + DPAAX_DEBUG("\t(%u),(%8"PRIx64"),(%8zu),(%8p)", + i, entry[i].start, entry[i].len, entry[i].pages); + } + + /* Release memory associated with nodes array - not required now */ + free(nodes); + + DPAAX_DEBUG("Adding mem-event handler\n"); + ret = dpaax_handle_memevents(); + if (ret) { + DPAAX_ERR("Unable to add mem-event handler"); + DPAAX_WARN("Cases with non-buffer pool mem won't work!"); + } + + return 0; +} + +void +dpaax_iova_table_depopulate(void) +{ + if (dpaax_iova_table_p == NULL) + return; + + rte_free(dpaax_iova_table_p->entries); + dpaax_iova_table_p = NULL; + + DPAAX_DEBUG("IOVA Table cleanedup"); +} + +int +dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length) +{ + int found = 0; + unsigned int i; + size_t req_length = length, e_offset; + struct dpaax_iovat_element *entry; + uintptr_t align_vaddr; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + align_vaddr = ((uintptr_t)vaddr & DPAAX_MEM_SPLIT_MASK); + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Add: Incorrect entry for PA->VA Table" + "(%"PRIu64")", paddr); + DPAAX_ERR("Add: Lowest address: %"PRIu64"", + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + + do { + e_offset = ((align_paddr - entry[i].start) / DPAAX_MEM_SPLIT); + /* TODO: Whatif something already exists at this + * location - is that an error? For now, ignoring the + * case. + */ + entry[i].pages[e_offset] = align_vaddr; + DPAAX_DEBUG("Added: vaddr=%zu for Phy:%"PRIu64" at %zu" + " remaining len %zu", align_vaddr, + align_paddr, e_offset, req_length); + + /* Incoming request can be larger than the + * DPAAX_MEM_SPLIT size - in which case, multiple + * entries in entry->pages[] are filled up. + */ + if (req_length <= DPAAX_MEM_SPLIT) + break; + align_paddr += DPAAX_MEM_SPLIT; + align_vaddr += DPAAX_MEM_SPLIT; + req_length -= DPAAX_MEM_SPLIT; + } while (1); + + break; + } + + if (!found) { + /* There might be case where the incoming physical address is + * beyond the address discovered in the memory node of + * device-tree. Specially if some malloc'd area is used by EAL + * and the memevent handlers passes that across. But, this is + * not necessarily an error. + */ + DPAAX_DEBUG("Add: Unable to find slot for vaddr:(%p)," + " phy(%"PRIu64")", + vaddr, paddr); + return -1; + } + + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, + vaddr, paddr, length); + return 0; +} + +/* dpaax_iova_table_dump + * Dump the table, with its entries, on screen. Only works in Debug Mode + * Not for weak hearted - the tables can get quite large + */ +void +dpaax_iova_table_dump(void) +{ + unsigned int i, j; + struct dpaax_iovat_element *entry; + + /* In case DEBUG is not enabled, some 'if' conditions might misbehave + * as they have nothing else in them except a DPAAX_DEBUG() which if + * tuned out would leave 'if' naked. + */ + if (rte_log_get_global_level() < RTE_LOG_DEBUG) { + DPAAX_ERR("Set log level to Debug for PA->Table dump!"); + return; + } + + DPAAX_DEBUG(" === Start of PA->VA Translation Table ==="); + if (dpaax_iova_table_p == NULL) + DPAAX_DEBUG("\tNULL"); + + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + DPAAX_DEBUG("\t(%16i),(%16"PRIu64"),(%16zu),(%16p)", + i, entry[i].start, entry[i].len, entry[i].pages); + DPAAX_DEBUG("\t\t (PA), (VA)"); + for (j = 0; j < (entry->len/DPAAX_MEM_SPLIT); j++) { + if (entry[i].pages[j] == 0) + continue; + DPAAX_DEBUG("\t\t(%16"PRIx64"),(%16"PRIx64")", + (entry[i].start + (j * sizeof(uint64_t))), + entry[i].pages[j]); + } + } + DPAAX_DEBUG(" === End of PA->VA Translation Table ==="); +} + +static void +dpaax_memevent_cb(enum rte_mem_event type, const void *addr, size_t len, + void *arg __rte_unused) +{ + struct rte_memseg_list *msl; + struct rte_memseg *ms; + size_t cur_len = 0, map_len = 0; + phys_addr_t phys_addr; + void *virt_addr; + int ret; + + DPAAX_DEBUG("Called with addr=%p, len=%zu", addr, len); + + msl = rte_mem_virt2memseg_list(addr); + + while (cur_len < len) { + const void *va = RTE_PTR_ADD(addr, cur_len); + + ms = rte_mem_virt2memseg(va, msl); + phys_addr = rte_mem_virt2phy(ms->addr); + virt_addr = ms->addr; + map_len = ms->len; + + DPAAX_DEBUG("Request for %s, va=%p, virt_addr=%p," + "iova=%"PRIu64", map_len=%zu", + type == RTE_MEM_EVENT_ALLOC ? + "alloc" : "dealloc", + va, virt_addr, phys_addr, map_len); + + if (type == RTE_MEM_EVENT_ALLOC) + ret = dpaax_iova_table_update(phys_addr, virt_addr, + map_len); + else + /* In case of mem_events for MEM_EVENT_FREE, complete + * hugepage is released and its PA entry is set to 0. + */ + ret = dpaax_iova_table_update(phys_addr, 0, map_len); + + if (ret != 0) { + DPAAX_ERR("PA-Table entry update failed. " + "Map=%d, addr=%p, len=%zu, err:(%d)", + type, va, map_len, ret); + return; + } + + cur_len += map_len; + } +} + +static int +dpaax_memevent_walk_memsegs(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, size_t len, + void *arg __rte_unused) +{ + DPAAX_DEBUG("Walking for %p (pa=%"PRIu64") and len %zu", + ms->addr, ms->phys_addr, len); + dpaax_iova_table_update(rte_mem_virt2phy(ms->addr), ms->addr, len); + return 0; +} + +static int +dpaax_handle_memevents(void) +{ + /* First, walk through all memsegs and pin them, before installing + * handler. This assures that all memseg which have already been + * identified/allocated by EAL, are already part of PA->VA Table. This + * is especially for cases where application allocates memory before + * the EAL or this is an externally allocated memory passed to EAL. + */ + rte_memseg_contig_walk_thread_unsafe(dpaax_memevent_walk_memsegs, NULL); + + return rte_mem_event_callback_register("dpaax_memevents_cb", + dpaax_memevent_cb, NULL); +} + +RTE_INIT(dpaax_log) +{ + dpaax_logger = rte_log_register("pmd.common.dpaax"); + if (dpaax_logger >= 0) + rte_log_set_level(dpaax_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/dpaax/dpaax_iova_table.h b/drivers/common/dpaax/dpaax_iova_table.h new file mode 100644 index 000000000..3e913ef45 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_IOVA_TABLE_H_ +#define _DPAAX_IOVA_TABLE_H_ + +#include <unistd.h> +#include <stdio.h> +#include <string.h> +#include <stdbool.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <dirent.h> +#include <fcntl.h> +#include <glob.h> +#include <errno.h> +#include <arpa/inet.h> + +#include <rte_eal.h> +#include <rte_branch_prediction.h> +#include <rte_memory.h> +#include <rte_malloc.h> + +struct dpaax_iovat_element { + phys_addr_t start; /**< Start address of block of physical pages */ + size_t len; /**< Difference of end-start for quick access */ + uint64_t *pages; /**< VA for each physical page in this block */ +}; + +struct dpaax_iova_table { + unsigned int count; /**< No. of blocks of contiguous physical pages */ + struct dpaax_iovat_element entries[0]; +}; + +/* Pointer to the table, which is common for DPAA/DPAA2 and only a single + * instance is required across net/crypto/event drivers. This table is + * populated iff devices are found on the bus. + */ +extern struct dpaax_iova_table *dpaax_iova_table_p; + +/* Device tree file for memory layout is named 'memory@<addr>' where the 'addr' + * is SoC dependent, or even Uboot fixup dependent. + */ +#define MEM_NODE_PATH_GLOB "/proc/device-tree/memory[@0-9]*/reg" +/* Device file should be multiple of 16 bytes, each containing 8 byte of addr + * and its length. Assuming max of 5 entries. + */ +#define MEM_NODE_FILE_LEN ((16 * 5) + 1) + +/* Table is made up of DPAAX_MEM_SPLIT elements for each contiguous zone. This + * helps avoid separate handling for cases where more than one size of hugepage + * is supported. + */ +#define DPAAX_MEM_SPLIT (1<<21) +#define DPAAX_MEM_SPLIT_MASK ~(DPAAX_MEM_SPLIT - 1) /**< Floor aligned */ +#define DPAAX_MEM_SPLIT_MASK_OFF (DPAAX_MEM_SPLIT - 1) /**< Offset */ + +/* APIs exposed */ +int dpaax_iova_table_populate(void); +void dpaax_iova_table_depopulate(void); +int dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length); +void dpaax_iova_table_dump(void); + +static inline void *dpaax_iova_table_get_va(phys_addr_t paddr) __attribute__((hot)); + +static inline void * +dpaax_iova_table_get_va(phys_addr_t paddr) { + unsigned int i = 0, index; + void *vaddr = 0; + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; + struct dpaax_iovat_element *entry; + + entry = dpaax_iova_table_p->entries; + + do { + if (unlikely(i > dpaax_iova_table_p->count)) + break; + + if (paddr_align < entry[i].start) { + /* Incorrect paddr; Not in memory range */ + return NULL; + } + + if (paddr_align > (entry[i].start + entry[i].len)) { + i++; + continue; + } + + /* paddr > entry->start && paddr <= entry->(start+len) */ + index = (paddr_align - entry[i].start)/DPAAX_MEM_SPLIT; + vaddr = (void *)((uintptr_t)entry[i].pages[index] + offset); + break; + } while (1); + + return vaddr; +} + +#endif /* _DPAAX_IOVA_TABLE_H_ */ diff --git a/drivers/common/dpaax/dpaax_logs.h b/drivers/common/dpaax/dpaax_logs.h new file mode 100644 index 000000000..bf1b27cc1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_logs.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_LOGS_H_ +#define _DPAAX_LOGS_H_ + +#include <rte_log.h> + +extern int dpaax_logger; + +#define DPAAX_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, dpaax_logger, "dpaax: " fmt "\n", \ + ##args) + +/* Debug logs are with Function names */ +#define DPAAX_DEBUG(fmt, args...) \ + rte_log(RTE_LOG_DEBUG, dpaax_logger, "dpaax: %s(): " fmt "\n", \ + __func__, ##args) + +#define DPAAX_INFO(fmt, args...) \ + DPAAX_LOG(INFO, fmt, ## args) +#define DPAAX_ERR(fmt, args...) \ + DPAAX_LOG(ERR, fmt, ## args) +#define DPAAX_WARN(fmt, args...) \ + DPAAX_LOG(WARNING, fmt, ## args) + +/* DP Logs, toggled out at compile time if level lower than current level */ +#define DPAAX_DP_LOG(level, fmt, args...) \ + RTE_LOG_DP(level, PMD, fmt, ## args) + +#define DPAAX_DP_DEBUG(fmt, args...) \ + DPAAX_DP_LOG(DEBUG, fmt, ## args) +#define DPAAX_DP_INFO(fmt, args...) \ + DPAAX_DP_LOG(INFO, fmt, ## args) +#define DPAAX_DP_WARN(fmt, args...) \ + DPAAX_DP_LOG(WARNING, fmt, ## args) + +#endif /* _DPAAX_LOGS_H_ */ diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build new file mode 100644 index 000000000..98a1bdd48 --- /dev/null +++ b/drivers/common/dpaax/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 NXP + +allow_experimental_apis = true + +if host_machine.system() != 'linux' + build = false +endif + +sources = files('dpaax_iova_table.c') + +cflags += ['-D_GNU_SOURCE'] diff --git a/drivers/common/dpaax/rte_common_dpaax_version.map b/drivers/common/dpaax/rte_common_dpaax_version.map new file mode 100644 index 000000000..8131c9e30 --- /dev/null +++ b/drivers/common/dpaax/rte_common_dpaax_version.map @@ -0,0 +1,11 @@ +DPDK_18.11 { + global: + + dpaax_iova_table_update; + dpaax_iova_table_depopulate; + dpaax_iova_table_dump; + dpaax_iova_table_p; + dpaax_iova_table_populate; + + local: *; +}; diff --git a/drivers/common/meson.build b/drivers/common/meson.build index f828ce7f7..0257d4d2b 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -2,6 +2,6 @@ # Copyright(c) 2018 Cavium, Inc std_deps = ['eal'] -drivers = ['mvep', 'octeontx', 'qat'] +drivers = ['dpaax', 'mvep', 'octeontx', 'qat'] config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON' driver_name_fmt = 'rte_common_@0@' -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 4/5] dpaa: enable dpaax library 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (2 preceding siblings ...) 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-10-15 6:42 ` Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 5/5] fslmc: " Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:42 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, dpaa bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa, event/dpaa and net/dpaa as they are dependent on the bus/dpaa and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 ++++ drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 ++++++ drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 ++++++ drivers/event/dpaa/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 ++++ drivers/mempool/dpaa/dpaa_mempool.h | 4 +--- drivers/net/dpaa/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 28 insertions(+), 4 deletions(-) diff --git a/drivers/bus/dpaa/Makefile b/drivers/bus/dpaa/Makefile index 9337b5f92..381a5c659 100644 --- a/drivers/bus/dpaa/Makefile +++ b/drivers/bus/dpaa/Makefile @@ -48,5 +48,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += \ LDLIBS += -lpthread LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c index 138e0f98d..381c3b17c 100644 --- a/drivers/bus/dpaa/dpaa_bus.c +++ b/drivers/bus/dpaa/dpaa_bus.c @@ -34,6 +34,7 @@ #include <rte_dpaa_bus.h> #include <rte_dpaa_logs.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -548,6 +549,9 @@ rte_dpaa_bus_probe(void) fclose(svr_file); } + /* And initialize the PA->VA translation table */ + dpaax_iova_table_populate(); + /* For each registered driver, and device, call the driver->probe */ TAILQ_FOREACH(dev, &rte_dpaa_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_dpaa_bus.driver_list, next) { diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build index 5e7705571..11a3c9499 100644 --- a/drivers/bus/dpaa/meson.build +++ b/drivers/bus/dpaa/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev'] +deps += ['common_dpaax', 'eventdev'] sources = files('base/fman/fman.c', 'base/fman/fman_hw.c', 'base/fman/netcfg_layer.c', diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h index 15dc6a4ac..1d580a000 100644 --- a/drivers/bus/dpaa/rte_dpaa_bus.h +++ b/drivers/bus/dpaa/rte_dpaa_bus.h @@ -8,6 +8,7 @@ #include <rte_bus.h> #include <rte_mempool.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -110,6 +111,11 @@ extern struct dpaa_memseg_list rte_dpaa_memsegs; static inline void *rte_dpaa_mem_ptov(phys_addr_t paddr) { struct dpaa_memseg *ms; + void *va; + + va = dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* Check if the address is already part of the memseg list internally * maintained by the dpaa driver. diff --git a/drivers/crypto/dpaa_sec/Makefile b/drivers/crypto/dpaa_sec/Makefile index 9be447041..674a7a398 100644 --- a/drivers/crypto/dpaa_sec/Makefile +++ b/drivers/crypto/dpaa_sec/Makefile @@ -38,5 +38,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA_SEC) += dpaa_sec.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 7c0459f9f..54f1913f2 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -107,6 +107,12 @@ dpaa_mem_vtop(void *vaddr) static inline void * dpaa_mem_ptov(rte_iova_t paddr) { + void *va; + + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va)) + return va; + return rte_mem_iova2virt(paddr); } diff --git a/drivers/event/dpaa/Makefile b/drivers/event/dpaa/Makefile index ddd855227..6f93e7f40 100644 --- a/drivers/event/dpaa/Makefile +++ b/drivers/event/dpaa/Makefile @@ -34,5 +34,6 @@ LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS += -lrte_eventdev -lrte_pmd_dpaa -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/Makefile b/drivers/mempool/dpaa/Makefile index da8da1e90..9cf36856c 100644 --- a/drivers/mempool/dpaa/Makefile +++ b/drivers/mempool/dpaa/Makefile @@ -31,5 +31,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += dpaa_mempool.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c index 1c121223b..b05fb7b9d 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.c +++ b/drivers/mempool/dpaa/dpaa_mempool.c @@ -26,6 +26,7 @@ #include <rte_ring.h> #include <dpaa_mempool.h> +#include <dpaax_iova_table.h> /* List of all the memseg information locally maintained in dpaa driver. This * is to optimize the PA_to_VA searches until a better mechanism (algo) is @@ -285,6 +286,9 @@ dpaa_populate(struct rte_mempool *mp, unsigned int max_objs, return 0; } + /* Update the PA-VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); + bp_info = DPAA_MEMPOOL_TO_POOL_INFO(mp); total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h index 092f326cb..533e1c6e2 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.h +++ b/drivers/mempool/dpaa/dpaa_mempool.h @@ -43,10 +43,8 @@ struct dpaa_bp_info { }; static inline void * -DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info, uint64_t addr) +DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info __rte_unused, uint64_t addr) { - if (bp_info->ptov_off) - return ((void *) (size_t)(addr + bp_info->ptov_off)); return rte_dpaa_mem_ptov(addr); } diff --git a/drivers/net/dpaa/Makefile b/drivers/net/dpaa/Makefile index d7a0a50c5..1c4f7d914 100644 --- a/drivers/net/dpaa/Makefile +++ b/drivers/net/dpaa/Makefile @@ -38,6 +38,7 @@ LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA_PMD)-include := rte_pmd_dpaa.h diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 3ece996e8..85605e38e 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -117,6 +117,7 @@ ifeq ($(CONFIG_RTE_BUILD_SHARED_LIB),n) _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_BUCKET) += -lrte_mempool_bucket _LDLIBS-$(CONFIG_RTE_DRIVER_MEMPOOL_STACK) += -lrte_mempool_stack ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 5/5] fslmc: enable dpaax library 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (3 preceding siblings ...) 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 4/5] dpaa: enable dpaax library Shreyansh Jain @ 2018-10-15 6:42 ` Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 6:42 UTC (permalink / raw) To: ferruh.yigit, thomas; +Cc: anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, fslmc bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa2, event/dpaa2, net/dpaa2, raw/dpaa2_cmdif and raw/dpaa2_qdma as they are dependent on the bus/fslmc and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 20 ++++++++++++++++ drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 ++++++++--------- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 ++++-------------------- drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 1 + 12 files changed, 43 insertions(+), 37 deletions(-) diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile index e95551980..218d9bd28 100644 --- a/drivers/bus/fslmc/Makefile +++ b/drivers/bus/fslmc/Makefile @@ -19,6 +19,7 @@ CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax # versioning export map EXPORT_MAP := rte_bus_fslmc_version.map diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 2bc9457bc..5ba5ce96b 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -20,6 +20,8 @@ #include <fslmc_vfio.h> #include "fslmc_logs.h" +#include <dpaax_iova_table.h> + int dpaa2_logtype_bus; #define VFIO_IOMMU_GROUP_PATH "/sys/kernel/iommu_groups" @@ -377,6 +379,19 @@ rte_fslmc_probe(void) probe_all = rte_fslmc_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST; + /* In case of PA, the FD addresses returned by qbman APIs are physical + * addresses, which need conversion into equivalent VA address for + * rte_mbuf. For that, a table (a serial array, in memory) is used to + * increase translation efficiency. + * This has to be done before probe as some device initialization + * (during) probe allocate memory (dpaa2_sec) which needs to be pinned + * to this table. + */ + ret = dpaax_iova_table_populate(); + if (ret) { + DPAA2_BUS_WARN("PA->VA Translation table not available;"); + } + TAILQ_FOREACH(dev, &rte_fslmc_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_fslmc_bus.driver_list, next) { ret = rte_fslmc_match(drv, dev); @@ -456,6 +471,11 @@ rte_fslmc_driver_unregister(struct rte_dpaa2_driver *driver) fslmc_bus = driver->fslmc_bus; + /* Cleanup the PA->VA Translation table; From whereever this function + * is called from. + */ + dpaax_iova_table_depopulate(); + TAILQ_REMOVE(&fslmc_bus->driver_list, driver, next); /* Update Bus references */ driver->fslmc_bus = NULL; diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build index 54ca92d0c..18c45495b 100644 --- a/drivers/bus/fslmc/meson.build +++ b/drivers/bus/fslmc/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev', 'kvargs'] +deps += ['common_dpaax', 'eventdev', 'kvargs'] sources = files('fslmc_bus.c', 'fslmc_vfio.c', 'mc/dpbp.c', diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h index 820759360..678ee34b8 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h +++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h @@ -9,6 +9,7 @@ #define _DPAA2_HW_PVT_H_ #include <rte_eventdev.h> +#include <dpaax_iova_table.h> #include <mc/fsl_mc_sys.h> #include <fsl_qbman_portal.h> @@ -275,28 +276,26 @@ extern struct dpaa2_memseg_list rte_dpaa2_memsegs; #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA extern uint8_t dpaa2_virt_mode; static void *dpaa2_mem_ptov(phys_addr_t paddr) __attribute__((unused)); -/* todo - this is costly, need to write a fast coversion routine */ + static void *dpaa2_mem_ptov(phys_addr_t paddr) { - struct dpaa2_memseg *ms; + void *va; if (dpaa2_virt_mode) return (void *)(size_t)paddr; - /* Check if the address is already part of the memseg list internally - * maintained by the dpaa2 driver. - */ - TAILQ_FOREACH(ms, &rte_dpaa2_memsegs, next) { - if (paddr >= ms->iova && paddr < - ms->iova + ms->len) - return RTE_PTR_ADD(ms->vaddr, (uintptr_t)(paddr - ms->iova)); - } + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* If not, Fallback to full memseg list searching */ - return rte_mem_iova2virt(paddr); + va = rte_mem_iova2virt(paddr); + + return va; } static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) __attribute__((unused)); + static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) { const struct rte_memseg *memseg; diff --git a/drivers/crypto/dpaa2_sec/Makefile b/drivers/crypto/dpaa2_sec/Makefile index da3d8f84f..1f951a14b 100644 --- a/drivers/crypto/dpaa2_sec/Makefile +++ b/drivers/crypto/dpaa2_sec/Makefile @@ -51,5 +51,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_cryptodev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/event/dpaa2/Makefile b/drivers/event/dpaa2/Makefile index 5e1a63200..7a71161de 100644 --- a/drivers/event/dpaa2/Makefile +++ b/drivers/event/dpaa2/Makefile @@ -21,6 +21,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal LDLIBS += -lrte_eal -lrte_eventdev LDLIBS += -lrte_bus_fslmc -lrte_mempool_dpaa2 -lrte_pmd_dpaa2 LDLIBS += -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2 CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2/mc diff --git a/drivers/mempool/dpaa2/Makefile b/drivers/mempool/dpaa2/Makefile index 9e4c87d79..0fc69c3bf 100644 --- a/drivers/mempool/dpaa2/Makefile +++ b/drivers/mempool/dpaa2/Makefile @@ -30,6 +30,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += dpaa2_hw_mempool.c LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)-include := rte_dpaa2_mempool.h diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c index 84ff12811..c5f60c5c6 100644 --- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c +++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c @@ -30,6 +30,8 @@ #include "dpaa2_hw_mempool.h" #include "dpaa2_hw_mempool_logs.h" +#include <dpaax_iova_table.h> + struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID]; static struct dpaa2_bp_list *h_bp_list; @@ -393,31 +395,8 @@ dpaa2_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t paddr, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - struct dpaa2_memseg *ms; - - /* For each memory chunk pinned to the Mempool, a linked list of the - * contained memsegs is created for searching when PA to VA - * conversion is required. - */ - ms = rte_zmalloc(NULL, sizeof(struct dpaa2_memseg), 0); - if (!ms) { - DPAA2_MEMPOOL_ERR("Unable to allocate internal memory."); - DPAA2_MEMPOOL_WARN("Fast Physical to Virtual Addr translation would not be available."); - /* If the element is not added, it would only lead to failure - * in searching for the element and the logic would Fallback - * to traditional DPDK memseg traversal code. So, this is not - * a blocking error - but, error would be printed on screen. - */ - return 0; - } - - ms->vaddr = vaddr; - ms->iova = paddr; - ms->len = len; - /* Head insertions are generally faster than tail insertions as the - * buffers pinned are picked from rear end. - */ - TAILQ_INSERT_HEAD(&rte_dpaa2_memsegs, ms, next); + /* Insert entry into the PA->VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); return rte_mempool_op_populate_default(mp, max_objs, vaddr, paddr, len, obj_cb, obj_cb_arg); diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile index 9b0b14331..52649a945 100644 --- a/drivers/net/dpaa2/Makefile +++ b/drivers/net/dpaa2/Makefile @@ -40,5 +40,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_cmdif/Makefile b/drivers/raw/dpaa2_cmdif/Makefile index 9b863dda2..3c56c4b44 100644 --- a/drivers/raw/dpaa2_cmdif/Makefile +++ b/drivers/raw/dpaa2_cmdif/Makefile @@ -21,6 +21,7 @@ LDLIBS += -lrte_eal LDLIBS += -lrte_kvargs LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_cmdif_version.map diff --git a/drivers/raw/dpaa2_qdma/Makefile b/drivers/raw/dpaa2_qdma/Makefile index d88809ead..2f79a3f41 100644 --- a/drivers/raw/dpaa2_qdma/Makefile +++ b/drivers/raw/dpaa2_qdma/Makefile @@ -22,6 +22,7 @@ LDLIBS += -lrte_mempool LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev LDLIBS += -lrte_ring +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_qdma_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 85605e38e..f2218df39 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -121,6 +121,7 @@ _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += -lrte_mempool_dpaa endif ifeq ($(CONFIG_RTE_EAL_VFIO)$(CONFIG_RTE_LIBRTE_FSLMC_BUS),yy) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += -lrte_mempool_dpaa2 endif -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 0/5] 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain ` (4 preceding siblings ...) 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 5/5] fslmc: " Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain ` (5 more replies) 5 siblings, 6 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas; +Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain ::Background:: After the restructuring of memory in last release(s), one of the major impact on fslmc/dpaa bus (and its devices) was the performance drop when using physical addressing. Previously, it was assumed that physical range was contiguous for any given request for hugepage memory. That way, whenever a virtual address was returned, it was easy to fetch physical equivalent, in almost constant time. But, with memory hotplug series, that assumption was negated. Every call that device drivers made for rte_mem_virt2iova or rte_mem_virt2phy were expensive. (Using IOVA_CONTIG is an app dependency which is not a practical option). For fslmc, working on Physical or Virtual (IOMMU supported) address is an optional thing. For dpaa bus, it is not optional and only physical addressing is supported. Thus, it impacted dpaa bus the most. ::DPAAX PA-VA Table:: - A simple table containing entries for all physical memory range available on a particular SoC (in this case, NXP's LS104x and LS20xx series, which are handled by dpaa and fslmc bus, respectively). As of now, this is SoC dependent for fetching range. - We populate the table either through the mempool handler (for mempool pinned memory) or through the memory event callbacks (for cases where working memory is allocated by application). - Though aim is only to translate addresses for descriptors which are Rx'd from devices, this is a generic layer which should work in other cases as well (though, not the target of current testing). ::About patches:: Patch 1: There was an issue in existing PA/VA mode reporting being done by fslmc bus. This patch fixes it. Patch 2: Common libraries/commponents can be dependency for the bus thus, blocking parallel compilation Patch 3: Add the library in common/dpaax. This is a single patch as functions are mostly inter-linked. Patch 4~5: Add support in dpaa and fslmc bus, respectively. It is not possible to unlink the bus and device drivers, thus, these patches have blanket change across all drivers. ::Next Steps:: - Some optimization are required to tune the access pattern of the table. These would be posted as additional patches. - In case there is any possible split of patches, I will post another version. But until then, this is the layout. ::Version History:: v4->v5: - Fixed a shared build error appearing on CI (couldn't reproduce locally so, moved the shared build sections in makefile) v3->v4: - Fixed missing rework against review comment from Pavan: shift the IOVA mode detection code in bus/fslmc - Rebased over master (abe92131c92) v2->v3: - Added back del operation (update) for mem-events, which was removed in v2 - Change IOMMU(PA) detection for FSLMC Bus (review comment: Pavan) - Rebase on master (6673fe0ce2) v1->v2: - Rework of review comments on v1 - Removed dpaax_iova_table_del API - that is redundant - Changed paax_iova_table_add to paax_iova_table_update to make it more relevant - Previous patch removed an advertised API (rte_dpaa2_memsegs). This is fixed. A deprecation notice would now be sent for removal in next release. - Rebase on master (5f73c2670f); Also verified on net-next/mater (317f8b01f) Shreyansh Jain (5): bus/fslmc: fix physical addressing check drivers: common as dependency for bus common/dpaax: add library for PA VA translation table dpaa: enable dpaax library fslmc: enable dpaax library config/common_base | 5 + config/common_linuxapp | 5 + drivers/Makefile | 1 + drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 + drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 + drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 24 + drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 +- drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 + drivers/event/dpaa/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 + drivers/mempool/dpaa/dpaa_mempool.h | 4 +- drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 +- drivers/net/dpaa/Makefile | 1 + drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 7 + 34 files changed, 753 insertions(+), 42 deletions(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-16 10:02 ` Thomas Monjalon 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 2/5] drivers: common as dependency for bus Shreyansh Jain ` (4 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain, hemant.agrawal In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported class is RTE_IOVA_PA. Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") Cc: hemant.agrawal@nxp.com Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/fslmc_bus.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 960f55071..2bc9457bc 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -496,6 +496,10 @@ rte_dpaa2_get_iommu_class(void) if (TAILQ_EMPTY(&rte_fslmc_bus.device_list)) return RTE_IOVA_DC; +#ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA + return RTE_IOVA_PA; +#endif + /* check if all devices on the bus support Virtual addressing or not */ has_iova_va = fslmc_all_device_support_iova(); -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-16 10:02 ` Thomas Monjalon 0 siblings, 0 replies; 53+ messages in thread From: Thomas Monjalon @ 2018-10-16 10:02 UTC (permalink / raw) To: Shreyansh Jain Cc: dev, ferruh.yigit, anatoly.burakov, pbhagavatula, hemant.agrawal, stable 15/10/2018 14:01, Shreyansh Jain: > In case RTE_LIBRTE_DPAA2_USE_PHYS_IOVA is enabled, only supported > class is RTE_IOVA_PA. > > Fixes: f7768afac101 ("bus/fslmc: support dynamic IOVA") > Cc: hemant.agrawal@nxp.com + Cc: stable@dpdk.org ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 2/5] drivers: common as dependency for bus 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain ` (3 subsequent siblings) 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas; +Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain Prior to this patch, bus and common compiled parallel. But, post this dependency is created. This is especially important for the DPAA/FSLMC buses which are going to use the common/dpaax library. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/Makefile | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/Makefile b/drivers/Makefile index 75660765e..7d5da5d9f 100644 --- a/drivers/Makefile +++ b/drivers/Makefile @@ -5,6 +5,7 @@ include $(RTE_SDK)/mk/rte.vars.mk DIRS-y += common DIRS-y += bus +DEPDIRS-bus := common DIRS-y += mempool DEPDIRS-mempool := common bus DIRS-y += net -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 2/5] drivers: common as dependency for bus Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-15 23:17 ` Thomas Monjalon 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 4/5] dpaa: enable dpaax library Shreyansh Jain ` (2 subsequent siblings) 5 siblings, 1 reply; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas; +Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain A common library, valid for dpaaX drivers, which is used to maintain a local copy of PA->VA translations. In case of physical addressing mode (one of the option for FSLMC, and only option for DPAA bus), the addresses of descriptors Rx'd are physical. These need to be converted into equivalent VA for rte_mbuf and other similar calls. Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This library is an attempt to reduce the overall cost associated with this translation. A small table is maintained, containing continuous entries representing a continguous physical range. Each of these entries stores the equivalent VA, which is fed during mempool creation, or memory allocation/deallocation callbacks. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- config/common_base | 5 + config/common_linuxapp | 5 + drivers/common/Makefile | 4 + drivers/common/dpaax/Makefile | 31 ++ drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ drivers/common/dpaax/dpaax_logs.h | 39 ++ drivers/common/dpaax/meson.build | 12 + .../common/dpaax/rte_common_dpaax_version.map | 11 + drivers/common/meson.build | 2 +- 10 files changed, 672 insertions(+), 1 deletion(-) create mode 100644 drivers/common/dpaax/Makefile create mode 100644 drivers/common/dpaax/dpaax_iova_table.c create mode 100644 drivers/common/dpaax/dpaax_iova_table.h create mode 100644 drivers/common/dpaax/dpaax_logs.h create mode 100644 drivers/common/dpaax/meson.build create mode 100644 drivers/common/dpaax/rte_common_dpaax_version.map diff --git a/config/common_base b/config/common_base index 8c7ead68d..7f10f7215 100644 --- a/config/common_base +++ b/config/common_base @@ -139,6 +139,11 @@ CONFIG_RTE_ETHDEV_PROFILE_WITH_VTUNE=n # CONFIG_RTE_ETHDEV_TX_PREPARE_NOOP=n +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=n + # # Compile the Intel FPGA bus # diff --git a/config/common_linuxapp b/config/common_linuxapp index 485e1467d..76b884c48 100644 --- a/config/common_linuxapp +++ b/config/common_linuxapp @@ -29,6 +29,11 @@ CONFIG_RTE_PROC_INFO=y CONFIG_RTE_LIBRTE_VMBUS=y CONFIG_RTE_LIBRTE_NETVSC_PMD=y +# +# Common libraries, before Bus/PMDs +# +CONFIG_RTE_LIBRTE_COMMON_DPAAX=y + # NXP DPAA BUS and drivers CONFIG_RTE_LIBRTE_DPAA_BUS=y CONFIG_RTE_LIBRTE_DPAA_MEMPOOL=y diff --git a/drivers/common/Makefile b/drivers/common/Makefile index b498c238f..6392a3412 100644 --- a/drivers/common/Makefile +++ b/drivers/common/Makefile @@ -14,4 +14,8 @@ ifneq (,$(findstring y,$(MVEP-y))) DIRS-y += mvep endif +ifeq ($(CONFIG_RTE_LIBRTE_COMMON_DPAAX),y) +DIRS-y += dpaax +endif + include $(RTE_SDK)/mk/rte.subdir.mk diff --git a/drivers/common/dpaax/Makefile b/drivers/common/dpaax/Makefile new file mode 100644 index 000000000..94d2cf0ce --- /dev/null +++ b/drivers/common/dpaax/Makefile @@ -0,0 +1,31 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright 2018 NXP +# + +include $(RTE_SDK)/mk/rte.vars.mk + +# +# library name +# +LIB = librte_common_dpaax.a + +CFLAGS += -DALLOW_EXPERIMENTAL_API +CFLAGS += -O3 +CFLAGS += $(WERROR_FLAGS) + +# versioning export map +EXPORT_MAP := rte_common_dpaax_version.map + +# library version +LIBABIVER := 1 + +# +# all source are stored in SRCS-y +# +SRCS-y += dpaax_iova_table.c + +LDLIBS += -lrte_eal + +SYMLINK-y-include += dpaax_iova_table.h + +include $(RTE_SDK)/mk/rte.lib.mk \ No newline at end of file diff --git a/drivers/common/dpaax/dpaax_iova_table.c b/drivers/common/dpaax/dpaax_iova_table.c new file mode 100644 index 000000000..d54267bb7 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.c @@ -0,0 +1,461 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#include <rte_memory.h> + +#include "dpaax_iova_table.h" +#include "dpaax_logs.h" + +/* Global dpaax logger identifier */ +int dpaax_logger; + +/* Global table reference */ +struct dpaax_iova_table *dpaax_iova_table_p; + +static int dpaax_handle_memevents(void); + +/* A structure representing the device-tree node available in /proc/device-tree. + */ +struct reg_node { + phys_addr_t addr; + size_t len; +}; + +/* A ntohll equivalent routine + * XXX: This is only applicable for 64 bit environment. + */ +static void +rotate_8(unsigned char *arr) +{ + uint32_t temp; + uint32_t *first_half; + uint32_t *second_half; + + first_half = (uint32_t *)(arr); + second_half = (uint32_t *)(arr + 4); + + temp = *first_half; + *first_half = *second_half; + *second_half = temp; + + *first_half = ntohl(*first_half); + *second_half = ntohl(*second_half); +} + +/* read_memory_nodes + * Memory layout for DPAAx platforms (LS1043, LS1046, LS1088, LS2088, LX2160) + * are populated by Uboot and available in device tree: + * /proc/device-tree/memory@<address>/reg <= register. + * Entries are of the form: + * (<8 byte start addr><8 byte length>)(..more similar blocks of start,len>).. + * + * @param count + * OUT populate number of entries found in memory node + * @return + * Pointer to array of reg_node elements, count size + */ +static struct reg_node * +read_memory_node(unsigned int *count) +{ + int fd, ret, i; + unsigned int j; + glob_t result = {0}; + struct stat statbuf = {0}; + char file_data[MEM_NODE_FILE_LEN]; + struct reg_node *nodes = NULL; + + *count = 0; + + ret = glob(MEM_NODE_PATH_GLOB, 0, NULL, &result); + if (ret != 0) { + DPAAX_ERR("Unable to glob device-tree memory node: (%s)(%d)", + MEM_NODE_PATH_GLOB, ret); + goto out; + } + + if (result.gl_pathc != 1) { + /* Either more than one memory@<addr> node found, or none. + * In either case, cannot work ahead. + */ + DPAAX_ERR("Found (%zu) entries in device-tree. Not supported!", + result.gl_pathc); + goto out; + } + + DPAAX_DEBUG("Opening and parsing device-tree node: (%s)", + result.gl_pathv[0]); + fd = open(result.gl_pathv[0], O_RDONLY); + if (fd < 0) { + DPAAX_ERR("Unable to open the device-tree node: (%s)(fd=%d)", + MEM_NODE_PATH_GLOB, fd); + goto cleanup; + } + + /* Stat to get the file size */ + ret = fstat(fd, &statbuf); + if (ret != 0) { + DPAAX_ERR("Unable to get device-tree memory node size."); + goto cleanup; + } + + DPAAX_DEBUG("Size of device-tree mem node: %lu", statbuf.st_size); + if (statbuf.st_size > MEM_NODE_FILE_LEN) { + DPAAX_WARN("More memory nodes available than assumed."); + DPAAX_WARN("System may not work properly!"); + } + + ret = read(fd, file_data, statbuf.st_size > MEM_NODE_FILE_LEN ? + MEM_NODE_FILE_LEN : statbuf.st_size); + if (ret <= 0) { + DPAAX_ERR("Unable to read device-tree memory node: (%d)", ret); + goto cleanup; + } + + /* The reg node should be multiple of 16 bytes, 8 bytes each for addr + * and len. + */ + *count = (statbuf.st_size / 16); + if ((*count) <= 0 || (statbuf.st_size % 16 != 0)) { + DPAAX_ERR("Invalid memory node values or count. (size=%lu)", + statbuf.st_size); + goto cleanup; + } + + /* each entry is of 16 bytes, and size/16 is total count of entries */ + nodes = malloc(sizeof(struct reg_node) * (*count)); + if (!nodes) { + DPAAX_ERR("Failure in allocating working memory."); + goto cleanup; + } + memset(nodes, 0, sizeof(struct reg_node) * (*count)); + + for (i = 0, j = 0; i < (statbuf.st_size) && j < (*count); i += 16, j++) { + memcpy(&nodes[j], file_data + i, 16); + /* Rotate (ntohl) each 8 byte entry */ + rotate_8((unsigned char *)(&(nodes[j].addr))); + rotate_8((unsigned char *)(&(nodes[j].len))); + } + + DPAAX_DEBUG("Device-tree memory node data:"); + do { + DPAAX_DEBUG("\n %08" PRIx64 " %08zu", nodes[j].addr, nodes[j].len); + } while (--j); + +cleanup: + close(fd); + globfree(&result); +out: + return nodes; +} + +int +dpaax_iova_table_populate(void) +{ + int ret; + unsigned int i, node_count; + size_t tot_memory_size, total_table_size; + struct reg_node *nodes; + struct dpaax_iovat_element *entry; + + /* dpaax_iova_table_p is a singleton - only one instance should be + * created. + */ + if (dpaax_iova_table_p) { + DPAAX_DEBUG("Multiple allocation attempt for IOVA Table (%p)", + dpaax_iova_table_p); + /* This can be an error case as well - some path not cleaning + * up table - but, for now, it is assumed that if IOVA Table + * pointer is valid, table is allocated. + */ + return 0; + } + + nodes = read_memory_node(&node_count); + if (nodes == NULL || node_count <= 0) { + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + return -1; + } + + tot_memory_size = 0; + for (i = 0; i < node_count; i++) + tot_memory_size += nodes[i].len; + + DPAAX_DEBUG("Total available PA memory size: %zu", tot_memory_size); + + /* Total table size = meta data + tot_memory_size/8 */ + total_table_size = sizeof(struct dpaax_iova_table) + + (sizeof(struct dpaax_iovat_element) * node_count) + + ((tot_memory_size / DPAAX_MEM_SPLIT) * sizeof(uint64_t)); + + /* TODO: This memory doesn't need to shared but needs to be always + * pinned to RAM (no swap out) - using hugepage rather than malloc + */ + dpaax_iova_table_p = rte_zmalloc(NULL, total_table_size, 0); + if (dpaax_iova_table_p == NULL) { + DPAAX_WARN("Unable to allocate memory for PA->VA Table;"); + DPAAX_WARN("PA->VA translation not available;"); + DPAAX_WARN("Expect performance impact."); + free(nodes); + return -1; + } + + /* Initialize table */ + dpaax_iova_table_p->count = node_count; + entry = dpaax_iova_table_p->entries; + + DPAAX_DEBUG("IOVA Table entries: (entry start = %p)", (void *)entry); + DPAAX_DEBUG("\t(entry),(start),(len),(next)"); + + for (i = 0; i < node_count; i++) { + /* dpaax_iova_table_p + * | dpaax_iova_table_p->entries + * | | + * | | + * V V + * +------+------+-------+---+----------+---------+--- + * |iova_ |entry | entry | | pages | pages | + * |table | 1 | 2 |...| entry 1 | entry2 | + * +-----'+.-----+-------+---+;---------+;--------+--- + * \ \ / / + * `~~~~~~|~~~~~>pages / + * \ / + * `~~~~~~~~~~~>pages + */ + entry[i].start = nodes[i].addr; + entry[i].len = nodes[i].len; + if (i > 0) + entry[i].pages = entry[i-1].pages + + ((entry[i-1].len/DPAAX_MEM_SPLIT)); + else + entry[i].pages = (uint64_t *)((unsigned char *)entry + + (sizeof(struct dpaax_iovat_element) * + node_count)); + + DPAAX_DEBUG("\t(%u),(%8"PRIx64"),(%8zu),(%8p)", + i, entry[i].start, entry[i].len, entry[i].pages); + } + + /* Release memory associated with nodes array - not required now */ + free(nodes); + + DPAAX_DEBUG("Adding mem-event handler\n"); + ret = dpaax_handle_memevents(); + if (ret) { + DPAAX_ERR("Unable to add mem-event handler"); + DPAAX_WARN("Cases with non-buffer pool mem won't work!"); + } + + return 0; +} + +void +dpaax_iova_table_depopulate(void) +{ + if (dpaax_iova_table_p == NULL) + return; + + rte_free(dpaax_iova_table_p->entries); + dpaax_iova_table_p = NULL; + + DPAAX_DEBUG("IOVA Table cleanedup"); +} + +int +dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length) +{ + int found = 0; + unsigned int i; + size_t req_length = length, e_offset; + struct dpaax_iovat_element *entry; + uintptr_t align_vaddr; + phys_addr_t align_paddr; + + align_paddr = paddr & DPAAX_MEM_SPLIT_MASK; + align_vaddr = ((uintptr_t)vaddr & DPAAX_MEM_SPLIT_MASK); + + /* Check if paddr is available in table */ + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + if (align_paddr < entry[i].start) { + /* Address lower than start, but not found in previous + * iteration shouldn't exist. + */ + DPAAX_ERR("Add: Incorrect entry for PA->VA Table" + "(%"PRIu64")", paddr); + DPAAX_ERR("Add: Lowest address: %"PRIu64"", + entry[i].start); + return -1; + } + + if (align_paddr > (entry[i].start + entry[i].len)) + continue; + + /* align_paddr >= start && align_paddr < (start + len) */ + found = 1; + + do { + e_offset = ((align_paddr - entry[i].start) / DPAAX_MEM_SPLIT); + /* TODO: Whatif something already exists at this + * location - is that an error? For now, ignoring the + * case. + */ + entry[i].pages[e_offset] = align_vaddr; + DPAAX_DEBUG("Added: vaddr=%zu for Phy:%"PRIu64" at %zu" + " remaining len %zu", align_vaddr, + align_paddr, e_offset, req_length); + + /* Incoming request can be larger than the + * DPAAX_MEM_SPLIT size - in which case, multiple + * entries in entry->pages[] are filled up. + */ + if (req_length <= DPAAX_MEM_SPLIT) + break; + align_paddr += DPAAX_MEM_SPLIT; + align_vaddr += DPAAX_MEM_SPLIT; + req_length -= DPAAX_MEM_SPLIT; + } while (1); + + break; + } + + if (!found) { + /* There might be case where the incoming physical address is + * beyond the address discovered in the memory node of + * device-tree. Specially if some malloc'd area is used by EAL + * and the memevent handlers passes that across. But, this is + * not necessarily an error. + */ + DPAAX_DEBUG("Add: Unable to find slot for vaddr:(%p)," + " phy(%"PRIu64")", + vaddr, paddr); + return -1; + } + + DPAAX_DEBUG("Add: Found slot at (%"PRIu64")[(%zu)] for vaddr:(%p)," + " phy(%"PRIu64"), len(%zu)", entry[i].start, e_offset, + vaddr, paddr, length); + return 0; +} + +/* dpaax_iova_table_dump + * Dump the table, with its entries, on screen. Only works in Debug Mode + * Not for weak hearted - the tables can get quite large + */ +void +dpaax_iova_table_dump(void) +{ + unsigned int i, j; + struct dpaax_iovat_element *entry; + + /* In case DEBUG is not enabled, some 'if' conditions might misbehave + * as they have nothing else in them except a DPAAX_DEBUG() which if + * tuned out would leave 'if' naked. + */ + if (rte_log_get_global_level() < RTE_LOG_DEBUG) { + DPAAX_ERR("Set log level to Debug for PA->Table dump!"); + return; + } + + DPAAX_DEBUG(" === Start of PA->VA Translation Table ==="); + if (dpaax_iova_table_p == NULL) + DPAAX_DEBUG("\tNULL"); + + entry = dpaax_iova_table_p->entries; + for (i = 0; i < dpaax_iova_table_p->count; i++) { + DPAAX_DEBUG("\t(%16i),(%16"PRIu64"),(%16zu),(%16p)", + i, entry[i].start, entry[i].len, entry[i].pages); + DPAAX_DEBUG("\t\t (PA), (VA)"); + for (j = 0; j < (entry->len/DPAAX_MEM_SPLIT); j++) { + if (entry[i].pages[j] == 0) + continue; + DPAAX_DEBUG("\t\t(%16"PRIx64"),(%16"PRIx64")", + (entry[i].start + (j * sizeof(uint64_t))), + entry[i].pages[j]); + } + } + DPAAX_DEBUG(" === End of PA->VA Translation Table ==="); +} + +static void +dpaax_memevent_cb(enum rte_mem_event type, const void *addr, size_t len, + void *arg __rte_unused) +{ + struct rte_memseg_list *msl; + struct rte_memseg *ms; + size_t cur_len = 0, map_len = 0; + phys_addr_t phys_addr; + void *virt_addr; + int ret; + + DPAAX_DEBUG("Called with addr=%p, len=%zu", addr, len); + + msl = rte_mem_virt2memseg_list(addr); + + while (cur_len < len) { + const void *va = RTE_PTR_ADD(addr, cur_len); + + ms = rte_mem_virt2memseg(va, msl); + phys_addr = rte_mem_virt2phy(ms->addr); + virt_addr = ms->addr; + map_len = ms->len; + + DPAAX_DEBUG("Request for %s, va=%p, virt_addr=%p," + "iova=%"PRIu64", map_len=%zu", + type == RTE_MEM_EVENT_ALLOC ? + "alloc" : "dealloc", + va, virt_addr, phys_addr, map_len); + + if (type == RTE_MEM_EVENT_ALLOC) + ret = dpaax_iova_table_update(phys_addr, virt_addr, + map_len); + else + /* In case of mem_events for MEM_EVENT_FREE, complete + * hugepage is released and its PA entry is set to 0. + */ + ret = dpaax_iova_table_update(phys_addr, 0, map_len); + + if (ret != 0) { + DPAAX_ERR("PA-Table entry update failed. " + "Map=%d, addr=%p, len=%zu, err:(%d)", + type, va, map_len, ret); + return; + } + + cur_len += map_len; + } +} + +static int +dpaax_memevent_walk_memsegs(const struct rte_memseg_list *msl __rte_unused, + const struct rte_memseg *ms, size_t len, + void *arg __rte_unused) +{ + DPAAX_DEBUG("Walking for %p (pa=%"PRIu64") and len %zu", + ms->addr, ms->phys_addr, len); + dpaax_iova_table_update(rte_mem_virt2phy(ms->addr), ms->addr, len); + return 0; +} + +static int +dpaax_handle_memevents(void) +{ + /* First, walk through all memsegs and pin them, before installing + * handler. This assures that all memseg which have already been + * identified/allocated by EAL, are already part of PA->VA Table. This + * is especially for cases where application allocates memory before + * the EAL or this is an externally allocated memory passed to EAL. + */ + rte_memseg_contig_walk_thread_unsafe(dpaax_memevent_walk_memsegs, NULL); + + return rte_mem_event_callback_register("dpaax_memevents_cb", + dpaax_memevent_cb, NULL); +} + +RTE_INIT(dpaax_log) +{ + dpaax_logger = rte_log_register("pmd.common.dpaax"); + if (dpaax_logger >= 0) + rte_log_set_level(dpaax_logger, RTE_LOG_NOTICE); +} diff --git a/drivers/common/dpaax/dpaax_iova_table.h b/drivers/common/dpaax/dpaax_iova_table.h new file mode 100644 index 000000000..3e913ef45 --- /dev/null +++ b/drivers/common/dpaax/dpaax_iova_table.h @@ -0,0 +1,103 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_IOVA_TABLE_H_ +#define _DPAAX_IOVA_TABLE_H_ + +#include <unistd.h> +#include <stdio.h> +#include <string.h> +#include <stdbool.h> +#include <string.h> +#include <stdlib.h> +#include <inttypes.h> +#include <sys/stat.h> +#include <sys/types.h> +#include <dirent.h> +#include <fcntl.h> +#include <glob.h> +#include <errno.h> +#include <arpa/inet.h> + +#include <rte_eal.h> +#include <rte_branch_prediction.h> +#include <rte_memory.h> +#include <rte_malloc.h> + +struct dpaax_iovat_element { + phys_addr_t start; /**< Start address of block of physical pages */ + size_t len; /**< Difference of end-start for quick access */ + uint64_t *pages; /**< VA for each physical page in this block */ +}; + +struct dpaax_iova_table { + unsigned int count; /**< No. of blocks of contiguous physical pages */ + struct dpaax_iovat_element entries[0]; +}; + +/* Pointer to the table, which is common for DPAA/DPAA2 and only a single + * instance is required across net/crypto/event drivers. This table is + * populated iff devices are found on the bus. + */ +extern struct dpaax_iova_table *dpaax_iova_table_p; + +/* Device tree file for memory layout is named 'memory@<addr>' where the 'addr' + * is SoC dependent, or even Uboot fixup dependent. + */ +#define MEM_NODE_PATH_GLOB "/proc/device-tree/memory[@0-9]*/reg" +/* Device file should be multiple of 16 bytes, each containing 8 byte of addr + * and its length. Assuming max of 5 entries. + */ +#define MEM_NODE_FILE_LEN ((16 * 5) + 1) + +/* Table is made up of DPAAX_MEM_SPLIT elements for each contiguous zone. This + * helps avoid separate handling for cases where more than one size of hugepage + * is supported. + */ +#define DPAAX_MEM_SPLIT (1<<21) +#define DPAAX_MEM_SPLIT_MASK ~(DPAAX_MEM_SPLIT - 1) /**< Floor aligned */ +#define DPAAX_MEM_SPLIT_MASK_OFF (DPAAX_MEM_SPLIT - 1) /**< Offset */ + +/* APIs exposed */ +int dpaax_iova_table_populate(void); +void dpaax_iova_table_depopulate(void); +int dpaax_iova_table_update(phys_addr_t paddr, void *vaddr, size_t length); +void dpaax_iova_table_dump(void); + +static inline void *dpaax_iova_table_get_va(phys_addr_t paddr) __attribute__((hot)); + +static inline void * +dpaax_iova_table_get_va(phys_addr_t paddr) { + unsigned int i = 0, index; + void *vaddr = 0; + phys_addr_t paddr_align = paddr & DPAAX_MEM_SPLIT_MASK; + size_t offset = paddr & DPAAX_MEM_SPLIT_MASK_OFF; + struct dpaax_iovat_element *entry; + + entry = dpaax_iova_table_p->entries; + + do { + if (unlikely(i > dpaax_iova_table_p->count)) + break; + + if (paddr_align < entry[i].start) { + /* Incorrect paddr; Not in memory range */ + return NULL; + } + + if (paddr_align > (entry[i].start + entry[i].len)) { + i++; + continue; + } + + /* paddr > entry->start && paddr <= entry->(start+len) */ + index = (paddr_align - entry[i].start)/DPAAX_MEM_SPLIT; + vaddr = (void *)((uintptr_t)entry[i].pages[index] + offset); + break; + } while (1); + + return vaddr; +} + +#endif /* _DPAAX_IOVA_TABLE_H_ */ diff --git a/drivers/common/dpaax/dpaax_logs.h b/drivers/common/dpaax/dpaax_logs.h new file mode 100644 index 000000000..bf1b27cc1 --- /dev/null +++ b/drivers/common/dpaax/dpaax_logs.h @@ -0,0 +1,39 @@ +/* SPDX-License-Identifier: BSD-3-Clause + * Copyright 2018 NXP + */ + +#ifndef _DPAAX_LOGS_H_ +#define _DPAAX_LOGS_H_ + +#include <rte_log.h> + +extern int dpaax_logger; + +#define DPAAX_LOG(level, fmt, args...) \ + rte_log(RTE_LOG_ ## level, dpaax_logger, "dpaax: " fmt "\n", \ + ##args) + +/* Debug logs are with Function names */ +#define DPAAX_DEBUG(fmt, args...) \ + rte_log(RTE_LOG_DEBUG, dpaax_logger, "dpaax: %s(): " fmt "\n", \ + __func__, ##args) + +#define DPAAX_INFO(fmt, args...) \ + DPAAX_LOG(INFO, fmt, ## args) +#define DPAAX_ERR(fmt, args...) \ + DPAAX_LOG(ERR, fmt, ## args) +#define DPAAX_WARN(fmt, args...) \ + DPAAX_LOG(WARNING, fmt, ## args) + +/* DP Logs, toggled out at compile time if level lower than current level */ +#define DPAAX_DP_LOG(level, fmt, args...) \ + RTE_LOG_DP(level, PMD, fmt, ## args) + +#define DPAAX_DP_DEBUG(fmt, args...) \ + DPAAX_DP_LOG(DEBUG, fmt, ## args) +#define DPAAX_DP_INFO(fmt, args...) \ + DPAAX_DP_LOG(INFO, fmt, ## args) +#define DPAAX_DP_WARN(fmt, args...) \ + DPAAX_DP_LOG(WARNING, fmt, ## args) + +#endif /* _DPAAX_LOGS_H_ */ diff --git a/drivers/common/dpaax/meson.build b/drivers/common/dpaax/meson.build new file mode 100644 index 000000000..98a1bdd48 --- /dev/null +++ b/drivers/common/dpaax/meson.build @@ -0,0 +1,12 @@ +# SPDX-License-Identifier: BSD-3-Clause +# Copyright(c) 2018 NXP + +allow_experimental_apis = true + +if host_machine.system() != 'linux' + build = false +endif + +sources = files('dpaax_iova_table.c') + +cflags += ['-D_GNU_SOURCE'] diff --git a/drivers/common/dpaax/rte_common_dpaax_version.map b/drivers/common/dpaax/rte_common_dpaax_version.map new file mode 100644 index 000000000..8131c9e30 --- /dev/null +++ b/drivers/common/dpaax/rte_common_dpaax_version.map @@ -0,0 +1,11 @@ +DPDK_18.11 { + global: + + dpaax_iova_table_update; + dpaax_iova_table_depopulate; + dpaax_iova_table_dump; + dpaax_iova_table_p; + dpaax_iova_table_populate; + + local: *; +}; diff --git a/drivers/common/meson.build b/drivers/common/meson.build index f828ce7f7..0257d4d2b 100644 --- a/drivers/common/meson.build +++ b/drivers/common/meson.build @@ -2,6 +2,6 @@ # Copyright(c) 2018 Cavium, Inc std_deps = ['eal'] -drivers = ['mvep', 'octeontx', 'qat'] +drivers = ['dpaax', 'mvep', 'octeontx', 'qat'] config_flag_fmt = 'RTE_LIBRTE_@0@_COMMON' driver_name_fmt = 'rte_common_@0@' -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-10-15 23:17 ` Thomas Monjalon 0 siblings, 0 replies; 53+ messages in thread From: Thomas Monjalon @ 2018-10-15 23:17 UTC (permalink / raw) To: Shreyansh Jain; +Cc: dev, ferruh.yigit, anatoly.burakov, pbhagavatula 15/10/2018 14:01, Shreyansh Jain: > A common library, valid for dpaaX drivers, which is used to maintain > a local copy of PA->VA translations. > > In case of physical addressing mode (one of the option for FSLMC, and > only option for DPAA bus), the addresses of descriptors Rx'd are > physical. These need to be converted into equivalent VA for rte_mbuf > and other similar calls. > > Using the rte_mem_virt2iova or rte_mem_virt2phy is expensive. This > library is an attempt to reduce the overall cost associated with > this translation. > > A small table is maintained, containing continuous entries > representing a continguous physical range. Each of these entries > stores the equivalent VA, which is fed during mempool creation, or > memory allocation/deallocation callbacks. > > Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> > --- > config/common_base | 5 + > config/common_linuxapp | 5 + > drivers/common/Makefile | 4 + > drivers/common/dpaax/Makefile | 31 ++ > drivers/common/dpaax/dpaax_iova_table.c | 461 ++++++++++++++++++ > drivers/common/dpaax/dpaax_iova_table.h | 103 ++++ > drivers/common/dpaax/dpaax_logs.h | 39 ++ > drivers/common/dpaax/meson.build | 12 + > .../common/dpaax/rte_common_dpaax_version.map | 11 + > drivers/common/meson.build | 2 +- > 10 files changed, 672 insertions(+), 1 deletion(-) I will add this change when applying: NXP buses M: Hemant Agrawal <hemant.agrawal@nxp.com> M: Shreyansh Jain <shreyansh.jain@nxp.com> +F: drivers/common/dpaax/ F: drivers/bus/dpaa/ F: drivers/bus/fslmc/ ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 4/5] dpaa: enable dpaax library 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain ` (2 preceding siblings ...) 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 5/5] fslmc: " Shreyansh Jain 2018-10-16 10:18 ` [dpdk-dev] [PATCH v5 0/5] Thomas Monjalon 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas; +Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, dpaa bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa, event/dpaa and net/dpaa as they are dependent on the bus/dpaa and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/dpaa/Makefile | 1 + drivers/bus/dpaa/dpaa_bus.c | 4 ++++ drivers/bus/dpaa/meson.build | 2 +- drivers/bus/dpaa/rte_dpaa_bus.h | 6 ++++++ drivers/crypto/dpaa_sec/Makefile | 1 + drivers/crypto/dpaa_sec/dpaa_sec.c | 6 ++++++ drivers/event/dpaa/Makefile | 1 + drivers/mempool/dpaa/Makefile | 1 + drivers/mempool/dpaa/dpaa_mempool.c | 4 ++++ drivers/mempool/dpaa/dpaa_mempool.h | 4 +--- drivers/net/dpaa/Makefile | 1 + mk/rte.app.mk | 4 ++++ 12 files changed, 31 insertions(+), 4 deletions(-) diff --git a/drivers/bus/dpaa/Makefile b/drivers/bus/dpaa/Makefile index 9337b5f92..381a5c659 100644 --- a/drivers/bus/dpaa/Makefile +++ b/drivers/bus/dpaa/Makefile @@ -48,5 +48,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += \ LDLIBS += -lpthread LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/bus/dpaa/dpaa_bus.c b/drivers/bus/dpaa/dpaa_bus.c index 138e0f98d..381c3b17c 100644 --- a/drivers/bus/dpaa/dpaa_bus.c +++ b/drivers/bus/dpaa/dpaa_bus.c @@ -34,6 +34,7 @@ #include <rte_dpaa_bus.h> #include <rte_dpaa_logs.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -548,6 +549,9 @@ rte_dpaa_bus_probe(void) fclose(svr_file); } + /* And initialize the PA->VA translation table */ + dpaax_iova_table_populate(); + /* For each registered driver, and device, call the driver->probe */ TAILQ_FOREACH(dev, &rte_dpaa_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_dpaa_bus.driver_list, next) { diff --git a/drivers/bus/dpaa/meson.build b/drivers/bus/dpaa/meson.build index 5e7705571..11a3c9499 100644 --- a/drivers/bus/dpaa/meson.build +++ b/drivers/bus/dpaa/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev'] +deps += ['common_dpaax', 'eventdev'] sources = files('base/fman/fman.c', 'base/fman/fman_hw.c', 'base/fman/netcfg_layer.c', diff --git a/drivers/bus/dpaa/rte_dpaa_bus.h b/drivers/bus/dpaa/rte_dpaa_bus.h index 15dc6a4ac..1d580a000 100644 --- a/drivers/bus/dpaa/rte_dpaa_bus.h +++ b/drivers/bus/dpaa/rte_dpaa_bus.h @@ -8,6 +8,7 @@ #include <rte_bus.h> #include <rte_mempool.h> +#include <dpaax_iova_table.h> #include <fsl_usd.h> #include <fsl_qman.h> @@ -110,6 +111,11 @@ extern struct dpaa_memseg_list rte_dpaa_memsegs; static inline void *rte_dpaa_mem_ptov(phys_addr_t paddr) { struct dpaa_memseg *ms; + void *va; + + va = dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* Check if the address is already part of the memseg list internally * maintained by the dpaa driver. diff --git a/drivers/crypto/dpaa_sec/Makefile b/drivers/crypto/dpaa_sec/Makefile index 9be447041..674a7a398 100644 --- a/drivers/crypto/dpaa_sec/Makefile +++ b/drivers/crypto/dpaa_sec/Makefile @@ -38,5 +38,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_DPAA_SEC) += dpaa_sec.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/crypto/dpaa_sec/dpaa_sec.c b/drivers/crypto/dpaa_sec/dpaa_sec.c index 7c0459f9f..54f1913f2 100644 --- a/drivers/crypto/dpaa_sec/dpaa_sec.c +++ b/drivers/crypto/dpaa_sec/dpaa_sec.c @@ -107,6 +107,12 @@ dpaa_mem_vtop(void *vaddr) static inline void * dpaa_mem_ptov(rte_iova_t paddr) { + void *va; + + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va)) + return va; + return rte_mem_iova2virt(paddr); } diff --git a/drivers/event/dpaa/Makefile b/drivers/event/dpaa/Makefile index ddd855227..6f93e7f40 100644 --- a/drivers/event/dpaa/Makefile +++ b/drivers/event/dpaa/Makefile @@ -34,5 +34,6 @@ LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs LDLIBS += -lrte_eventdev -lrte_pmd_dpaa -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/Makefile b/drivers/mempool/dpaa/Makefile index da8da1e90..9cf36856c 100644 --- a/drivers/mempool/dpaa/Makefile +++ b/drivers/mempool/dpaa/Makefile @@ -31,5 +31,6 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA_MEMPOOL) += dpaa_mempool.c LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/mempool/dpaa/dpaa_mempool.c b/drivers/mempool/dpaa/dpaa_mempool.c index 1c121223b..b05fb7b9d 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.c +++ b/drivers/mempool/dpaa/dpaa_mempool.c @@ -26,6 +26,7 @@ #include <rte_ring.h> #include <dpaa_mempool.h> +#include <dpaax_iova_table.h> /* List of all the memseg information locally maintained in dpaa driver. This * is to optimize the PA_to_VA searches until a better mechanism (algo) is @@ -285,6 +286,9 @@ dpaa_populate(struct rte_mempool *mp, unsigned int max_objs, return 0; } + /* Update the PA-VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); + bp_info = DPAA_MEMPOOL_TO_POOL_INFO(mp); total_elt_sz = mp->header_size + mp->elt_size + mp->trailer_size; diff --git a/drivers/mempool/dpaa/dpaa_mempool.h b/drivers/mempool/dpaa/dpaa_mempool.h index 092f326cb..533e1c6e2 100644 --- a/drivers/mempool/dpaa/dpaa_mempool.h +++ b/drivers/mempool/dpaa/dpaa_mempool.h @@ -43,10 +43,8 @@ struct dpaa_bp_info { }; static inline void * -DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info, uint64_t addr) +DPAA_MEMPOOL_PTOV(struct dpaa_bp_info *bp_info __rte_unused, uint64_t addr) { - if (bp_info->ptov_off) - return ((void *) (size_t)(addr + bp_info->ptov_off)); return rte_dpaa_mem_ptov(addr); } diff --git a/drivers/net/dpaa/Makefile b/drivers/net/dpaa/Makefile index d7a0a50c5..1c4f7d914 100644 --- a/drivers/net/dpaa/Makefile +++ b/drivers/net/dpaa/Makefile @@ -38,6 +38,7 @@ LDLIBS += -lrte_bus_dpaa LDLIBS += -lrte_mempool_dpaa LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax # install this header file SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA_PMD)-include := rte_pmd_dpaa.h diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 3ece996e8..4c70a408a 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -104,6 +104,10 @@ ifneq (,$(findstring y,$(MVEP-y))) _LDLIBS-y += -lrte_common_mvep -L$(LIBMUSDK_PATH)/lib -lmusdk endif +ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax +endif + _LDLIBS-$(CONFIG_RTE_LIBRTE_PCI_BUS) += -lrte_bus_pci _LDLIBS-$(CONFIG_RTE_LIBRTE_VDEV_BUS) += -lrte_bus_vdev _LDLIBS-$(CONFIG_RTE_LIBRTE_DPAA_BUS) += -lrte_bus_dpaa -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 5/5] fslmc: enable dpaax library 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain ` (3 preceding siblings ...) 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 4/5] dpaa: enable dpaax library Shreyansh Jain @ 2018-10-15 12:01 ` Shreyansh Jain 2018-10-16 10:18 ` [dpdk-dev] [PATCH v5 0/5] Thomas Monjalon 5 siblings, 0 replies; 53+ messages in thread From: Shreyansh Jain @ 2018-10-15 12:01 UTC (permalink / raw) To: thomas; +Cc: ferruh.yigit, anatoly.burakov, pbhagavatula, dev, Shreyansh Jain With this patch, fslmc bus and ethernet devices on this bus would start using the physical-virtual library interfaces. This patch impacts mempool/dpaa2, event/dpaa2, net/dpaa2, raw/dpaa2_cmdif and raw/dpaa2_qdma as they are dependent on the bus/fslmc and thus impact linkage of libraries. Signed-off-by: Shreyansh Jain <shreyansh.jain@nxp.com> --- drivers/bus/fslmc/Makefile | 1 + drivers/bus/fslmc/fslmc_bus.c | 20 ++++++++++++++++ drivers/bus/fslmc/meson.build | 2 +- drivers/bus/fslmc/portal/dpaa2_hw_pvt.h | 21 ++++++++--------- drivers/crypto/dpaa2_sec/Makefile | 1 + drivers/event/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/Makefile | 1 + drivers/mempool/dpaa2/dpaa2_hw_mempool.c | 29 ++++-------------------- drivers/net/dpaa2/Makefile | 1 + drivers/raw/dpaa2_cmdif/Makefile | 1 + drivers/raw/dpaa2_qdma/Makefile | 1 + mk/rte.app.mk | 3 +++ 12 files changed, 45 insertions(+), 37 deletions(-) diff --git a/drivers/bus/fslmc/Makefile b/drivers/bus/fslmc/Makefile index e95551980..218d9bd28 100644 --- a/drivers/bus/fslmc/Makefile +++ b/drivers/bus/fslmc/Makefile @@ -19,6 +19,7 @@ CFLAGS += -I$(RTE_SDK)/drivers/bus/fslmc/qbman/include CFLAGS += -I$(RTE_SDK)/lib/librte_eal/common LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev +LDLIBS += -lrte_common_dpaax # versioning export map EXPORT_MAP := rte_bus_fslmc_version.map diff --git a/drivers/bus/fslmc/fslmc_bus.c b/drivers/bus/fslmc/fslmc_bus.c index 2bc9457bc..5ba5ce96b 100644 --- a/drivers/bus/fslmc/fslmc_bus.c +++ b/drivers/bus/fslmc/fslmc_bus.c @@ -20,6 +20,8 @@ #include <fslmc_vfio.h> #include "fslmc_logs.h" +#include <dpaax_iova_table.h> + int dpaa2_logtype_bus; #define VFIO_IOMMU_GROUP_PATH "/sys/kernel/iommu_groups" @@ -377,6 +379,19 @@ rte_fslmc_probe(void) probe_all = rte_fslmc_bus.bus.conf.scan_mode != RTE_BUS_SCAN_WHITELIST; + /* In case of PA, the FD addresses returned by qbman APIs are physical + * addresses, which need conversion into equivalent VA address for + * rte_mbuf. For that, a table (a serial array, in memory) is used to + * increase translation efficiency. + * This has to be done before probe as some device initialization + * (during) probe allocate memory (dpaa2_sec) which needs to be pinned + * to this table. + */ + ret = dpaax_iova_table_populate(); + if (ret) { + DPAA2_BUS_WARN("PA->VA Translation table not available;"); + } + TAILQ_FOREACH(dev, &rte_fslmc_bus.device_list, next) { TAILQ_FOREACH(drv, &rte_fslmc_bus.driver_list, next) { ret = rte_fslmc_match(drv, dev); @@ -456,6 +471,11 @@ rte_fslmc_driver_unregister(struct rte_dpaa2_driver *driver) fslmc_bus = driver->fslmc_bus; + /* Cleanup the PA->VA Translation table; From whereever this function + * is called from. + */ + dpaax_iova_table_depopulate(); + TAILQ_REMOVE(&fslmc_bus->driver_list, driver, next); /* Update Bus references */ driver->fslmc_bus = NULL; diff --git a/drivers/bus/fslmc/meson.build b/drivers/bus/fslmc/meson.build index 54ca92d0c..18c45495b 100644 --- a/drivers/bus/fslmc/meson.build +++ b/drivers/bus/fslmc/meson.build @@ -7,7 +7,7 @@ if host_machine.system() != 'linux' build = false endif -deps += ['eventdev', 'kvargs'] +deps += ['common_dpaax', 'eventdev', 'kvargs'] sources = files('fslmc_bus.c', 'fslmc_vfio.c', 'mc/dpbp.c', diff --git a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h index 820759360..678ee34b8 100644 --- a/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h +++ b/drivers/bus/fslmc/portal/dpaa2_hw_pvt.h @@ -9,6 +9,7 @@ #define _DPAA2_HW_PVT_H_ #include <rte_eventdev.h> +#include <dpaax_iova_table.h> #include <mc/fsl_mc_sys.h> #include <fsl_qbman_portal.h> @@ -275,28 +276,26 @@ extern struct dpaa2_memseg_list rte_dpaa2_memsegs; #ifdef RTE_LIBRTE_DPAA2_USE_PHYS_IOVA extern uint8_t dpaa2_virt_mode; static void *dpaa2_mem_ptov(phys_addr_t paddr) __attribute__((unused)); -/* todo - this is costly, need to write a fast coversion routine */ + static void *dpaa2_mem_ptov(phys_addr_t paddr) { - struct dpaa2_memseg *ms; + void *va; if (dpaa2_virt_mode) return (void *)(size_t)paddr; - /* Check if the address is already part of the memseg list internally - * maintained by the dpaa2 driver. - */ - TAILQ_FOREACH(ms, &rte_dpaa2_memsegs, next) { - if (paddr >= ms->iova && paddr < - ms->iova + ms->len) - return RTE_PTR_ADD(ms->vaddr, (uintptr_t)(paddr - ms->iova)); - } + va = (void *)dpaax_iova_table_get_va(paddr); + if (likely(va != NULL)) + return va; /* If not, Fallback to full memseg list searching */ - return rte_mem_iova2virt(paddr); + va = rte_mem_iova2virt(paddr); + + return va; } static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) __attribute__((unused)); + static phys_addr_t dpaa2_mem_vtop(uint64_t vaddr) { const struct rte_memseg *memseg; diff --git a/drivers/crypto/dpaa2_sec/Makefile b/drivers/crypto/dpaa2_sec/Makefile index da3d8f84f..1f951a14b 100644 --- a/drivers/crypto/dpaa2_sec/Makefile +++ b/drivers/crypto/dpaa2_sec/Makefile @@ -51,5 +51,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_cryptodev +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/event/dpaa2/Makefile b/drivers/event/dpaa2/Makefile index 5e1a63200..7a71161de 100644 --- a/drivers/event/dpaa2/Makefile +++ b/drivers/event/dpaa2/Makefile @@ -21,6 +21,7 @@ CFLAGS += -I$(RTE_SDK)/lib/librte_eal/linuxapp/eal LDLIBS += -lrte_eal -lrte_eventdev LDLIBS += -lrte_bus_fslmc -lrte_mempool_dpaa2 -lrte_pmd_dpaa2 LDLIBS += -lrte_bus_vdev +LDLIBS += -lrte_common_dpaax CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2 CFLAGS += -I$(RTE_SDK)/drivers/net/dpaa2/mc diff --git a/drivers/mempool/dpaa2/Makefile b/drivers/mempool/dpaa2/Makefile index 9e4c87d79..0fc69c3bf 100644 --- a/drivers/mempool/dpaa2/Makefile +++ b/drivers/mempool/dpaa2/Makefile @@ -30,6 +30,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL) += dpaa2_hw_mempool.c LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_eal -lrte_mempool -lrte_ring +LDLIBS += -lrte_common_dpaax SYMLINK-$(CONFIG_RTE_LIBRTE_DPAA2_MEMPOOL)-include := rte_dpaa2_mempool.h diff --git a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c index 84ff12811..c5f60c5c6 100644 --- a/drivers/mempool/dpaa2/dpaa2_hw_mempool.c +++ b/drivers/mempool/dpaa2/dpaa2_hw_mempool.c @@ -30,6 +30,8 @@ #include "dpaa2_hw_mempool.h" #include "dpaa2_hw_mempool_logs.h" +#include <dpaax_iova_table.h> + struct dpaa2_bp_info rte_dpaa2_bpid_info[MAX_BPID]; static struct dpaa2_bp_list *h_bp_list; @@ -393,31 +395,8 @@ dpaa2_populate(struct rte_mempool *mp, unsigned int max_objs, void *vaddr, rte_iova_t paddr, size_t len, rte_mempool_populate_obj_cb_t *obj_cb, void *obj_cb_arg) { - struct dpaa2_memseg *ms; - - /* For each memory chunk pinned to the Mempool, a linked list of the - * contained memsegs is created for searching when PA to VA - * conversion is required. - */ - ms = rte_zmalloc(NULL, sizeof(struct dpaa2_memseg), 0); - if (!ms) { - DPAA2_MEMPOOL_ERR("Unable to allocate internal memory."); - DPAA2_MEMPOOL_WARN("Fast Physical to Virtual Addr translation would not be available."); - /* If the element is not added, it would only lead to failure - * in searching for the element and the logic would Fallback - * to traditional DPDK memseg traversal code. So, this is not - * a blocking error - but, error would be printed on screen. - */ - return 0; - } - - ms->vaddr = vaddr; - ms->iova = paddr; - ms->len = len; - /* Head insertions are generally faster than tail insertions as the - * buffers pinned are picked from rear end. - */ - TAILQ_INSERT_HEAD(&rte_dpaa2_memsegs, ms, next); + /* Insert entry into the PA->VA Table */ + dpaax_iova_table_update(paddr, vaddr, len); return rte_mempool_op_populate_default(mp, max_objs, vaddr, paddr, len, obj_cb, obj_cb_arg); diff --git a/drivers/net/dpaa2/Makefile b/drivers/net/dpaa2/Makefile index 9b0b14331..52649a945 100644 --- a/drivers/net/dpaa2/Makefile +++ b/drivers/net/dpaa2/Makefile @@ -40,5 +40,6 @@ LDLIBS += -lrte_bus_fslmc LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs +LDLIBS += -lrte_common_dpaax include $(RTE_SDK)/mk/rte.lib.mk diff --git a/drivers/raw/dpaa2_cmdif/Makefile b/drivers/raw/dpaa2_cmdif/Makefile index 9b863dda2..3c56c4b44 100644 --- a/drivers/raw/dpaa2_cmdif/Makefile +++ b/drivers/raw/dpaa2_cmdif/Makefile @@ -21,6 +21,7 @@ LDLIBS += -lrte_eal LDLIBS += -lrte_kvargs LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_cmdif_version.map diff --git a/drivers/raw/dpaa2_qdma/Makefile b/drivers/raw/dpaa2_qdma/Makefile index d88809ead..2f79a3f41 100644 --- a/drivers/raw/dpaa2_qdma/Makefile +++ b/drivers/raw/dpaa2_qdma/Makefile @@ -22,6 +22,7 @@ LDLIBS += -lrte_mempool LDLIBS += -lrte_mempool_dpaa2 LDLIBS += -lrte_rawdev LDLIBS += -lrte_ring +LDLIBS += -lrte_common_dpaax EXPORT_MAP := rte_pmd_dpaa2_qdma_version.map diff --git a/mk/rte.app.mk b/mk/rte.app.mk index 4c70a408a..06a457d62 100644 --- a/mk/rte.app.mk +++ b/mk/rte.app.mk @@ -107,6 +107,9 @@ endif ifeq ($(CONFIG_RTE_LIBRTE_DPAA_BUS),y) _LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax endif +ifeq ($(CONFIG_RTE_LIBRTE_FSLMC_BUS),y) +_LDLIBS-$(CONFIG_RTE_LIBRTE_COMMON_DPAAX) += -lrte_common_dpaax +endif _LDLIBS-$(CONFIG_RTE_LIBRTE_PCI_BUS) += -lrte_bus_pci _LDLIBS-$(CONFIG_RTE_LIBRTE_VDEV_BUS) += -lrte_bus_vdev -- 2.17.1 ^ permalink raw reply [flat|nested] 53+ messages in thread
* Re: [dpdk-dev] [PATCH v5 0/5] 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain ` (4 preceding siblings ...) 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 5/5] fslmc: " Shreyansh Jain @ 2018-10-16 10:18 ` Thomas Monjalon 5 siblings, 0 replies; 53+ messages in thread From: Thomas Monjalon @ 2018-10-16 10:18 UTC (permalink / raw) To: Shreyansh Jain; +Cc: dev, ferruh.yigit, anatoly.burakov, pbhagavatula 15/10/2018 14:01, Shreyansh Jain: > Shreyansh Jain (5): > bus/fslmc: fix physical addressing check > drivers: common as dependency for bus > common/dpaax: add library for PA VA translation table > dpaa: enable dpaax library > fslmc: enable dpaax library Applied with couple of fixes as noticed in this thread, thanks. ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v4 0/4] eventdev: add attribute based get APIs @ 2017-09-14 16:08 Harry van Haaren 2017-09-20 13:35 ` [dpdk-dev] [PATCH v5 0/5] Harry van Haaren 0 siblings, 1 reply; 53+ messages in thread From: Harry van Haaren @ 2017-09-14 16:08 UTC (permalink / raw) To: dev; +Cc: jerin.jacob, Harry van Haaren This patchset refactors the eventdev API to be more flexible and capable. In particular, the API is capable of returning an error value if an invalid device, port or attribute ID is passed in, which was not possible with the previous APIs. The implementation of this patchset is based on a v1 patch[1], and after some discussion this API was seen as the best solution. In terms of flexibility, the attribute id allows addition of new common eventdev layer attributes without breaking ABI or adding new functions. Note that these attributes are not data-path, and that PMDs should continue to use the xstats API for reporting any unique PMD statistics that are available. Regarding API/ABI compatibility, I have removed the functions from the .map files - please review the .map file changes for ABI issues carefully. The last patch of this series adds a started attribute to the device, allowing the application to query if a device is currently running. -Harry [1] http://dpdk.org/dev/patchwork/patch/27152/ --- v4: - Rework based on review by Jerin - default: cases into switches - Remove old functions from .map file - Remove /* out */ parameters - Rework header file definitions to match logical order - Rework patch split - Cleaner removal of queue_count() function v3: - Fix checkpatch issues... somehow I broke my checkpatch script :/ v2: - New APIs design based on discussion of initial patch. Harry van Haaren (4): eventdev: add port attribute function eventdev: add dev attribute get function eventdev: add queue attribute function eventdev: add device started attribute lib/librte_eventdev/rte_eventdev.c | 97 ++++++++++++------ lib/librte_eventdev/rte_eventdev.h | 115 +++++++++++---------- lib/librte_eventdev/rte_eventdev_version.map | 14 ++- test/test/test_eventdev.c | 132 +++++++++++++++++++------ test/test/test_eventdev_octeontx.c | 143 ++++++++++++++++++++------- 5 files changed, 345 insertions(+), 156 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 53+ messages in thread
* [dpdk-dev] [PATCH v5 0/5] 2017-09-14 16:08 [dpdk-dev] [PATCH v4 0/4] eventdev: add attribute based get APIs Harry van Haaren @ 2017-09-20 13:35 ` Harry van Haaren 0 siblings, 0 replies; 53+ messages in thread From: Harry van Haaren @ 2017-09-20 13:35 UTC (permalink / raw) To: dev; +Cc: jerin.jacob, Harry van Haaren This patchset refactors the eventdev API to be more flexible and capable. In particular, the API is capable of returning an error value if an invalid device, port or attribute ID is passed in, which was not possible with the previous APIs. The implementation of this patchset is based on a v1 patch[1], and after some discussion this API was seen as the best solution. In terms of flexibility, the attribute id allows addition of new common eventdev layer attributes without breaking ABI or adding new functions. Note that these attributes are not data-path, and that PMDs should continue to use the xstats API for reporting any unique PMD statistics that are available. Regarding API/ABI compatibility, I have removed the functions from the .map files - please review the .map file changes for ABI issues carefully. The last patch of this series adds a started attribute to the device, allowing the application to query if a device is currently running. -Harry [1] http://dpdk.org/dev/patchwork/patch/27152/ --- v5: - Bump library version of Eventdev (Jerin) - http://dpdk.org/ml/archives/dev/2017-September/075551.html v4: - Rework based on review by Jerin - default: cases into switches - Remove old functions from .map file - Remove /* out */ parameters - Rework header file definitions to match logical order - Rework patch split - Cleaner removal of queue_count() function v3: - Fix checkpatch issues... somehow I broke my checkpatch script :/ v2: - New APIs design based on discussion of initial patch. Harry van Haaren (5): eventdev: add port attribute function eventdev: add dev attribute get function eventdev: add queue attribute function eventdev: add device started attribute eventdev: bump library version doc/guides/rel_notes/release_17_11.rst | 2 +- lib/librte_eventdev/Makefile | 2 +- lib/librte_eventdev/rte_eventdev.c | 97 ++++++++++++------ lib/librte_eventdev/rte_eventdev.h | 115 +++++++++++---------- lib/librte_eventdev/rte_eventdev_version.map | 14 ++- test/test/test_eventdev.c | 132 +++++++++++++++++++------ test/test/test_eventdev_octeontx.c | 143 ++++++++++++++++++++------- 7 files changed, 347 insertions(+), 158 deletions(-) -- 2.7.4 ^ permalink raw reply [flat|nested] 53+ messages in thread
end of thread, other threads:[~2018-10-16 10:18 UTC | newest] Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2018-09-25 12:54 [dpdk-dev] [PATCH 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 2/5] drivers: common as dependency for bus Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain 2018-09-25 13:28 ` Burakov, Anatoly 2018-09-25 13:39 ` Shreyansh Jain 2018-09-25 13:51 ` Burakov, Anatoly 2018-09-25 14:00 ` Shreyansh Jain 2018-09-25 14:08 ` Burakov, Anatoly 2018-09-26 10:16 ` Burakov, Anatoly 2018-10-09 10:45 ` Shreyansh Jain 2018-10-11 9:03 ` Burakov, Anatoly 2018-10-11 10:02 ` Shreyansh Jain 2018-10-11 10:07 ` Shreyansh Jain 2018-10-11 10:13 ` Burakov, Anatoly 2018-10-11 10:39 ` Shreyansh Jain 2018-10-11 10:46 ` Burakov, Anatoly 2018-10-11 10:09 ` Burakov, Anatoly 2018-09-25 12:54 ` [dpdk-dev] [PATCH 4/5] dpaa: enable dpaax library Shreyansh Jain 2018-09-25 12:54 ` [dpdk-dev] [PATCH 5/5] fslmc: " Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-12 9:01 ` Pavan Nikhilesh 2018-10-12 10:44 ` Shreyansh Jain 2018-10-12 16:29 ` Pavan Nikhilesh 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 2/5] drivers: common as dependency for bus Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 4/5] dpaa: enable dpaax library Shreyansh Jain 2018-10-09 11:25 ` [dpdk-dev] [PATCH v2 5/5] fslmc: " Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-13 16:08 ` Pavan Nikhilesh 2018-10-15 6:36 ` Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 2/5] drivers: common as dependency for bus Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 4/5] dpaa: enable dpaax library Shreyansh Jain 2018-10-13 12:21 ` [dpdk-dev] [PATCH v3 5/5] fslmc: " Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 0/5] Add a PA-VA Translation table for DPAAx Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-15 6:41 ` [dpdk-dev] [PATCH v4 2/5] drivers: common as dependency for bus Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 4/5] dpaa: enable dpaax library Shreyansh Jain 2018-10-15 6:42 ` [dpdk-dev] [PATCH v4 5/5] fslmc: " Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 0/5] Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 1/5] bus/fslmc: fix physical addressing check Shreyansh Jain 2018-10-16 10:02 ` Thomas Monjalon 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 2/5] drivers: common as dependency for bus Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 3/5] common/dpaax: add library for PA VA translation table Shreyansh Jain 2018-10-15 23:17 ` Thomas Monjalon 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 4/5] dpaa: enable dpaax library Shreyansh Jain 2018-10-15 12:01 ` [dpdk-dev] [PATCH v5 5/5] fslmc: " Shreyansh Jain 2018-10-16 10:18 ` [dpdk-dev] [PATCH v5 0/5] Thomas Monjalon -- strict thread matches above, loose matches on Subject: below -- 2017-09-14 16:08 [dpdk-dev] [PATCH v4 0/4] eventdev: add attribute based get APIs Harry van Haaren 2017-09-20 13:35 ` [dpdk-dev] [PATCH v5 0/5] Harry van Haaren
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).