On Feb 6, 2024, at 9:24 PM, Morten Brørup wrote: Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. From: Akhil Goyal [mailto:gakhil@marvell.com] Sent: Tuesday, 6 February 2024 15.25 Cache the most recent VA -> PA mapping found so that we can skip most of the system calls. With 4K pages this reduces pool create time by about 90%. Signed-off-by: Andrew Boyer I believe there should be a generic solution for this in mempool if it is not there already. Here, you are adding cache in mempool priv which does not seem to be a correct place. This optimization would be needed across all types of mempools. Adding more people for comments. --- lib/cryptodev/rte_crypto.h | 5 +++++ lib/cryptodev/rte_cryptodev.c | 23 ++++++++++++++++++++++- 2 files changed, 27 insertions(+), 1 deletion(-) diff --git a/lib/cryptodev/rte_crypto.h b/lib/cryptodev/rte_crypto.h index dbc2700da5..ee6aa1e40e 100644 --- a/lib/cryptodev/rte_crypto.h +++ b/lib/cryptodev/rte_crypto.h @@ -220,6 +220,11 @@ struct rte_crypto_op_pool_private { /**< Crypto op pool type operation. */ uint16_t priv_size; /**< Size of private area in each crypto operation. */ + + unsigned long vp_cache; + /* Virtual page address of previous op. */ + rte_iova_t iovp_cache; + /* I/O virtual page address of previous op. */ }; diff --git a/lib/cryptodev/rte_cryptodev.c b/lib/cryptodev/rte_cryptodev.c index b233c0ecd7..d596f85a57 100644 --- a/lib/cryptodev/rte_cryptodev.c +++ b/lib/cryptodev/rte_cryptodev.c @@ -10,6 +10,7 @@ #include #include #include +#include #include #include @@ -2568,12 +2569,32 @@ rte_crypto_op_init(struct rte_mempool *mempool, { struct rte_crypto_op *op = _op_data; enum rte_crypto_op_type type = *(enum rte_crypto_op_type *)opaque_arg; + struct rte_crypto_op_pool_private *priv; + unsigned long virt_addr = (unsigned long)(uintptr_t)_op_data; +#ifdef RTE_EXEC_ENV_WINDOWS + unsigned long page_mask = 4095; +#else + unsigned long page_mask = sysconf(_SC_PAGESIZE) - 1; +#endif + unsigned long virt_page = virt_addr & ~page_mask; memset(_op_data, 0, mempool->elt_size); __rte_crypto_op_reset(op, type); - op->phys_addr = rte_mem_virt2iova(_op_data); This optimization is for rte_mem_virt2iova(_op_data) being slow. If I'm not mistaken, _op_data is an object in a mempool, where the mempool object headers have already been initialized. In this case, it could simply be optimized as this: - op->phys_addr = rte_mem_virt2iova(_op_data); + op->phys_addr = rte_mempool_virt2iova(_op_data); That certainly is shorter! Thanks, I was not aware of this function. -Andrew