DPDK patches and discussions
 help / color / Atom feed
* [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process
@ 2019-09-03 15:40 Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Fan Zhang
                   ` (9 more replies)
  0 siblings, 10 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This RFC patch adds a way to rte_security to process symmetric crypto
workload in bulk synchronously for SW crypto devices. 

Originally both SW and HW crypto PMDs works under rte_cryptodev to
process the crypto workload asynchronously. This way provides uniformity
to both PMD types but also introduce unnecessary performance penalty to
SW PMDs such as extra SW ring enqueue/dequeue steps to "simulate"
asynchronous working manner and unnecessary HW addresses computation.

We introduce a new way for SW crypto devices that perform crypto operation
synchronously with only fields required for the computation as input. The
proof-of-concept AESNI-GCM and AESNI-MB SW PMDs are updated with the
support of this new method. To demonstrate the performance gain with
this method 2 simple performance evaluation apps under unit-test are added
"app/test: security_aesni_gcm_perftest/security_aesni_mb_perftest". The
users can freely compare their results against crypto perf application
results.

Fan Zhang (9):
  security: introduce CPU Crypto action type and API
  crypto/aesni_gcm: add rte_security handler
  app/test: add security cpu crypto autotest
  app/test: add security cpu crypto perftest
  crypto/aesni_mb: add rte_security handler
  app/test: add aesni_mb security cpu crypto autotest
  app/test: add aesni_mb security cpu crypto perftest
  ipsec: add rte_security cpu_crypto action support
  examples/ipsec-secgw: add security cpu_crypto action support

 app/test/Makefile                                  |    1 +
 app/test/meson.build                               |    1 +
 app/test/test_security_cpu_crypto.c                | 1326 ++++++++++++++++++++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c           |   91 +-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c       |   95 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h   |   23 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         |  291 ++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |   91 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   21 +-
 examples/ipsec-secgw/ipsec.c                       |   22 +
 examples/ipsec-secgw/ipsec_process.c               |    4 +-
 examples/ipsec-secgw/sa.c                          |   13 +-
 examples/ipsec-secgw/test/run_test.sh              |   10 +
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 lib/librte_ipsec/esp_inb.c                         |  174 ++-
 lib/librte_ipsec/esp_outb.c                        |  290 ++++-
 lib/librte_ipsec/sa.c                              |   53 +-
 lib/librte_ipsec/sa.h                              |   29 +
 lib/librte_ipsec/ses.c                             |    4 +-
 lib/librte_security/rte_security.c                 |   16 +
 lib/librte_security/rte_security.h                 |   51 +-
 lib/librte_security/rte_security_driver.h          |   19 +
 lib/librte_security/rte_security_version.map       |    1 +
 32 files changed, 2658 insertions(+), 22 deletions(-)
 create mode 100644 app/test/test_security_cpu_crypto.c
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-04 10:32   ` Akhil Goyal
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 2/9] crypto/aesni_gcm: add rte_security handler Fan Zhang
                   ` (8 subsequent siblings)
  9 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
security library. The type represents performing crypto operation with CPU
cycles. The patch also includes a new API to process crypto operations in
bulk and the function pointers for PMDs.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_security/rte_security.c           | 16 +++++++++
 lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
 lib/librte_security/rte_security_driver.h    | 19 +++++++++++
 lib/librte_security/rte_security_version.map |  1 +
 4 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
index bc81ce15d..0f85c1b59 100644
--- a/lib/librte_security/rte_security.c
+++ b/lib/librte_security/rte_security.c
@@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
 
 	return NULL;
 }
+
+void
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	uint32_t i;
+
+	for (i = 0; i < num; i++)
+		status[i] = -1;
+
+	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
+	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
+			aad, digest, status, num);
+}
diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
index 96806e3a2..5a0f8901b 100644
--- a/lib/librte_security/rte_security.h
+++ b/lib/librte_security/rte_security.h
@@ -18,6 +18,7 @@ extern "C" {
 #endif
 
 #include <sys/types.h>
+#include <sys/uio.h>
 
 #include <netinet/in.h>
 #include <netinet/ip.h>
@@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
 	uint32_t hfn_threshold;
 };
 
+struct rte_security_cpu_crypto_xform {
+	/** For cipher/authentication crypto operation the authentication may
+	 * cover more content then the cipher. E.g., for IPSec ESP encryption
+	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
+	 * header but whole packet (apart from MAC header) is authenticated.
+	 * The cipher_offset field is used to deduct the cipher data pointer
+	 * from the buffer to be processed.
+	 *
+	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
+	 * uses the same offset for cipher and authentication.
+	 */
+	int32_t cipher_offset;
+};
+
 /**
  * Security session action type.
  */
@@ -286,10 +301,14 @@ enum rte_security_session_action_type {
 	/**< All security protocol processing is performed inline during
 	 * transmission
 	 */
-	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
+	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
 	/**< All security protocol processing including crypto is performed
 	 * on a lookaside accelerator
 	 */
+	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
+	/**< Crypto processing for security protocol is processed by CPU
+	 * synchronously
+	 */
 };
 
 /** Security session protocol definition */
@@ -315,6 +334,7 @@ struct rte_security_session_conf {
 		struct rte_security_ipsec_xform ipsec;
 		struct rte_security_macsec_xform macsec;
 		struct rte_security_pdcp_xform pdcp;
+		struct rte_security_cpu_crypto_xform cpucrypto;
 	};
 	/**< Configuration parameters for security session */
 	struct rte_crypto_sym_xform *crypto_xform;
@@ -639,6 +659,35 @@ const struct rte_security_capability *
 rte_security_capability_get(struct rte_security_ctx *instance,
 			    struct rte_security_capability_idx *idx);
 
+/**
+ * Security vector structure, contains pointer to vector array and the length
+ * of the array
+ */
+struct rte_security_vec {
+	struct iovec *vec;
+	uint32_t num;
+};
+
+/**
+ * Processing bulk crypto workload with CPU
+ *
+ * @param	instance	security instance.
+ * @param	sess		security session
+ * @param	buf		array of buffer SGL vectors
+ * @param	iv		array of IV pointers
+ * @param	aad		array of AAD pointers
+ * @param	digest		array of digest pointers
+ * @param	status		array of status for the function to return
+ * @param	num		number of elements in each array
+ *
+ */
+__rte_experimental
+void
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
index 1b561f852..70fcb0c26 100644
--- a/lib/librte_security/rte_security_driver.h
+++ b/lib/librte_security/rte_security_driver.h
@@ -132,6 +132,23 @@ typedef int (*security_get_userdata_t)(void *device,
 typedef const struct rte_security_capability *(*security_capabilities_get_t)(
 		void *device);
 
+/**
+ * Process security operations in bulk using CPU accelerated method.
+ *
+ * @param	sess		Security session structure.
+ * @param	buf		Buffer to the vectors to be processed.
+ * @param	iv		IV pointers.
+ * @param	aad		AAD pointers.
+ * @param	digest		Digest pointers.
+ * @param	status		Array of status value.
+ * @param	num		Number of elements in each array.
+ */
+
+typedef void (*security_process_cpu_crypto_bulk_t)(
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** Security operations function pointer table */
 struct rte_security_ops {
 	security_session_create_t session_create;
@@ -150,6 +167,8 @@ struct rte_security_ops {
 	/**< Get userdata associated with session which processed the packet. */
 	security_capabilities_get_t capabilities_get;
 	/**< Get security capabilities. */
+	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
+	/**< Process data in bulk. */
 };
 
 #ifdef __cplusplus
diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
index 53267bf3c..2132e7a00 100644
--- a/lib/librte_security/rte_security_version.map
+++ b/lib/librte_security/rte_security_version.map
@@ -18,4 +18,5 @@ EXPERIMENTAL {
 	rte_security_get_userdata;
 	rte_security_session_stats_get;
 	rte_security_session_update;
+	rte_security_process_cpu_crypto_bulk;
 };
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 2/9] crypto/aesni_gcm: add rte_security handler
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 3/9] app/test: add security cpu crypto autotest Fan Zhang
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch add rte_security support support to AESNI-GCM PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode with
scatter-gather list buffer supported.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c         | 91 ++++++++++++++++++++++-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c     | 95 ++++++++++++++++++++++++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h | 23 ++++++
 3 files changed, 208 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 1006a5c4d..0a346eddd 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -6,6 +6,7 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -174,6 +175,56 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct rte_crypto_op *op)
 	return sess;
 }
 
+static __rte_always_inline int
+process_gcm_security_sgl_buf(struct aesni_gcm_security_session *sess,
+		struct rte_security_vec *buf, uint8_t *iv,
+		uint8_t *aad, uint8_t *digest)
+{
+	struct aesni_gcm_session *session = &sess->sess;
+	uint8_t *tag;
+	uint32_t i;
+
+	sess->init(&session->gdata_key, &sess->gdata_ctx, iv, aad,
+			(uint64_t)session->aad_length);
+
+	for (i = 0; i < buf->num; i++) {
+		struct iovec *vec = &buf->vec[i];
+
+		sess->update(&session->gdata_key, &sess->gdata_ctx,
+				vec->iov_base, vec->iov_base, vec->iov_len);
+	}
+
+	switch (session->op) {
+	case AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION:
+		if (session->req_digest_length != session->gen_digest_length)
+			tag = sess->temp_digest;
+		else
+			tag = digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (session->req_digest_length != session->gen_digest_length)
+			memcpy(digest, sess->temp_digest,
+					session->req_digest_length);
+		break;
+
+	case AESNI_GCM_OP_AUTHENTICATED_DECRYPTION:
+		tag = sess->temp_digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (memcmp(tag, digest,	session->req_digest_length) != 0)
+			return -1;
+		break;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
 /**
  * Process a crypto operation, calling
  * the GCM API from the multi buffer library.
@@ -488,8 +539,10 @@ aesni_gcm_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_gcm_private *internals;
+	struct rte_security_ctx *sec_ctx;
 	enum aesni_gcm_vector_mode vector_mode;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -524,7 +577,8 @@ aesni_gcm_create(const char *name,
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
 			RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 	mb_mgr = alloc_mb_mgr(0);
 	if (mb_mgr == NULL)
@@ -587,6 +641,21 @@ aesni_gcm_create(const char *name,
 
 	internals->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_gcm_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_GCM_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_gcm_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 #if IMB_VERSION_NUM >= IMB_VERSION(0, 50, 0)
 	AESNI_GCM_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
@@ -641,6 +710,8 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	if (cryptodev == NULL)
 		return -ENODEV;
 
+	rte_free(cryptodev->security_ctx);
+
 	internals = cryptodev->data->dev_private;
 
 	free_mb_mgr(internals->mb_mgr);
@@ -648,6 +719,24 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	return rte_cryptodev_pmd_destroy(cryptodev);
 }
 
+void
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_gcm_security_session *session =
+			get_sec_session_private_data(sess);
+	uint32_t i;
+
+	if (unlikely(!session))
+		return;
+
+	for (i = 0; i < num; i++)
+		status[i] = process_gcm_security_sgl_buf(session, &buf[i],
+				(uint8_t *)iv[i], (uint8_t *)aad[i],
+				(uint8_t *)digest[i]);
+}
+
 static struct rte_vdev_driver aesni_gcm_pmd_drv = {
 	.probe = aesni_gcm_probe,
 	.remove = aesni_gcm_remove
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
index 2f66c7c58..cc71dbd60 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
@@ -7,6 +7,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "aesni_gcm_pmd_private.h"
 
@@ -316,6 +317,85 @@ aesni_gcm_pmd_sym_session_clear(struct rte_cryptodev *dev,
 	}
 }
 
+static int
+aesni_gcm_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_gcm_private *internals = cdev->data->dev_private;
+	struct aesni_gcm_security_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_GCM_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (conf->crypto_xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
+		AESNI_GCM_LOG(ERR, "GMAC is not supported in security session");
+		return -EINVAL;
+	}
+
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_GCM_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	ret = aesni_gcm_set_session_parameters(internals->ops,
+				&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_GCM_LOG(ERR, "Failed configure session parameters");
+
+		/* Return session to mempool */
+		rte_mempool_put(mempool, (void *)sess_priv);
+		return ret;
+	}
+
+	sess_priv->pre = internals->ops[sess_priv->sess.key].pre;
+	sess_priv->init = internals->ops[sess_priv->sess.key].init;
+	if (sess_priv->sess.op == AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION) {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_enc;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_enc;
+	} else {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_dec;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_dec;
+	}
+
+	sess->sess_private_data = sess_priv;
+
+	return 0;
+}
+
+static int
+aesni_gcm_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	void *sess_priv = get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_gcm_security_session));
+		set_sec_session_private_data(sess, NULL);
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+	return 0;
+}
+
+static unsigned int
+aesni_gcm_sec_session_get_size(__rte_unused void *device)
+{
+	return sizeof(struct aesni_gcm_security_session);
+}
+
 struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.dev_configure		= aesni_gcm_pmd_config,
 		.dev_start		= aesni_gcm_pmd_start,
@@ -336,4 +416,19 @@ struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.sym_session_clear	= aesni_gcm_pmd_sym_session_clear
 };
 
+static struct rte_security_ops aesni_gcm_security_ops = {
+		.session_create = aesni_gcm_security_session_create,
+		.session_get_size = aesni_gcm_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_gcm_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk =
+				aesni_gcm_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops = &aesni_gcm_pmd_ops;
+
+struct rte_security_ops *rte_aesni_gcm_pmd_security_ops =
+		&aesni_gcm_security_ops;
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
index 56b29e013..8e490b6ce 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
@@ -114,5 +114,28 @@ aesni_gcm_set_session_parameters(const struct aesni_gcm_ops *ops,
  * Device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops;
 
+/**
+ * Security session structure.
+ */
+struct aesni_gcm_security_session {
+	/** Temp digest for decryption */
+	uint8_t temp_digest[DIGEST_LENGTH_MAX];
+	/** GCM operations */
+	aesni_gcm_pre_t pre;
+	aesni_gcm_init_t init;
+	aesni_gcm_update_t update;
+	aesni_gcm_finalize_t finalize;
+	/** AESNI-GCM session */
+	struct aesni_gcm_session sess;
+	/** AESNI-GCM context */
+	struct gcm_context_data gdata_ctx;
+};
+
+extern void
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
+extern struct rte_security_ops *rte_aesni_gcm_pmd_security_ops;
 
 #endif /* _RTE_AESNI_GCM_PMD_PRIVATE_H_ */
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 3/9] app/test: add security cpu crypto autotest
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 2/9] crypto/aesni_gcm: add rte_security handler Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 4/9] app/test: add security cpu crypto perftest Fan Zhang
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch adds cpu crypto unit test for AESNI_GCM PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/Makefile                   |   1 +
 app/test/meson.build                |   1 +
 app/test/test_security_cpu_crypto.c | 564 ++++++++++++++++++++++++++++++++++++
 3 files changed, 566 insertions(+)
 create mode 100644 app/test/test_security_cpu_crypto.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 26ba6fe2b..090c55746 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -196,6 +196,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_blockcipher.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_asym.c
+SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_security_cpu_crypto.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_METRICS) += test_metrics.c
 
diff --git a/app/test/meson.build b/app/test/meson.build
index ec40943bd..b7834ff21 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -103,6 +103,7 @@ test_sources = files('commands.c',
 	'test_ring_perf.c',
 	'test_rwlock.c',
 	'test_sched.c',
+	'test_security_cpu_crypto.c',
 	'test_service_cores.c',
 	'test_spinlock.c',
 	'test_stack.c',
diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
new file mode 100644
index 000000000..d345922b2
--- /dev/null
+++ b/app/test/test_security_cpu_crypto.c
@@ -0,0 +1,564 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#include <rte_common.h>
+#include <rte_hexdump.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_pause.h>
+#include <rte_bus_vdev.h>
+#include <rte_random.h>
+
+#include <rte_security.h>
+
+#include <rte_crypto.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+
+#include "test.h"
+#include "test_cryptodev.h"
+#include "test_cryptodev_aead_test_vectors.h"
+
+#define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
+#define MAX_NB_SIGMENTS			4
+
+enum buffer_assemble_option {
+	SGL_MAX_SEG,
+	SGL_ONE_SEG,
+};
+
+struct cpu_crypto_test_case {
+	struct {
+		uint8_t seg[MBUF_DATAPAYLOAD_SIZE];
+		uint32_t seg_len;
+	} seg_buf[MAX_NB_SIGMENTS];
+	uint8_t iv[MAXIMUM_IV_LENGTH];
+	uint8_t aad[CPU_CRYPTO_TEST_MAX_AAD_LENGTH];
+	uint8_t digest[DIGEST_BYTE_LENGTH_SHA512];
+} __rte_cache_aligned;
+
+struct cpu_crypto_test_obj {
+	struct iovec vec[MAX_NUM_OPS_INFLIGHT][MAX_NB_SIGMENTS];
+	struct rte_security_vec sec_buf[MAX_NUM_OPS_INFLIGHT];
+	void *iv[MAX_NUM_OPS_INFLIGHT];
+	void *digest[MAX_NUM_OPS_INFLIGHT];
+	void *aad[MAX_NUM_OPS_INFLIGHT];
+	int status[MAX_NUM_OPS_INFLIGHT];
+};
+
+struct cpu_crypto_testsuite_params {
+	struct rte_mempool *buf_pool;
+	struct rte_mempool *session_priv_mpool;
+	struct rte_security_ctx *ctx;
+};
+
+struct cpu_crypto_unittest_params {
+	struct rte_security_session *sess;
+	void *test_datas[MAX_NUM_OPS_INFLIGHT];
+	struct cpu_crypto_test_obj test_obj;
+	uint32_t nb_bufs;
+};
+
+static struct cpu_crypto_testsuite_params testsuite_params = { NULL };
+static struct cpu_crypto_unittest_params unittest_params;
+
+static int gbl_driver_id;
+
+static int
+testsuite_setup(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct rte_cryptodev_info info;
+	uint32_t i;
+	uint32_t nb_devs;
+	uint32_t sess_sz;
+	int ret;
+
+	memset(ts_params, 0, sizeof(*ts_params));
+
+	ts_params->buf_pool = rte_mempool_lookup("CPU_CRYPTO_MBUFPOOL");
+	if (ts_params->buf_pool == NULL) {
+		/* Not already created so create */
+		ts_params->buf_pool = rte_pktmbuf_pool_create(
+				"CRYPTO_MBUFPOOL",
+				NUM_MBUFS, MBUF_CACHE_SIZE, 0,
+				sizeof(struct cpu_crypto_test_case),
+				rte_socket_id());
+		if (ts_params->buf_pool == NULL) {
+			RTE_LOG(ERR, USER1, "Can't create CRYPTO_MBUFPOOL\n");
+			return TEST_FAILED;
+		}
+	}
+
+	/* Create an AESNI MB device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD)));
+		if (nb_devs < 1) {
+			ret = rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD), NULL);
+
+			TEST_ASSERT(ret == 0,
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+		}
+	}
+
+	/* Create an AESNI GCM device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD)));
+		if (nb_devs < 1) {
+			TEST_ASSERT_SUCCESS(rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD), NULL),
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+		}
+	}
+
+	nb_devs = rte_cryptodev_count();
+	if (nb_devs < 1) {
+		RTE_LOG(ERR, USER1, "No crypto devices found?\n");
+		return TEST_FAILED;
+	}
+
+	/* Get security context */
+	for (i = 0; i < nb_devs; i++) {
+		rte_cryptodev_info_get(i, &info);
+		if (info.driver_id != gbl_driver_id)
+			continue;
+
+		ts_params->ctx = rte_cryptodev_get_sec_ctx(i);
+		if (!ts_params->ctx) {
+			RTE_LOG(ERR, USER1, "Rte_security is not supported\n");
+			return TEST_FAILED;
+		}
+	}
+
+	sess_sz = rte_security_session_get_size(ts_params->ctx);
+	ts_params->session_priv_mpool = rte_mempool_create(
+			"cpu_crypto_test_sess_mp", 2, sess_sz, 0, 0,
+			NULL, NULL, NULL, NULL,
+			SOCKET_ID_ANY, 0);
+	if (!ts_params->session_priv_mpool) {
+		RTE_LOG(ERR, USER1, "Not enough memory\n");
+		return TEST_FAILED;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static void
+testsuite_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+
+	if (ts_params->buf_pool)
+		rte_mempool_free(ts_params->buf_pool);
+
+	if (ts_params->session_priv_mpool)
+		rte_mempool_free(ts_params->session_priv_mpool);
+}
+
+static int
+ut_setup(void)
+{
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	memset(ut_params, 0, sizeof(*ut_params));
+	return TEST_SUCCESS;
+}
+
+static void
+ut_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	if (ut_params->sess)
+		rte_security_session_destroy(ts_params->ctx, ut_params->sess);
+
+	if (ut_params->nb_bufs) {
+		uint32_t i;
+
+		for (i = 0; i < ut_params->nb_bufs; i++)
+			memset(ut_params->test_datas[i], 0,
+				sizeof(struct cpu_crypto_test_case));
+
+		rte_mempool_put_bulk(ts_params->buf_pool, ut_params->test_datas,
+				ut_params->nb_bufs);
+	}
+}
+
+static int
+allocate_buf(uint32_t n)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	int ret;
+
+	ret = rte_mempool_get_bulk(ts_params->buf_pool, ut_params->test_datas,
+			n);
+
+	if (ret == 0)
+		ut_params->nb_bufs = n;
+
+	return ret;
+}
+
+static int
+check_status(struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	uint32_t i;
+
+	for (i = 0; i < n; i++)
+		if (obj->status[i] < 0)
+			return -1;
+
+	return 0;
+}
+
+static struct rte_security_session *
+create_aead_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xform = {0};
+
+	if (is_unit_test)
+		debug_hexdump(stdout, "key:", test_data->key.data,
+				test_data->key.len);
+
+	/* Setup AEAD Parameters */
+	xform.type = RTE_CRYPTO_SYM_XFORM_AEAD;
+	xform.next = NULL;
+	xform.aead.algo = test_data->algo;
+	xform.aead.op = op;
+	xform.aead.key.data = test_data->key.data;
+	xform.aead.key.length = test_data->key.len;
+	xform.aead.iv.offset = 0;
+	xform.aead.iv.length = test_data->iv.len;
+	xform.aead.digest_length = test_data->auth_tag.len;
+	xform.aead.aad_length = test_data->aad.len;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = &xform;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_aead_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		enum buffer_assemble_option sgl_option,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t seg_idx;
+	uint32_t bytes_per_seg;
+	uint32_t left;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->auth_tag.data,
+				test_data->auth_tag.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:",
+					test_data->auth_tag.data,
+					test_data->auth_tag.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	switch (sgl_option) {
+	case SGL_MAX_SEG:
+		seg_idx = 0;
+		bytes_per_seg = src_len / MAX_NB_SIGMENTS + 1;
+		left = src_len;
+
+		if (bytes_per_seg > (MBUF_DATAPAYLOAD_SIZE / MAX_NB_SIGMENTS))
+			return -ENOMEM;
+
+		while (left) {
+			uint32_t cp_len = RTE_MIN(left, bytes_per_seg);
+			memcpy(data->seg_buf[seg_idx].seg, src, cp_len);
+			data->seg_buf[seg_idx].seg_len = cp_len;
+			obj->vec[obj_idx][seg_idx].iov_base =
+					(void *)data->seg_buf[seg_idx].seg;
+			obj->vec[obj_idx][seg_idx].iov_len = cp_len;
+			src += cp_len;
+			left -= cp_len;
+			seg_idx++;
+		}
+
+		if (left)
+			return -ENOMEM;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = seg_idx;
+
+		break;
+	case SGL_ONE_SEG:
+		memcpy(data->seg_buf[0].seg, src, src_len);
+		data->seg_buf[0].seg_len = src_len;
+		obj->vec[obj_idx][0].iov_base =
+				(void *)data->seg_buf[0].seg;
+		obj->vec[obj_idx][0].iov_len = src_len;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = 1;
+		break;
+	default:
+		return -1;
+	}
+
+	if (test_data->algo == RTE_CRYPTO_AEAD_AES_CCM) {
+		memcpy(data->iv + 1, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad + 18, test_data->aad.data, test_data->aad.len);
+	} else {
+		memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad, test_data->aad.data, test_data->aad.len);
+	}
+
+	if (is_unit_test) {
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+		debug_hexdump(stdout, "aad:", test_data->aad.data,
+				test_data->aad.len);
+	}
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+	obj->aad[obj_idx] = (void *)data->aad;
+
+	return 0;
+}
+
+#define CPU_CRYPTO_ERR_EXP_CT	"expect ciphertext:"
+#define CPU_CRYPTO_ERR_GEN_CT	"gen ciphertext:"
+#define CPU_CRYPTO_ERR_EXP_PT	"expect plaintext:"
+#define CPU_CRYPTO_ERR_GEN_PT	"gen plaintext:"
+
+static int
+check_aead_result(struct cpu_crypto_test_case *tcase,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *tdata)
+{
+	const char *err_msg1, *err_msg2;
+	const uint8_t *src_pt_ct;
+	const uint8_t *tmp_src;
+	uint32_t src_len;
+	uint32_t left;
+	uint32_t i = 0;
+	int ret;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+
+		src_pt_ct = tdata->ciphertext.data;
+		src_len = tdata->ciphertext.len;
+
+		ret = memcmp(tcase->digest, tdata->auth_tag.data,
+				tdata->auth_tag.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					tdata->auth_tag.data,
+					tdata->auth_tag.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					tdata->auth_tag.len);
+			return -1;
+		}
+	} else {
+		src_pt_ct = tdata->plaintext.data;
+		src_len = tdata->plaintext.len;
+		err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+	}
+
+	tmp_src = src_pt_ct;
+	left = src_len;
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		ret = memcmp(tcase->seg_buf[i].seg, tmp_src,
+				tcase->seg_buf[i].seg_len);
+		if (ret != 0)
+			goto sgl_err_dump;
+		tmp_src += tcase->seg_buf[i].seg_len;
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+
+	if (left) {
+		ret = -ENOMEM;
+		goto sgl_err_dump;
+	}
+
+	return 0;
+
+sgl_err_dump:
+	left = src_len;
+	i = 0;
+
+	debug_hexdump(stdout, err_msg1,
+			tdata->ciphertext.data,
+			tdata->ciphertext.len);
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		debug_hexdump(stdout, err_msg2,
+				tcase->seg_buf[i].seg,
+				tcase->seg_buf[i].seg_len);
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+	return ret;
+}
+
+static inline void
+run_test(struct rte_security_ctx *ctx, struct rte_security_session *sess,
+		struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	rte_security_process_cpu_crypto_bulk(ctx, sess, obj->sec_buf,
+			obj->iv, obj->aad, obj->digest, obj->status, n);
+}
+
+static int
+cpu_crypto_test_aead(const struct aead_test_data *tdata,
+		enum rte_crypto_aead_operation dir,
+		enum buffer_assemble_option sgl_option)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			dir,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_aead_buf(tcase, obj, 0, dir, tdata, sgl_option, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_aead_result(tcase, dir, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* test-vector/sgl-option */
+#define all_gcm_unit_test_cases(type)		\
+	TEST_EXPAND(gcm_test_case_1, type)	\
+	TEST_EXPAND(gcm_test_case_2, type)	\
+	TEST_EXPAND(gcm_test_case_3, type)	\
+	TEST_EXPAND(gcm_test_case_4, type)	\
+	TEST_EXPAND(gcm_test_case_5, type)	\
+	TEST_EXPAND(gcm_test_case_6, type)	\
+	TEST_EXPAND(gcm_test_case_7, type)	\
+	TEST_EXPAND(gcm_test_case_8, type)	\
+	TEST_EXPAND(gcm_test_case_192_1, type)	\
+	TEST_EXPAND(gcm_test_case_192_2, type)	\
+	TEST_EXPAND(gcm_test_case_192_3, type)	\
+	TEST_EXPAND(gcm_test_case_192_4, type)	\
+	TEST_EXPAND(gcm_test_case_192_5, type)	\
+	TEST_EXPAND(gcm_test_case_192_6, type)	\
+	TEST_EXPAND(gcm_test_case_192_7, type)	\
+	TEST_EXPAND(gcm_test_case_256_1, type)	\
+	TEST_EXPAND(gcm_test_case_256_2, type)	\
+	TEST_EXPAND(gcm_test_case_256_3, type)	\
+	TEST_EXPAND(gcm_test_case_256_4, type)	\
+	TEST_EXPAND(gcm_test_case_256_5, type)	\
+	TEST_EXPAND(gcm_test_case_256_6, type)	\
+	TEST_EXPAND(gcm_test_case_256_7, type)
+
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_aead_enc_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_ENCRYPT, o);	\
+}									\
+static int								\
+cpu_crypto_aead_dec_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_DECRYPT, o);	\
+}									\
+
+all_gcm_unit_test_cases(SGL_ONE_SEG)
+all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-GCM Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
+}
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
+		test_security_cpu_crypto_aesni_gcm);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 4/9] app/test: add security cpu crypto perftest
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (2 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 3/9] app/test: add security cpu crypto autotest Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 5/9] crypto/aesni_mb: add rte_security handler Fan Zhang
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple GCM CPU crypto performance test to crypto unittest
application. The test includes different key and data sizes test with
single buffer and SGL buffer test items and will display the throughput
as well as cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 201 ++++++++++++++++++++++++++++++++++++
 1 file changed, 201 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index d345922b2..ca9a8dae6 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -23,6 +23,7 @@
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
+#define CACHE_WARM_ITER			2048
 
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
@@ -560,5 +561,205 @@ test_security_cpu_crypto_aesni_gcm(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
 }
 
+
+static inline void
+gen_rand(uint8_t *data, uint32_t len)
+{
+	uint32_t i;
+
+	for (i = 0; i < len; i++)
+		data[i] = (uint8_t)rte_rand();
+}
+
+static inline void
+switch_aead_enc_to_dec(struct aead_test_data *tdata,
+		struct cpu_crypto_test_case *tcase,
+		enum buffer_assemble_option sgl_option)
+{
+	uint32_t i;
+	uint8_t *dst = tdata->ciphertext.data;
+
+	switch (sgl_option) {
+	case SGL_ONE_SEG:
+		memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+		tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+		break;
+	case SGL_MAX_SEG:
+		tdata->ciphertext.len = 0;
+		for (i = 0; i < MAX_NB_SIGMENTS; i++) {
+			memcpy(dst, tcase->seg_buf[i].seg,
+					tcase->seg_buf[i].seg_len);
+			tdata->ciphertext.len += tcase->seg_buf[i].seg_len;
+		}
+		break;
+	}
+
+	memcpy(tdata->auth_tag.data, tcase->digest, tdata->auth_tag.len);
+}
+
+static int
+cpu_crypto_test_aead_perf(enum buffer_assemble_option sgl_option,
+		uint32_t key_sz)
+{
+	struct aead_test_data tdata = {0};
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint8_t aad[16];
+	int ret;
+
+	tdata.key.len = key_sz;
+	gen_rand(tdata.key.data, tdata.key.len);
+	tdata.algo = RTE_CRYPTO_AEAD_AES_GCM;
+	tdata.aad.data = aad;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			RTE_CRYPTO_AEAD_OP_DECRYPT,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(tdata.plaintext.data,
+					tdata.plaintext.len);
+
+			tdata.aad.len = 12;
+			gen_rand(tdata.aad.data, tdata.aad.len);
+
+			tdata.auth_tag.len = 16;
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_ENCRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_aead_enc_to_dec(&tdata, tcase, sgl_option);
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_DECRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* test-perfix/key-size/sgl-type */
+#define all_gcm_perf_test_cases(type)					\
+	TEST_EXPAND(_128, 16, type)					\
+	TEST_EXPAND(_192, 24, type)					\
+	TEST_EXPAND(_256, 32, type)
+
+#define TEST_EXPAND(a, b, c)						\
+static int								\
+cpu_crypto_gcm_perf##a##_##c(void)					\
+{									\
+	return cpu_crypto_test_aead_perf(c, b);				\
+}									\
+
+all_gcm_perf_test_cases(SGL_ONE_SEG)
+all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_perf_testsuite  = {
+		.suite_name = "Security CPU Crypto AESNI-GCM Perf Test Suite",
+		.setup = testsuite_setup,
+		.teardown = testsuite_teardown,
+		.unit_test_cases = {
+#define TEST_EXPAND(a, b, c)						\
+		TEST_CASE_ST(ut_setup, ut_teardown,			\
+				cpu_crypto_gcm_perf##a##_##c),		\
+
+		all_gcm_perf_test_cases(SGL_ONE_SEG)
+		all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+		TEST_CASES_END() /**< NULL terminate unit test array */
+		},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesgcm_perf_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
+		test_security_cpu_crypto_aesni_gcm_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 5/9] crypto/aesni_mb: add rte_security handler
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (3 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 4/9] app/test: add security cpu crypto perftest Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 6/9] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch add rte_security support support to AESNI-MB PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         | 291 ++++++++++++++++++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |  91 ++++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |  21 +-
 3 files changed, 398 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index b495a9679..68767c04e 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -8,6 +8,8 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -789,6 +791,167 @@ auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
 			(UINT64_MAX - u_src + u_dst + 1);
 }
 
+union sec_userdata_field {
+	int status;
+	struct {
+		uint16_t is_gen_digest;
+		uint16_t digest_len;
+	};
+};
+
+struct sec_udata_digest_field {
+	uint32_t is_digest_gen;
+	uint32_t digest_len;
+};
+
+static inline int
+set_mb_job_params_sec(JOB_AES_HMAC *job, struct aesni_mb_sec_session *sec_sess,
+		void *buf, uint32_t buf_len, void *iv, void *aad, void *digest,
+		int *status, uint8_t *digest_idx)
+{
+	struct aesni_mb_session *session = &sec_sess->sess;
+	uint32_t cipher_offset = sec_sess->cipher_offset;
+	void *user_digest = NULL;
+	union sec_userdata_field udata;
+
+	if (unlikely(cipher_offset > buf_len))
+		return -EINVAL;
+
+	/* Set crypto operation */
+	job->chain_order = session->chain_order;
+
+	/* Set cipher parameters */
+	job->cipher_direction = session->cipher.direction;
+	job->cipher_mode = session->cipher.mode;
+
+	job->aes_key_len_in_bytes = session->cipher.key_length_in_bytes;
+
+	/* Set authentication parameters */
+	job->hash_alg = session->auth.algo;
+	job->iv = iv;
+
+	switch (job->hash_alg) {
+	case AES_XCBC:
+		job->u.XCBC._k1_expanded = session->auth.xcbc.k1_expanded;
+		job->u.XCBC._k2 = session->auth.xcbc.k2;
+		job->u.XCBC._k3 = session->auth.xcbc.k3;
+
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_CCM:
+		job->u.CCM.aad = (uint8_t *)aad + 18;
+		job->u.CCM.aad_len_in_bytes = session->aead.aad_len;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		job->iv++;
+		break;
+
+	case AES_CMAC:
+		job->u.CMAC._key_expanded = session->auth.cmac.expkey;
+		job->u.CMAC._skey1 = session->auth.cmac.skey1;
+		job->u.CMAC._skey2 = session->auth.cmac.skey2;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_GMAC:
+		if (session->cipher.mode == GCM) {
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = session->aead.aad_len;
+		} else {
+			/* For GMAC */
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = buf_len;
+			job->cipher_mode = GCM;
+		}
+		job->aes_enc_key_expanded = &session->cipher.gcm_key;
+		job->aes_dec_key_expanded = &session->cipher.gcm_key;
+		break;
+
+	default:
+		job->u.HMAC._hashed_auth_key_xor_ipad =
+				session->auth.pads.inner;
+		job->u.HMAC._hashed_auth_key_xor_opad =
+				session->auth.pads.outer;
+
+		if (job->cipher_mode == DES3) {
+			job->aes_enc_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+			job->aes_dec_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+		} else {
+			job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+			job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		}
+	}
+
+	/* Set digest output location */
+	if (job->hash_alg != NULL_HASH &&
+			session->auth.operation == RTE_CRYPTO_AUTH_OP_VERIFY) {
+		job->auth_tag_output = sec_sess->temp_digests[*digest_idx];
+		*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+
+		udata.is_gen_digest = 0;
+		udata.digest_len = session->auth.req_digest_len;
+		user_digest = (void *)digest;
+	} else {
+		udata.is_gen_digest = 1;
+		udata.digest_len = session->auth.req_digest_len;
+
+		if (session->auth.req_digest_len !=
+				session->auth.gen_digest_len) {
+			job->auth_tag_output =
+					sec_sess->temp_digests[*digest_idx];
+			*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+
+			user_digest = (void *)digest;
+		} else
+			job->auth_tag_output = digest;
+
+		/* A bit of hack here, since job structure only supports
+		 * 2 user data fields and we need 4 params to be passed
+		 * (status, direction, digest for verify, and length of
+		 * digest), we set the status value as digest length +
+		 * direction here temporarily to avoid creating longer
+		 * buffer to store all 4 params.
+		 */
+		*status = udata.status;
+	}
+	/*
+	 * Multi-buffer library current only support returning a truncated
+	 * digest length as specified in the relevant IPsec RFCs
+	 */
+
+	/* Set digest length */
+	job->auth_tag_output_len_in_bytes = session->auth.gen_digest_len;
+
+	/* Set IV parameters */
+	job->iv_len_in_bytes = session->iv.length;
+
+	/* Data Parameters */
+	job->src = buf;
+	job->dst = buf;
+	job->cipher_start_src_offset_in_bytes = cipher_offset;
+	job->msg_len_to_cipher_in_bytes = buf_len - cipher_offset;
+	job->hash_start_src_offset_in_bytes = 0;
+	job->msg_len_to_hash_in_bytes = buf_len;
+
+	job->user_data = (void *)status;
+	job->user_data2 = user_digest;
+
+	return 0;
+}
+
 /**
  * Process a crypto operation and complete a JOB_AES_HMAC job structure for
  * submission to the multi buffer library for processing.
@@ -1081,6 +1244,37 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
 	return op;
 }
 
+static inline void
+post_process_mb_sec_job(JOB_AES_HMAC *job)
+{
+	void *user_digest = job->user_data2;
+	int *status = job->user_data;
+	union sec_userdata_field udata;
+
+	switch (job->status) {
+	case STS_COMPLETED:
+		if (user_digest) {
+			udata.status = *status;
+
+			if (udata.is_gen_digest) {
+				*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+				memcpy(user_digest, job->auth_tag_output,
+						udata.digest_len);
+			} else {
+				verify_digest(job, user_digest,
+					udata.digest_len, (uint8_t *)status);
+
+				if (*status == RTE_CRYPTO_OP_STATUS_AUTH_FAILED)
+					*status = -1;
+			}
+		} else
+			*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+		break;
+	default:
+		*status = RTE_CRYPTO_OP_STATUS_ERROR;
+	}
+}
+
 /**
  * Process a completed JOB_AES_HMAC job and keep processing jobs until
  * get_completed_job return NULL
@@ -1117,6 +1311,32 @@ handle_completed_jobs(struct aesni_mb_qp *qp, JOB_AES_HMAC *job,
 	return processed_jobs;
 }
 
+static inline uint32_t
+handle_completed_sec_jobs(JOB_AES_HMAC *job, MB_MGR *mb_mgr)
+{
+	uint32_t processed = 0;
+
+	while (job != NULL) {
+		post_process_mb_sec_job(job);
+		job = IMB_GET_COMPLETED_JOB(mb_mgr);
+		processed++;
+	}
+
+	return processed;
+}
+
+static inline uint32_t
+flush_mb_sec_mgr(MB_MGR *mb_mgr)
+{
+	JOB_AES_HMAC *job = IMB_FLUSH_JOB(mb_mgr);
+	uint32_t processed = 0;
+
+	if (job)
+		processed = handle_completed_sec_jobs(job, mb_mgr);
+
+	return processed;
+}
+
 static inline uint16_t
 flush_mb_mgr(struct aesni_mb_qp *qp, struct rte_crypto_op **ops,
 		uint16_t nb_ops)
@@ -1220,6 +1440,55 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	return processed_jobs;
 }
 
+void
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_mb_sec_session *sec_sess = sess->sess_private_data;
+	JOB_AES_HMAC *job;
+	uint8_t digest_idx = sec_sess->digest_idx;
+	uint32_t i, processed = 0;
+	int ret;
+
+	for (i = 0; i < num; i++) {
+		void *seg_buf = buf[i].vec[0].iov_base;
+		uint32_t buf_len = buf[i].vec[0].iov_len;
+
+		job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
+		if (unlikely(job == NULL)) {
+			processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
+
+			job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
+			if (!job)
+				return;
+		}
+
+		ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
+				iv[i], aad[i], digest[i], &status[i],
+				&digest_idx);
+				/* Submit job to multi-buffer for processing */
+		if (ret) {
+			processed++;
+			status[i] = ret;
+			continue;
+		}
+
+#ifdef RTE_LIBRTE_PMD_AESNI_MB_DEBUG
+		job = IMB_SUBMIT_JOB(sec_sess->mb_mgr);
+#else
+		job = IMB_SUBMIT_JOB_NOCHECK(sec_sess->mb_mgr);
+#endif
+
+		if (job)
+			processed += handle_completed_sec_jobs(job,
+					sec_sess->mb_mgr);
+	}
+
+	while (processed < num)
+		processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
+}
+
 static int cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev);
 
 static int
@@ -1229,8 +1498,10 @@ cryptodev_aesni_mb_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_mb_private *internals;
+	struct rte_security_ctx *sec_ctx;
 	enum aesni_mb_vector_mode vector_mode;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -1264,7 +1535,8 @@ cryptodev_aesni_mb_create(const char *name,
 	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO |
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 
 	mb_mgr = alloc_mb_mgr(0);
@@ -1303,11 +1575,28 @@ cryptodev_aesni_mb_create(const char *name,
 	AESNI_MB_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_mb_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_MB_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_mb_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 	return 0;
 
 error_exit:
 	if (mb_mgr)
 		free_mb_mgr(mb_mgr);
+	if (sec_ctx)
+		rte_free(sec_ctx);
 
 	rte_cryptodev_pmd_destroy(dev);
 
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
index 8d15b99d4..ca6cea775 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
@@ -8,6 +8,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "rte_aesni_mb_pmd_private.h"
 
@@ -732,7 +733,8 @@ aesni_mb_pmd_qp_count(struct rte_cryptodev *dev)
 static unsigned
 aesni_mb_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
 {
-	return sizeof(struct aesni_mb_session);
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_session),
+			RTE_CACHE_LINE_SIZE);
 }
 
 /** Configure a aesni multi-buffer session from a crypto xform chain */
@@ -810,4 +812,91 @@ struct rte_cryptodev_ops aesni_mb_pmd_ops = {
 		.sym_session_clear	= aesni_mb_pmd_sym_session_clear
 };
 
+/** Set session authentication parameters */
+
+static int
+aesni_mb_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_mb_private *internals = cdev->data->dev_private;
+	struct aesni_mb_sec_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_MB_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_MB_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	sess_priv->mb_mgr = internals->mb_mgr;
+	if (sess_priv->mb_mgr == NULL)
+		return -ENOMEM;
+
+	sess_priv->cipher_offset = conf->cpucrypto.cipher_offset;
+
+	ret = aesni_mb_set_session_parameters(sess_priv->mb_mgr,
+			&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_MB_LOG(ERR, "failed configure session parameters");
+
+		rte_mempool_put(mempool, sess_priv);
+	}
+
+	sess->sess_private_data = (void *)sess_priv;
+
+	return ret;
+}
+
+static int
+aesni_mb_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	struct aesni_mb_sec_session *sess_priv =
+			get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(
+				(void *)sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_mb_sec_session));
+		set_sec_session_private_data(sess, NULL);
+
+		if (sess_mp == NULL) {
+			AESNI_MB_LOG(ERR, "failed fetch session mempool");
+			return -EINVAL;
+		}
+
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+
+	return 0;
+}
+
+static unsigned int
+aesni_mb_sec_session_get_size(__rte_unused void *device)
+{
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_sec_session),
+			RTE_CACHE_LINE_SIZE);
+}
+
+static struct rte_security_ops aesni_mb_security_ops = {
+		.session_create = aesni_mb_security_session_create,
+		.session_get_size = aesni_mb_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_mb_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk = aesni_mb_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops = &aesni_mb_pmd_ops;
+struct rte_security_ops *rte_aesni_mb_pmd_security_ops = &aesni_mb_security_ops;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
index b794d4bc1..d1cf416ab 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
@@ -176,7 +176,6 @@ struct aesni_mb_qp {
 	 */
 } __rte_cache_aligned;
 
-/** AES-NI multi-buffer private session structure */
 struct aesni_mb_session {
 	JOB_CHAIN_ORDER chain_order;
 	struct {
@@ -265,16 +264,32 @@ struct aesni_mb_session {
 		/** AAD data length */
 		uint16_t aad_len;
 	} aead;
-} __rte_cache_aligned;
+};
+
+/** AES-NI multi-buffer private security session structure */
+struct aesni_mb_sec_session {
+	/**< Unique Queue Pair Name */
+	struct aesni_mb_session sess;
+	uint8_t temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];
+	uint16_t digest_idx;
+	uint32_t cipher_offset;
+	MB_MGR *mb_mgr;
+};
 
 extern int
 aesni_mb_set_session_parameters(const MB_MGR *mb_mgr,
 		struct aesni_mb_session *sess,
 		const struct rte_crypto_sym_xform *xform);
 
+extern void
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops;
 
-
+/** device specific operations function pointer structure for rte_security */
+extern struct rte_security_ops *rte_aesni_mb_pmd_security_ops;
 
 #endif /* _RTE_AESNI_MB_PMD_PRIVATE_H_ */
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 6/9] app/test: add aesni_mb security cpu crypto autotest
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (4 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 5/9] crypto/aesni_mb: add rte_security handler Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 7/9] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch adds cpu crypto unit test for AESNI_MB PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 367 ++++++++++++++++++++++++++++++++++++
 1 file changed, 367 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index ca9a8dae6..0ea406390 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -19,12 +19,23 @@
 
 #include "test.h"
 #include "test_cryptodev.h"
+#include "test_cryptodev_blockcipher.h"
+#include "test_cryptodev_aes_test_vectors.h"
 #include "test_cryptodev_aead_test_vectors.h"
+#include "test_cryptodev_des_test_vectors.h"
+#include "test_cryptodev_hash_test_vectors.h"
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
 #define CACHE_WARM_ITER			2048
 
+#define TOP_ENC		BLOCKCIPHER_TEST_OP_ENCRYPT
+#define TOP_DEC		BLOCKCIPHER_TEST_OP_DECRYPT
+#define TOP_AUTH_GEN	BLOCKCIPHER_TEST_OP_AUTH_GEN
+#define TOP_AUTH_VER	BLOCKCIPHER_TEST_OP_AUTH_VERIFY
+#define TOP_ENC_AUTH	BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN
+#define TOP_AUTH_DEC	BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC
+
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
 	SGL_ONE_SEG,
@@ -516,6 +527,11 @@ cpu_crypto_test_aead(const struct aead_test_data *tdata,
 	TEST_EXPAND(gcm_test_case_256_6, type)	\
 	TEST_EXPAND(gcm_test_case_256_7, type)
 
+/* test-vector/sgl-option */
+#define all_ccm_unit_test_cases \
+	TEST_EXPAND(ccm_test_case_128_1, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_2, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_3, SGL_ONE_SEG)
 
 #define TEST_EXPAND(t, o)						\
 static int								\
@@ -531,6 +547,7 @@ cpu_crypto_aead_dec_test_##t##_##o(void)				\
 
 all_gcm_unit_test_cases(SGL_ONE_SEG)
 all_gcm_unit_test_cases(SGL_MAX_SEG)
+all_ccm_unit_test_cases
 #undef TEST_EXPAND
 
 static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
@@ -758,8 +775,358 @@ test_security_cpu_crypto_aesni_gcm_perf(void)
 			&security_cpu_crypto_aesgcm_perf_testsuite);
 }
 
+static struct rte_security_session *
+create_blockcipher_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xforms[2] = { {0} };
+	struct rte_crypto_sym_xform *cipher_xform = NULL;
+	struct rte_crypto_sym_xform *auth_xform = NULL;
+	struct rte_crypto_sym_xform *xform;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		cipher_xform = &xforms[0];
+		cipher_xform->type = RTE_CRYPTO_SYM_XFORM_CIPHER;
+
+		if (op_mask & TOP_ENC)
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_ENCRYPT;
+		else
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_DECRYPT;
+
+		cipher_xform->cipher.algo = test_data->crypto_algo;
+		cipher_xform->cipher.key.data = test_data->cipher_key.data;
+		cipher_xform->cipher.key.length = test_data->cipher_key.len;
+		cipher_xform->cipher.iv.offset = 0;
+		cipher_xform->cipher.iv.length = test_data->iv.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "cipher key:",
+					test_data->cipher_key.data,
+					test_data->cipher_key.len);
+	}
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH) {
+		auth_xform = &xforms[1];
+		auth_xform->type = RTE_CRYPTO_SYM_XFORM_AUTH;
+
+		if (op_mask & TOP_AUTH_GEN)
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_GENERATE;
+		else
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_VERIFY;
+
+		auth_xform->auth.algo = test_data->auth_algo;
+		auth_xform->auth.key.length = test_data->auth_key.len;
+		auth_xform->auth.key.data = test_data->auth_key.data;
+		auth_xform->auth.digest_length = test_data->digest.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "auth key:",
+					test_data->auth_key.data,
+					test_data->auth_key.len);
+	}
+
+	if (op_mask == TOP_ENC ||
+			op_mask == TOP_DEC)
+		xform = cipher_xform;
+	else if (op_mask == TOP_AUTH_GEN ||
+			op_mask == TOP_AUTH_VER)
+		xform = auth_xform;
+	else if (op_mask == TOP_ENC_AUTH) {
+		xform = cipher_xform;
+		xform->next = auth_xform;
+	} else if (op_mask == TOP_AUTH_DEC) {
+		xform = auth_xform;
+		xform->next = cipher_xform;
+	} else
+		return NULL;
+
+	if (test_data->cipher_offset < test_data->auth_offset)
+		return NULL;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = xform;
+	sess_conf.cpucrypto.cipher_offset = test_data->cipher_offset -
+			test_data->auth_offset;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_blockcipher_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t offset;
+
+	if (op_mask == TOP_ENC_AUTH ||
+			op_mask == TOP_AUTH_GEN ||
+			op_mask == BLOCKCIPHER_TEST_OP_AUTH_VERIFY)
+		offset = test_data->auth_offset;
+	else
+		offset = test_data->cipher_offset;
+
+	if (op_mask & TOP_ENC_AUTH) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:", test_data->digest.data,
+					test_data->digest.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	memcpy(data->seg_buf[0].seg, src, src_len);
+	data->seg_buf[0].seg_len = src_len;
+	obj->vec[obj_idx][0].iov_base =
+			(void *)(data->seg_buf[0].seg + offset);
+	obj->vec[obj_idx][0].iov_len = src_len - offset;
+
+	obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+	obj->sec_buf[obj_idx].num = 1;
+
+	memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+	if (is_unit_test)
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+
+	return 0;
+}
+
+static int
+check_blockcipher_result(struct cpu_crypto_test_case *tcase,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data)
+{
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		const char *err_msg1, *err_msg2;
+		const uint8_t *src_pt_ct;
+		uint32_t src_len;
+
+		if (op_mask & TOP_ENC) {
+			src_pt_ct = test_data->ciphertext.data;
+			src_len = test_data->ciphertext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+		} else {
+			src_pt_ct = test_data->plaintext.data;
+			src_len = test_data->plaintext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+		}
+
+		ret = memcmp(tcase->seg_buf[0].seg, src_pt_ct, src_len);
+		if (ret != 0) {
+			debug_hexdump(stdout, err_msg1, src_pt_ct, src_len);
+			debug_hexdump(stdout, err_msg2,
+					tcase->seg_buf[0].seg,
+					test_data->ciphertext.len);
+			return -1;
+		}
+	}
+
+	if (op_mask & TOP_AUTH_GEN) {
+		ret = memcmp(tcase->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					test_data->digest.data,
+					test_data->digest.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					test_data->digest.len);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static int
+cpu_crypto_test_blockcipher(const struct blockcipher_test_data *tdata,
+		uint32_t op_mask)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_blockcipher_buf(tcase, obj, 0, op_mask, tdata, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_blockcipher_result(tcase, op_mask, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* Macro to save code for defining BlockCipher test cases */
+/* test-vector-name/op */
+#define all_blockcipher_test_cases \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_1, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_1, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_2, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_2, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_3, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_3, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_4, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_4, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_5, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_5, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_6, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_6, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_7, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_7, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_8, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_8, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_9, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_9, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_10, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_10, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_11, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_11, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_12, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_12, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_13, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_13, TOP_AUTH_DEC) \
+	TEST_EXPAND(des_test_data_1, TOP_ENC) \
+	TEST_EXPAND(des_test_data_1, TOP_DEC) \
+	TEST_EXPAND(des_test_data_2, TOP_ENC) \
+	TEST_EXPAND(des_test_data_2, TOP_DEC) \
+	TEST_EXPAND(des_test_data_3, TOP_ENC) \
+	TEST_EXPAND(des_test_data_3, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC_AUTH) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_AUTH_DEC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_DEC) \
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_blockcipher_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_blockcipher(&t, o);			\
+}
+
+all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_ccm_unit_test_cases
+#undef TEST_EXPAND
+
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_blockcipher_test_##t##_##o),		\
+
+	all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
 REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 		test_security_cpu_crypto_aesni_gcm_perf);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
+		test_security_cpu_crypto_aesni_mb);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 7/9] app/test: add aesni_mb security cpu crypto perftest
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (5 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 6/9] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 8/9] ipsec: add rte_security cpu_crypto action support Fan Zhang
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple AES-CBC-SHA1-HMAC CPU crypto performance test to crypto
unittest application. The test includes different key and data sizes test
with single buffer test items and will display the throughput as well as
cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 194 ++++++++++++++++++++++++++++++++++++
 1 file changed, 194 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index 0ea406390..6e012672e 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -1122,6 +1122,197 @@ test_security_cpu_crypto_aesni_mb(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
 }
 
+static inline void
+switch_blockcipher_enc_to_dec(struct blockcipher_test_data *tdata,
+		struct cpu_crypto_test_case *tcase, uint8_t *dst)
+{
+	memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+	tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+	memcpy(tdata->digest.data, tcase->digest, tdata->digest.len);
+}
+
+static int
+cpu_crypto_test_blockcipher_perf(
+		const enum rte_crypto_cipher_algorithm cipher_algo,
+		uint32_t cipher_key_sz,
+		const enum rte_crypto_auth_algorithm auth_algo,
+		uint32_t auth_key_sz, uint32_t digest_sz,
+		uint32_t op_mask)
+{
+	struct blockcipher_test_data tdata = {0};
+	uint8_t plaintext[3000], ciphertext[3000];
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint32_t op_mask_opp = 0;
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_CIPHER);
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_AUTH);
+
+	tdata.plaintext.data = plaintext;
+	tdata.ciphertext.data = ciphertext;
+
+	tdata.cipher_key.len = cipher_key_sz;
+	tdata.auth_key.len = auth_key_sz;
+
+	gen_rand(tdata.cipher_key.data, cipher_key_sz / 8);
+	gen_rand(tdata.auth_key.data, auth_key_sz / 8);
+
+	tdata.crypto_algo = cipher_algo;
+	tdata.auth_algo = auth_algo;
+
+	tdata.digest.len = digest_sz;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(plaintext, tdata.plaintext.len);
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+			cycles_per_buf, cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_blockcipher_enc_to_dec(&tdata, tcase,
+					ciphertext);
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask_opp,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* cipher-algo/cipher-key-len/auth-algo/auth-key-len/digest-len/op */
+#define all_block_cipher_perf_test_cases				\
+	TEST_EXPAND(_AES_CBC, 128, _NULL, 0, 0, TOP_ENC)		\
+	TEST_EXPAND(_NULL, 0, _SHA1_HMAC, 160, 20, TOP_AUTH_GEN)	\
+	TEST_EXPAND(_AES_CBC, 128, _SHA1_HMAC, 160, 20, TOP_ENC_AUTH)
+
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+static int								\
+cpu_crypto_blockcipher_perf##a##_##b##c##_##f(void)			\
+{									\
+	return cpu_crypto_test_blockcipher_perf(RTE_CRYPTO_CIPHER##a,	\
+			b / 8, RTE_CRYPTO_AUTH##c, d / 8, e, f);	\
+}									\
+
+all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_perf_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Perf Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+		cpu_crypto_blockcipher_perf##a##_##b##c##_##f),	\
+
+	all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesni_mb_perf_testsuite);
+}
+
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
@@ -1130,3 +1321,6 @@ REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 
 REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
 		test_security_cpu_crypto_aesni_mb);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_perftest,
+		test_security_cpu_crypto_aesni_mb_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 8/9] ipsec: add rte_security cpu_crypto action support
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (6 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 7/9] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
@ 2019-09-03 15:40 ` Fan Zhang
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 9/9] examples/ipsec-secgw: add security " Fan Zhang
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

This patch updates the ipsec library to handle the newly introduced
RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_ipsec/esp_inb.c  | 174 +++++++++++++++++++++++++-
 lib/librte_ipsec/esp_outb.c | 290 +++++++++++++++++++++++++++++++++++++++++++-
 lib/librte_ipsec/sa.c       |  53 ++++++--
 lib/librte_ipsec/sa.h       |  29 +++++
 lib/librte_ipsec/ses.c      |   4 +-
 5 files changed, 539 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ipsec/esp_inb.c b/lib/librte_ipsec/esp_inb.c
index 8e3ecbc64..2220df0f6 100644
--- a/lib/librte_ipsec/esp_inb.c
+++ b/lib/librte_ipsec/esp_inb.c
@@ -105,6 +105,73 @@ inb_cop_prepare(struct rte_crypto_op *cop,
 	}
 }
 
+static inline int
+inb_sync_crypto_proc_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
+	const union sym_op_data *icv, uint32_t pofs, uint32_t plen,
+	struct rte_security_vec *buf, struct iovec *cur_vec,
+	void *iv, void **aad, void **digest)
+{
+	struct rte_mbuf *ms;
+	struct iovec *vec = cur_vec;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	uint64_t *ivp;
+	uint32_t algo, left, off = 0, n_seg = 0;
+
+	ivp = rte_pktmbuf_mtod_offset(mb, uint64_t *,
+		pofs + sizeof(struct rte_esp_hdr));
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = (struct aead_gcm_iv *)iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		*aad = icv->va + sa->icv_len;
+		off = sa->ctp.cipher.offset + pofs;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		off = sa->ctp.auth.offset + pofs;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		off = sa->ctp.auth.offset + pofs;
+		ctr = (struct aesctr_cnt_blk *)iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		break;
+	}
+
+	*digest = icv->va;
+
+	left = plen - sa->ctp.cipher.length;
+
+	ms = mbuf_get_seg_ofs(mb, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
 /*
  * Helper function for prepare() to deal with situation when
  * ICV is spread by two segments. Tries to move ICV completely into the
@@ -512,7 +579,6 @@ tun_process(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return k;
 }
 
-
 /*
  * *process* function for tunnel packets
  */
@@ -625,6 +691,112 @@ esp_inb_pkt_process(struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return n;
 }
 
+/*
+ * process packets using sync crypto engine
+ */
+static uint16_t
+esp_inb_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num, uint8_t sqh_len,
+		esp_inb_process_t process)
+{
+	int32_t rc;
+	uint32_t i, k, hl, n, p;
+	struct rte_ipsec_sa *sa;
+	struct replay_sqn *rsn;
+	union sym_op_data icv;
+	uint32_t sqn[num];
+	uint32_t dr[num];
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	void *iv[num];
+	void *aad[num];
+	void *digest[num];
+	int status[num];
+
+	sa = ss->sa;
+	rsn = rsn_acquire(sa);
+
+	k = 0;
+	for (i = 0; i != num; i++) {
+		hl = mb[i]->l2_len + mb[i]->l3_len;
+		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, &icv);
+		if (rc >= 0) {
+			iv[k] = (void *)ivs[k];
+			rc = inb_sync_crypto_proc_prepare(sa, mb[i], &icv, hl,
+					rc, buf + k, vec + vec_idx, iv + k,
+					&aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		} else
+			dr[i - k] = i;
+	}
+
+	/* copy not prepared mbufs beyond good ones */
+	if (k != num) {
+		rte_errno = EBADMSG;
+
+		if (unlikely(k == 0))
+			return 0;
+
+		move_bad_mbufs(mb, dr, num, num - k);
+	}
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ss->security.ctx,
+			ss->security.ses, buf, iv, aad, digest, status,
+			k);
+	/* move failed process packets to dr */
+	for (i = 0; i < k; i++) {
+		if (status[i]) {
+			dr[n++] = i;
+			rte_errno = EBADMSG;
+		}
+	}
+
+	/* move bad packets to the back */
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	/* process packets */
+	p = process(sa, mb, sqn, dr, k - n, sqh_len);
+
+	if (p != k - n && p != 0)
+		move_bad_mbufs(mb, dr, k - n, k - n - p);
+
+	if (p != num)
+		rte_errno = EBADMSG;
+
+	return p;
+}
+
+uint16_t
+esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+
+	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
+			tun_process);
+}
+
+uint16_t
+esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+
+	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
+			trs_process);
+}
+
 /*
  * process group of ESP inbound tunnel packets.
  */
diff --git a/lib/librte_ipsec/esp_outb.c b/lib/librte_ipsec/esp_outb.c
index 55799a867..a3d18eefd 100644
--- a/lib/librte_ipsec/esp_outb.c
+++ b/lib/librte_ipsec/esp_outb.c
@@ -403,6 +403,292 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	return k;
 }
 
+
+static inline int
+outb_sync_crypto_proc_prepare(struct rte_mbuf *m, const struct rte_ipsec_sa *sa,
+		const uint64_t ivp[IPSEC_MAX_IV_QWORD],
+		const union sym_op_data *icv, uint32_t hlen, uint32_t plen,
+		struct rte_security_vec *buf, struct iovec *cur_vec, void *iv,
+		void **aad, void **digest)
+{
+	struct rte_mbuf *ms;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	struct iovec *vec = cur_vec;
+	uint32_t left, off = 0, n_seg = 0;
+	uint32_t algo;
+
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		*aad = (void *)(icv->va + sa->icv_len);
+		off = sa->ctp.cipher.offset + hlen;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		off = sa->ctp.auth.offset + hlen;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		ctr = iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		break;
+	}
+
+	*digest = (void *)icv->va;
+
+	left = sa->ctp.cipher.length + plen;
+
+	ms = mbuf_get_seg_ofs(m, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
+/**
+ * Local post process function prototype that same as process function prototype
+ * as rte_ipsec_sa_pkt_func's process().
+ */
+typedef uint16_t (*sync_crypto_post_process)(const struct rte_ipsec_session *ss,
+				struct rte_mbuf *mb[],
+				uint16_t num);
+static uint16_t
+esp_outb_tun_sync_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num,
+		sync_crypto_post_process post_process)
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	union sym_op_data icv;
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	void *aad[num];
+	void *digest[num];
+	void *iv[num];
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	uint64_t ivp[IPSEC_MAX_IV_QWORD];
+	int status[num];
+	uint32_t dr[num];
+	uint32_t i, n, k;
+	int32_t rc;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(ivp, sqc);
+
+		/* try to update the packet itself */
+		rc = outb_tun_pkt_prepare(sa, sqc, ivp, mb[i], &icv,
+				sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			outb_pkt_xprepare(sa, sqc, &icv);
+
+			iv[k] = (void *)ivs[k];
+			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
+					0, rc, buf + k, vec + vec_idx, iv + k,
+					&aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	if (unlikely(k == 0)) {
+		rte_errno = EBADMSG;
+		return 0;
+	}
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, (void **)iv,
+			(void **)aad, (void **)digest, status, k);
+	/* move failed process packets to dr */
+	for (i = 0; i < n; i++) {
+		if (status[i])
+			dr[n++] = i;
+	}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return post_process(ss, mb, k - n);
+}
+
+static uint16_t
+esp_outb_trs_sync_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num,
+		sync_crypto_post_process post_process)
+
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	union sym_op_data icv;
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	void *aad[num];
+	void *digest[num];
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	void *iv[num];
+	int status[num];
+	uint64_t ivp[IPSEC_MAX_IV_QWORD];
+	uint32_t dr[num];
+	uint32_t i, n, k;
+	uint32_t l2, l3;
+	int32_t rc;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		l2 = mb[i]->l2_len;
+		l3 = mb[i]->l3_len;
+
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(ivp, sqc);
+
+		/* try to update the packet itself */
+		rc = outb_trs_pkt_prepare(sa, sqc, ivp, mb[i], l2, l3, &icv,
+				sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			outb_pkt_xprepare(sa, sqc, &icv);
+
+			iv[k] = (void *)ivs[k];
+
+			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
+					l2 + l3, rc, buf + k, vec + vec_idx,
+					iv + k, &aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, (void **)iv,
+			(void **)aad, (void **)digest, status, k);
+	/* move failed process packets to dr */
+	for (i = 0; i < k; i++) {
+		if (status[i])
+			dr[n++] = i;
+	}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return post_process(ss, mb, k - n);
+}
+
+uint16_t
+esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_tun_sync_crypto_process(ss, mb, num,
+			esp_outb_sqh_process);
+}
+
+uint16_t
+esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_tun_sync_crypto_process(ss, mb, num,
+			esp_outb_pkt_flag_process);
+}
+
+uint16_t
+esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_trs_sync_crypto_process(ss, mb, num,
+			esp_outb_sqh_process);
+}
+
+uint16_t
+esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_trs_sync_crypto_process(ss, mb, num,
+			esp_outb_pkt_flag_process);
+}
+
 /*
  * process outbound packets for SA with ESN support,
  * for algorithms that require SQN.hibits to be implictly included
@@ -410,8 +696,8 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
  * In that case we have to move ICV bytes back to their proper place.
  */
 uint16_t
-esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+esp_outb_sqh_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k, icv_len, *icv;
 	struct rte_mbuf *ml;
diff --git a/lib/librte_ipsec/sa.c b/lib/librte_ipsec/sa.c
index 23d394b46..31ffbce2c 100644
--- a/lib/librte_ipsec/sa.c
+++ b/lib/librte_ipsec/sa.c
@@ -544,9 +544,9 @@ lksd_proto_prepare(const struct rte_ipsec_session *ss,
  * - inbound/outbound for RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
  * - outbound for RTE_SECURITY_ACTION_TYPE_NONE when ESN is disabled
  */
-static uint16_t
-pkt_flag_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k;
 	uint32_t dr[num];
@@ -599,12 +599,48 @@ lksd_none_pkt_func_select(const struct rte_ipsec_sa *sa,
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
 		pf->prepare = esp_outb_tun_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
 		break;
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
 		pf->prepare = esp_outb_trs_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
+		break;
+	default:
+		rc = -ENOTSUP;
+	}
+
+	return rc;
+}
+
+static int
+lksd_sync_crypto_pkt_func_select(const struct rte_ipsec_sa *sa,
+		struct rte_ipsec_sa_pkt_func *pf)
+{
+	int32_t rc;
+
+	static const uint64_t msk = RTE_IPSEC_SATP_DIR_MASK |
+			RTE_IPSEC_SATP_MODE_MASK;
+
+	rc = 0;
+	switch (sa->type & msk) {
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = esp_inb_tun_sync_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = esp_inb_trs_sync_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_tun_sync_crpyto_sqh_process :
+			esp_outb_tun_sync_crpyto_flag_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_trs_sync_crpyto_sqh_process :
+			esp_outb_trs_sync_crpyto_flag_process;
 		break;
 	default:
 		rc = -ENOTSUP;
@@ -672,13 +708,16 @@ ipsec_sa_pkt_func_select(const struct rte_ipsec_session *ss,
 	case RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL:
 		if ((sa->type & RTE_IPSEC_SATP_DIR_MASK) ==
 				RTE_IPSEC_SATP_DIR_IB)
-			pf->process = pkt_flag_process;
+			pf->process = esp_outb_pkt_flag_process;
 		else
 			pf->process = inline_proto_outb_pkt_process;
 		break;
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		pf->prepare = lksd_proto_prepare;
-		pf->process = pkt_flag_process;
+		pf->process = esp_outb_pkt_flag_process;
+		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		rc = lksd_sync_crypto_pkt_func_select(sa, pf);
 		break;
 	default:
 		rc = -ENOTSUP;
diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
index 51e69ad05..02c7abc60 100644
--- a/lib/librte_ipsec/sa.h
+++ b/lib/librte_ipsec/sa.h
@@ -156,6 +156,14 @@ uint16_t
 inline_inb_trs_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
 /* outbound processing */
 
 uint16_t
@@ -170,6 +178,10 @@ uint16_t
 esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	uint16_t num);
 
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num);
+
 uint16_t
 inline_outb_tun_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
@@ -182,4 +194,21 @@ uint16_t
 inline_proto_outb_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+
 #endif /* _SA_H_ */
diff --git a/lib/librte_ipsec/ses.c b/lib/librte_ipsec/ses.c
index 82c765a33..eaa8c17b7 100644
--- a/lib/librte_ipsec/ses.c
+++ b/lib/librte_ipsec/ses.c
@@ -19,7 +19,9 @@ session_check(struct rte_ipsec_session *ss)
 			return -EINVAL;
 		if ((ss->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
 				ss->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) &&
+				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+				ss->type ==
+				RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
 				ss->security.ctx == NULL)
 			return -EINVAL;
 	}
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [RFC PATCH 9/9] examples/ipsec-secgw: add security cpu_crypto action support
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (7 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 8/9] ipsec: add rte_security cpu_crypto action support Fan Zhang
@ 2019-09-03 15:40 ` " Fan Zhang
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-03 15:40 UTC (permalink / raw)
  To: dev
  Cc: akhil.goyal, konstantin.ananyev, declan.doherty,
	pablo.de.lara.guarch, Fan Zhang

Since ipsec library is added cpu_crypto security action type support,
this patch updates ipsec-secgw sample application with added action type
"cpu-crypto". The patch also includes a number of test scripts to
prove the correctness of the implementation.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 examples/ipsec-secgw/ipsec.c                       | 22 ++++++++++++++++++++++
 examples/ipsec-secgw/ipsec_process.c               |  4 ++--
 examples/ipsec-secgw/sa.c                          | 13 +++++++++++--
 examples/ipsec-secgw/test/run_test.sh              | 10 ++++++++++
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |  5 +++++
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |  5 +++++
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++++
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |  5 +++++
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |  5 +++++
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++++
 14 files changed, 99 insertions(+), 4 deletions(-)
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index dc85adfe5..4c39a7de6 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -10,6 +10,7 @@
 #include <rte_crypto.h>
 #include <rte_security.h>
 #include <rte_cryptodev.h>
+#include <rte_ipsec.h>
 #include <rte_ethdev.h>
 #include <rte_mbuf.h>
 #include <rte_hash.h>
@@ -105,6 +106,26 @@ create_lookaside_session(struct ipsec_ctx *ipsec_ctx, struct ipsec_sa *sa)
 				"SEC Session init failed: err: %d\n", ret);
 				return -1;
 			}
+		} else if (sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
+			struct rte_security_ctx *ctx =
+				(struct rte_security_ctx *)
+				rte_cryptodev_get_sec_ctx(
+					ipsec_ctx->tbl[cdev_id_qp].id);
+			int32_t offset = sizeof(struct rte_esp_hdr) +
+					sa->iv_len;
+
+			/* Set IPsec parameters in conf */
+			sess_conf.cpucrypto.cipher_offset = offset;
+
+			set_ipsec_conf(sa, &(sess_conf.ipsec));
+			sa->security_ctx = ctx;
+			sa->sec_session = rte_security_session_create(ctx,
+				&sess_conf, ipsec_ctx->session_priv_pool);
+			if (sa->sec_session == NULL) {
+				RTE_LOG(ERR, IPSEC,
+				"SEC Session init failed: err: %d\n", ret);
+				return -1;
+			}
 		} else {
 			RTE_LOG(ERR, IPSEC, "Inline not supported\n");
 			return -1;
@@ -473,6 +494,7 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx *ipsec_ctx,
 						sa->sec_session, pkts[i], NULL);
 			continue;
 		case RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO:
+		case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
 			RTE_ASSERT(sa->sec_session != NULL);
 			priv->cop.type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
 			priv->cop.status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
diff --git a/examples/ipsec-secgw/ipsec_process.c b/examples/ipsec-secgw/ipsec_process.c
index 868f1a28d..73bfb314e 100644
--- a/examples/ipsec-secgw/ipsec_process.c
+++ b/examples/ipsec-secgw/ipsec_process.c
@@ -227,8 +227,8 @@ ipsec_process(struct ipsec_ctx *ctx, struct ipsec_traffic *trf)
 
 		/* process packets inline */
 		else if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
-				sa->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) {
+			sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 
 			satp = rte_ipsec_sa_type(ips->sa);
 
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index c3cf3bd1f..ba773346f 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -570,6 +570,9 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 				RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL;
 			else if (strcmp(tokens[ti], "no-offload") == 0)
 				rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
+			else if (strcmp(tokens[ti], "cpu-crypto") == 0)
+				rule->type =
+					RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
 			else {
 				APP_CHECK(0, status, "Invalid input \"%s\"",
 						tokens[ti]);
@@ -624,10 +627,13 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 	if (status->status < 0)
 		return;
 
-	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE) && (portid_p == 0))
+	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
+			(portid_p == 0))
 		printf("Missing portid option, falling back to non-offload\n");
 
-	if (!type_p || !portid_p) {
+	if (!type_p || (!portid_p && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO)) {
 		rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
 		rule->portid = -1;
 	}
@@ -709,6 +715,9 @@ print_one_sa_rule(const struct ipsec_sa *sa, int inbound)
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		printf("lookaside-protocol-offload ");
 		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		printf("cpu-crypto-accelerated");
+		break;
 	}
 	printf("\n");
 }
diff --git a/examples/ipsec-secgw/test/run_test.sh b/examples/ipsec-secgw/test/run_test.sh
index 8055a4c04..f322aa785 100755
--- a/examples/ipsec-secgw/test/run_test.sh
+++ b/examples/ipsec-secgw/test/run_test.sh
@@ -32,15 +32,21 @@ usage()
 }
 
 LINUX_TEST="tun_aescbc_sha1 \
+tun_aescbc_sha1_cpu_crypto \
 tun_aescbc_sha1_esn \
 tun_aescbc_sha1_esn_atom \
 tun_aesgcm \
+tun_aesgcm_cpu_crypto \
+tun_aesgcm_mb_cpu_crypto \
 tun_aesgcm_esn \
 tun_aesgcm_esn_atom \
 trs_aescbc_sha1 \
+trs_aescbc_sha1_cpu_crypto \
 trs_aescbc_sha1_esn \
 trs_aescbc_sha1_esn_atom \
 trs_aesgcm \
+trs_aesgcm_cpu_crypto \
+trs_aesgcm_mb_cpu_crypto \
 trs_aesgcm_esn \
 trs_aesgcm_esn_atom \
 tun_aescbc_sha1_old \
@@ -49,17 +55,21 @@ trs_aescbc_sha1_old \
 trs_aesgcm_old \
 tun_aesctr_sha1 \
 tun_aesctr_sha1_old \
+tun_aesctr_cpu_crypto \
 tun_aesctr_sha1_esn \
 tun_aesctr_sha1_esn_atom \
 trs_aesctr_sha1 \
+trs_aesctr_sha1_cpu_crypto \
 trs_aesctr_sha1_old \
 trs_aesctr_sha1_esn \
 trs_aesctr_sha1_esn_atom \
 tun_3descbc_sha1 \
+tun_3descbc_sha1_cpu_crypto \
 tun_3descbc_sha1_old \
 tun_3descbc_sha1_esn \
 tun_3descbc_sha1_esn_atom \
 trs_3descbc_sha1 \
+trs_3descbc_sha1 \
 trs_3descbc_sha1_old \
 trs_3descbc_sha1_esn \
 trs_3descbc_sha1_esn_atom"
diff --git a/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..a864a8886
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..a4d83e9c4
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..745a2a02b
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..8917122da
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..26943321f
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..747141f62
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..56076fa50
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..3af680533
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..5bf1c0ae5
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..039b8095e
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-09-04 10:32   ` Akhil Goyal
  2019-09-04 13:06     ` Zhang, Roy Fan
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-04 10:32 UTC (permalink / raw)
  To: Fan Zhang, dev; +Cc: konstantin.ananyev, declan.doherty, pablo.de.lara.guarch

Hi Fan,

> 
> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action
> type to
> security library. The type represents performing crypto operation with CPU
> cycles. The patch also includes a new API to process crypto operations in
> bulk and the function pointers for PMDs.
> 
I am not able to get the flow of execution for this action type. Could you please elaborate
the flow in the documentation. If not in documentation right now, then please elaborate the
flow in cover letter.
Also I see that there are new APIs for processing crypto operations in bulk.
What does that mean. How are they different from the existing APIs which are also
handling bulk crypto ops depending on the budget.


-Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-04 10:32   ` Akhil Goyal
@ 2019-09-04 13:06     ` Zhang, Roy Fan
  2019-09-06  9:01       ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Zhang, Roy Fan @ 2019-09-04 13:06 UTC (permalink / raw)
  To: Akhil Goyal, dev
  Cc: Ananyev, Konstantin, Doherty, Declan, De Lara Guarch, Pablo

Hi Akhil,

This action type allows the burst of symmetric crypto workload using the same
algorithm, key, and direction being processed by CPU cycles synchronously. 
This flexible action type does not require external hardware involvement,
having the crypto workload processed synchronously, and is more performant
than Cryptodev SW PMD due to the saved cycles on removed "async mode
simulation" as well as 3 cacheline access of the crypto ops. 

AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a small
performance test app under app/test/security_aesni_gcm(mb)_perftest to
prove.

For the new API
The packet is sent to the crypto device for symmetric crypto
processing. The device will encrypt or decrypt the buffer based on the session
data specified and preprocessed in the security session. Different
than the inline or lookaside modes, when the function exits, the user will
expect the buffers are either processed successfully, or having the error number
assigned to the appropriate index of the status array.

Will update the program's guide in the v1 patch.

Regards,
Fan

> -----Original Message-----
> From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> Sent: Wednesday, September 4, 2019 11:33 AM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>
> Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and
> API
> 
> Hi Fan,
> 
> >
> > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> action
> > type to security library. The type represents performing crypto
> > operation with CPU cycles. The patch also includes a new API to
> > process crypto operations in bulk and the function pointers for PMDs.
> >
> I am not able to get the flow of execution for this action type. Could you
> please elaborate the flow in the documentation. If not in documentation
> right now, then please elaborate the flow in cover letter.
> Also I see that there are new APIs for processing crypto operations in bulk.
> What does that mean. How are they different from the existing APIs which
> are also handling bulk crypto ops depending on the budget.
> 
> 
> -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-04 13:06     ` Zhang, Roy Fan
@ 2019-09-06  9:01       ` Akhil Goyal
  2019-09-06 13:12         ` Zhang, Roy Fan
  2019-09-06 13:27         ` Ananyev, Konstantin
  0 siblings, 2 replies; 84+ messages in thread
From: Akhil Goyal @ 2019-09-06  9:01 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev
  Cc: Ananyev, Konstantin, Doherty, Declan, De Lara Guarch, Pablo


Hi Fan,
> 
> Hi Akhil,
> 
> This action type allows the burst of symmetric crypto workload using the same
> algorithm, key, and direction being processed by CPU cycles synchronously.
> This flexible action type does not require external hardware involvement,
> having the crypto workload processed synchronously, and is more performant
> than Cryptodev SW PMD due to the saved cycles on removed "async mode
> simulation" as well as 3 cacheline access of the crypto ops.

Does that mean application will not call the cryptodev_enqueue_burst and corresponding dequeue burst.
It would be a new API something like process_packets and it will have the crypto processed packets while returning from the API?

I still do not understand why we cannot do with the conventional crypto lib only.
As far as I can understand, you are not doing any protocol processing or any value add
To the crypto processing. IMO, you just need a synchronous crypto processing API which
Can be defined in cryptodev, you don't need to re-create a crypto session in the name of
Security session in the driver just to do a synchronous processing.

> 
> AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a small
> performance test app under app/test/security_aesni_gcm(mb)_perftest to
> prove.
> 
> For the new API
> The packet is sent to the crypto device for symmetric crypto
> processing. The device will encrypt or decrypt the buffer based on the session
> data specified and preprocessed in the security session. Different
> than the inline or lookaside modes, when the function exits, the user will
> expect the buffers are either processed successfully, or having the error number
> assigned to the appropriate index of the status array.
> 
> Will update the program's guide in the v1 patch.
> 
> Regards,
> Fan
> 
> > -----Original Message-----
> > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > Sent: Wednesday, September 4, 2019 11:33 AM
> > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty, Declan
> > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>
> > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and
> > API
> >
> > Hi Fan,
> >
> > >
> > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > action
> > > type to security library. The type represents performing crypto
> > > operation with CPU cycles. The patch also includes a new API to
> > > process crypto operations in bulk and the function pointers for PMDs.
> > >
> > I am not able to get the flow of execution for this action type. Could you
> > please elaborate the flow in the documentation. If not in documentation
> > right now, then please elaborate the flow in cover letter.
> > Also I see that there are new APIs for processing crypto operations in bulk.
> > What does that mean. How are they different from the existing APIs which
> > are also handling bulk crypto ops depending on the budget.
> >
> >
> > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-06  9:01       ` Akhil Goyal
@ 2019-09-06 13:12         ` Zhang, Roy Fan
  2019-09-10 11:25           ` Akhil Goyal
  2019-09-06 13:27         ` Ananyev, Konstantin
  1 sibling, 1 reply; 84+ messages in thread
From: Zhang, Roy Fan @ 2019-09-06 13:12 UTC (permalink / raw)
  To: Akhil Goyal, dev
  Cc: Ananyev, Konstantin, Doherty, Declan, De Lara Guarch, Pablo

Hi Akhil,

You are right, the new API will process the crypto workload, no heavy enqueue
Dequeue operations required. 

Cryptodev tends to support multiple crypto devices, including HW and SW. 
The 3-cache line access, iova address computation and assignment, simulation
of async enqueue/dequeue operations, allocate and free crypto ops, even the
mbuf linked-list for scatter-gather buffers are too heavy for SW crypto PMDs.

To create this new synchronous API in cryptodev cannot avoid the problem
listed above:  first the API shall not serve only to part of the crypto (SW) PMDs -
as you know, it is Cryptodev. The users can expect some PMD only support part
of the overall algorithms, but not the workload processing API. 

Another reason is, there is assumption made, first when creating a crypto op
we have to allocate the memory to hold crypto op + sym op + iv, - we cannot
simply declare an array of crypto ops in the run-time and discard it when processing
is done. Also we need to fill aad and digest HW address, which is not required for
SW at all. 

Bottom line: using crypto op will still have 3 cache-line access performance problem.

So if we to create the new API in Cryptodev instead of rte_security, we need to
create new crypto op structure only for the SW PMDs, carefully document them
to not confuse with existing cryptodev APIs, make new device feature flags to
indicate the API is not supported by some PMDs, and again carefully document
them of these device feature flags.

So, to push these changes to rte_security instead the above problem can be resolved,
and the performance improvement because of this change is big for smaller packets
- I attached a performance test app in the patchset.

For rte_security, we already have inline-crypto type that works quite close to the this
new API, the only difference is that it is processed by the CPU cycles. As you may
have already seen the ipsec-library has wrapped these changes, and ipsec-secgw
has only minimum updates to adopt this change too. So to the end user, if they 
use IPSec this patchset can seamlessly enabled with just commandline update when
creating an SA.

Regards,
Fan
 

> -----Original Message-----
> From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> Sent: Friday, September 6, 2019 10:01 AM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; De Lara Guarch, Pablo
> <pablo.de.lara.guarch@intel.com>
> Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and
> API
> 
> 
> Hi Fan,
> >
> > Hi Akhil,
> >
> > This action type allows the burst of symmetric crypto workload using
> > the same algorithm, key, and direction being processed by CPU cycles
> synchronously.
> > This flexible action type does not require external hardware
> > involvement, having the crypto workload processed synchronously, and
> > is more performant than Cryptodev SW PMD due to the saved cycles on
> > removed "async mode simulation" as well as 3 cacheline access of the
> crypto ops.
> 
> Does that mean application will not call the cryptodev_enqueue_burst and
> corresponding dequeue burst.
> It would be a new API something like process_packets and it will have the
> crypto processed packets while returning from the API?
> 
> I still do not understand why we cannot do with the conventional crypto lib
> only.
> As far as I can understand, you are not doing any protocol processing or any
> value add To the crypto processing. IMO, you just need a synchronous crypto
> processing API which Can be defined in cryptodev, you don't need to re-
> create a crypto session in the name of Security session in the driver just to do
> a synchronous processing.
> 
> >
> > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a
> > small performance test app under
> > app/test/security_aesni_gcm(mb)_perftest to prove.
> >
> > For the new API
> > The packet is sent to the crypto device for symmetric crypto
> > processing. The device will encrypt or decrypt the buffer based on the
> > session data specified and preprocessed in the security session.
> > Different than the inline or lookaside modes, when the function exits,
> > the user will expect the buffers are either processed successfully, or
> > having the error number assigned to the appropriate index of the status
> array.
> >
> > Will update the program's guide in the v1 patch.
> >
> > Regards,
> > Fan
> >
> > > -----Original Message-----
> > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > Declan <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > <pablo.de.lara.guarch@intel.com>
> > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > > type and API
> > >
> > > Hi Fan,
> > >
> > > >
> > > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > action
> > > > type to security library. The type represents performing crypto
> > > > operation with CPU cycles. The patch also includes a new API to
> > > > process crypto operations in bulk and the function pointers for PMDs.
> > > >
> > > I am not able to get the flow of execution for this action type.
> > > Could you please elaborate the flow in the documentation. If not in
> > > documentation right now, then please elaborate the flow in cover letter.
> > > Also I see that there are new APIs for processing crypto operations in
> bulk.
> > > What does that mean. How are they different from the existing APIs
> > > which are also handling bulk crypto ops depending on the budget.
> > >
> > >
> > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process
  2019-09-03 15:40 [dpdk-dev] [RFC PATCH 0/9] security: add software synchronous crypto process Fan Zhang
                   ` (8 preceding siblings ...)
  2019-09-03 15:40 ` [dpdk-dev] [RFC PATCH 9/9] examples/ipsec-secgw: add security " Fan Zhang
@ 2019-09-06 13:13 ` Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API Fan Zhang
                     ` (11 more replies)
  9 siblings, 12 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This RFC patch adds a way to rte_security to process symmetric crypto
workload in bulk synchronously for SW crypto devices.

Originally both SW and HW crypto PMDs works under rte_cryptodev to
process the crypto workload asynchronously. This way provides uniformity
to both PMD types but also introduce unnecessary performance penalty to
SW PMDs such as extra SW ring enqueue/dequeue steps to "simulate"
asynchronous working manner and unnecessary HW addresses computation.

We introduce a new way for SW crypto devices that perform crypto operation
synchronously with only fields required for the computation as input.

In rte_security, a new action type "RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO"
is introduced. This action type allows the burst of symmetric crypto
workload using the same algorithm, key, and direction being processed by
CPU cycles synchronously. This flexible action type does not require
external hardware involvement.

This patch also includes the announcement of a new API
"rte_security_process_cpu_crypto_bulk". With this API the packet is sent to
the crypto device for symmetric crypto processing. The device will encrypt
or decrypt the buffer based on the session data specified and preprocessed
in the security session. Different than the inline or lookaside modes, when
the function exits, the user will expect the buffers are either processed
successfully, or having the error number assigned to the appropriate index
of the status array.

The proof-of-concept AESNI-GCM and AESNI-MB SW PMDs are updated with the
support of this new method. To demonstrate the performance gain with
this method 2 simple performance evaluation apps under unit-test are added
"app/test: security_aesni_gcm_perftest/security_aesni_mb_perftest". The
users can freely compare their results against crypto perf application
results.

In the end, the ipsec library and ipsec-secgw sample application are also
updated to support this feature. Several test scripts are added to the
ipsec-secgw test-suite to prove the correctness of the implementation.

Fan Zhang (10):
  security: introduce CPU Crypto action type and API
  crypto/aesni_gcm: add rte_security handler
  app/test: add security cpu crypto autotest
  app/test: add security cpu crypto perftest
  crypto/aesni_mb: add rte_security handler
  app/test: add aesni_mb security cpu crypto autotest
  app/test: add aesni_mb security cpu crypto perftest
  ipsec: add rte_security cpu_crypto action support
  examples/ipsec-secgw: add security cpu_crypto action support
  doc: update security cpu process description

 app/test/Makefile                                  |    1 +
 app/test/meson.build                               |    1 +
 app/test/test_security_cpu_crypto.c                | 1326 ++++++++++++++++++++
 doc/guides/cryptodevs/aesni_gcm.rst                |    6 +
 doc/guides/cryptodevs/aesni_mb.rst                 |    7 +
 doc/guides/prog_guide/rte_security.rst             |  112 +-
 doc/guides/rel_notes/release_19_11.rst             |    7 +
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c           |   91 +-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c       |   95 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h   |   23 +
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         |  291 ++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |   91 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   21 +-
 examples/ipsec-secgw/ipsec.c                       |   22 +
 examples/ipsec-secgw/ipsec_process.c               |    7 +-
 examples/ipsec-secgw/sa.c                          |   13 +-
 examples/ipsec-secgw/test/run_test.sh              |   10 +
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 lib/librte_ipsec/esp_inb.c                         |  174 ++-
 lib/librte_ipsec/esp_outb.c                        |  290 ++++-
 lib/librte_ipsec/sa.c                              |   53 +-
 lib/librte_ipsec/sa.h                              |   29 +
 lib/librte_ipsec/ses.c                             |    4 +-
 lib/librte_security/rte_security.c                 |   16 +
 lib/librte_security/rte_security.h                 |   51 +-
 lib/librte_security/rte_security_driver.h          |   19 +
 lib/librte_security/rte_security_version.map       |    1 +
 36 files changed, 2791 insertions(+), 24 deletions(-)
 create mode 100644 app/test/test_security_cpu_crypto.c
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-18 12:45     ` Ananyev, Konstantin
  2019-09-29  6:00     ` Hemant Agrawal
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
                     ` (10 subsequent siblings)
  11 siblings, 2 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
security library. The type represents performing crypto operation with CPU
cycles. The patch also includes a new API to process crypto operations in
bulk and the function pointers for PMDs.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_security/rte_security.c           | 16 +++++++++
 lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
 lib/librte_security/rte_security_driver.h    | 19 +++++++++++
 lib/librte_security/rte_security_version.map |  1 +
 4 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
index bc81ce15d..0f85c1b59 100644
--- a/lib/librte_security/rte_security.c
+++ b/lib/librte_security/rte_security.c
@@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
 
 	return NULL;
 }
+
+void
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	uint32_t i;
+
+	for (i = 0; i < num; i++)
+		status[i] = -1;
+
+	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
+	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
+			aad, digest, status, num);
+}
diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
index 96806e3a2..5a0f8901b 100644
--- a/lib/librte_security/rte_security.h
+++ b/lib/librte_security/rte_security.h
@@ -18,6 +18,7 @@ extern "C" {
 #endif
 
 #include <sys/types.h>
+#include <sys/uio.h>
 
 #include <netinet/in.h>
 #include <netinet/ip.h>
@@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
 	uint32_t hfn_threshold;
 };
 
+struct rte_security_cpu_crypto_xform {
+	/** For cipher/authentication crypto operation the authentication may
+	 * cover more content then the cipher. E.g., for IPSec ESP encryption
+	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
+	 * header but whole packet (apart from MAC header) is authenticated.
+	 * The cipher_offset field is used to deduct the cipher data pointer
+	 * from the buffer to be processed.
+	 *
+	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
+	 * uses the same offset for cipher and authentication.
+	 */
+	int32_t cipher_offset;
+};
+
 /**
  * Security session action type.
  */
@@ -286,10 +301,14 @@ enum rte_security_session_action_type {
 	/**< All security protocol processing is performed inline during
 	 * transmission
 	 */
-	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
+	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
 	/**< All security protocol processing including crypto is performed
 	 * on a lookaside accelerator
 	 */
+	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
+	/**< Crypto processing for security protocol is processed by CPU
+	 * synchronously
+	 */
 };
 
 /** Security session protocol definition */
@@ -315,6 +334,7 @@ struct rte_security_session_conf {
 		struct rte_security_ipsec_xform ipsec;
 		struct rte_security_macsec_xform macsec;
 		struct rte_security_pdcp_xform pdcp;
+		struct rte_security_cpu_crypto_xform cpucrypto;
 	};
 	/**< Configuration parameters for security session */
 	struct rte_crypto_sym_xform *crypto_xform;
@@ -639,6 +659,35 @@ const struct rte_security_capability *
 rte_security_capability_get(struct rte_security_ctx *instance,
 			    struct rte_security_capability_idx *idx);
 
+/**
+ * Security vector structure, contains pointer to vector array and the length
+ * of the array
+ */
+struct rte_security_vec {
+	struct iovec *vec;
+	uint32_t num;
+};
+
+/**
+ * Processing bulk crypto workload with CPU
+ *
+ * @param	instance	security instance.
+ * @param	sess		security session
+ * @param	buf		array of buffer SGL vectors
+ * @param	iv		array of IV pointers
+ * @param	aad		array of AAD pointers
+ * @param	digest		array of digest pointers
+ * @param	status		array of status for the function to return
+ * @param	num		number of elements in each array
+ *
+ */
+__rte_experimental
+void
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
index 1b561f852..70fcb0c26 100644
--- a/lib/librte_security/rte_security_driver.h
+++ b/lib/librte_security/rte_security_driver.h
@@ -132,6 +132,23 @@ typedef int (*security_get_userdata_t)(void *device,
 typedef const struct rte_security_capability *(*security_capabilities_get_t)(
 		void *device);
 
+/**
+ * Process security operations in bulk using CPU accelerated method.
+ *
+ * @param	sess		Security session structure.
+ * @param	buf		Buffer to the vectors to be processed.
+ * @param	iv		IV pointers.
+ * @param	aad		AAD pointers.
+ * @param	digest		Digest pointers.
+ * @param	status		Array of status value.
+ * @param	num		Number of elements in each array.
+ */
+
+typedef void (*security_process_cpu_crypto_bulk_t)(
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** Security operations function pointer table */
 struct rte_security_ops {
 	security_session_create_t session_create;
@@ -150,6 +167,8 @@ struct rte_security_ops {
 	/**< Get userdata associated with session which processed the packet. */
 	security_capabilities_get_t capabilities_get;
 	/**< Get security capabilities. */
+	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
+	/**< Process data in bulk. */
 };
 
 #ifdef __cplusplus
diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
index 53267bf3c..2132e7a00 100644
--- a/lib/librte_security/rte_security_version.map
+++ b/lib/librte_security/rte_security_version.map
@@ -18,4 +18,5 @@ EXPERIMENTAL {
 	rte_security_get_userdata;
 	rte_security_session_stats_get;
 	rte_security_session_update;
+	rte_security_process_cpu_crypto_bulk;
 };
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 02/10] crypto/aesni_gcm: add rte_security handler
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-18 10:24     ` Ananyev, Konstantin
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 03/10] app/test: add security cpu crypto autotest Fan Zhang
                     ` (9 subsequent siblings)
  11 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch add rte_security support support to AESNI-GCM PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode with
scatter-gather list buffer supported.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c         | 91 ++++++++++++++++++++++-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c     | 95 ++++++++++++++++++++++++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h | 23 ++++++
 3 files changed, 208 insertions(+), 1 deletion(-)

diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 1006a5c4d..0a346eddd 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -6,6 +6,7 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -174,6 +175,56 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct rte_crypto_op *op)
 	return sess;
 }
 
+static __rte_always_inline int
+process_gcm_security_sgl_buf(struct aesni_gcm_security_session *sess,
+		struct rte_security_vec *buf, uint8_t *iv,
+		uint8_t *aad, uint8_t *digest)
+{
+	struct aesni_gcm_session *session = &sess->sess;
+	uint8_t *tag;
+	uint32_t i;
+
+	sess->init(&session->gdata_key, &sess->gdata_ctx, iv, aad,
+			(uint64_t)session->aad_length);
+
+	for (i = 0; i < buf->num; i++) {
+		struct iovec *vec = &buf->vec[i];
+
+		sess->update(&session->gdata_key, &sess->gdata_ctx,
+				vec->iov_base, vec->iov_base, vec->iov_len);
+	}
+
+	switch (session->op) {
+	case AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION:
+		if (session->req_digest_length != session->gen_digest_length)
+			tag = sess->temp_digest;
+		else
+			tag = digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (session->req_digest_length != session->gen_digest_length)
+			memcpy(digest, sess->temp_digest,
+					session->req_digest_length);
+		break;
+
+	case AESNI_GCM_OP_AUTHENTICATED_DECRYPTION:
+		tag = sess->temp_digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (memcmp(tag, digest,	session->req_digest_length) != 0)
+			return -1;
+		break;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
 /**
  * Process a crypto operation, calling
  * the GCM API from the multi buffer library.
@@ -488,8 +539,10 @@ aesni_gcm_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_gcm_private *internals;
+	struct rte_security_ctx *sec_ctx;
 	enum aesni_gcm_vector_mode vector_mode;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -524,7 +577,8 @@ aesni_gcm_create(const char *name,
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
 			RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 	mb_mgr = alloc_mb_mgr(0);
 	if (mb_mgr == NULL)
@@ -587,6 +641,21 @@ aesni_gcm_create(const char *name,
 
 	internals->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_gcm_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_GCM_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_gcm_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 #if IMB_VERSION_NUM >= IMB_VERSION(0, 50, 0)
 	AESNI_GCM_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
@@ -641,6 +710,8 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	if (cryptodev == NULL)
 		return -ENODEV;
 
+	rte_free(cryptodev->security_ctx);
+
 	internals = cryptodev->data->dev_private;
 
 	free_mb_mgr(internals->mb_mgr);
@@ -648,6 +719,24 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	return rte_cryptodev_pmd_destroy(cryptodev);
 }
 
+void
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_gcm_security_session *session =
+			get_sec_session_private_data(sess);
+	uint32_t i;
+
+	if (unlikely(!session))
+		return;
+
+	for (i = 0; i < num; i++)
+		status[i] = process_gcm_security_sgl_buf(session, &buf[i],
+				(uint8_t *)iv[i], (uint8_t *)aad[i],
+				(uint8_t *)digest[i]);
+}
+
 static struct rte_vdev_driver aesni_gcm_pmd_drv = {
 	.probe = aesni_gcm_probe,
 	.remove = aesni_gcm_remove
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
index 2f66c7c58..cc71dbd60 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
@@ -7,6 +7,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "aesni_gcm_pmd_private.h"
 
@@ -316,6 +317,85 @@ aesni_gcm_pmd_sym_session_clear(struct rte_cryptodev *dev,
 	}
 }
 
+static int
+aesni_gcm_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_gcm_private *internals = cdev->data->dev_private;
+	struct aesni_gcm_security_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_GCM_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (conf->crypto_xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
+		AESNI_GCM_LOG(ERR, "GMAC is not supported in security session");
+		return -EINVAL;
+	}
+
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_GCM_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	ret = aesni_gcm_set_session_parameters(internals->ops,
+				&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_GCM_LOG(ERR, "Failed configure session parameters");
+
+		/* Return session to mempool */
+		rte_mempool_put(mempool, (void *)sess_priv);
+		return ret;
+	}
+
+	sess_priv->pre = internals->ops[sess_priv->sess.key].pre;
+	sess_priv->init = internals->ops[sess_priv->sess.key].init;
+	if (sess_priv->sess.op == AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION) {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_enc;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_enc;
+	} else {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_dec;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_dec;
+	}
+
+	sess->sess_private_data = sess_priv;
+
+	return 0;
+}
+
+static int
+aesni_gcm_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	void *sess_priv = get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_gcm_security_session));
+		set_sec_session_private_data(sess, NULL);
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+	return 0;
+}
+
+static unsigned int
+aesni_gcm_sec_session_get_size(__rte_unused void *device)
+{
+	return sizeof(struct aesni_gcm_security_session);
+}
+
 struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.dev_configure		= aesni_gcm_pmd_config,
 		.dev_start		= aesni_gcm_pmd_start,
@@ -336,4 +416,19 @@ struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.sym_session_clear	= aesni_gcm_pmd_sym_session_clear
 };
 
+static struct rte_security_ops aesni_gcm_security_ops = {
+		.session_create = aesni_gcm_security_session_create,
+		.session_get_size = aesni_gcm_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_gcm_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk =
+				aesni_gcm_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops = &aesni_gcm_pmd_ops;
+
+struct rte_security_ops *rte_aesni_gcm_pmd_security_ops =
+		&aesni_gcm_security_ops;
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
index 56b29e013..8e490b6ce 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
@@ -114,5 +114,28 @@ aesni_gcm_set_session_parameters(const struct aesni_gcm_ops *ops,
  * Device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops;
 
+/**
+ * Security session structure.
+ */
+struct aesni_gcm_security_session {
+	/** Temp digest for decryption */
+	uint8_t temp_digest[DIGEST_LENGTH_MAX];
+	/** GCM operations */
+	aesni_gcm_pre_t pre;
+	aesni_gcm_init_t init;
+	aesni_gcm_update_t update;
+	aesni_gcm_finalize_t finalize;
+	/** AESNI-GCM session */
+	struct aesni_gcm_session sess;
+	/** AESNI-GCM context */
+	struct gcm_context_data gdata_ctx;
+};
+
+extern void
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
+extern struct rte_security_ops *rte_aesni_gcm_pmd_security_ops;
 
 #endif /* _RTE_AESNI_GCM_PMD_PRIVATE_H_ */
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 03/10] app/test: add security cpu crypto autotest
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 04/10] app/test: add security cpu crypto perftest Fan Zhang
                     ` (8 subsequent siblings)
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch adds cpu crypto unit test for AESNI_GCM PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/Makefile                   |   1 +
 app/test/meson.build                |   1 +
 app/test/test_security_cpu_crypto.c | 564 ++++++++++++++++++++++++++++++++++++
 3 files changed, 566 insertions(+)
 create mode 100644 app/test/test_security_cpu_crypto.c

diff --git a/app/test/Makefile b/app/test/Makefile
index 26ba6fe2b..090c55746 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -196,6 +196,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_PMD_RING) += test_pmd_ring_perf.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_blockcipher.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_asym.c
+SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_security_cpu_crypto.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_METRICS) += test_metrics.c
 
diff --git a/app/test/meson.build b/app/test/meson.build
index ec40943bd..b7834ff21 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -103,6 +103,7 @@ test_sources = files('commands.c',
 	'test_ring_perf.c',
 	'test_rwlock.c',
 	'test_sched.c',
+	'test_security_cpu_crypto.c',
 	'test_service_cores.c',
 	'test_spinlock.c',
 	'test_stack.c',
diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
new file mode 100644
index 000000000..d345922b2
--- /dev/null
+++ b/app/test/test_security_cpu_crypto.c
@@ -0,0 +1,564 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#include <rte_common.h>
+#include <rte_hexdump.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_pause.h>
+#include <rte_bus_vdev.h>
+#include <rte_random.h>
+
+#include <rte_security.h>
+
+#include <rte_crypto.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+
+#include "test.h"
+#include "test_cryptodev.h"
+#include "test_cryptodev_aead_test_vectors.h"
+
+#define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
+#define MAX_NB_SIGMENTS			4
+
+enum buffer_assemble_option {
+	SGL_MAX_SEG,
+	SGL_ONE_SEG,
+};
+
+struct cpu_crypto_test_case {
+	struct {
+		uint8_t seg[MBUF_DATAPAYLOAD_SIZE];
+		uint32_t seg_len;
+	} seg_buf[MAX_NB_SIGMENTS];
+	uint8_t iv[MAXIMUM_IV_LENGTH];
+	uint8_t aad[CPU_CRYPTO_TEST_MAX_AAD_LENGTH];
+	uint8_t digest[DIGEST_BYTE_LENGTH_SHA512];
+} __rte_cache_aligned;
+
+struct cpu_crypto_test_obj {
+	struct iovec vec[MAX_NUM_OPS_INFLIGHT][MAX_NB_SIGMENTS];
+	struct rte_security_vec sec_buf[MAX_NUM_OPS_INFLIGHT];
+	void *iv[MAX_NUM_OPS_INFLIGHT];
+	void *digest[MAX_NUM_OPS_INFLIGHT];
+	void *aad[MAX_NUM_OPS_INFLIGHT];
+	int status[MAX_NUM_OPS_INFLIGHT];
+};
+
+struct cpu_crypto_testsuite_params {
+	struct rte_mempool *buf_pool;
+	struct rte_mempool *session_priv_mpool;
+	struct rte_security_ctx *ctx;
+};
+
+struct cpu_crypto_unittest_params {
+	struct rte_security_session *sess;
+	void *test_datas[MAX_NUM_OPS_INFLIGHT];
+	struct cpu_crypto_test_obj test_obj;
+	uint32_t nb_bufs;
+};
+
+static struct cpu_crypto_testsuite_params testsuite_params = { NULL };
+static struct cpu_crypto_unittest_params unittest_params;
+
+static int gbl_driver_id;
+
+static int
+testsuite_setup(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct rte_cryptodev_info info;
+	uint32_t i;
+	uint32_t nb_devs;
+	uint32_t sess_sz;
+	int ret;
+
+	memset(ts_params, 0, sizeof(*ts_params));
+
+	ts_params->buf_pool = rte_mempool_lookup("CPU_CRYPTO_MBUFPOOL");
+	if (ts_params->buf_pool == NULL) {
+		/* Not already created so create */
+		ts_params->buf_pool = rte_pktmbuf_pool_create(
+				"CRYPTO_MBUFPOOL",
+				NUM_MBUFS, MBUF_CACHE_SIZE, 0,
+				sizeof(struct cpu_crypto_test_case),
+				rte_socket_id());
+		if (ts_params->buf_pool == NULL) {
+			RTE_LOG(ERR, USER1, "Can't create CRYPTO_MBUFPOOL\n");
+			return TEST_FAILED;
+		}
+	}
+
+	/* Create an AESNI MB device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD)));
+		if (nb_devs < 1) {
+			ret = rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD), NULL);
+
+			TEST_ASSERT(ret == 0,
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+		}
+	}
+
+	/* Create an AESNI GCM device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD)));
+		if (nb_devs < 1) {
+			TEST_ASSERT_SUCCESS(rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD), NULL),
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+		}
+	}
+
+	nb_devs = rte_cryptodev_count();
+	if (nb_devs < 1) {
+		RTE_LOG(ERR, USER1, "No crypto devices found?\n");
+		return TEST_FAILED;
+	}
+
+	/* Get security context */
+	for (i = 0; i < nb_devs; i++) {
+		rte_cryptodev_info_get(i, &info);
+		if (info.driver_id != gbl_driver_id)
+			continue;
+
+		ts_params->ctx = rte_cryptodev_get_sec_ctx(i);
+		if (!ts_params->ctx) {
+			RTE_LOG(ERR, USER1, "Rte_security is not supported\n");
+			return TEST_FAILED;
+		}
+	}
+
+	sess_sz = rte_security_session_get_size(ts_params->ctx);
+	ts_params->session_priv_mpool = rte_mempool_create(
+			"cpu_crypto_test_sess_mp", 2, sess_sz, 0, 0,
+			NULL, NULL, NULL, NULL,
+			SOCKET_ID_ANY, 0);
+	if (!ts_params->session_priv_mpool) {
+		RTE_LOG(ERR, USER1, "Not enough memory\n");
+		return TEST_FAILED;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static void
+testsuite_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+
+	if (ts_params->buf_pool)
+		rte_mempool_free(ts_params->buf_pool);
+
+	if (ts_params->session_priv_mpool)
+		rte_mempool_free(ts_params->session_priv_mpool);
+}
+
+static int
+ut_setup(void)
+{
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	memset(ut_params, 0, sizeof(*ut_params));
+	return TEST_SUCCESS;
+}
+
+static void
+ut_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	if (ut_params->sess)
+		rte_security_session_destroy(ts_params->ctx, ut_params->sess);
+
+	if (ut_params->nb_bufs) {
+		uint32_t i;
+
+		for (i = 0; i < ut_params->nb_bufs; i++)
+			memset(ut_params->test_datas[i], 0,
+				sizeof(struct cpu_crypto_test_case));
+
+		rte_mempool_put_bulk(ts_params->buf_pool, ut_params->test_datas,
+				ut_params->nb_bufs);
+	}
+}
+
+static int
+allocate_buf(uint32_t n)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	int ret;
+
+	ret = rte_mempool_get_bulk(ts_params->buf_pool, ut_params->test_datas,
+			n);
+
+	if (ret == 0)
+		ut_params->nb_bufs = n;
+
+	return ret;
+}
+
+static int
+check_status(struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	uint32_t i;
+
+	for (i = 0; i < n; i++)
+		if (obj->status[i] < 0)
+			return -1;
+
+	return 0;
+}
+
+static struct rte_security_session *
+create_aead_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xform = {0};
+
+	if (is_unit_test)
+		debug_hexdump(stdout, "key:", test_data->key.data,
+				test_data->key.len);
+
+	/* Setup AEAD Parameters */
+	xform.type = RTE_CRYPTO_SYM_XFORM_AEAD;
+	xform.next = NULL;
+	xform.aead.algo = test_data->algo;
+	xform.aead.op = op;
+	xform.aead.key.data = test_data->key.data;
+	xform.aead.key.length = test_data->key.len;
+	xform.aead.iv.offset = 0;
+	xform.aead.iv.length = test_data->iv.len;
+	xform.aead.digest_length = test_data->auth_tag.len;
+	xform.aead.aad_length = test_data->aad.len;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = &xform;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_aead_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		enum buffer_assemble_option sgl_option,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t seg_idx;
+	uint32_t bytes_per_seg;
+	uint32_t left;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->auth_tag.data,
+				test_data->auth_tag.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:",
+					test_data->auth_tag.data,
+					test_data->auth_tag.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	switch (sgl_option) {
+	case SGL_MAX_SEG:
+		seg_idx = 0;
+		bytes_per_seg = src_len / MAX_NB_SIGMENTS + 1;
+		left = src_len;
+
+		if (bytes_per_seg > (MBUF_DATAPAYLOAD_SIZE / MAX_NB_SIGMENTS))
+			return -ENOMEM;
+
+		while (left) {
+			uint32_t cp_len = RTE_MIN(left, bytes_per_seg);
+			memcpy(data->seg_buf[seg_idx].seg, src, cp_len);
+			data->seg_buf[seg_idx].seg_len = cp_len;
+			obj->vec[obj_idx][seg_idx].iov_base =
+					(void *)data->seg_buf[seg_idx].seg;
+			obj->vec[obj_idx][seg_idx].iov_len = cp_len;
+			src += cp_len;
+			left -= cp_len;
+			seg_idx++;
+		}
+
+		if (left)
+			return -ENOMEM;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = seg_idx;
+
+		break;
+	case SGL_ONE_SEG:
+		memcpy(data->seg_buf[0].seg, src, src_len);
+		data->seg_buf[0].seg_len = src_len;
+		obj->vec[obj_idx][0].iov_base =
+				(void *)data->seg_buf[0].seg;
+		obj->vec[obj_idx][0].iov_len = src_len;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = 1;
+		break;
+	default:
+		return -1;
+	}
+
+	if (test_data->algo == RTE_CRYPTO_AEAD_AES_CCM) {
+		memcpy(data->iv + 1, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad + 18, test_data->aad.data, test_data->aad.len);
+	} else {
+		memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad, test_data->aad.data, test_data->aad.len);
+	}
+
+	if (is_unit_test) {
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+		debug_hexdump(stdout, "aad:", test_data->aad.data,
+				test_data->aad.len);
+	}
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+	obj->aad[obj_idx] = (void *)data->aad;
+
+	return 0;
+}
+
+#define CPU_CRYPTO_ERR_EXP_CT	"expect ciphertext:"
+#define CPU_CRYPTO_ERR_GEN_CT	"gen ciphertext:"
+#define CPU_CRYPTO_ERR_EXP_PT	"expect plaintext:"
+#define CPU_CRYPTO_ERR_GEN_PT	"gen plaintext:"
+
+static int
+check_aead_result(struct cpu_crypto_test_case *tcase,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *tdata)
+{
+	const char *err_msg1, *err_msg2;
+	const uint8_t *src_pt_ct;
+	const uint8_t *tmp_src;
+	uint32_t src_len;
+	uint32_t left;
+	uint32_t i = 0;
+	int ret;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+
+		src_pt_ct = tdata->ciphertext.data;
+		src_len = tdata->ciphertext.len;
+
+		ret = memcmp(tcase->digest, tdata->auth_tag.data,
+				tdata->auth_tag.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					tdata->auth_tag.data,
+					tdata->auth_tag.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					tdata->auth_tag.len);
+			return -1;
+		}
+	} else {
+		src_pt_ct = tdata->plaintext.data;
+		src_len = tdata->plaintext.len;
+		err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+	}
+
+	tmp_src = src_pt_ct;
+	left = src_len;
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		ret = memcmp(tcase->seg_buf[i].seg, tmp_src,
+				tcase->seg_buf[i].seg_len);
+		if (ret != 0)
+			goto sgl_err_dump;
+		tmp_src += tcase->seg_buf[i].seg_len;
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+
+	if (left) {
+		ret = -ENOMEM;
+		goto sgl_err_dump;
+	}
+
+	return 0;
+
+sgl_err_dump:
+	left = src_len;
+	i = 0;
+
+	debug_hexdump(stdout, err_msg1,
+			tdata->ciphertext.data,
+			tdata->ciphertext.len);
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		debug_hexdump(stdout, err_msg2,
+				tcase->seg_buf[i].seg,
+				tcase->seg_buf[i].seg_len);
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+	return ret;
+}
+
+static inline void
+run_test(struct rte_security_ctx *ctx, struct rte_security_session *sess,
+		struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	rte_security_process_cpu_crypto_bulk(ctx, sess, obj->sec_buf,
+			obj->iv, obj->aad, obj->digest, obj->status, n);
+}
+
+static int
+cpu_crypto_test_aead(const struct aead_test_data *tdata,
+		enum rte_crypto_aead_operation dir,
+		enum buffer_assemble_option sgl_option)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			dir,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_aead_buf(tcase, obj, 0, dir, tdata, sgl_option, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_aead_result(tcase, dir, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* test-vector/sgl-option */
+#define all_gcm_unit_test_cases(type)		\
+	TEST_EXPAND(gcm_test_case_1, type)	\
+	TEST_EXPAND(gcm_test_case_2, type)	\
+	TEST_EXPAND(gcm_test_case_3, type)	\
+	TEST_EXPAND(gcm_test_case_4, type)	\
+	TEST_EXPAND(gcm_test_case_5, type)	\
+	TEST_EXPAND(gcm_test_case_6, type)	\
+	TEST_EXPAND(gcm_test_case_7, type)	\
+	TEST_EXPAND(gcm_test_case_8, type)	\
+	TEST_EXPAND(gcm_test_case_192_1, type)	\
+	TEST_EXPAND(gcm_test_case_192_2, type)	\
+	TEST_EXPAND(gcm_test_case_192_3, type)	\
+	TEST_EXPAND(gcm_test_case_192_4, type)	\
+	TEST_EXPAND(gcm_test_case_192_5, type)	\
+	TEST_EXPAND(gcm_test_case_192_6, type)	\
+	TEST_EXPAND(gcm_test_case_192_7, type)	\
+	TEST_EXPAND(gcm_test_case_256_1, type)	\
+	TEST_EXPAND(gcm_test_case_256_2, type)	\
+	TEST_EXPAND(gcm_test_case_256_3, type)	\
+	TEST_EXPAND(gcm_test_case_256_4, type)	\
+	TEST_EXPAND(gcm_test_case_256_5, type)	\
+	TEST_EXPAND(gcm_test_case_256_6, type)	\
+	TEST_EXPAND(gcm_test_case_256_7, type)
+
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_aead_enc_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_ENCRYPT, o);	\
+}									\
+static int								\
+cpu_crypto_aead_dec_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_DECRYPT, o);	\
+}									\
+
+all_gcm_unit_test_cases(SGL_ONE_SEG)
+all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-GCM Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
+}
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
+		test_security_cpu_crypto_aesni_gcm);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 04/10] app/test: add security cpu crypto perftest
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (2 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 03/10] app/test: add security cpu crypto autotest Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
                     ` (7 subsequent siblings)
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple GCM CPU crypto performance test to crypto unittest
application. The test includes different key and data sizes test with
single buffer and SGL buffer test items and will display the throughput
as well as cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 201 ++++++++++++++++++++++++++++++++++++
 1 file changed, 201 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index d345922b2..ca9a8dae6 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -23,6 +23,7 @@
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
+#define CACHE_WARM_ITER			2048
 
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
@@ -560,5 +561,205 @@ test_security_cpu_crypto_aesni_gcm(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
 }
 
+
+static inline void
+gen_rand(uint8_t *data, uint32_t len)
+{
+	uint32_t i;
+
+	for (i = 0; i < len; i++)
+		data[i] = (uint8_t)rte_rand();
+}
+
+static inline void
+switch_aead_enc_to_dec(struct aead_test_data *tdata,
+		struct cpu_crypto_test_case *tcase,
+		enum buffer_assemble_option sgl_option)
+{
+	uint32_t i;
+	uint8_t *dst = tdata->ciphertext.data;
+
+	switch (sgl_option) {
+	case SGL_ONE_SEG:
+		memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+		tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+		break;
+	case SGL_MAX_SEG:
+		tdata->ciphertext.len = 0;
+		for (i = 0; i < MAX_NB_SIGMENTS; i++) {
+			memcpy(dst, tcase->seg_buf[i].seg,
+					tcase->seg_buf[i].seg_len);
+			tdata->ciphertext.len += tcase->seg_buf[i].seg_len;
+		}
+		break;
+	}
+
+	memcpy(tdata->auth_tag.data, tcase->digest, tdata->auth_tag.len);
+}
+
+static int
+cpu_crypto_test_aead_perf(enum buffer_assemble_option sgl_option,
+		uint32_t key_sz)
+{
+	struct aead_test_data tdata = {0};
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint8_t aad[16];
+	int ret;
+
+	tdata.key.len = key_sz;
+	gen_rand(tdata.key.data, tdata.key.len);
+	tdata.algo = RTE_CRYPTO_AEAD_AES_GCM;
+	tdata.aad.data = aad;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			RTE_CRYPTO_AEAD_OP_DECRYPT,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(tdata.plaintext.data,
+					tdata.plaintext.len);
+
+			tdata.aad.len = 12;
+			gen_rand(tdata.aad.data, tdata.aad.len);
+
+			tdata.auth_tag.len = 16;
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_ENCRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_aead_enc_to_dec(&tdata, tcase, sgl_option);
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_DECRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* test-perfix/key-size/sgl-type */
+#define all_gcm_perf_test_cases(type)					\
+	TEST_EXPAND(_128, 16, type)					\
+	TEST_EXPAND(_192, 24, type)					\
+	TEST_EXPAND(_256, 32, type)
+
+#define TEST_EXPAND(a, b, c)						\
+static int								\
+cpu_crypto_gcm_perf##a##_##c(void)					\
+{									\
+	return cpu_crypto_test_aead_perf(c, b);				\
+}									\
+
+all_gcm_perf_test_cases(SGL_ONE_SEG)
+all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_perf_testsuite  = {
+		.suite_name = "Security CPU Crypto AESNI-GCM Perf Test Suite",
+		.setup = testsuite_setup,
+		.teardown = testsuite_teardown,
+		.unit_test_cases = {
+#define TEST_EXPAND(a, b, c)						\
+		TEST_CASE_ST(ut_setup, ut_teardown,			\
+				cpu_crypto_gcm_perf##a##_##c),		\
+
+		all_gcm_perf_test_cases(SGL_ONE_SEG)
+		all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+		TEST_CASES_END() /**< NULL terminate unit test array */
+		},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesgcm_perf_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
+		test_security_cpu_crypto_aesni_gcm_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 05/10] crypto/aesni_mb: add rte_security handler
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (3 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 04/10] app/test: add security cpu crypto perftest Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-18 15:20     ` Ananyev, Konstantin
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 06/10] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
                     ` (6 subsequent siblings)
  11 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch add rte_security support support to AESNI-MB PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         | 291 ++++++++++++++++++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |  91 ++++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |  21 +-
 3 files changed, 398 insertions(+), 5 deletions(-)

diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index b495a9679..68767c04e 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -8,6 +8,8 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -789,6 +791,167 @@ auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
 			(UINT64_MAX - u_src + u_dst + 1);
 }
 
+union sec_userdata_field {
+	int status;
+	struct {
+		uint16_t is_gen_digest;
+		uint16_t digest_len;
+	};
+};
+
+struct sec_udata_digest_field {
+	uint32_t is_digest_gen;
+	uint32_t digest_len;
+};
+
+static inline int
+set_mb_job_params_sec(JOB_AES_HMAC *job, struct aesni_mb_sec_session *sec_sess,
+		void *buf, uint32_t buf_len, void *iv, void *aad, void *digest,
+		int *status, uint8_t *digest_idx)
+{
+	struct aesni_mb_session *session = &sec_sess->sess;
+	uint32_t cipher_offset = sec_sess->cipher_offset;
+	void *user_digest = NULL;
+	union sec_userdata_field udata;
+
+	if (unlikely(cipher_offset > buf_len))
+		return -EINVAL;
+
+	/* Set crypto operation */
+	job->chain_order = session->chain_order;
+
+	/* Set cipher parameters */
+	job->cipher_direction = session->cipher.direction;
+	job->cipher_mode = session->cipher.mode;
+
+	job->aes_key_len_in_bytes = session->cipher.key_length_in_bytes;
+
+	/* Set authentication parameters */
+	job->hash_alg = session->auth.algo;
+	job->iv = iv;
+
+	switch (job->hash_alg) {
+	case AES_XCBC:
+		job->u.XCBC._k1_expanded = session->auth.xcbc.k1_expanded;
+		job->u.XCBC._k2 = session->auth.xcbc.k2;
+		job->u.XCBC._k3 = session->auth.xcbc.k3;
+
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_CCM:
+		job->u.CCM.aad = (uint8_t *)aad + 18;
+		job->u.CCM.aad_len_in_bytes = session->aead.aad_len;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		job->iv++;
+		break;
+
+	case AES_CMAC:
+		job->u.CMAC._key_expanded = session->auth.cmac.expkey;
+		job->u.CMAC._skey1 = session->auth.cmac.skey1;
+		job->u.CMAC._skey2 = session->auth.cmac.skey2;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_GMAC:
+		if (session->cipher.mode == GCM) {
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = session->aead.aad_len;
+		} else {
+			/* For GMAC */
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = buf_len;
+			job->cipher_mode = GCM;
+		}
+		job->aes_enc_key_expanded = &session->cipher.gcm_key;
+		job->aes_dec_key_expanded = &session->cipher.gcm_key;
+		break;
+
+	default:
+		job->u.HMAC._hashed_auth_key_xor_ipad =
+				session->auth.pads.inner;
+		job->u.HMAC._hashed_auth_key_xor_opad =
+				session->auth.pads.outer;
+
+		if (job->cipher_mode == DES3) {
+			job->aes_enc_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+			job->aes_dec_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+		} else {
+			job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+			job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		}
+	}
+
+	/* Set digest output location */
+	if (job->hash_alg != NULL_HASH &&
+			session->auth.operation == RTE_CRYPTO_AUTH_OP_VERIFY) {
+		job->auth_tag_output = sec_sess->temp_digests[*digest_idx];
+		*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+
+		udata.is_gen_digest = 0;
+		udata.digest_len = session->auth.req_digest_len;
+		user_digest = (void *)digest;
+	} else {
+		udata.is_gen_digest = 1;
+		udata.digest_len = session->auth.req_digest_len;
+
+		if (session->auth.req_digest_len !=
+				session->auth.gen_digest_len) {
+			job->auth_tag_output =
+					sec_sess->temp_digests[*digest_idx];
+			*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+
+			user_digest = (void *)digest;
+		} else
+			job->auth_tag_output = digest;
+
+		/* A bit of hack here, since job structure only supports
+		 * 2 user data fields and we need 4 params to be passed
+		 * (status, direction, digest for verify, and length of
+		 * digest), we set the status value as digest length +
+		 * direction here temporarily to avoid creating longer
+		 * buffer to store all 4 params.
+		 */
+		*status = udata.status;
+	}
+	/*
+	 * Multi-buffer library current only support returning a truncated
+	 * digest length as specified in the relevant IPsec RFCs
+	 */
+
+	/* Set digest length */
+	job->auth_tag_output_len_in_bytes = session->auth.gen_digest_len;
+
+	/* Set IV parameters */
+	job->iv_len_in_bytes = session->iv.length;
+
+	/* Data Parameters */
+	job->src = buf;
+	job->dst = buf;
+	job->cipher_start_src_offset_in_bytes = cipher_offset;
+	job->msg_len_to_cipher_in_bytes = buf_len - cipher_offset;
+	job->hash_start_src_offset_in_bytes = 0;
+	job->msg_len_to_hash_in_bytes = buf_len;
+
+	job->user_data = (void *)status;
+	job->user_data2 = user_digest;
+
+	return 0;
+}
+
 /**
  * Process a crypto operation and complete a JOB_AES_HMAC job structure for
  * submission to the multi buffer library for processing.
@@ -1081,6 +1244,37 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
 	return op;
 }
 
+static inline void
+post_process_mb_sec_job(JOB_AES_HMAC *job)
+{
+	void *user_digest = job->user_data2;
+	int *status = job->user_data;
+	union sec_userdata_field udata;
+
+	switch (job->status) {
+	case STS_COMPLETED:
+		if (user_digest) {
+			udata.status = *status;
+
+			if (udata.is_gen_digest) {
+				*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+				memcpy(user_digest, job->auth_tag_output,
+						udata.digest_len);
+			} else {
+				verify_digest(job, user_digest,
+					udata.digest_len, (uint8_t *)status);
+
+				if (*status == RTE_CRYPTO_OP_STATUS_AUTH_FAILED)
+					*status = -1;
+			}
+		} else
+			*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+		break;
+	default:
+		*status = RTE_CRYPTO_OP_STATUS_ERROR;
+	}
+}
+
 /**
  * Process a completed JOB_AES_HMAC job and keep processing jobs until
  * get_completed_job return NULL
@@ -1117,6 +1311,32 @@ handle_completed_jobs(struct aesni_mb_qp *qp, JOB_AES_HMAC *job,
 	return processed_jobs;
 }
 
+static inline uint32_t
+handle_completed_sec_jobs(JOB_AES_HMAC *job, MB_MGR *mb_mgr)
+{
+	uint32_t processed = 0;
+
+	while (job != NULL) {
+		post_process_mb_sec_job(job);
+		job = IMB_GET_COMPLETED_JOB(mb_mgr);
+		processed++;
+	}
+
+	return processed;
+}
+
+static inline uint32_t
+flush_mb_sec_mgr(MB_MGR *mb_mgr)
+{
+	JOB_AES_HMAC *job = IMB_FLUSH_JOB(mb_mgr);
+	uint32_t processed = 0;
+
+	if (job)
+		processed = handle_completed_sec_jobs(job, mb_mgr);
+
+	return processed;
+}
+
 static inline uint16_t
 flush_mb_mgr(struct aesni_mb_qp *qp, struct rte_crypto_op **ops,
 		uint16_t nb_ops)
@@ -1220,6 +1440,55 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	return processed_jobs;
 }
 
+void
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_mb_sec_session *sec_sess = sess->sess_private_data;
+	JOB_AES_HMAC *job;
+	uint8_t digest_idx = sec_sess->digest_idx;
+	uint32_t i, processed = 0;
+	int ret;
+
+	for (i = 0; i < num; i++) {
+		void *seg_buf = buf[i].vec[0].iov_base;
+		uint32_t buf_len = buf[i].vec[0].iov_len;
+
+		job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
+		if (unlikely(job == NULL)) {
+			processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
+
+			job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
+			if (!job)
+				return;
+		}
+
+		ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
+				iv[i], aad[i], digest[i], &status[i],
+				&digest_idx);
+				/* Submit job to multi-buffer for processing */
+		if (ret) {
+			processed++;
+			status[i] = ret;
+			continue;
+		}
+
+#ifdef RTE_LIBRTE_PMD_AESNI_MB_DEBUG
+		job = IMB_SUBMIT_JOB(sec_sess->mb_mgr);
+#else
+		job = IMB_SUBMIT_JOB_NOCHECK(sec_sess->mb_mgr);
+#endif
+
+		if (job)
+			processed += handle_completed_sec_jobs(job,
+					sec_sess->mb_mgr);
+	}
+
+	while (processed < num)
+		processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
+}
+
 static int cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev);
 
 static int
@@ -1229,8 +1498,10 @@ cryptodev_aesni_mb_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_mb_private *internals;
+	struct rte_security_ctx *sec_ctx;
 	enum aesni_mb_vector_mode vector_mode;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -1264,7 +1535,8 @@ cryptodev_aesni_mb_create(const char *name,
 	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO |
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 
 	mb_mgr = alloc_mb_mgr(0);
@@ -1303,11 +1575,28 @@ cryptodev_aesni_mb_create(const char *name,
 	AESNI_MB_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_mb_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_MB_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_mb_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 	return 0;
 
 error_exit:
 	if (mb_mgr)
 		free_mb_mgr(mb_mgr);
+	if (sec_ctx)
+		rte_free(sec_ctx);
 
 	rte_cryptodev_pmd_destroy(dev);
 
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
index 8d15b99d4..ca6cea775 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
@@ -8,6 +8,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "rte_aesni_mb_pmd_private.h"
 
@@ -732,7 +733,8 @@ aesni_mb_pmd_qp_count(struct rte_cryptodev *dev)
 static unsigned
 aesni_mb_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
 {
-	return sizeof(struct aesni_mb_session);
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_session),
+			RTE_CACHE_LINE_SIZE);
 }
 
 /** Configure a aesni multi-buffer session from a crypto xform chain */
@@ -810,4 +812,91 @@ struct rte_cryptodev_ops aesni_mb_pmd_ops = {
 		.sym_session_clear	= aesni_mb_pmd_sym_session_clear
 };
 
+/** Set session authentication parameters */
+
+static int
+aesni_mb_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_mb_private *internals = cdev->data->dev_private;
+	struct aesni_mb_sec_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_MB_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_MB_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	sess_priv->mb_mgr = internals->mb_mgr;
+	if (sess_priv->mb_mgr == NULL)
+		return -ENOMEM;
+
+	sess_priv->cipher_offset = conf->cpucrypto.cipher_offset;
+
+	ret = aesni_mb_set_session_parameters(sess_priv->mb_mgr,
+			&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_MB_LOG(ERR, "failed configure session parameters");
+
+		rte_mempool_put(mempool, sess_priv);
+	}
+
+	sess->sess_private_data = (void *)sess_priv;
+
+	return ret;
+}
+
+static int
+aesni_mb_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	struct aesni_mb_sec_session *sess_priv =
+			get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(
+				(void *)sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_mb_sec_session));
+		set_sec_session_private_data(sess, NULL);
+
+		if (sess_mp == NULL) {
+			AESNI_MB_LOG(ERR, "failed fetch session mempool");
+			return -EINVAL;
+		}
+
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+
+	return 0;
+}
+
+static unsigned int
+aesni_mb_sec_session_get_size(__rte_unused void *device)
+{
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_sec_session),
+			RTE_CACHE_LINE_SIZE);
+}
+
+static struct rte_security_ops aesni_mb_security_ops = {
+		.session_create = aesni_mb_security_session_create,
+		.session_get_size = aesni_mb_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_mb_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk = aesni_mb_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops = &aesni_mb_pmd_ops;
+struct rte_security_ops *rte_aesni_mb_pmd_security_ops = &aesni_mb_security_ops;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
index b794d4bc1..d1cf416ab 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
@@ -176,7 +176,6 @@ struct aesni_mb_qp {
 	 */
 } __rte_cache_aligned;
 
-/** AES-NI multi-buffer private session structure */
 struct aesni_mb_session {
 	JOB_CHAIN_ORDER chain_order;
 	struct {
@@ -265,16 +264,32 @@ struct aesni_mb_session {
 		/** AAD data length */
 		uint16_t aad_len;
 	} aead;
-} __rte_cache_aligned;
+};
+
+/** AES-NI multi-buffer private security session structure */
+struct aesni_mb_sec_session {
+	/**< Unique Queue Pair Name */
+	struct aesni_mb_session sess;
+	uint8_t temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];
+	uint16_t digest_idx;
+	uint32_t cipher_offset;
+	MB_MGR *mb_mgr;
+};
 
 extern int
 aesni_mb_set_session_parameters(const MB_MGR *mb_mgr,
 		struct aesni_mb_session *sess,
 		const struct rte_crypto_sym_xform *xform);
 
+extern void
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops;
 
-
+/** device specific operations function pointer structure for rte_security */
+extern struct rte_security_ops *rte_aesni_mb_pmd_security_ops;
 
 #endif /* _RTE_AESNI_MB_PMD_PRIVATE_H_ */
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 06/10] app/test: add aesni_mb security cpu crypto autotest
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (4 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 07/10] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
                     ` (5 subsequent siblings)
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch adds cpu crypto unit test for AESNI_MB PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 367 ++++++++++++++++++++++++++++++++++++
 1 file changed, 367 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index ca9a8dae6..0ea406390 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -19,12 +19,23 @@
 
 #include "test.h"
 #include "test_cryptodev.h"
+#include "test_cryptodev_blockcipher.h"
+#include "test_cryptodev_aes_test_vectors.h"
 #include "test_cryptodev_aead_test_vectors.h"
+#include "test_cryptodev_des_test_vectors.h"
+#include "test_cryptodev_hash_test_vectors.h"
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
 #define CACHE_WARM_ITER			2048
 
+#define TOP_ENC		BLOCKCIPHER_TEST_OP_ENCRYPT
+#define TOP_DEC		BLOCKCIPHER_TEST_OP_DECRYPT
+#define TOP_AUTH_GEN	BLOCKCIPHER_TEST_OP_AUTH_GEN
+#define TOP_AUTH_VER	BLOCKCIPHER_TEST_OP_AUTH_VERIFY
+#define TOP_ENC_AUTH	BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN
+#define TOP_AUTH_DEC	BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC
+
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
 	SGL_ONE_SEG,
@@ -516,6 +527,11 @@ cpu_crypto_test_aead(const struct aead_test_data *tdata,
 	TEST_EXPAND(gcm_test_case_256_6, type)	\
 	TEST_EXPAND(gcm_test_case_256_7, type)
 
+/* test-vector/sgl-option */
+#define all_ccm_unit_test_cases \
+	TEST_EXPAND(ccm_test_case_128_1, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_2, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_3, SGL_ONE_SEG)
 
 #define TEST_EXPAND(t, o)						\
 static int								\
@@ -531,6 +547,7 @@ cpu_crypto_aead_dec_test_##t##_##o(void)				\
 
 all_gcm_unit_test_cases(SGL_ONE_SEG)
 all_gcm_unit_test_cases(SGL_MAX_SEG)
+all_ccm_unit_test_cases
 #undef TEST_EXPAND
 
 static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
@@ -758,8 +775,358 @@ test_security_cpu_crypto_aesni_gcm_perf(void)
 			&security_cpu_crypto_aesgcm_perf_testsuite);
 }
 
+static struct rte_security_session *
+create_blockcipher_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xforms[2] = { {0} };
+	struct rte_crypto_sym_xform *cipher_xform = NULL;
+	struct rte_crypto_sym_xform *auth_xform = NULL;
+	struct rte_crypto_sym_xform *xform;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		cipher_xform = &xforms[0];
+		cipher_xform->type = RTE_CRYPTO_SYM_XFORM_CIPHER;
+
+		if (op_mask & TOP_ENC)
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_ENCRYPT;
+		else
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_DECRYPT;
+
+		cipher_xform->cipher.algo = test_data->crypto_algo;
+		cipher_xform->cipher.key.data = test_data->cipher_key.data;
+		cipher_xform->cipher.key.length = test_data->cipher_key.len;
+		cipher_xform->cipher.iv.offset = 0;
+		cipher_xform->cipher.iv.length = test_data->iv.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "cipher key:",
+					test_data->cipher_key.data,
+					test_data->cipher_key.len);
+	}
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH) {
+		auth_xform = &xforms[1];
+		auth_xform->type = RTE_CRYPTO_SYM_XFORM_AUTH;
+
+		if (op_mask & TOP_AUTH_GEN)
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_GENERATE;
+		else
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_VERIFY;
+
+		auth_xform->auth.algo = test_data->auth_algo;
+		auth_xform->auth.key.length = test_data->auth_key.len;
+		auth_xform->auth.key.data = test_data->auth_key.data;
+		auth_xform->auth.digest_length = test_data->digest.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "auth key:",
+					test_data->auth_key.data,
+					test_data->auth_key.len);
+	}
+
+	if (op_mask == TOP_ENC ||
+			op_mask == TOP_DEC)
+		xform = cipher_xform;
+	else if (op_mask == TOP_AUTH_GEN ||
+			op_mask == TOP_AUTH_VER)
+		xform = auth_xform;
+	else if (op_mask == TOP_ENC_AUTH) {
+		xform = cipher_xform;
+		xform->next = auth_xform;
+	} else if (op_mask == TOP_AUTH_DEC) {
+		xform = auth_xform;
+		xform->next = cipher_xform;
+	} else
+		return NULL;
+
+	if (test_data->cipher_offset < test_data->auth_offset)
+		return NULL;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = xform;
+	sess_conf.cpucrypto.cipher_offset = test_data->cipher_offset -
+			test_data->auth_offset;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_blockcipher_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t offset;
+
+	if (op_mask == TOP_ENC_AUTH ||
+			op_mask == TOP_AUTH_GEN ||
+			op_mask == BLOCKCIPHER_TEST_OP_AUTH_VERIFY)
+		offset = test_data->auth_offset;
+	else
+		offset = test_data->cipher_offset;
+
+	if (op_mask & TOP_ENC_AUTH) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:", test_data->digest.data,
+					test_data->digest.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	memcpy(data->seg_buf[0].seg, src, src_len);
+	data->seg_buf[0].seg_len = src_len;
+	obj->vec[obj_idx][0].iov_base =
+			(void *)(data->seg_buf[0].seg + offset);
+	obj->vec[obj_idx][0].iov_len = src_len - offset;
+
+	obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+	obj->sec_buf[obj_idx].num = 1;
+
+	memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+	if (is_unit_test)
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+
+	return 0;
+}
+
+static int
+check_blockcipher_result(struct cpu_crypto_test_case *tcase,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data)
+{
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		const char *err_msg1, *err_msg2;
+		const uint8_t *src_pt_ct;
+		uint32_t src_len;
+
+		if (op_mask & TOP_ENC) {
+			src_pt_ct = test_data->ciphertext.data;
+			src_len = test_data->ciphertext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+		} else {
+			src_pt_ct = test_data->plaintext.data;
+			src_len = test_data->plaintext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+		}
+
+		ret = memcmp(tcase->seg_buf[0].seg, src_pt_ct, src_len);
+		if (ret != 0) {
+			debug_hexdump(stdout, err_msg1, src_pt_ct, src_len);
+			debug_hexdump(stdout, err_msg2,
+					tcase->seg_buf[0].seg,
+					test_data->ciphertext.len);
+			return -1;
+		}
+	}
+
+	if (op_mask & TOP_AUTH_GEN) {
+		ret = memcmp(tcase->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					test_data->digest.data,
+					test_data->digest.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					test_data->digest.len);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static int
+cpu_crypto_test_blockcipher(const struct blockcipher_test_data *tdata,
+		uint32_t op_mask)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_blockcipher_buf(tcase, obj, 0, op_mask, tdata, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_blockcipher_result(tcase, op_mask, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* Macro to save code for defining BlockCipher test cases */
+/* test-vector-name/op */
+#define all_blockcipher_test_cases \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_1, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_1, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_2, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_2, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_3, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_3, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_4, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_4, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_5, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_5, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_6, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_6, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_7, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_7, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_8, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_8, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_9, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_9, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_10, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_10, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_11, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_11, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_12, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_12, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_13, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_13, TOP_AUTH_DEC) \
+	TEST_EXPAND(des_test_data_1, TOP_ENC) \
+	TEST_EXPAND(des_test_data_1, TOP_DEC) \
+	TEST_EXPAND(des_test_data_2, TOP_ENC) \
+	TEST_EXPAND(des_test_data_2, TOP_DEC) \
+	TEST_EXPAND(des_test_data_3, TOP_ENC) \
+	TEST_EXPAND(des_test_data_3, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC_AUTH) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_AUTH_DEC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_DEC) \
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_blockcipher_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_blockcipher(&t, o);			\
+}
+
+all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_ccm_unit_test_cases
+#undef TEST_EXPAND
+
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_blockcipher_test_##t##_##o),		\
+
+	all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
 REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 		test_security_cpu_crypto_aesni_gcm_perf);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
+		test_security_cpu_crypto_aesni_mb);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 07/10] app/test: add aesni_mb security cpu crypto perftest
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (5 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 06/10] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
                     ` (4 subsequent siblings)
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple AES-CBC-SHA1-HMAC CPU crypto performance test to crypto
unittest application. The test includes different key and data sizes test
with single buffer test items and will display the throughput as well as
cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 194 ++++++++++++++++++++++++++++++++++++
 1 file changed, 194 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index 0ea406390..6e012672e 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -1122,6 +1122,197 @@ test_security_cpu_crypto_aesni_mb(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
 }
 
+static inline void
+switch_blockcipher_enc_to_dec(struct blockcipher_test_data *tdata,
+		struct cpu_crypto_test_case *tcase, uint8_t *dst)
+{
+	memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+	tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+	memcpy(tdata->digest.data, tcase->digest, tdata->digest.len);
+}
+
+static int
+cpu_crypto_test_blockcipher_perf(
+		const enum rte_crypto_cipher_algorithm cipher_algo,
+		uint32_t cipher_key_sz,
+		const enum rte_crypto_auth_algorithm auth_algo,
+		uint32_t auth_key_sz, uint32_t digest_sz,
+		uint32_t op_mask)
+{
+	struct blockcipher_test_data tdata = {0};
+	uint8_t plaintext[3000], ciphertext[3000];
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint32_t op_mask_opp = 0;
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_CIPHER);
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_AUTH);
+
+	tdata.plaintext.data = plaintext;
+	tdata.ciphertext.data = ciphertext;
+
+	tdata.cipher_key.len = cipher_key_sz;
+	tdata.auth_key.len = auth_key_sz;
+
+	gen_rand(tdata.cipher_key.data, cipher_key_sz / 8);
+	gen_rand(tdata.auth_key.data, auth_key_sz / 8);
+
+	tdata.crypto_algo = cipher_algo;
+	tdata.auth_algo = auth_algo;
+
+	tdata.digest.len = digest_sz;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(plaintext, tdata.plaintext.len);
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+			cycles_per_buf, cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_blockcipher_enc_to_dec(&tdata, tcase,
+					ciphertext);
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask_opp,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* cipher-algo/cipher-key-len/auth-algo/auth-key-len/digest-len/op */
+#define all_block_cipher_perf_test_cases				\
+	TEST_EXPAND(_AES_CBC, 128, _NULL, 0, 0, TOP_ENC)		\
+	TEST_EXPAND(_NULL, 0, _SHA1_HMAC, 160, 20, TOP_AUTH_GEN)	\
+	TEST_EXPAND(_AES_CBC, 128, _SHA1_HMAC, 160, 20, TOP_ENC_AUTH)
+
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+static int								\
+cpu_crypto_blockcipher_perf##a##_##b##c##_##f(void)			\
+{									\
+	return cpu_crypto_test_blockcipher_perf(RTE_CRYPTO_CIPHER##a,	\
+			b / 8, RTE_CRYPTO_AUTH##c, d / 8, e, f);	\
+}									\
+
+all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_perf_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Perf Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+		cpu_crypto_blockcipher_perf##a##_##b##c##_##f),	\
+
+	all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesni_mb_perf_testsuite);
+}
+
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
@@ -1130,3 +1321,6 @@ REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 
 REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
 		test_security_cpu_crypto_aesni_mb);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_perftest,
+		test_security_cpu_crypto_aesni_mb_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (6 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 07/10] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-26 23:20     ` Ananyev, Konstantin
  2019-09-27 10:38     ` Ananyev, Konstantin
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 09/10] examples/ipsec-secgw: add security " Fan Zhang
                     ` (3 subsequent siblings)
  11 siblings, 2 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch updates the ipsec library to handle the newly introduced
RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_ipsec/esp_inb.c  | 174 +++++++++++++++++++++++++-
 lib/librte_ipsec/esp_outb.c | 290 +++++++++++++++++++++++++++++++++++++++++++-
 lib/librte_ipsec/sa.c       |  53 ++++++--
 lib/librte_ipsec/sa.h       |  29 +++++
 lib/librte_ipsec/ses.c      |   4 +-
 5 files changed, 539 insertions(+), 11 deletions(-)

diff --git a/lib/librte_ipsec/esp_inb.c b/lib/librte_ipsec/esp_inb.c
index 8e3ecbc64..6077dcb1e 100644
--- a/lib/librte_ipsec/esp_inb.c
+++ b/lib/librte_ipsec/esp_inb.c
@@ -105,6 +105,73 @@ inb_cop_prepare(struct rte_crypto_op *cop,
 	}
 }
 
+static inline int
+inb_sync_crypto_proc_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
+	const union sym_op_data *icv, uint32_t pofs, uint32_t plen,
+	struct rte_security_vec *buf, struct iovec *cur_vec,
+	void *iv, void **aad, void **digest)
+{
+	struct rte_mbuf *ms;
+	struct iovec *vec = cur_vec;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	uint64_t *ivp;
+	uint32_t algo, left, off = 0, n_seg = 0;
+
+	ivp = rte_pktmbuf_mtod_offset(mb, uint64_t *,
+		pofs + sizeof(struct rte_esp_hdr));
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = (struct aead_gcm_iv *)iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		*aad = icv->va + sa->icv_len;
+		off = sa->ctp.cipher.offset + pofs;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		off = sa->ctp.auth.offset + pofs;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		off = sa->ctp.auth.offset + pofs;
+		ctr = (struct aesctr_cnt_blk *)iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		break;
+	}
+
+	*digest = icv->va;
+
+	left = plen - sa->ctp.cipher.length;
+
+	ms = mbuf_get_seg_ofs(mb, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
 /*
  * Helper function for prepare() to deal with situation when
  * ICV is spread by two segments. Tries to move ICV completely into the
@@ -512,7 +579,6 @@ tun_process(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return k;
 }
 
-
 /*
  * *process* function for tunnel packets
  */
@@ -625,6 +691,112 @@ esp_inb_pkt_process(struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return n;
 }
 
+/*
+ * process packets using sync crypto engine
+ */
+static uint16_t
+esp_inb_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num, uint8_t sqh_len,
+		esp_inb_process_t process)
+{
+	int32_t rc;
+	uint32_t i, k, hl, n, p;
+	struct rte_ipsec_sa *sa;
+	struct replay_sqn *rsn;
+	union sym_op_data icv;
+	uint32_t sqn[num];
+	uint32_t dr[num];
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	void *iv[num];
+	void *aad[num];
+	void *digest[num];
+	int status[num];
+
+	sa = ss->sa;
+	rsn = rsn_acquire(sa);
+
+	k = 0;
+	for (i = 0; i != num; i++) {
+		hl = mb[i]->l2_len + mb[i]->l3_len;
+		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, &icv);
+		if (rc >= 0) {
+			iv[k] = (void *)ivs[k];
+			rc = inb_sync_crypto_proc_prepare(sa, mb[i], &icv, hl,
+					rc, &buf[k], &vec[vec_idx], iv[k],
+					&aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		} else
+			dr[i - k] = i;
+	}
+
+	/* copy not prepared mbufs beyond good ones */
+	if (k != num) {
+		rte_errno = EBADMSG;
+
+		if (unlikely(k == 0))
+			return 0;
+
+		move_bad_mbufs(mb, dr, num, num - k);
+	}
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ss->security.ctx,
+			ss->security.ses, buf, iv, aad, digest, status,
+			k);
+	/* move failed process packets to dr */
+	for (i = 0; i < k; i++) {
+		if (status[i]) {
+			dr[n++] = i;
+			rte_errno = EBADMSG;
+		}
+	}
+
+	/* move bad packets to the back */
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	/* process packets */
+	p = process(sa, mb, sqn, dr, k - n, sqh_len);
+
+	if (p != k - n && p != 0)
+		move_bad_mbufs(mb, dr, k - n, k - n - p);
+
+	if (p != num)
+		rte_errno = EBADMSG;
+
+	return p;
+}
+
+uint16_t
+esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+
+	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
+			tun_process);
+}
+
+uint16_t
+esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+
+	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
+			trs_process);
+}
+
 /*
  * process group of ESP inbound tunnel packets.
  */
diff --git a/lib/librte_ipsec/esp_outb.c b/lib/librte_ipsec/esp_outb.c
index 55799a867..097cb663f 100644
--- a/lib/librte_ipsec/esp_outb.c
+++ b/lib/librte_ipsec/esp_outb.c
@@ -403,6 +403,292 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	return k;
 }
 
+
+static inline int
+outb_sync_crypto_proc_prepare(struct rte_mbuf *m, const struct rte_ipsec_sa *sa,
+		const uint64_t ivp[IPSEC_MAX_IV_QWORD],
+		const union sym_op_data *icv, uint32_t hlen, uint32_t plen,
+		struct rte_security_vec *buf, struct iovec *cur_vec, void *iv,
+		void **aad, void **digest)
+{
+	struct rte_mbuf *ms;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	struct iovec *vec = cur_vec;
+	uint32_t left, off = 0, n_seg = 0;
+	uint32_t algo;
+
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		*aad = (void *)(icv->va + sa->icv_len);
+		off = sa->ctp.cipher.offset + hlen;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		off = sa->ctp.auth.offset + hlen;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		ctr = iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		break;
+	}
+
+	*digest = (void *)icv->va;
+
+	left = sa->ctp.cipher.length + plen;
+
+	ms = mbuf_get_seg_ofs(m, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
+/**
+ * Local post process function prototype that same as process function prototype
+ * as rte_ipsec_sa_pkt_func's process().
+ */
+typedef uint16_t (*sync_crypto_post_process)(const struct rte_ipsec_session *ss,
+				struct rte_mbuf *mb[],
+				uint16_t num);
+static uint16_t
+esp_outb_tun_sync_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num,
+		sync_crypto_post_process post_process)
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	union sym_op_data icv;
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	void *aad[num];
+	void *digest[num];
+	void *iv[num];
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	uint64_t ivp[IPSEC_MAX_IV_QWORD];
+	int status[num];
+	uint32_t dr[num];
+	uint32_t i, n, k;
+	int32_t rc;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(ivp, sqc);
+
+		/* try to update the packet itself */
+		rc = outb_tun_pkt_prepare(sa, sqc, ivp, mb[i], &icv,
+				sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			outb_pkt_xprepare(sa, sqc, &icv);
+
+			iv[k] = (void *)ivs[k];
+			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
+					0, rc, &buf[k], &vec[vec_idx], iv[k],
+					&aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	if (unlikely(k == 0)) {
+		rte_errno = EBADMSG;
+		return 0;
+	}
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad, digest,
+			status, k);
+	/* move failed process packets to dr */
+	for (i = 0; i < n; i++) {
+		if (status[i])
+			dr[n++] = i;
+	}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return post_process(ss, mb, k - n);
+}
+
+static uint16_t
+esp_outb_trs_sync_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num,
+		sync_crypto_post_process post_process)
+
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	union sym_op_data icv;
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	void *aad[num];
+	void *digest[num];
+	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
+	void *iv[num];
+	int status[num];
+	uint64_t ivp[IPSEC_MAX_IV_QWORD];
+	uint32_t dr[num];
+	uint32_t i, n, k;
+	uint32_t l2, l3;
+	int32_t rc;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		l2 = mb[i]->l2_len;
+		l3 = mb[i]->l3_len;
+
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(ivp, sqc);
+
+		/* try to update the packet itself */
+		rc = outb_trs_pkt_prepare(sa, sqc, ivp, mb[i], l2, l3, &icv,
+				sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			outb_pkt_xprepare(sa, sqc, &icv);
+
+			iv[k] = (void *)ivs[k];
+
+			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
+					l2 + l3, rc, &buf[k], &vec[vec_idx],
+					iv[k], &aad[k], &digest[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	/* process the packets */
+	n = 0;
+	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad, digest,
+			status, k);
+	/* move failed process packets to dr */
+	for (i = 0; i < k; i++) {
+		if (status[i])
+			dr[n++] = i;
+	}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return post_process(ss, mb, k - n);
+}
+
+uint16_t
+esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_tun_sync_crypto_process(ss, mb, num,
+			esp_outb_sqh_process);
+}
+
+uint16_t
+esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_tun_sync_crypto_process(ss, mb, num,
+			esp_outb_pkt_flag_process);
+}
+
+uint16_t
+esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_trs_sync_crypto_process(ss, mb, num,
+			esp_outb_sqh_process);
+}
+
+uint16_t
+esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_trs_sync_crypto_process(ss, mb, num,
+			esp_outb_pkt_flag_process);
+}
+
 /*
  * process outbound packets for SA with ESN support,
  * for algorithms that require SQN.hibits to be implictly included
@@ -410,8 +696,8 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
  * In that case we have to move ICV bytes back to their proper place.
  */
 uint16_t
-esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+esp_outb_sqh_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k, icv_len, *icv;
 	struct rte_mbuf *ml;
diff --git a/lib/librte_ipsec/sa.c b/lib/librte_ipsec/sa.c
index 23d394b46..31ffbce2c 100644
--- a/lib/librte_ipsec/sa.c
+++ b/lib/librte_ipsec/sa.c
@@ -544,9 +544,9 @@ lksd_proto_prepare(const struct rte_ipsec_session *ss,
  * - inbound/outbound for RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
  * - outbound for RTE_SECURITY_ACTION_TYPE_NONE when ESN is disabled
  */
-static uint16_t
-pkt_flag_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k;
 	uint32_t dr[num];
@@ -599,12 +599,48 @@ lksd_none_pkt_func_select(const struct rte_ipsec_sa *sa,
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
 		pf->prepare = esp_outb_tun_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
 		break;
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
 		pf->prepare = esp_outb_trs_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
+		break;
+	default:
+		rc = -ENOTSUP;
+	}
+
+	return rc;
+}
+
+static int
+lksd_sync_crypto_pkt_func_select(const struct rte_ipsec_sa *sa,
+		struct rte_ipsec_sa_pkt_func *pf)
+{
+	int32_t rc;
+
+	static const uint64_t msk = RTE_IPSEC_SATP_DIR_MASK |
+			RTE_IPSEC_SATP_MODE_MASK;
+
+	rc = 0;
+	switch (sa->type & msk) {
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = esp_inb_tun_sync_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = esp_inb_trs_sync_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_tun_sync_crpyto_sqh_process :
+			esp_outb_tun_sync_crpyto_flag_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_trs_sync_crpyto_sqh_process :
+			esp_outb_trs_sync_crpyto_flag_process;
 		break;
 	default:
 		rc = -ENOTSUP;
@@ -672,13 +708,16 @@ ipsec_sa_pkt_func_select(const struct rte_ipsec_session *ss,
 	case RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL:
 		if ((sa->type & RTE_IPSEC_SATP_DIR_MASK) ==
 				RTE_IPSEC_SATP_DIR_IB)
-			pf->process = pkt_flag_process;
+			pf->process = esp_outb_pkt_flag_process;
 		else
 			pf->process = inline_proto_outb_pkt_process;
 		break;
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		pf->prepare = lksd_proto_prepare;
-		pf->process = pkt_flag_process;
+		pf->process = esp_outb_pkt_flag_process;
+		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		rc = lksd_sync_crypto_pkt_func_select(sa, pf);
 		break;
 	default:
 		rc = -ENOTSUP;
diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
index 51e69ad05..02c7abc60 100644
--- a/lib/librte_ipsec/sa.h
+++ b/lib/librte_ipsec/sa.h
@@ -156,6 +156,14 @@ uint16_t
 inline_inb_trs_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
 /* outbound processing */
 
 uint16_t
@@ -170,6 +178,10 @@ uint16_t
 esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	uint16_t num);
 
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num);
+
 uint16_t
 inline_outb_tun_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
@@ -182,4 +194,21 @@ uint16_t
 inline_proto_outb_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+
 #endif /* _SA_H_ */
diff --git a/lib/librte_ipsec/ses.c b/lib/librte_ipsec/ses.c
index 82c765a33..eaa8c17b7 100644
--- a/lib/librte_ipsec/ses.c
+++ b/lib/librte_ipsec/ses.c
@@ -19,7 +19,9 @@ session_check(struct rte_ipsec_session *ss)
 			return -EINVAL;
 		if ((ss->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
 				ss->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) &&
+				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+				ss->type ==
+				RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
 				ss->security.ctx == NULL)
 			return -EINVAL;
 	}
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 09/10] examples/ipsec-secgw: add security cpu_crypto action support
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (7 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
@ 2019-09-06 13:13   ` " Fan Zhang
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 10/10] doc: update security cpu process description Fan Zhang
                     ` (2 subsequent siblings)
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since ipsec library is added cpu_crypto security action type support,
this patch updates ipsec-secgw sample application with added action type
"cpu-crypto". The patch also includes a number of test scripts to
prove the correctness of the implementation.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 examples/ipsec-secgw/ipsec.c                       | 22 ++++++++++++++++++++++
 examples/ipsec-secgw/ipsec_process.c               |  7 ++++---
 examples/ipsec-secgw/sa.c                          | 13 +++++++++++--
 examples/ipsec-secgw/test/run_test.sh              | 10 ++++++++++
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |  5 +++++
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |  5 +++++
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++++
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |  5 +++++
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |  5 +++++
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |  5 +++++
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++++
 14 files changed, 101 insertions(+), 5 deletions(-)
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index dc85adfe5..4c39a7de6 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -10,6 +10,7 @@
 #include <rte_crypto.h>
 #include <rte_security.h>
 #include <rte_cryptodev.h>
+#include <rte_ipsec.h>
 #include <rte_ethdev.h>
 #include <rte_mbuf.h>
 #include <rte_hash.h>
@@ -105,6 +106,26 @@ create_lookaside_session(struct ipsec_ctx *ipsec_ctx, struct ipsec_sa *sa)
 				"SEC Session init failed: err: %d\n", ret);
 				return -1;
 			}
+		} else if (sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
+			struct rte_security_ctx *ctx =
+				(struct rte_security_ctx *)
+				rte_cryptodev_get_sec_ctx(
+					ipsec_ctx->tbl[cdev_id_qp].id);
+			int32_t offset = sizeof(struct rte_esp_hdr) +
+					sa->iv_len;
+
+			/* Set IPsec parameters in conf */
+			sess_conf.cpucrypto.cipher_offset = offset;
+
+			set_ipsec_conf(sa, &(sess_conf.ipsec));
+			sa->security_ctx = ctx;
+			sa->sec_session = rte_security_session_create(ctx,
+				&sess_conf, ipsec_ctx->session_priv_pool);
+			if (sa->sec_session == NULL) {
+				RTE_LOG(ERR, IPSEC,
+				"SEC Session init failed: err: %d\n", ret);
+				return -1;
+			}
 		} else {
 			RTE_LOG(ERR, IPSEC, "Inline not supported\n");
 			return -1;
@@ -473,6 +494,7 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx *ipsec_ctx,
 						sa->sec_session, pkts[i], NULL);
 			continue;
 		case RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO:
+		case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
 			RTE_ASSERT(sa->sec_session != NULL);
 			priv->cop.type = RTE_CRYPTO_OP_TYPE_SYMMETRIC;
 			priv->cop.status = RTE_CRYPTO_OP_STATUS_NOT_PROCESSED;
diff --git a/examples/ipsec-secgw/ipsec_process.c b/examples/ipsec-secgw/ipsec_process.c
index 868f1a28d..1932b631f 100644
--- a/examples/ipsec-secgw/ipsec_process.c
+++ b/examples/ipsec-secgw/ipsec_process.c
@@ -101,7 +101,8 @@ fill_ipsec_session(struct rte_ipsec_session *ss, struct ipsec_ctx *ctx,
 		}
 		ss->crypto.ses = sa->crypto_session;
 	/* setup session action type */
-	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL) {
+	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 		if (sa->sec_session == NULL) {
 			rc = create_lookaside_session(ctx, sa);
 			if (rc != 0)
@@ -227,8 +228,8 @@ ipsec_process(struct ipsec_ctx *ctx, struct ipsec_traffic *trf)
 
 		/* process packets inline */
 		else if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
-				sa->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) {
+			sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 
 			satp = rte_ipsec_sa_type(ips->sa);
 
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index c3cf3bd1f..ba773346f 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -570,6 +570,9 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 				RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL;
 			else if (strcmp(tokens[ti], "no-offload") == 0)
 				rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
+			else if (strcmp(tokens[ti], "cpu-crypto") == 0)
+				rule->type =
+					RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
 			else {
 				APP_CHECK(0, status, "Invalid input \"%s\"",
 						tokens[ti]);
@@ -624,10 +627,13 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 	if (status->status < 0)
 		return;
 
-	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE) && (portid_p == 0))
+	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
+			(portid_p == 0))
 		printf("Missing portid option, falling back to non-offload\n");
 
-	if (!type_p || !portid_p) {
+	if (!type_p || (!portid_p && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO)) {
 		rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
 		rule->portid = -1;
 	}
@@ -709,6 +715,9 @@ print_one_sa_rule(const struct ipsec_sa *sa, int inbound)
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		printf("lookaside-protocol-offload ");
 		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		printf("cpu-crypto-accelerated");
+		break;
 	}
 	printf("\n");
 }
diff --git a/examples/ipsec-secgw/test/run_test.sh b/examples/ipsec-secgw/test/run_test.sh
index 8055a4c04..f322aa785 100755
--- a/examples/ipsec-secgw/test/run_test.sh
+++ b/examples/ipsec-secgw/test/run_test.sh
@@ -32,15 +32,21 @@ usage()
 }
 
 LINUX_TEST="tun_aescbc_sha1 \
+tun_aescbc_sha1_cpu_crypto \
 tun_aescbc_sha1_esn \
 tun_aescbc_sha1_esn_atom \
 tun_aesgcm \
+tun_aesgcm_cpu_crypto \
+tun_aesgcm_mb_cpu_crypto \
 tun_aesgcm_esn \
 tun_aesgcm_esn_atom \
 trs_aescbc_sha1 \
+trs_aescbc_sha1_cpu_crypto \
 trs_aescbc_sha1_esn \
 trs_aescbc_sha1_esn_atom \
 trs_aesgcm \
+trs_aesgcm_cpu_crypto \
+trs_aesgcm_mb_cpu_crypto \
 trs_aesgcm_esn \
 trs_aesgcm_esn_atom \
 tun_aescbc_sha1_old \
@@ -49,17 +55,21 @@ trs_aescbc_sha1_old \
 trs_aesgcm_old \
 tun_aesctr_sha1 \
 tun_aesctr_sha1_old \
+tun_aesctr_cpu_crypto \
 tun_aesctr_sha1_esn \
 tun_aesctr_sha1_esn_atom \
 trs_aesctr_sha1 \
+trs_aesctr_sha1_cpu_crypto \
 trs_aesctr_sha1_old \
 trs_aesctr_sha1_esn \
 trs_aesctr_sha1_esn_atom \
 tun_3descbc_sha1 \
+tun_3descbc_sha1_cpu_crypto \
 tun_3descbc_sha1_old \
 tun_3descbc_sha1_esn \
 tun_3descbc_sha1_esn_atom \
 trs_3descbc_sha1 \
+trs_3descbc_sha1 \
 trs_3descbc_sha1_old \
 trs_3descbc_sha1_esn \
 trs_3descbc_sha1_esn_atom"
diff --git a/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..a864a8886
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..b515cd9f8
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..745a2a02b
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..8917122da
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..26943321f
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..747141f62
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..56076fa50
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..3af680533
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..5bf1c0ae5
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..039b8095e
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH 10/10] doc: update security cpu process description
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (8 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 09/10] examples/ipsec-secgw: add security " Fan Zhang
@ 2019-09-06 13:13   ` Fan Zhang
  2019-09-09 12:43   ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Aaron Conole
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
  11 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-09-06 13:13 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch updates programmer's guide and release note for
newly added security cpu process description.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 doc/guides/cryptodevs/aesni_gcm.rst    |   6 ++
 doc/guides/cryptodevs/aesni_mb.rst     |   7 +++
 doc/guides/prog_guide/rte_security.rst | 112 ++++++++++++++++++++++++++++++++-
 doc/guides/rel_notes/release_19_11.rst |   7 +++
 4 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst b/doc/guides/cryptodevs/aesni_gcm.rst
index 9a8bc9323..31297fabd 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -9,6 +9,12 @@ The AES-NI GCM PMD (**librte_pmd_aesni_gcm**) provides poll mode crypto driver
 support for utilizing Intel multi buffer library (see AES-NI Multi-buffer PMD documentation
 to learn more about it, including installation).
 
+The AES-NI GCM PMD also supports rte_security with security session create
+and ``rte_security_process_cpu_crypto_bulk`` function call to process
+symmetric crypto synchronously with all algorithms specified below. With this
+way it supports scather-gather buffers (``rte_security_vec`` can be greater than
+``1``. Please refer to ``rte_security`` programmer's guide for more detail.
+
 Features
 --------
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst
index 1eff2b073..1a3ddd850 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -12,6 +12,13 @@ support for utilizing Intel multi buffer library, see the white paper
 
 The AES-NI MB PMD has current only been tested on Fedora 21 64-bit with gcc.
 
+The AES-NI MB PMD also supports rte_security with security session create
+and ``rte_security_process_cpu_crypto_bulk`` function call to process
+symmetric crypto synchronously with all algorithms specified below. However
+it does not support scather-gather buffer so the ``num`` value in
+``rte_security_vec`` can only be ``1``. Please refer to ``rte_security``
+programmer's guide for more detail.
+
 Features
 --------
 
diff --git a/doc/guides/prog_guide/rte_security.rst b/doc/guides/prog_guide/rte_security.rst
index 7d0734a37..861619202 100644
--- a/doc/guides/prog_guide/rte_security.rst
+++ b/doc/guides/prog_guide/rte_security.rst
@@ -296,6 +296,56 @@ Just like IPsec, in case of PDCP also header addition/deletion, cipher/
 de-cipher, integrity protection/verification is done based on the action
 type chosen.
 
+
+Synchronous CPU Crypto
+~~~~~~~~~~~~~~~~~~~~~~
+
+RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+This action type allows the burst of symmetric crypto workload using the same
+algorithm, key, and direction being processed by CPU cycles synchronously.
+
+The packet is sent to the crypto device for symmetric crypto
+processing. The device will encrypt or decrypt the buffer based on the key(s)
+and algorithm(s) specified and preprocessed in the security session. Different
+than the inline or lookaside modes, when the function exits, the user will
+expect the buffers are either processed successfully, or having the error number
+assigned to the appropriate index of the status array.
+
+E.g. in case of IPsec, the application will use CPU cycles to process both
+stack and crypto workload synchronously.
+
+.. code-block:: console
+
+         Egress Data Path
+                 |
+        +--------|--------+
+        |  egress IPsec   |
+        |        |        |
+        | +------V------+ |
+        | | SADB lookup | |
+        | +------|------+ |
+        | +------V------+ |
+        | |   Desc      | |
+        | +------|------+ |
+        +--------V--------+
+                 |
+        +--------V--------+
+        |    L2 Stack     |
+        +-----------------+
+        |                 |
+        |   Synchronous   |   <------ Using CPU instructions
+        |  Crypto Process |
+        |                 |
+        +--------V--------+
+        |  L2 Stack Post  |   <------ Add tunnel, ESP header etc header etc.
+        +--------|--------+
+                 |
+        +--------|--------+
+        |       NIC       |
+        +--------|--------+
+                 V
+
+
 Device Features and Capabilities
 ---------------------------------
 
@@ -491,6 +541,7 @@ Security Session configuration structure is defined as ``rte_security_session_co
                 struct rte_security_ipsec_xform ipsec;
                 struct rte_security_macsec_xform macsec;
                 struct rte_security_pdcp_xform pdcp;
+                struct rte_security_cpu_crypto_xform cpu_crypto;
         };
         /**< Configuration parameters for security session */
         struct rte_crypto_sym_xform *crypto_xform;
@@ -515,9 +566,12 @@ Offload.
         RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL,
         /**< All security protocol processing is performed inline during
          * transmission */
-        RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
+        RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
         /**< All security protocol processing including crypto is performed
          * on a lookaside accelerator */
+        RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
+        /**< Crypto processing for security protocol is processed by CPU
+         * synchronously
     };
 
 The ``rte_security_session_protocol`` is defined as
@@ -587,6 +641,10 @@ PDCP related configuration parameters are defined in ``rte_security_pdcp_xform``
         uint32_t hfn_threshold;
     };
 
+For CPU Crypto processing action, the application should attach the initialized
+`xform` to the security session configuration to specify the algorithm, key,
+direction, and other necessary fields required to perform crypto operation.
+
 
 Security API
 ~~~~~~~~~~~~
@@ -650,3 +708,55 @@ it is only valid to have a single flow to map to that security session.
         +-------+            +--------+    +-----+
         |  Eth  | ->  ... -> |   ESP  | -> | END |
         +-------+            +--------+    +-----+
+
+
+Process bulk crypto workload using CPU instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The inline and lookaside mode depends on the external HW to complete the
+workload, where the user has another option to use rte_security to process
+symmetric crypto synchronously with CPU instructions.
+
+When creating the security session the user need to fill the
+``rte_security_session_conf`` parameter with the ``action_type`` field as
+``RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO``, and points ``crypto_xform`` to an
+properly initialized cryptodev xform. The user then passes the
+``rte_security_session_conf`` instance to ``rte_security_session_create()``
+along with the security context pointer belongs to a certain SW crypto device.
+The crypto device may or may not support this action type or the algorithm /
+key sizes specified in the ``crypto_xform``, but when everything is ok
+the function will return the created security session.
+
+The user then can use this session to process the crypto workload synchronously.
+Instead of using mbuf ``next`` pointers, synchronous CPU crypto processing uses
+a special structure ``rte_security_vec`` to describe scatter-gather buffers.
+
+.. code-block:: c
+
+    struct rte_security_vec {
+        struct iovec *vec;
+        uint32_t num;
+    };
+
+Where the structure ``rte_security_vec`` is used to store scatter-gather buffer
+pointers, where ``vec`` is the pointer to one buffer and ``num`` indicates the
+number of buffers.
+
+Please note not all crypto devices support scatter-gather buffer processing,
+please check ``cryptodev`` guide for more details.
+
+The API of the synchronous CPU crypto process is
+
+.. code-block:: c
+
+    void
+    rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+            struct rte_security_session *sess,
+            struct rte_security_vec buf[], void *iv[], void *aad[],
+            void *digest[], int status[], uint32_t num);
+
+This function will process ``num`` number of ``rte_security_vec`` buffers using
+the content stored in ``iv`` and ``aad`` arrays. The API only support in-place
+operation so ``buf`` will be overwritten the encrypted or decrypted values
+when successfully processed. Otherwise the error number of the status array's
+according index.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index 8490d897c..6cd21704f 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -56,6 +56,13 @@ New Features
      Also, make sure to start the actual text at the margin.
      =========================================================
 
+* **RTE_SECURITY is added new synchronous Crypto burst API with CPU**
+
+  A new API rte_security_process_cpu_crypto_bulk is introduced in security
+  library to process crypto workload in bulk using CPU instructions. AESNI_MB
+  and AESNI_GCM PMD, as well as unit-test and ipsec-secgw sample applications
+  are updated to support this feature.
+
 
 Removed Items
 -------------
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-06  9:01       ` Akhil Goyal
  2019-09-06 13:12         ` Zhang, Roy Fan
@ 2019-09-06 13:27         ` Ananyev, Konstantin
  2019-09-10 10:44           ` Akhil Goyal
  1 sibling, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-06 13:27 UTC (permalink / raw)
  To: Akhil Goyal, dev; +Cc: Zhang, Roy Fan, Doherty, Declan, De Lara Guarch, Pablo

Hi Akhil,

> > This action type allows the burst of symmetric crypto workload using the same
> > algorithm, key, and direction being processed by CPU cycles synchronously.
> > This flexible action type does not require external hardware involvement,
> > having the crypto workload processed synchronously, and is more performant
> > than Cryptodev SW PMD due to the saved cycles on removed "async mode
> > simulation" as well as 3 cacheline access of the crypto ops.
> 
> Does that mean application will not call the cryptodev_enqueue_burst and corresponding dequeue burst.

Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)

> It would be a new API something like process_packets and it will have the crypto processed packets while returning from the API?

Yes, though the plan is that API will operate on raw data buffers, not mbufs.

> 
> I still do not understand why we cannot do with the conventional crypto lib only.
> As far as I can understand, you are not doing any protocol processing or any value add
> To the crypto processing. IMO, you just need a synchronous crypto processing API which
> Can be defined in cryptodev, you don't need to re-create a crypto session in the name of
> Security session in the driver just to do a synchronous processing.

I suppose your question is why not to have rte_crypot_process_cpu_crypto_bulk(...) instead?
The main reason is that would require disruptive changes in existing cryptodev API
(would cause ABI/API breakage).
Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra information
that normal crypto_sym_xform doesn't contain 
(cipher offset from the start of the buffer, might be something extra in future).
Also right now there is no way to add new type of crypto_sym_session without
either breaking existing crypto-dev ABI/API or introducing new structure 
(rte_crypto_sym_cpu_session or so) for that.   
While rte_security is designed in a way that we can add new session types and
related parameters without causing API/ABI breakage. 

BTW, what is your concern with proposed approach (via rte_security)?
From my perspective it is a lightweight change and it is totally optional
for the crypto PMDs to support it or not.
Konstantin 

> >
> > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a small
> > performance test app under app/test/security_aesni_gcm(mb)_perftest to
> > prove.
> >
> > For the new API
> > The packet is sent to the crypto device for symmetric crypto
> > processing. The device will encrypt or decrypt the buffer based on the session
> > data specified and preprocessed in the security session. Different
> > than the inline or lookaside modes, when the function exits, the user will
> > expect the buffers are either processed successfully, or having the error number
> > assigned to the appropriate index of the status array.
> >
> > Will update the program's guide in the v1 patch.
> >
> > Regards,
> > Fan
> >
> > > -----Original Message-----
> > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty, Declan
> > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > <pablo.de.lara.guarch@intel.com>
> > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and
> > > API
> > >
> > > Hi Fan,
> > >
> > > >
> > > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > action
> > > > type to security library. The type represents performing crypto
> > > > operation with CPU cycles. The patch also includes a new API to
> > > > process crypto operations in bulk and the function pointers for PMDs.
> > > >
> > > I am not able to get the flow of execution for this action type. Could you
> > > please elaborate the flow in the documentation. If not in documentation
> > > right now, then please elaborate the flow in cover letter.
> > > Also I see that there are new APIs for processing crypto operations in bulk.
> > > What does that mean. How are they different from the existing APIs which
> > > are also handling bulk crypto ops depending on the budget.
> > >
> > >
> > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (9 preceding siblings ...)
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 10/10] doc: update security cpu process description Fan Zhang
@ 2019-09-09 12:43   ` Aaron Conole
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
  11 siblings, 0 replies; 84+ messages in thread
From: Aaron Conole @ 2019-09-09 12:43 UTC (permalink / raw)
  To: Fan Zhang; +Cc: dev, konstantin.ananyev, declan.doherty, akhil.goyal

Fan Zhang <roy.fan.zhang@intel.com> writes:

> This RFC patch adds a way to rte_security to process symmetric crypto
> workload in bulk synchronously for SW crypto devices.
>
> Originally both SW and HW crypto PMDs works under rte_cryptodev to
> process the crypto workload asynchronously. This way provides uniformity
> to both PMD types but also introduce unnecessary performance penalty to
> SW PMDs such as extra SW ring enqueue/dequeue steps to "simulate"
> asynchronous working manner and unnecessary HW addresses computation.
>
> We introduce a new way for SW crypto devices that perform crypto operation
> synchronously with only fields required for the computation as input.
>
> In rte_security, a new action type "RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO"
> is introduced. This action type allows the burst of symmetric crypto
> workload using the same algorithm, key, and direction being processed by
> CPU cycles synchronously. This flexible action type does not require
> external hardware involvement.
>
> This patch also includes the announcement of a new API
> "rte_security_process_cpu_crypto_bulk". With this API the packet is sent to
> the crypto device for symmetric crypto processing. The device will encrypt
> or decrypt the buffer based on the session data specified and preprocessed
> in the security session. Different than the inline or lookaside modes, when
> the function exits, the user will expect the buffers are either processed
> successfully, or having the error number assigned to the appropriate index
> of the status array.
>
> The proof-of-concept AESNI-GCM and AESNI-MB SW PMDs are updated with the
> support of this new method. To demonstrate the performance gain with
> this method 2 simple performance evaluation apps under unit-test are added
> "app/test: security_aesni_gcm_perftest/security_aesni_mb_perftest". The
> users can freely compare their results against crypto perf application
> results.
>
> In the end, the ipsec library and ipsec-secgw sample application are also
> updated to support this feature. Several test scripts are added to the
> ipsec-secgw test-suite to prove the correctness of the implementation.
>
> Fan Zhang (10):
>   security: introduce CPU Crypto action type and API
>   crypto/aesni_gcm: add rte_security handler
>   app/test: add security cpu crypto autotest
>   app/test: add security cpu crypto perftest
>   crypto/aesni_mb: add rte_security handler
>   app/test: add aesni_mb security cpu crypto autotest
>   app/test: add aesni_mb security cpu crypto perftest
>   ipsec: add rte_security cpu_crypto action support
>   examples/ipsec-secgw: add security cpu_crypto action support
>   doc: update security cpu process description
>

Hi Fan,

This series has problem on aarch64:

   ../app/test/test_security_cpu_crypto.c:626:16: error: implicit declaration of function ‘rte_get_tsc_hz’ [-Werror=implicit-function-declaration]
     uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
                   ^
   ../app/test/test_security_cpu_crypto.c:679:16: error: implicit declaration of function ‘rte_rdtsc’ [-Werror=implicit-function-declaration]
      time_start = rte_rdtsc();
                   ^
   ../app/test/test_security_cpu_crypto.c:711:16: error: implicit declaration of function ‘rte_get_timer_cycles’ [-Werror=implicit-function-declaration]
      time_start = rte_get_timer_cycles();
                   ^

I'm not sure best way to address this in the test - maybe there's a
better API to use for getting the cycles?

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-06 13:27         ` Ananyev, Konstantin
@ 2019-09-10 10:44           ` Akhil Goyal
  2019-09-11 12:29             ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-10 10:44 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev
  Cc: Zhang, Roy Fan, Doherty, Declan, De Lara Guarch, Pablo


Hi Konstantin,
> 
> Hi Akhil,
> 
> > > This action type allows the burst of symmetric crypto workload using the
> same
> > > algorithm, key, and direction being processed by CPU cycles synchronously.
> > > This flexible action type does not require external hardware involvement,
> > > having the crypto workload processed synchronously, and is more
> performant
> > > than Cryptodev SW PMD due to the saved cycles on removed "async mode
> > > simulation" as well as 3 cacheline access of the crypto ops.
> >
> > Does that mean application will not call the cryptodev_enqueue_burst and
> corresponding dequeue burst.
> 
> Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> 
> > It would be a new API something like process_packets and it will have the
> crypto processed packets while returning from the API?
> 
> Yes, though the plan is that API will operate on raw data buffers, not mbufs.
> 
> >
> > I still do not understand why we cannot do with the conventional crypto lib
> only.
> > As far as I can understand, you are not doing any protocol processing or any
> value add
> > To the crypto processing. IMO, you just need a synchronous crypto processing
> API which
> > Can be defined in cryptodev, you don't need to re-create a crypto session in
> the name of
> > Security session in the driver just to do a synchronous processing.
> 
> I suppose your question is why not to have
> rte_crypot_process_cpu_crypto_bulk(...) instead?
> The main reason is that would require disruptive changes in existing cryptodev
> API
> (would cause ABI/API breakage).
> Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra
> information
> that normal crypto_sym_xform doesn't contain
> (cipher offset from the start of the buffer, might be something extra in future).

Cipher offset will be part of rte_crypto_op. If you intend not to use rte_crypto_op
You can pass this as an argument in the new cryptodev API.
Something extra will also cause ABI breakage in security as well.
So it will be same.

> Also right now there is no way to add new type of crypto_sym_session without
> either breaking existing crypto-dev ABI/API or introducing new structure
> (rte_crypto_sym_cpu_session or so) for that.

What extra info is required in rte_cryptodev_sym_session to get the rte_crypto_sym_cpu_session.
I don't think there is any.
I believe the same crypto session will be able to work synchronously as well. We would only need
a new API to perform synchronous actions. That will reduce the duplication code significantly
in the driver to support 2 different kind of APIs with similar code inside. 
Please correct me in case I am missing something.


> While rte_security is designed in a way that we can add new session types and
> related parameters without causing API/ABI breakage.

Yes the intent is to add new sessions based on various protocols that can be supported by the driver.
It is not that we should find it as an alternative to cryptodev and using it just because it will not cause
ABI/API breakage. IMO the code should be placed where its intent is.

> 
> BTW, what is your concern with proposed approach (via rte_security)?
> From my perspective it is a lightweight change and it is totally optional
> for the crypto PMDs to support it or not.
> Konstantin
> 
> > >
> > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a
> small
> > > performance test app under app/test/security_aesni_gcm(mb)_perftest to
> > > prove.
> > >
> > > For the new API
> > > The packet is sent to the crypto device for symmetric crypto
> > > processing. The device will encrypt or decrypt the buffer based on the
> session
> > > data specified and preprocessed in the security session. Different
> > > than the inline or lookaside modes, when the function exits, the user will
> > > expect the buffers are either processed successfully, or having the error
> number
> > > assigned to the appropriate index of the status array.
> > >
> > > Will update the program's guide in the v1 patch.
> > >
> > > Regards,
> > > Fan
> > >
> > > > -----Original Message-----
> > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> Declan
> > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > <pablo.de.lara.guarch@intel.com>
> > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type
> and
> > > > API
> > > >
> > > > Hi Fan,
> > > >
> > > > >
> > > > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > action
> > > > > type to security library. The type represents performing crypto
> > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > >
> > > > I am not able to get the flow of execution for this action type. Could you
> > > > please elaborate the flow in the documentation. If not in documentation
> > > > right now, then please elaborate the flow in cover letter.
> > > > Also I see that there are new APIs for processing crypto operations in bulk.
> > > > What does that mean. How are they different from the existing APIs which
> > > > are also handling bulk crypto ops depending on the budget.
> > > >
> > > >
> > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-06 13:12         ` Zhang, Roy Fan
@ 2019-09-10 11:25           ` Akhil Goyal
  2019-09-11 13:01             ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-10 11:25 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev
  Cc: Ananyev, Konstantin, Doherty, Declan, De Lara Guarch, Pablo

Hi Fan,
> 
> Hi Akhil,
> 
> You are right, the new API will process the crypto workload, no heavy enqueue
> Dequeue operations required.
> 
> Cryptodev tends to support multiple crypto devices, including HW and SW.
> The 3-cache line access, iova address computation and assignment, simulation
> of async enqueue/dequeue operations, allocate and free crypto ops, even the
> mbuf linked-list for scatter-gather buffers are too heavy for SW crypto PMDs.

Why cant we have a cryptodev synchronous API which work on plain bufs as your suggested
API and use the same crypto sym_session creation logic as it was before? It will perform
same as it is doing in this series.

> 
> To create this new synchronous API in cryptodev cannot avoid the problem
> listed above:  first the API shall not serve only to part of the crypto (SW) PMDs -
> as you know, it is Cryptodev. The users can expect some PMD only support part
> of the overall algorithms, but not the workload processing API.

Why cant we have an optional data path in cryptodev for synchronous behavior if the
underlying PMD support it. It depends on the PMD to decide whether it can have it supported or not.
Only a feature flag will be needed to decide that.
One more option could be a PMD API which the application can directly call if the
mode is only supported in very few PMDs. This could be a backup if there is a 
requirement of deprecation notice etc.

> 
> Another reason is, there is assumption made, first when creating a crypto op
> we have to allocate the memory to hold crypto op + sym op + iv, - we cannot
> simply declare an array of crypto ops in the run-time and discard it when
> processing
> is done. Also we need to fill aad and digest HW address, which is not required for
> SW at all.

We are defining a new API which may have its own parameters and requirements which
Need to be fulfilled. In case it was a rte_security API, then also you are defining a new way
Of packet execution and API params. So it would be same.
You can reduce the cache line accesses as you need in the new API.
The session logic need not be changed from crypto session to security session.
Only the data patch need to be altered as per the new API.

> 
> Bottom line: using crypto op will still have 3 cache-line access performance
> problem.
> 
> So if we to create the new API in Cryptodev instead of rte_security, we need to
> create new crypto op structure only for the SW PMDs, carefully document them
> to not confuse with existing cryptodev APIs, make new device feature flags to
> indicate the API is not supported by some PMDs, and again carefully document
> them of these device feature flags.

The explanation of the new API will also happen in case it is a security API. Instead you need
to add more explanation for session also which is already there in cryptodev.

> 
> So, to push these changes to rte_security instead the above problem can be
> resolved,
> and the performance improvement because of this change is big for smaller
> packets
> - I attached a performance test app in the patchset.

I believe there wont be any perf gap in case the optimized new cryptodev API is used.

> 
> For rte_security, we already have inline-crypto type that works quite close to the
> this
> new API, the only difference is that it is processed by the CPU cycles. As you may
> have already seen the ipsec-library has wrapped these changes, and ipsec-secgw
> has only minimum updates to adopt this change too. So to the end user, if they
> use IPSec this patchset can seamlessly enabled with just commandline update
> when
> creating an SA.

In the IPSec application I do not see the changes wrt the new execution API.
So the data path is not getting handled there. It looks incomplete. The user experience
to use the new API will definitely be changed.

So I believe this patchset is not required in rte_security, we can have it in cryptodev unless
I have missed something.

> 
> Regards,
> Fan
> 
> 
> > -----Original Message-----
> > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > Sent: Friday, September 6, 2019 10:01 AM
> > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty, Declan
> > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > <pablo.de.lara.guarch@intel.com>
> > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and
> > API
> >
> >
> > Hi Fan,
> > >
> > > Hi Akhil,
> > >
> > > This action type allows the burst of symmetric crypto workload using
> > > the same algorithm, key, and direction being processed by CPU cycles
> > synchronously.
> > > This flexible action type does not require external hardware
> > > involvement, having the crypto workload processed synchronously, and
> > > is more performant than Cryptodev SW PMD due to the saved cycles on
> > > removed "async mode simulation" as well as 3 cacheline access of the
> > crypto ops.
> >
> > Does that mean application will not call the cryptodev_enqueue_burst and
> > corresponding dequeue burst.
> > It would be a new API something like process_packets and it will have the
> > crypto processed packets while returning from the API?
> >
> > I still do not understand why we cannot do with the conventional crypto lib
> > only.
> > As far as I can understand, you are not doing any protocol processing or any
> > value add To the crypto processing. IMO, you just need a synchronous crypto
> > processing API which Can be defined in cryptodev, you don't need to re-
> > create a crypto session in the name of Security session in the driver just to do
> > a synchronous processing.
> >
> > >
> > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a
> > > small performance test app under
> > > app/test/security_aesni_gcm(mb)_perftest to prove.
> > >
> > > For the new API
> > > The packet is sent to the crypto device for symmetric crypto
> > > processing. The device will encrypt or decrypt the buffer based on the
> > > session data specified and preprocessed in the security session.
> > > Different than the inline or lookaside modes, when the function exits,
> > > the user will expect the buffers are either processed successfully, or
> > > having the error number assigned to the appropriate index of the status
> > array.
> > >
> > > Will update the program's guide in the v1 patch.
> > >
> > > Regards,
> > > Fan
> > >
> > > > -----Original Message-----
> > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > > Declan <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > <pablo.de.lara.guarch@intel.com>
> > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > > > type and API
> > > >
> > > > Hi Fan,
> > > >
> > > > >
> > > > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > action
> > > > > type to security library. The type represents performing crypto
> > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > >
> > > > I am not able to get the flow of execution for this action type.
> > > > Could you please elaborate the flow in the documentation. If not in
> > > > documentation right now, then please elaborate the flow in cover letter.
> > > > Also I see that there are new APIs for processing crypto operations in
> > bulk.
> > > > What does that mean. How are they different from the existing APIs
> > > > which are also handling bulk crypto ops depending on the budget.
> > > >
> > > >
> > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-10 10:44           ` Akhil Goyal
@ 2019-09-11 12:29             ` Ananyev, Konstantin
  2019-09-12 14:12               ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-11 12:29 UTC (permalink / raw)
  To: Akhil Goyal, dev; +Cc: Zhang, Roy Fan, Doherty, Declan, De Lara Guarch, Pablo



Hi Akhil,
> >
> > > > This action type allows the burst of symmetric crypto workload using the
> > same
> > > > algorithm, key, and direction being processed by CPU cycles synchronously.
> > > > This flexible action type does not require external hardware involvement,
> > > > having the crypto workload processed synchronously, and is more
> > performant
> > > > than Cryptodev SW PMD due to the saved cycles on removed "async mode
> > > > simulation" as well as 3 cacheline access of the crypto ops.
> > >
> > > Does that mean application will not call the cryptodev_enqueue_burst and
> > corresponding dequeue burst.
> >
> > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> >
> > > It would be a new API something like process_packets and it will have the
> > crypto processed packets while returning from the API?
> >
> > Yes, though the plan is that API will operate on raw data buffers, not mbufs.
> >
> > >
> > > I still do not understand why we cannot do with the conventional crypto lib
> > only.
> > > As far as I can understand, you are not doing any protocol processing or any
> > value add
> > > To the crypto processing. IMO, you just need a synchronous crypto processing
> > API which
> > > Can be defined in cryptodev, you don't need to re-create a crypto session in
> > the name of
> > > Security session in the driver just to do a synchronous processing.
> >
> > I suppose your question is why not to have
> > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > The main reason is that would require disruptive changes in existing cryptodev
> > API
> > (would cause ABI/API breakage).
> > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra
> > information
> > that normal crypto_sym_xform doesn't contain
> > (cipher offset from the start of the buffer, might be something extra in future).
> 
> Cipher offset will be part of rte_crypto_op.

fill/read (+ alloc/free) is one of the main things that slowdown current crypto-op approach.
That's why the general idea - have all data that wouldn't change from packet to packet
included into the session and setup it once at session_init().

> If you intend not to use rte_crypto_op
> You can pass this as an argument in the new cryptodev API.

You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
It can be in theory, but that solution looks a bit ugly:
	why to pass for each call something that would be constant per session?
	Again having that value constant per session might allow some extra optimisations
	That would be hard to achieve for dynamic case. 
and not extendable:
Suppose tomorrow will need to add something extra (some new algorithm support or so).
With what you proposing will need to new parameter to the function,
which means API breakage. 

> Something extra will also cause ABI breakage in security as well.
> So it will be same.

I don't think it would.
AFAIK, right now this patch doesn't introduce any API/ABI breakage.
Iinside struct rte_security_session_conf we have a union of xforms
depending on session type.
So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
I believe no ABI breakage will appear.


> 
> > Also right now there is no way to add new type of crypto_sym_session without
> > either breaking existing crypto-dev ABI/API or introducing new structure
> > (rte_crypto_sym_cpu_session or so) for that.
> 
> What extra info is required in rte_cryptodev_sym_session to get the rte_crypto_sym_cpu_session.

Right now - just cipher_offset (see above).
What else in future (if any) - don't know.

> I don't think there is any.
> I believe the same crypto session will be able to work synchronously as well.

Exactly the same - problematically, see above.

> We would only need  a new API to perform synchronous actions.
> That will reduce the duplication code significantly
> in the driver to support 2 different kind of APIs with similar code inside.
> Please correct me in case I am missing something.

To add new API into crypto-dev would also require changes in the PMD,
it wouldn't come totally free and I believe would require roughly the same amount of changes. 

> 
> 
> > While rte_security is designed in a way that we can add new session types and
> > related parameters without causing API/ABI breakage.
> 
> Yes the intent is to add new sessions based on various protocols that can be supported by the driver.

Various protocols and different types of sessions (and devices they belong to).
Let say right now we have INLINE_CRYPTO, INLINE_PROTO, LOOKASIDE_PROTO, etc.
Here we introduce new type of session.

> It is not that we should find it as an alternative to cryptodev and using it just because it will not cause
> ABI/API breakage.

I am considering this new API as an alternative to existing ones, but as an extension.
Existing crypto-op API has its own advantages (generic), and I think we should keep it supported by all crypto-devs. 
From other side rte_security is an extendable framework that suits the purpose:
allows easily (and yes without ABI breakage) introduce new API for special type of crypto-dev (SW based).


 


> IMO the code should be placed where its intent is.
> 
> >
> > BTW, what is your concern with proposed approach (via rte_security)?
> > From my perspective it is a lightweight change and it is totally optional
> > for the crypto PMDs to support it or not.
> > Konstantin
> >
> > > >
> > > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is a
> > small
> > > > performance test app under app/test/security_aesni_gcm(mb)_perftest to
> > > > prove.
> > > >
> > > > For the new API
> > > > The packet is sent to the crypto device for symmetric crypto
> > > > processing. The device will encrypt or decrypt the buffer based on the
> > session
> > > > data specified and preprocessed in the security session. Different
> > > > than the inline or lookaside modes, when the function exits, the user will
> > > > expect the buffers are either processed successfully, or having the error
> > number
> > > > assigned to the appropriate index of the status array.
> > > >
> > > > Will update the program's guide in the v1 patch.
> > > >
> > > > Regards,
> > > > Fan
> > > >
> > > > > -----Original Message-----
> > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > Declan
> > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > <pablo.de.lara.guarch@intel.com>
> > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type
> > and
> > > > > API
> > > > >
> > > > > Hi Fan,
> > > > >
> > > > > >
> > > > > > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > action
> > > > > > type to security library. The type represents performing crypto
> > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > > >
> > > > > I am not able to get the flow of execution for this action type. Could you
> > > > > please elaborate the flow in the documentation. If not in documentation
> > > > > right now, then please elaborate the flow in cover letter.
> > > > > Also I see that there are new APIs for processing crypto operations in bulk.
> > > > > What does that mean. How are they different from the existing APIs which
> > > > > are also handling bulk crypto ops depending on the budget.
> > > > >
> > > > >
> > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-10 11:25           ` Akhil Goyal
@ 2019-09-11 13:01             ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-11 13:01 UTC (permalink / raw)
  To: Akhil Goyal, Zhang, Roy Fan, dev; +Cc: Doherty, Declan, De Lara Guarch, Pablo


Hi lads,
> >
> > You are right, the new API will process the crypto workload, no heavy enqueue
> > Dequeue operations required.
> >
> > Cryptodev tends to support multiple crypto devices, including HW and SW.
> > The 3-cache line access, iova address computation and assignment, simulation
> > of async enqueue/dequeue operations, allocate and free crypto ops, even the
> > mbuf linked-list for scatter-gather buffers are too heavy for SW crypto PMDs.
> 
> Why cant we have a cryptodev synchronous API which work on plain bufs as your suggested
> API and use the same crypto sym_session creation logic as it was before? It will perform
> same as it is doing in this series.

I tried to summarize our reasons in another mail in that thread.

> 
> >
> > To create this new synchronous API in cryptodev cannot avoid the problem
> > listed above:  first the API shall not serve only to part of the crypto (SW) PMDs -
> > as you know, it is Cryptodev. The users can expect some PMD only support part
> > of the overall algorithms, but not the workload processing API.
> 
> Why cant we have an optional data path in cryptodev for synchronous behavior if the
> underlying PMD support it. It depends on the PMD to decide whether it can have it supported or not.
> Only a feature flag will be needed to decide that.
> One more option could be a PMD API which the application can directly call if the
> mode is only supported in very few PMDs. This could be a backup if there is a
> requirement of deprecation notice etc.
> 
> >
> > Another reason is, there is assumption made, first when creating a crypto op
> > we have to allocate the memory to hold crypto op + sym op + iv, - we cannot
> > simply declare an array of crypto ops in the run-time and discard it when
> > processing
> > is done. Also we need to fill aad and digest HW address, which is not required for
> > SW at all.
> 
> We are defining a new API which may have its own parameters and requirements which
> Need to be fulfilled. In case it was a rte_security API, then also you are defining a new way
> Of packet execution and API params. So it would be same.
> You can reduce the cache line accesses as you need in the new API.
> The session logic need not be changed from crypto session to security session.
> Only the data patch need to be altered as per the new API.
> 
> >
> > Bottom line: using crypto op will still have 3 cache-line access performance
> > problem.
> >
> > So if we to create the new API in Cryptodev instead of rte_security, we need to
> > create new crypto op structure only for the SW PMDs, carefully document them
> > to not confuse with existing cryptodev APIs, make new device feature flags to
> > indicate the API is not supported by some PMDs, and again carefully document
> > them of these device feature flags.
> 
> The explanation of the new API will also happen in case it is a security API. Instead you need
> to add more explanation for session also which is already there in cryptodev.
> 
> >
> > So, to push these changes to rte_security instead the above problem can be
> > resolved,
> > and the performance improvement because of this change is big for smaller
> > packets
> > - I attached a performance test app in the patchset.
> 
> I believe there wont be any perf gap in case the optimized new cryptodev API is used.
> 
> >
> > For rte_security, we already have inline-crypto type that works quite close to the
> > this
> > new API, the only difference is that it is processed by the CPU cycles. As you may
> > have already seen the ipsec-library has wrapped these changes, and ipsec-secgw
> > has only minimum updates to adopt this change too. So to the end user, if they
> > use IPSec this patchset can seamlessly enabled with just commandline update
> > when
> > creating an SA.
> 
> In the IPSec application I do not see the changes wrt the new execution API.
> So the data path is not getting handled there. It looks incomplete. The user experience
> to use the new API will definitely be changed.

I believe we do support it for libtre_ipsec mode.
librte_ipsec hides all processing complexity inside and
does call rte_security_process_cpu_crypto_bulk() internally.
That's why for librte_ipsec it is literally 2 lines change:
--- a/examples/ipsec-secgw/ipsec_process.c
+++ b/examples/ipsec-secgw/ipsec_process.c
@@ -101,7 +101,8 @@  fill_ipsec_session(struct rte_ipsec_session *ss, struct ipsec_ctx *ctx,
 		}
 		ss->crypto.ses = sa->crypto_session;
 	/* setup session action type */
-	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL) {
+	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 		if (sa->sec_session == NULL) {
 			rc = create_lookaside_session(ctx, sa);
 			if (rc != 0)
@@ -227,8 +228,8 @@  ipsec_process(struct ipsec_ctx *ctx, struct ipsec_traffic *trf)
 
 		/* process packets inline */
 		else if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
-				sa->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) {
+			sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 
 			satp = rte_ipsec_sa_type(ips->sa);





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-11 12:29             ` Ananyev, Konstantin
@ 2019-09-12 14:12               ` Akhil Goyal
  2019-09-16 14:53                 ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-12 14:12 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph

Hi Konstantin,

> Hi Akhil,
> > >
> > > > > This action type allows the burst of symmetric crypto workload using the
> > > same
> > > > > algorithm, key, and direction being processed by CPU cycles
> synchronously.
> > > > > This flexible action type does not require external hardware involvement,
> > > > > having the crypto workload processed synchronously, and is more
> > > performant
> > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> mode
> > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > >
> > > > Does that mean application will not call the cryptodev_enqueue_burst and
> > > corresponding dequeue burst.
> > >
> > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > >
> > > > It would be a new API something like process_packets and it will have the
> > > crypto processed packets while returning from the API?
> > >
> > > Yes, though the plan is that API will operate on raw data buffers, not mbufs.
> > >
> > > >
> > > > I still do not understand why we cannot do with the conventional crypto lib
> > > only.
> > > > As far as I can understand, you are not doing any protocol processing or
> any
> > > value add
> > > > To the crypto processing. IMO, you just need a synchronous crypto
> processing
> > > API which
> > > > Can be defined in cryptodev, you don't need to re-create a crypto session
> in
> > > the name of
> > > > Security session in the driver just to do a synchronous processing.
> > >
> > > I suppose your question is why not to have
> > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > The main reason is that would require disruptive changes in existing
> cryptodev
> > > API
> > > (would cause ABI/API breakage).
> > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra
> > > information
> > > that normal crypto_sym_xform doesn't contain
> > > (cipher offset from the start of the buffer, might be something extra in
> future).
> >
> > Cipher offset will be part of rte_crypto_op.
> 
> fill/read (+ alloc/free) is one of the main things that slowdown current crypto-op
> approach.
> That's why the general idea - have all data that wouldn't change from packet to
> packet
> included into the session and setup it once at session_init().

I agree that you cannot use crypto-op.
You can have the new API in crypto.
As per the current patch, you only need cipher_offset which you can have it as a parameter until
You get it approved in the crypto xform. I believe it will be beneficial in case of other crypto cases as well.
We can have cipher offset at both places(crypto-op and cipher_xform). It will give flexibility to the user to
override it.


> 
> > If you intend not to use rte_crypto_op
> > You can pass this as an argument in the new cryptodev API.
> 
> You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
> It can be in theory, but that solution looks a bit ugly:
> 	why to pass for each call something that would be constant per session?
> 	Again having that value constant per session might allow some extra
> optimisations
> 	That would be hard to achieve for dynamic case.
> and not extendable:
> Suppose tomorrow will need to add something extra (some new algorithm
> support or so).
> With what you proposing will need to new parameter to the function,
> which means API breakage.
> 
> > Something extra will also cause ABI breakage in security as well.
> > So it will be same.
> 
> I don't think it would.
> AFAIK, right now this patch doesn't introduce any API/ABI breakage.
> Iinside struct rte_security_session_conf we have a union of xforms
> depending on session type.
> So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
> I believe no ABI breakage will appear.
Agreed, it will not break ABI in case of security till we do not exceed current size.

Saving an ABI/API breakage is more important or placing the code at the correct place.
We need to find a tradeoff. Others can comment on this.
@Thomas Monjalon, @De Lara Guarch, Pablo Any comments?

> 
> 
> >
> > > Also right now there is no way to add new type of crypto_sym_session
> without
> > > either breaking existing crypto-dev ABI/API or introducing new structure
> > > (rte_crypto_sym_cpu_session or so) for that.
> >
> > What extra info is required in rte_cryptodev_sym_session to get the
> rte_crypto_sym_cpu_session.
> 
> Right now - just cipher_offset (see above).
> What else in future (if any) - don't know.
> 
> > I don't think there is any.
> > I believe the same crypto session will be able to work synchronously as well.
> 
> Exactly the same - problematically, see above.
> 
> > We would only need  a new API to perform synchronous actions.
> > That will reduce the duplication code significantly
> > in the driver to support 2 different kind of APIs with similar code inside.
> > Please correct me in case I am missing something.
> 
> To add new API into crypto-dev would also require changes in the PMD,
> it wouldn't come totally free and I believe would require roughly the same
> amount of changes.

It will be required only in the PMDs which support it and would be minimal.
You would need a feature flag, support  for that synchronous API. Session information will
already be there in the session. The changes wrt cipher_offset need to be added
but with some default value to identify override will be done or not.

> 
> >
> >
> > > While rte_security is designed in a way that we can add new session types
> and
> > > related parameters without causing API/ABI breakage.
> >
> > Yes the intent is to add new sessions based on various protocols that can be
> supported by the driver.
> 
> Various protocols and different types of sessions (and devices they belong to).
> Let say right now we have INLINE_CRYPTO, INLINE_PROTO, LOOKASIDE_PROTO,
> etc.
> Here we introduce new type of session.

What is the new value add to the existing sessions. The changes that we are doing
here is just to avoid an API/ABI breakage. The synchronous processing can happen on both
crypto and security session. This would mean, only the processing API should be defined,
rest all should be already there in the sessions.
In All other cases, INLINE - eth device was not having any format to perform crypto op
LOOKASIDE - PROTO - add protocol specific sessions which is not available in crypto.

> 
> > It is not that we should find it as an alternative to cryptodev and using it just
> because it will not cause
> > ABI/API breakage.
> 
> I am considering this new API as an alternative to existing ones, but as an
> extension.
> Existing crypto-op API has its own advantages (generic), and I think we should
> keep it supported by all crypto-devs.
> From other side rte_security is an extendable framework that suits the purpose:
> allows easily (and yes without ABI breakage) introduce new API for special type
> of crypto-dev (SW based).
> 
> 

Adding a synchronous processing API is understandable and can be added in both
Crypto as well as Security, but a new action type for it is not required.
Now whether to support that, we have ABI/API breakage, that is a different issue.
And we may have to deal with it if no other option is there.

> 
> 
> 
> > IMO the code should be placed where its intent is.
> >
> > >
> > > BTW, what is your concern with proposed approach (via rte_security)?
> > > From my perspective it is a lightweight change and it is totally optional
> > > for the crypto PMDs to support it or not.
> > > Konstantin
> > >
> > > > >
> > > > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is
> a
> > > small
> > > > > performance test app under app/test/security_aesni_gcm(mb)_perftest
> to
> > > > > prove.
> > > > >
> > > > > For the new API
> > > > > The packet is sent to the crypto device for symmetric crypto
> > > > > processing. The device will encrypt or decrypt the buffer based on the
> > > session
> > > > > data specified and preprocessed in the security session. Different
> > > > > than the inline or lookaside modes, when the function exits, the user will
> > > > > expect the buffers are either processed successfully, or having the error
> > > number
> > > > > assigned to the appropriate index of the status array.
> > > > >
> > > > > Will update the program's guide in the v1 patch.
> > > > >
> > > > > Regards,
> > > > > Fan
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > Declan
> > > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > > <pablo.de.lara.guarch@intel.com>
> > > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> type
> > > and
> > > > > > API
> > > > > >
> > > > > > Hi Fan,
> > > > > >
> > > > > > >
> > > > > > > This patch introduce new
> RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > > action
> > > > > > > type to security library. The type represents performing crypto
> > > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > > > >
> > > > > > I am not able to get the flow of execution for this action type. Could
> you
> > > > > > please elaborate the flow in the documentation. If not in
> documentation
> > > > > > right now, then please elaborate the flow in cover letter.
> > > > > > Also I see that there are new APIs for processing crypto operations in
> bulk.
> > > > > > What does that mean. How are they different from the existing APIs
> which
> > > > > > are also handling bulk crypto ops depending on the budget.
> > > > > >
> > > > > >
> > > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-12 14:12               ` Akhil Goyal
@ 2019-09-16 14:53                 ` Ananyev, Konstantin
  2019-09-16 15:08                   ` Ananyev, Konstantin
  2019-09-17  6:02                   ` Akhil Goyal
  0 siblings, 2 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-16 14:53 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph

Hi Akhil,

> > > > > > This action type allows the burst of symmetric crypto workload using the
> > > > same
> > > > > > algorithm, key, and direction being processed by CPU cycles
> > synchronously.
> > > > > > This flexible action type does not require external hardware involvement,
> > > > > > having the crypto workload processed synchronously, and is more
> > > > performant
> > > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> > mode
> > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > >
> > > > > Does that mean application will not call the cryptodev_enqueue_burst and
> > > > corresponding dequeue burst.
> > > >
> > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > >
> > > > > It would be a new API something like process_packets and it will have the
> > > > crypto processed packets while returning from the API?
> > > >
> > > > Yes, though the plan is that API will operate on raw data buffers, not mbufs.
> > > >
> > > > >
> > > > > I still do not understand why we cannot do with the conventional crypto lib
> > > > only.
> > > > > As far as I can understand, you are not doing any protocol processing or
> > any
> > > > value add
> > > > > To the crypto processing. IMO, you just need a synchronous crypto
> > processing
> > > > API which
> > > > > Can be defined in cryptodev, you don't need to re-create a crypto session
> > in
> > > > the name of
> > > > > Security session in the driver just to do a synchronous processing.
> > > >
> > > > I suppose your question is why not to have
> > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > The main reason is that would require disruptive changes in existing
> > cryptodev
> > > > API
> > > > (would cause ABI/API breakage).
> > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra
> > > > information
> > > > that normal crypto_sym_xform doesn't contain
> > > > (cipher offset from the start of the buffer, might be something extra in
> > future).
> > >
> > > Cipher offset will be part of rte_crypto_op.
> >
> > fill/read (+ alloc/free) is one of the main things that slowdown current crypto-op
> > approach.
> > That's why the general idea - have all data that wouldn't change from packet to
> > packet
> > included into the session and setup it once at session_init().
> 
> I agree that you cannot use crypto-op.
> You can have the new API in crypto.
> As per the current patch, you only need cipher_offset which you can have it as a parameter until
> You get it approved in the crypto xform. I believe it will be beneficial in case of other crypto cases as well.
> We can have cipher offset at both places(crypto-op and cipher_xform). It will give flexibility to the user to
> override it.

After having another thought on your proposal: 
Probably we can introduce new rte_crypto_sym_xform_types for CPU related stuff here?
Let say we can have :
num rte_crypto_sym_xform_type {
        RTE_CRYPTO_SYM_XFORM_NOT_SPECIFIED = 0, /**< No xform specified */
        RTE_CRYPTO_SYM_XFORM_AUTH,              /**< Authentication xform */
        RTE_CRYPTO_SYM_XFORM_CIPHER,            /**< Cipher xform  */
        RTE_CRYPTO_SYM_XFORM_AEAD               /**< AEAD xform  */
+     RTE_CRYPTO_SYM_XFORM_CPU = INT32_MIN,
+    RTE_CRYPTO_SYM_XFORM_CPU_AEAD = (RTE_CRYPTO_SYM_XFORM_CPU | RTE_CRYPTO_SYM_XFORM_CPU),
      /* same for auth and crypto xforms */
};

Then we either can re-define some values in struct rte_crypto_aead_xform (via unions),
or even have new  struct rte_crypto_cpu_aead_xform (same for crypto and auth xforms).
Then if PMD wants to support new sync API it would need to recognize new xform types
and internally  it might end up with different session structure (one for sync, another for async mode).
That I think should allow us to introduce cpu_crypto as part of crypto-dev API without ABI breakage.
What do you think?
Konstantin 
 
> 
> >
> > > If you intend not to use rte_crypto_op
> > > You can pass this as an argument in the new cryptodev API.
> >
> > You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
> > It can be in theory, but that solution looks a bit ugly:
> > 	why to pass for each call something that would be constant per session?
> > 	Again having that value constant per session might allow some extra
> > optimisations
> > 	That would be hard to achieve for dynamic case.
> > and not extendable:
> > Suppose tomorrow will need to add something extra (some new algorithm
> > support or so).
> > With what you proposing will need to new parameter to the function,
> > which means API breakage.
> >
> > > Something extra will also cause ABI breakage in security as well.
> > > So it will be same.
> >
> > I don't think it would.
> > AFAIK, right now this patch doesn't introduce any API/ABI breakage.
> > Iinside struct rte_security_session_conf we have a union of xforms
> > depending on session type.
> > So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
> > I believe no ABI breakage will appear.
> Agreed, it will not break ABI in case of security till we do not exceed current size.
> 
> Saving an ABI/API breakage is more important or placing the code at the correct place.
> We need to find a tradeoff. Others can comment on this.
> @Thomas Monjalon, @De Lara Guarch, Pablo Any comments?
> 
> >
> >
> > >
> > > > Also right now there is no way to add new type of crypto_sym_session
> > without
> > > > either breaking existing crypto-dev ABI/API or introducing new structure
> > > > (rte_crypto_sym_cpu_session or so) for that.
> > >
> > > What extra info is required in rte_cryptodev_sym_session to get the
> > rte_crypto_sym_cpu_session.
> >
> > Right now - just cipher_offset (see above).
> > What else in future (if any) - don't know.
> >
> > > I don't think there is any.
> > > I believe the same crypto session will be able to work synchronously as well.
> >
> > Exactly the same - problematically, see above.
> >
> > > We would only need  a new API to perform synchronous actions.
> > > That will reduce the duplication code significantly
> > > in the driver to support 2 different kind of APIs with similar code inside.
> > > Please correct me in case I am missing something.
> >
> > To add new API into crypto-dev would also require changes in the PMD,
> > it wouldn't come totally free and I believe would require roughly the same
> > amount of changes.
> 
> It will be required only in the PMDs which support it and would be minimal.
> You would need a feature flag, support  for that synchronous API. Session information will
> already be there in the session. The changes wrt cipher_offset need to be added
> but with some default value to identify override will be done or not.
> 
> >
> > >
> > >
> > > > While rte_security is designed in a way that we can add new session types
> > and
> > > > related parameters without causing API/ABI breakage.
> > >
> > > Yes the intent is to add new sessions based on various protocols that can be
> > supported by the driver.
> >
> > Various protocols and different types of sessions (and devices they belong to).
> > Let say right now we have INLINE_CRYPTO, INLINE_PROTO, LOOKASIDE_PROTO,
> > etc.
> > Here we introduce new type of session.
> 
> What is the new value add to the existing sessions. The changes that we are doing
> here is just to avoid an API/ABI breakage. The synchronous processing can happen on both
> crypto and security session. This would mean, only the processing API should be defined,
> rest all should be already there in the sessions.
> In All other cases, INLINE - eth device was not having any format to perform crypto op
> LOOKASIDE - PROTO - add protocol specific sessions which is not available in crypto.
> 
> >
> > > It is not that we should find it as an alternative to cryptodev and using it just
> > because it will not cause
> > > ABI/API breakage.
> >
> > I am considering this new API as an alternative to existing ones, but as an
> > extension.
> > Existing crypto-op API has its own advantages (generic), and I think we should
> > keep it supported by all crypto-devs.
> > From other side rte_security is an extendable framework that suits the purpose:
> > allows easily (and yes without ABI breakage) introduce new API for special type
> > of crypto-dev (SW based).
> >
> >
> 
> Adding a synchronous processing API is understandable and can be added in both
> Crypto as well as Security, but a new action type for it is not required.
> Now whether to support that, we have ABI/API breakage, that is a different issue.
> And we may have to deal with it if no other option is there.
> 
> >
> >
> >
> > > IMO the code should be placed where its intent is.
> > >
> > > >
> > > > BTW, what is your concern with proposed approach (via rte_security)?
> > > > From my perspective it is a lightweight change and it is totally optional
> > > > for the crypto PMDs to support it or not.
> > > > Konstantin
> > > >
> > > > > >
> > > > > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is
> > a
> > > > small
> > > > > > performance test app under app/test/security_aesni_gcm(mb)_perftest
> > to
> > > > > > prove.
> > > > > >
> > > > > > For the new API
> > > > > > The packet is sent to the crypto device for symmetric crypto
> > > > > > processing. The device will encrypt or decrypt the buffer based on the
> > > > session
> > > > > > data specified and preprocessed in the security session. Different
> > > > > > than the inline or lookaside modes, when the function exits, the user will
> > > > > > expect the buffers are either processed successfully, or having the error
> > > > number
> > > > > > assigned to the appropriate index of the status array.
> > > > > >
> > > > > > Will update the program's guide in the v1 patch.
> > > > > >
> > > > > > Regards,
> > > > > > Fan
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > > Declan
> > > > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > > > <pablo.de.lara.guarch@intel.com>
> > > > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > type
> > > > and
> > > > > > > API
> > > > > > >
> > > > > > > Hi Fan,
> > > > > > >
> > > > > > > >
> > > > > > > > This patch introduce new
> > RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > > > action
> > > > > > > > type to security library. The type represents performing crypto
> > > > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > > > > >
> > > > > > > I am not able to get the flow of execution for this action type. Could
> > you
> > > > > > > please elaborate the flow in the documentation. If not in
> > documentation
> > > > > > > right now, then please elaborate the flow in cover letter.
> > > > > > > Also I see that there are new APIs for processing crypto operations in
> > bulk.
> > > > > > > What does that mean. How are they different from the existing APIs
> > which
> > > > > > > are also handling bulk crypto ops depending on the budget.
> > > > > > >
> > > > > > >
> > > > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-16 14:53                 ` Ananyev, Konstantin
@ 2019-09-16 15:08                   ` Ananyev, Konstantin
  2019-09-17  6:02                   ` Akhil Goyal
  1 sibling, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-16 15:08 UTC (permalink / raw)
  To: Ananyev, Konstantin, Akhil Goyal, dev, De Lara Guarch, Pablo,
	Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


> Hi Akhil,
> 
> > > > > > > This action type allows the burst of symmetric crypto workload using the
> > > > > same
> > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > synchronously.
> > > > > > > This flexible action type does not require external hardware involvement,
> > > > > > > having the crypto workload processed synchronously, and is more
> > > > > performant
> > > > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> > > mode
> > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > >
> > > > > > Does that mean application will not call the cryptodev_enqueue_burst and
> > > > > corresponding dequeue burst.
> > > > >
> > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > >
> > > > > > It would be a new API something like process_packets and it will have the
> > > > > crypto processed packets while returning from the API?
> > > > >
> > > > > Yes, though the plan is that API will operate on raw data buffers, not mbufs.
> > > > >
> > > > > >
> > > > > > I still do not understand why we cannot do with the conventional crypto lib
> > > > > only.
> > > > > > As far as I can understand, you are not doing any protocol processing or
> > > any
> > > > > value add
> > > > > > To the crypto processing. IMO, you just need a synchronous crypto
> > > processing
> > > > > API which
> > > > > > Can be defined in cryptodev, you don't need to re-create a crypto session
> > > in
> > > > > the name of
> > > > > > Security session in the driver just to do a synchronous processing.
> > > > >
> > > > > I suppose your question is why not to have
> > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > The main reason is that would require disruptive changes in existing
> > > cryptodev
> > > > > API
> > > > > (would cause ABI/API breakage).
> > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some extra
> > > > > information
> > > > > that normal crypto_sym_xform doesn't contain
> > > > > (cipher offset from the start of the buffer, might be something extra in
> > > future).
> > > >
> > > > Cipher offset will be part of rte_crypto_op.
> > >
> > > fill/read (+ alloc/free) is one of the main things that slowdown current crypto-op
> > > approach.
> > > That's why the general idea - have all data that wouldn't change from packet to
> > > packet
> > > included into the session and setup it once at session_init().
> >
> > I agree that you cannot use crypto-op.
> > You can have the new API in crypto.
> > As per the current patch, you only need cipher_offset which you can have it as a parameter until
> > You get it approved in the crypto xform. I believe it will be beneficial in case of other crypto cases as well.
> > We can have cipher offset at both places(crypto-op and cipher_xform). It will give flexibility to the user to
> > override it.
> 
> After having another thought on your proposal:
> Probably we can introduce new rte_crypto_sym_xform_types for CPU related stuff here?
> Let say we can have :
> num rte_crypto_sym_xform_type {
>         RTE_CRYPTO_SYM_XFORM_NOT_SPECIFIED = 0, /**< No xform specified */
>         RTE_CRYPTO_SYM_XFORM_AUTH,              /**< Authentication xform */
>         RTE_CRYPTO_SYM_XFORM_CIPHER,            /**< Cipher xform  */
>         RTE_CRYPTO_SYM_XFORM_AEAD               /**< AEAD xform  */
> +     RTE_CRYPTO_SYM_XFORM_CPU = INT32_MIN,
> +    RTE_CRYPTO_SYM_XFORM_CPU_AEAD = (RTE_CRYPTO_SYM_XFORM_CPU | RTE_CRYPTO_SYM_XFORM_CPU),
Meant
RTE_CRYPTO_SYM_XFORM_CPU_AEAD = (RTE_CRYPTO_SYM_XFORM_CPU | RTE_CRYPTO_SYM_XFORM_AEAD),
of course.

>       /* same for auth and crypto xforms */
> };
> 
> Then we either can re-define some values in struct rte_crypto_aead_xform (via unions),
> or even have new  struct rte_crypto_cpu_aead_xform (same for crypto and auth xforms).
> Then if PMD wants to support new sync API it would need to recognize new xform types
> and internally  it might end up with different session structure (one for sync, another for async mode).
> That I think should allow us to introduce cpu_crypto as part of crypto-dev API without ABI breakage.
> What do you think?
> Konstantin
> 
> >
> > >
> > > > If you intend not to use rte_crypto_op
> > > > You can pass this as an argument in the new cryptodev API.
> > >
> > > You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
> > > It can be in theory, but that solution looks a bit ugly:
> > > 	why to pass for each call something that would be constant per session?
> > > 	Again having that value constant per session might allow some extra
> > > optimisations
> > > 	That would be hard to achieve for dynamic case.
> > > and not extendable:
> > > Suppose tomorrow will need to add something extra (some new algorithm
> > > support or so).
> > > With what you proposing will need to new parameter to the function,
> > > which means API breakage.
> > >
> > > > Something extra will also cause ABI breakage in security as well.
> > > > So it will be same.
> > >
> > > I don't think it would.
> > > AFAIK, right now this patch doesn't introduce any API/ABI breakage.
> > > Iinside struct rte_security_session_conf we have a union of xforms
> > > depending on session type.
> > > So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
> > > I believe no ABI breakage will appear.
> > Agreed, it will not break ABI in case of security till we do not exceed current size.
> >
> > Saving an ABI/API breakage is more important or placing the code at the correct place.
> > We need to find a tradeoff. Others can comment on this.
> > @Thomas Monjalon, @De Lara Guarch, Pablo Any comments?
> >
> > >
> > >
> > > >
> > > > > Also right now there is no way to add new type of crypto_sym_session
> > > without
> > > > > either breaking existing crypto-dev ABI/API or introducing new structure
> > > > > (rte_crypto_sym_cpu_session or so) for that.
> > > >
> > > > What extra info is required in rte_cryptodev_sym_session to get the
> > > rte_crypto_sym_cpu_session.
> > >
> > > Right now - just cipher_offset (see above).
> > > What else in future (if any) - don't know.
> > >
> > > > I don't think there is any.
> > > > I believe the same crypto session will be able to work synchronously as well.
> > >
> > > Exactly the same - problematically, see above.
> > >
> > > > We would only need  a new API to perform synchronous actions.
> > > > That will reduce the duplication code significantly
> > > > in the driver to support 2 different kind of APIs with similar code inside.
> > > > Please correct me in case I am missing something.
> > >
> > > To add new API into crypto-dev would also require changes in the PMD,
> > > it wouldn't come totally free and I believe would require roughly the same
> > > amount of changes.
> >
> > It will be required only in the PMDs which support it and would be minimal.
> > You would need a feature flag, support  for that synchronous API. Session information will
> > already be there in the session. The changes wrt cipher_offset need to be added
> > but with some default value to identify override will be done or not.
> >
> > >
> > > >
> > > >
> > > > > While rte_security is designed in a way that we can add new session types
> > > and
> > > > > related parameters without causing API/ABI breakage.
> > > >
> > > > Yes the intent is to add new sessions based on various protocols that can be
> > > supported by the driver.
> > >
> > > Various protocols and different types of sessions (and devices they belong to).
> > > Let say right now we have INLINE_CRYPTO, INLINE_PROTO, LOOKASIDE_PROTO,
> > > etc.
> > > Here we introduce new type of session.
> >
> > What is the new value add to the existing sessions. The changes that we are doing
> > here is just to avoid an API/ABI breakage. The synchronous processing can happen on both
> > crypto and security session. This would mean, only the processing API should be defined,
> > rest all should be already there in the sessions.
> > In All other cases, INLINE - eth device was not having any format to perform crypto op
> > LOOKASIDE - PROTO - add protocol specific sessions which is not available in crypto.
> >
> > >
> > > > It is not that we should find it as an alternative to cryptodev and using it just
> > > because it will not cause
> > > > ABI/API breakage.
> > >
> > > I am considering this new API as an alternative to existing ones, but as an
> > > extension.
> > > Existing crypto-op API has its own advantages (generic), and I think we should
> > > keep it supported by all crypto-devs.
> > > From other side rte_security is an extendable framework that suits the purpose:
> > > allows easily (and yes without ABI breakage) introduce new API for special type
> > > of crypto-dev (SW based).
> > >
> > >
> >
> > Adding a synchronous processing API is understandable and can be added in both
> > Crypto as well as Security, but a new action type for it is not required.
> > Now whether to support that, we have ABI/API breakage, that is a different issue.
> > And we may have to deal with it if no other option is there.
> >
> > >
> > >
> > >
> > > > IMO the code should be placed where its intent is.
> > > >
> > > > >
> > > > > BTW, what is your concern with proposed approach (via rte_security)?
> > > > > From my perspective it is a lightweight change and it is totally optional
> > > > > for the crypto PMDs to support it or not.
> > > > > Konstantin
> > > > >
> > > > > > >
> > > > > > > AESNI-GCM and AESNI-MB PMDs are updated with this support. There is
> > > a
> > > > > small
> > > > > > > performance test app under app/test/security_aesni_gcm(mb)_perftest
> > > to
> > > > > > > prove.
> > > > > > >
> > > > > > > For the new API
> > > > > > > The packet is sent to the crypto device for symmetric crypto
> > > > > > > processing. The device will encrypt or decrypt the buffer based on the
> > > > > session
> > > > > > > data specified and preprocessed in the security session. Different
> > > > > > > than the inline or lookaside modes, when the function exits, the user will
> > > > > > > expect the buffers are either processed successfully, or having the error
> > > > > number
> > > > > > > assigned to the appropriate index of the status array.
> > > > > > >
> > > > > > > Will update the program's guide in the v1 patch.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Fan
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > > > Declan
> > > > > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > > > > <pablo.de.lara.guarch@intel.com>
> > > > > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > > type
> > > > > and
> > > > > > > > API
> > > > > > > >
> > > > > > > > Hi Fan,
> > > > > > > >
> > > > > > > > >
> > > > > > > > > This patch introduce new
> > > RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > > > > action
> > > > > > > > > type to security library. The type represents performing crypto
> > > > > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > > > > process crypto operations in bulk and the function pointers for PMDs.
> > > > > > > > >
> > > > > > > > I am not able to get the flow of execution for this action type. Could
> > > you
> > > > > > > > please elaborate the flow in the documentation. If not in
> > > documentation
> > > > > > > > right now, then please elaborate the flow in cover letter.
> > > > > > > > Also I see that there are new APIs for processing crypto operations in
> > > bulk.
> > > > > > > > What does that mean. How are they different from the existing APIs
> > > which
> > > > > > > > are also handling bulk crypto ops depending on the budget.
> > > > > > > >
> > > > > > > >
> > > > > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-16 14:53                 ` Ananyev, Konstantin
  2019-09-16 15:08                   ` Ananyev, Konstantin
@ 2019-09-17  6:02                   ` Akhil Goyal
  2019-09-18  7:44                     ` Ananyev, Konstantin
  1 sibling, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-17  6:02 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Konstantin,
> 
> Hi Akhil,
> 
> > > > > > > This action type allows the burst of symmetric crypto workload using
> the
> > > > > same
> > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > synchronously.
> > > > > > > This flexible action type does not require external hardware
> involvement,
> > > > > > > having the crypto workload processed synchronously, and is more
> > > > > performant
> > > > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> > > mode
> > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > >
> > > > > > Does that mean application will not call the cryptodev_enqueue_burst
> and
> > > > > corresponding dequeue burst.
> > > > >
> > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > >
> > > > > > It would be a new API something like process_packets and it will have
> the
> > > > > crypto processed packets while returning from the API?
> > > > >
> > > > > Yes, though the plan is that API will operate on raw data buffers, not
> mbufs.
> > > > >
> > > > > >
> > > > > > I still do not understand why we cannot do with the conventional
> crypto lib
> > > > > only.
> > > > > > As far as I can understand, you are not doing any protocol processing
> or
> > > any
> > > > > value add
> > > > > > To the crypto processing. IMO, you just need a synchronous crypto
> > > processing
> > > > > API which
> > > > > > Can be defined in cryptodev, you don't need to re-create a crypto
> session
> > > in
> > > > > the name of
> > > > > > Security session in the driver just to do a synchronous processing.
> > > > >
> > > > > I suppose your question is why not to have
> > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > The main reason is that would require disruptive changes in existing
> > > cryptodev
> > > > > API
> > > > > (would cause ABI/API breakage).
> > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some
> extra
> > > > > information
> > > > > that normal crypto_sym_xform doesn't contain
> > > > > (cipher offset from the start of the buffer, might be something extra in
> > > future).
> > > >
> > > > Cipher offset will be part of rte_crypto_op.
> > >
> > > fill/read (+ alloc/free) is one of the main things that slowdown current
> crypto-op
> > > approach.
> > > That's why the general idea - have all data that wouldn't change from packet
> to
> > > packet
> > > included into the session and setup it once at session_init().
> >
> > I agree that you cannot use crypto-op.
> > You can have the new API in crypto.
> > As per the current patch, you only need cipher_offset which you can have it as
> a parameter until
> > You get it approved in the crypto xform. I believe it will be beneficial in case of
> other crypto cases as well.
> > We can have cipher offset at both places(crypto-op and cipher_xform). It will
> give flexibility to the user to
> > override it.
> 
> After having another thought on your proposal:
> Probably we can introduce new rte_crypto_sym_xform_types for CPU related
> stuff here?

I also thought of adding new xforms, but that wont serve the purpose for may be all the cases.
You would be needing all information currently available in the current xforms.
So if you are adding new fields in the new xform, the size will be more than that of the union of xforms.
ABI breakage would still be there. 

If you think a valid compression of the AEAD xform can be done, then that can be done for each of the
Xforms and we can have a solution to this issue.

> Let say we can have :
> num rte_crypto_sym_xform_type {
>         RTE_CRYPTO_SYM_XFORM_NOT_SPECIFIED = 0, /**< No xform specified
> */
>         RTE_CRYPTO_SYM_XFORM_AUTH,              /**< Authentication xform */
>         RTE_CRYPTO_SYM_XFORM_CIPHER,            /**< Cipher xform  */
>         RTE_CRYPTO_SYM_XFORM_AEAD               /**< AEAD xform  */
> +     RTE_CRYPTO_SYM_XFORM_CPU = INT32_MIN,
> +    RTE_CRYPTO_SYM_XFORM_CPU_AEAD = (RTE_CRYPTO_SYM_XFORM_CPU |
> RTE_CRYPTO_SYM_XFORM_CPU),

Instead of CPU I believe SYNC would be better.

>       /* same for auth and crypto xforms */
> };
> 
> Then we either can re-define some values in struct rte_crypto_aead_xform (via
> unions),
> or even have new  struct rte_crypto_cpu_aead_xform (same for crypto and auth
> xforms).
> Then if PMD wants to support new sync API it would need to recognize new
> xform types
> and internally  it might end up with different session structure (one for sync,
> another for async mode).
> That I think should allow us to introduce cpu_crypto as part of crypto-dev API
> without ABI breakage.
> What do you think?
> Konstantin
> 
> >
> > >
> > > > If you intend not to use rte_crypto_op
> > > > You can pass this as an argument in the new cryptodev API.
> > >
> > > You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
> > > It can be in theory, but that solution looks a bit ugly:
> > > 	why to pass for each call something that would be constant per session?
> > > 	Again having that value constant per session might allow some extra
> > > optimisations
> > > 	That would be hard to achieve for dynamic case.
> > > and not extendable:
> > > Suppose tomorrow will need to add something extra (some new algorithm
> > > support or so).
> > > With what you proposing will need to new parameter to the function,
> > > which means API breakage.
> > >
> > > > Something extra will also cause ABI breakage in security as well.
> > > > So it will be same.
> > >
> > > I don't think it would.
> > > AFAIK, right now this patch doesn't introduce any API/ABI breakage.
> > > Iinside struct rte_security_session_conf we have a union of xforms
> > > depending on session type.
> > > So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
> > > I believe no ABI breakage will appear.
> > Agreed, it will not break ABI in case of security till we do not exceed current
> size.
> >
> > Saving an ABI/API breakage is more important or placing the code at the
> correct place.
> > We need to find a tradeoff. Others can comment on this.
> > @Thomas Monjalon, @De Lara Guarch, Pablo Any comments?
> >
> > >
> > >
> > > >
> > > > > Also right now there is no way to add new type of crypto_sym_session
> > > without
> > > > > either breaking existing crypto-dev ABI/API or introducing new structure
> > > > > (rte_crypto_sym_cpu_session or so) for that.
> > > >
> > > > What extra info is required in rte_cryptodev_sym_session to get the
> > > rte_crypto_sym_cpu_session.
> > >
> > > Right now - just cipher_offset (see above).
> > > What else in future (if any) - don't know.
> > >
> > > > I don't think there is any.
> > > > I believe the same crypto session will be able to work synchronously as well.
> > >
> > > Exactly the same - problematically, see above.
> > >
> > > > We would only need  a new API to perform synchronous actions.
> > > > That will reduce the duplication code significantly
> > > > in the driver to support 2 different kind of APIs with similar code inside.
> > > > Please correct me in case I am missing something.
> > >
> > > To add new API into crypto-dev would also require changes in the PMD,
> > > it wouldn't come totally free and I believe would require roughly the same
> > > amount of changes.
> >
> > It will be required only in the PMDs which support it and would be minimal.
> > You would need a feature flag, support  for that synchronous API. Session
> information will
> > already be there in the session. The changes wrt cipher_offset need to be
> added
> > but with some default value to identify override will be done or not.
> >
> > >
> > > >
> > > >
> > > > > While rte_security is designed in a way that we can add new session
> types
> > > and
> > > > > related parameters without causing API/ABI breakage.
> > > >
> > > > Yes the intent is to add new sessions based on various protocols that can
> be
> > > supported by the driver.
> > >
> > > Various protocols and different types of sessions (and devices they belong
> to).
> > > Let say right now we have INLINE_CRYPTO, INLINE_PROTO,
> LOOKASIDE_PROTO,
> > > etc.
> > > Here we introduce new type of session.
> >
> > What is the new value add to the existing sessions. The changes that we are
> doing
> > here is just to avoid an API/ABI breakage. The synchronous processing can
> happen on both
> > crypto and security session. This would mean, only the processing API should
> be defined,
> > rest all should be already there in the sessions.
> > In All other cases, INLINE - eth device was not having any format to perform
> crypto op
> > LOOKASIDE - PROTO - add protocol specific sessions which is not available in
> crypto.
> >
> > >
> > > > It is not that we should find it as an alternative to cryptodev and using it
> just
> > > because it will not cause
> > > > ABI/API breakage.
> > >
> > > I am considering this new API as an alternative to existing ones, but as an
> > > extension.
> > > Existing crypto-op API has its own advantages (generic), and I think we
> should
> > > keep it supported by all crypto-devs.
> > > From other side rte_security is an extendable framework that suits the
> purpose:
> > > allows easily (and yes without ABI breakage) introduce new API for special
> type
> > > of crypto-dev (SW based).
> > >
> > >
> >
> > Adding a synchronous processing API is understandable and can be added in
> both
> > Crypto as well as Security, but a new action type for it is not required.
> > Now whether to support that, we have ABI/API breakage, that is a different
> issue.
> > And we may have to deal with it if no other option is there.
> >
> > >
> > >
> > >
> > > > IMO the code should be placed where its intent is.
> > > >
> > > > >
> > > > > BTW, what is your concern with proposed approach (via rte_security)?
> > > > > From my perspective it is a lightweight change and it is totally optional
> > > > > for the crypto PMDs to support it or not.
> > > > > Konstantin
> > > > >
> > > > > > >
> > > > > > > AESNI-GCM and AESNI-MB PMDs are updated with this support.
> There is
> > > a
> > > > > small
> > > > > > > performance test app under
> app/test/security_aesni_gcm(mb)_perftest
> > > to
> > > > > > > prove.
> > > > > > >
> > > > > > > For the new API
> > > > > > > The packet is sent to the crypto device for symmetric crypto
> > > > > > > processing. The device will encrypt or decrypt the buffer based on the
> > > > > session
> > > > > > > data specified and preprocessed in the security session. Different
> > > > > > > than the inline or lookaside modes, when the function exits, the user
> will
> > > > > > > expect the buffers are either processed successfully, or having the
> error
> > > > > number
> > > > > > > assigned to the appropriate index of the status array.
> > > > > > >
> > > > > > > Will update the program's guide in the v1 patch.
> > > > > > >
> > > > > > > Regards,
> > > > > > > Fan
> > > > > > >
> > > > > > > > -----Original Message-----
> > > > > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > > > Declan
> > > > > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > > > > <pablo.de.lara.guarch@intel.com>
> > > > > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > > type
> > > > > and
> > > > > > > > API
> > > > > > > >
> > > > > > > > Hi Fan,
> > > > > > > >
> > > > > > > > >
> > > > > > > > > This patch introduce new
> > > RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > > > > action
> > > > > > > > > type to security library. The type represents performing crypto
> > > > > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > > > > process crypto operations in bulk and the function pointers for
> PMDs.
> > > > > > > > >
> > > > > > > > I am not able to get the flow of execution for this action type.
> Could
> > > you
> > > > > > > > please elaborate the flow in the documentation. If not in
> > > documentation
> > > > > > > > right now, then please elaborate the flow in cover letter.
> > > > > > > > Also I see that there are new APIs for processing crypto operations
> in
> > > bulk.
> > > > > > > > What does that mean. How are they different from the existing APIs
> > > which
> > > > > > > > are also handling bulk crypto ops depending on the budget.
> > > > > > > >
> > > > > > > >
> > > > > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-17  6:02                   ` Akhil Goyal
@ 2019-09-18  7:44                     ` Ananyev, Konstantin
  2019-09-25 18:24                       ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-18  7:44 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Akhil,

> > > > > > > > This action type allows the burst of symmetric crypto workload using
> > the
> > > > > > same
> > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > synchronously.
> > > > > > > > This flexible action type does not require external hardware
> > involvement,
> > > > > > > > having the crypto workload processed synchronously, and is more
> > > > > > performant
> > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> > > > mode
> > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > >
> > > > > > > Does that mean application will not call the cryptodev_enqueue_burst
> > and
> > > > > > corresponding dequeue burst.
> > > > > >
> > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > >
> > > > > > > It would be a new API something like process_packets and it will have
> > the
> > > > > > crypto processed packets while returning from the API?
> > > > > >
> > > > > > Yes, though the plan is that API will operate on raw data buffers, not
> > mbufs.
> > > > > >
> > > > > > >
> > > > > > > I still do not understand why we cannot do with the conventional
> > crypto lib
> > > > > > only.
> > > > > > > As far as I can understand, you are not doing any protocol processing
> > or
> > > > any
> > > > > > value add
> > > > > > > To the crypto processing. IMO, you just need a synchronous crypto
> > > > processing
> > > > > > API which
> > > > > > > Can be defined in cryptodev, you don't need to re-create a crypto
> > session
> > > > in
> > > > > > the name of
> > > > > > > Security session in the driver just to do a synchronous processing.
> > > > > >
> > > > > > I suppose your question is why not to have
> > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > The main reason is that would require disruptive changes in existing
> > > > cryptodev
> > > > > > API
> > > > > > (would cause ABI/API breakage).
> > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some
> > extra
> > > > > > information
> > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > (cipher offset from the start of the buffer, might be something extra in
> > > > future).
> > > > >
> > > > > Cipher offset will be part of rte_crypto_op.
> > > >
> > > > fill/read (+ alloc/free) is one of the main things that slowdown current
> > crypto-op
> > > > approach.
> > > > That's why the general idea - have all data that wouldn't change from packet
> > to
> > > > packet
> > > > included into the session and setup it once at session_init().
> > >
> > > I agree that you cannot use crypto-op.
> > > You can have the new API in crypto.
> > > As per the current patch, you only need cipher_offset which you can have it as
> > a parameter until
> > > You get it approved in the crypto xform. I believe it will be beneficial in case of
> > other crypto cases as well.
> > > We can have cipher offset at both places(crypto-op and cipher_xform). It will
> > give flexibility to the user to
> > > override it.
> >
> > After having another thought on your proposal:
> > Probably we can introduce new rte_crypto_sym_xform_types for CPU related
> > stuff here?
> 
> I also thought of adding new xforms, but that wont serve the purpose for may be all the cases.
> You would be needing all information currently available in the current xforms.
> So if you are adding new fields in the new xform, the size will be more than that of the union of xforms.
> ABI breakage would still be there.
> 
> If you think a valid compression of the AEAD xform can be done, then that can be done for each of the
> Xforms and we can have a solution to this issue.

I think that we can re-use iv.offset for our purposes (for crypto offset).
So for now we can make that path work without any ABI breakage. 
Fan, please feel free to correct me here, if I missed something.
If in future we would need to add some extra information it might
require ABI breakage, though by now I don't envision anything particular to add.
Anyway, if there is no objection to go that way, we can try to make
these changes for v2. 

> 
> > Let say we can have :
> > num rte_crypto_sym_xform_type {
> >         RTE_CRYPTO_SYM_XFORM_NOT_SPECIFIED = 0, /**< No xform specified
> > */
> >         RTE_CRYPTO_SYM_XFORM_AUTH,              /**< Authentication xform */
> >         RTE_CRYPTO_SYM_XFORM_CIPHER,            /**< Cipher xform  */
> >         RTE_CRYPTO_SYM_XFORM_AEAD               /**< AEAD xform  */
> > +     RTE_CRYPTO_SYM_XFORM_CPU = INT32_MIN,
> > +    RTE_CRYPTO_SYM_XFORM_CPU_AEAD = (RTE_CRYPTO_SYM_XFORM_CPU |
> > RTE_CRYPTO_SYM_XFORM_CPU),
> 
> Instead of CPU I believe SYNC would be better.

I don't mind to name it to SYNC, but I'd like to outline,
that it's not really more CPU then generic SYNC API
(it doesn't pass IOVA for data buffers, etc., only VA). 

> 
> >       /* same for auth and crypto xforms */
> > };
> >
> > Then we either can re-define some values in struct rte_crypto_aead_xform (via
> > unions),
> > or even have new  struct rte_crypto_cpu_aead_xform (same for crypto and auth
> > xforms).
> > Then if PMD wants to support new sync API it would need to recognize new
> > xform types
> > and internally  it might end up with different session structure (one for sync,
> > another for async mode).
> > That I think should allow us to introduce cpu_crypto as part of crypto-dev API
> > without ABI breakage.
> > What do you think?
> > Konstantin
> >
> > >
> > > >
> > > > > If you intend not to use rte_crypto_op
> > > > > You can pass this as an argument in the new cryptodev API.
> > > >
> > > > You mean extra parameter in rte_security_process_cpu_crypto_bulk()?
> > > > It can be in theory, but that solution looks a bit ugly:
> > > > 	why to pass for each call something that would be constant per session?
> > > > 	Again having that value constant per session might allow some extra
> > > > optimisations
> > > > 	That would be hard to achieve for dynamic case.
> > > > and not extendable:
> > > > Suppose tomorrow will need to add something extra (some new algorithm
> > > > support or so).
> > > > With what you proposing will need to new parameter to the function,
> > > > which means API breakage.
> > > >
> > > > > Something extra will also cause ABI breakage in security as well.
> > > > > So it will be same.
> > > >
> > > > I don't think it would.
> > > > AFAIK, right now this patch doesn't introduce any API/ABI breakage.
> > > > Iinside struct rte_security_session_conf we have a union of xforms
> > > > depending on session type.
> > > > So as long as cpu_crypto_xform wouldn't exceed sizes of other xform -
> > > > I believe no ABI breakage will appear.
> > > Agreed, it will not break ABI in case of security till we do not exceed current
> > size.
> > >
> > > Saving an ABI/API breakage is more important or placing the code at the
> > correct place.
> > > We need to find a tradeoff. Others can comment on this.
> > > @Thomas Monjalon, @De Lara Guarch, Pablo Any comments?
> > >
> > > >
> > > >
> > > > >
> > > > > > Also right now there is no way to add new type of crypto_sym_session
> > > > without
> > > > > > either breaking existing crypto-dev ABI/API or introducing new structure
> > > > > > (rte_crypto_sym_cpu_session or so) for that.
> > > > >
> > > > > What extra info is required in rte_cryptodev_sym_session to get the
> > > > rte_crypto_sym_cpu_session.
> > > >
> > > > Right now - just cipher_offset (see above).
> > > > What else in future (if any) - don't know.
> > > >
> > > > > I don't think there is any.
> > > > > I believe the same crypto session will be able to work synchronously as well.
> > > >
> > > > Exactly the same - problematically, see above.
> > > >
> > > > > We would only need  a new API to perform synchronous actions.
> > > > > That will reduce the duplication code significantly
> > > > > in the driver to support 2 different kind of APIs with similar code inside.
> > > > > Please correct me in case I am missing something.
> > > >
> > > > To add new API into crypto-dev would also require changes in the PMD,
> > > > it wouldn't come totally free and I believe would require roughly the same
> > > > amount of changes.
> > >
> > > It will be required only in the PMDs which support it and would be minimal.
> > > You would need a feature flag, support  for that synchronous API. Session
> > information will
> > > already be there in the session. The changes wrt cipher_offset need to be
> > added
> > > but with some default value to identify override will be done or not.
> > >
> > > >
> > > > >
> > > > >
> > > > > > While rte_security is designed in a way that we can add new session
> > types
> > > > and
> > > > > > related parameters without causing API/ABI breakage.
> > > > >
> > > > > Yes the intent is to add new sessions based on various protocols that can
> > be
> > > > supported by the driver.
> > > >
> > > > Various protocols and different types of sessions (and devices they belong
> > to).
> > > > Let say right now we have INLINE_CRYPTO, INLINE_PROTO,
> > LOOKASIDE_PROTO,
> > > > etc.
> > > > Here we introduce new type of session.
> > >
> > > What is the new value add to the existing sessions. The changes that we are
> > doing
> > > here is just to avoid an API/ABI breakage. The synchronous processing can
> > happen on both
> > > crypto and security session. This would mean, only the processing API should
> > be defined,
> > > rest all should be already there in the sessions.
> > > In All other cases, INLINE - eth device was not having any format to perform
> > crypto op
> > > LOOKASIDE - PROTO - add protocol specific sessions which is not available in
> > crypto.
> > >
> > > >
> > > > > It is not that we should find it as an alternative to cryptodev and using it
> > just
> > > > because it will not cause
> > > > > ABI/API breakage.
> > > >
> > > > I am considering this new API as an alternative to existing ones, but as an
> > > > extension.
> > > > Existing crypto-op API has its own advantages (generic), and I think we
> > should
> > > > keep it supported by all crypto-devs.
> > > > From other side rte_security is an extendable framework that suits the
> > purpose:
> > > > allows easily (and yes without ABI breakage) introduce new API for special
> > type
> > > > of crypto-dev (SW based).
> > > >
> > > >
> > >
> > > Adding a synchronous processing API is understandable and can be added in
> > both
> > > Crypto as well as Security, but a new action type for it is not required.
> > > Now whether to support that, we have ABI/API breakage, that is a different
> > issue.
> > > And we may have to deal with it if no other option is there.
> > >
> > > >
> > > >
> > > >
> > > > > IMO the code should be placed where its intent is.
> > > > >
> > > > > >
> > > > > > BTW, what is your concern with proposed approach (via rte_security)?
> > > > > > From my perspective it is a lightweight change and it is totally optional
> > > > > > for the crypto PMDs to support it or not.
> > > > > > Konstantin
> > > > > >
> > > > > > > >
> > > > > > > > AESNI-GCM and AESNI-MB PMDs are updated with this support.
> > There is
> > > > a
> > > > > > small
> > > > > > > > performance test app under
> > app/test/security_aesni_gcm(mb)_perftest
> > > > to
> > > > > > > > prove.
> > > > > > > >
> > > > > > > > For the new API
> > > > > > > > The packet is sent to the crypto device for symmetric crypto
> > > > > > > > processing. The device will encrypt or decrypt the buffer based on the
> > > > > > session
> > > > > > > > data specified and preprocessed in the security session. Different
> > > > > > > > than the inline or lookaside modes, when the function exits, the user
> > will
> > > > > > > > expect the buffers are either processed successfully, or having the
> > error
> > > > > > number
> > > > > > > > assigned to the appropriate index of the status array.
> > > > > > > >
> > > > > > > > Will update the program's guide in the v1 patch.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > > Fan
> > > > > > > >
> > > > > > > > > -----Original Message-----
> > > > > > > > > From: Akhil Goyal [mailto:akhil.goyal@nxp.com]
> > > > > > > > > Sent: Wednesday, September 4, 2019 11:33 AM
> > > > > > > > > To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> > > > > > > > > Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; Doherty,
> > > > > > Declan
> > > > > > > > > <declan.doherty@intel.com>; De Lara Guarch, Pablo
> > > > > > > > > <pablo.de.lara.guarch@intel.com>
> > > > > > > > > Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action
> > > > type
> > > > > > and
> > > > > > > > > API
> > > > > > > > >
> > > > > > > > > Hi Fan,
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > This patch introduce new
> > > > RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > > > > > > > > action
> > > > > > > > > > type to security library. The type represents performing crypto
> > > > > > > > > > operation with CPU cycles. The patch also includes a new API to
> > > > > > > > > > process crypto operations in bulk and the function pointers for
> > PMDs.
> > > > > > > > > >
> > > > > > > > > I am not able to get the flow of execution for this action type.
> > Could
> > > > you
> > > > > > > > > please elaborate the flow in the documentation. If not in
> > > > documentation
> > > > > > > > > right now, then please elaborate the flow in cover letter.
> > > > > > > > > Also I see that there are new APIs for processing crypto operations
> > in
> > > > bulk.
> > > > > > > > > What does that mean. How are they different from the existing APIs
> > > > which
> > > > > > > > > are also handling bulk crypto ops depending on the budget.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -Akhil


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 02/10] crypto/aesni_gcm: add rte_security handler
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
@ 2019-09-18 10:24     ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-18 10:24 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

Hi Fan,

> 
> This patch add rte_security support support to AESNI-GCM PMD. The PMD now
> initialize security context instance, create/delete PMD specific security
> sessions, and process crypto workloads in synchronous mode with
> scatter-gather list buffer supported.Hi 
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c         | 91 ++++++++++++++++++++++-
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c     | 95 ++++++++++++++++++++++++
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h | 23 ++++++
>  3 files changed, 208 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> index 1006a5c4d..0a346eddd 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> @@ -6,6 +6,7 @@
>  #include <rte_hexdump.h>
>  #include <rte_cryptodev.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
>  #include <rte_bus_vdev.h>
>  #include <rte_malloc.h>
>  #include <rte_cpuflags.h>
> @@ -174,6 +175,56 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct rte_crypto_op *op)
>  	return sess;
>  }
> 
> +static __rte_always_inline int
> +process_gcm_security_sgl_buf(struct aesni_gcm_security_session *sess,
> +		struct rte_security_vec *buf, uint8_t *iv,
> +		uint8_t *aad, uint8_t *digest)
> +{
> +	struct aesni_gcm_session *session = &sess->sess;
> +	uint8_t *tag;
> +	uint32_t i;
> +
> +	sess->init(&session->gdata_key, &sess->gdata_ctx, iv, aad,
> +			(uint64_t)session->aad_length);
> +
> +	for (i = 0; i < buf->num; i++) {
> +		struct iovec *vec = &buf->vec[i];
> +
> +		sess->update(&session->gdata_key, &sess->gdata_ctx,
> +				vec->iov_base, vec->iov_base, vec->iov_len);
> +	}
> +
> +	switch (session->op) {
> +	case AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION:
> +		if (session->req_digest_length != session->gen_digest_length)
> +			tag = sess->temp_digest;
> +		else
> +			tag = digest;
> +
> +		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
> +				session->gen_digest_length);
> +
> +		if (session->req_digest_length != session->gen_digest_length)
> +			memcpy(digest, sess->temp_digest,
> +					session->req_digest_length);
> +		break;


Wonder can we move all these cases and ifs into session_create() time -
so instead of one process() function with a lot of branches,
we'll have several process functions with minimal/none branches.
I think it should help us to save extra cycles.

> +
> +	case AESNI_GCM_OP_AUTHENTICATED_DECRYPTION:
> +		tag = sess->temp_digest;
> +
> +		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
> +				session->gen_digest_length);
> +
> +		if (memcmp(tag, digest,	session->req_digest_length) != 0)
> +			return -1;
> +		break;
> +	default:
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * Process a crypto operation, calling
>   * the GCM API from the multi buffer library.
> @@ -488,8 +539,10 @@ aesni_gcm_create(const char *name,
>  {
>  	struct rte_cryptodev *dev;
>  	struct aesni_gcm_private *internals;
> +	struct rte_security_ctx *sec_ctx;
>  	enum aesni_gcm_vector_mode vector_mode;
>  	MB_MGR *mb_mgr;
> +	char sec_name[RTE_DEV_NAME_MAX_LEN];
> 
>  	/* Check CPU for support for AES instruction set */
>  	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
> @@ -524,7 +577,8 @@ aesni_gcm_create(const char *name,
>  			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
>  			RTE_CRYPTODEV_FF_CPU_AESNI |
>  			RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT |
> -			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
> +			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
> +			RTE_CRYPTODEV_FF_SECURITY;
> 
>  	mb_mgr = alloc_mb_mgr(0);
>  	if (mb_mgr == NULL)
> @@ -587,6 +641,21 @@ aesni_gcm_create(const char *name,
> 
>  	internals->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
> 
> +	/* setup security operations */
> +	snprintf(sec_name, sizeof(sec_name) - 1, "aes_gcm_sec_%u",
> +			dev->driver_id);
> +	sec_ctx = rte_zmalloc_socket(sec_name,
> +			sizeof(struct rte_security_ctx),
> +			RTE_CACHE_LINE_SIZE, init_params->socket_id);
> +	if (sec_ctx == NULL) {
> +		AESNI_GCM_LOG(ERR, "memory allocation failed\n");
> +		goto error_exit;
> +	}
> +
> +	sec_ctx->device = (void *)dev;
> +	sec_ctx->ops = rte_aesni_gcm_pmd_security_ops;
> +	dev->security_ctx = sec_ctx;
> +
>  #if IMB_VERSION_NUM >= IMB_VERSION(0, 50, 0)
>  	AESNI_GCM_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
>  			imb_get_version_str());
> @@ -641,6 +710,8 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
>  	if (cryptodev == NULL)
>  		return -ENODEV;
> 
> +	rte_free(cryptodev->security_ctx);
> +
>  	internals = cryptodev->data->dev_private;
> 
>  	free_mb_mgr(internals->mb_mgr);
> @@ -648,6 +719,24 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
>  	return rte_cryptodev_pmd_destroy(cryptodev);
>  }
> 
> +void
> +aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	struct aesni_gcm_security_session *session =
> +			get_sec_session_private_data(sess);
> +	uint32_t i;
> +
> +	if (unlikely(!session))
> +		return;

I think you can't just return here, you need to
set all status[] entries to some -errno value.

> +
> +	for (i = 0; i < num; i++)
> +		status[i] = process_gcm_security_sgl_buf(session, &buf[i],
> +				(uint8_t *)iv[i], (uint8_t *)aad[i],
> +				(uint8_t *)digest[i]);
> +}
> +
>  static struct rte_vdev_driver aesni_gcm_pmd_drv = {
>  	.probe = aesni_gcm_probe,
>  	.remove = aesni_gcm_remove
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> index 2f66c7c58..cc71dbd60 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> @@ -7,6 +7,7 @@
>  #include <rte_common.h>
>  #include <rte_malloc.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
> 
>  #include "aesni_gcm_pmd_private.h"
> 
> @@ -316,6 +317,85 @@ aesni_gcm_pmd_sym_session_clear(struct rte_cryptodev *dev,
>  	}
>  }
> 
> +static int
> +aesni_gcm_security_session_create(void *dev,
> +		struct rte_security_session_conf *conf,
> +		struct rte_security_session *sess,
> +		struct rte_mempool *mempool)
> +{
> +	struct rte_cryptodev *cdev = dev;
> +	struct aesni_gcm_private *internals = cdev->data->dev_private;
> +	struct aesni_gcm_security_session *sess_priv;
> +	int ret;
> +
> +	if (!conf->crypto_xform) {
> +		AESNI_GCM_LOG(ERR, "Invalid security session conf");
> +		return -EINVAL;
> +	}
> +
> +	if (conf->crypto_xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
> +		AESNI_GCM_LOG(ERR, "GMAC is not supported in security session");
> +		return -EINVAL;
> +	}
> +
> +
> +	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
> +		AESNI_GCM_LOG(ERR,
> +				"Couldn't get object from session mempool");
> +		return -ENOMEM;
> +	}
> +
> +	ret = aesni_gcm_set_session_parameters(internals->ops,
> +				&sess_priv->sess, conf->crypto_xform);
> +	if (ret != 0) {
> +		AESNI_GCM_LOG(ERR, "Failed configure session parameters");
> +
> +		/* Return session to mempool */
> +		rte_mempool_put(mempool, (void *)sess_priv);
> +		return ret;
> +	}
> +
> +	sess_priv->pre = internals->ops[sess_priv->sess.key].pre;
> +	sess_priv->init = internals->ops[sess_priv->sess.key].init;
> +	if (sess_priv->sess.op == AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION) {
> +		sess_priv->update =
> +			internals->ops[sess_priv->sess.key].update_enc;
> +		sess_priv->finalize =
> +			internals->ops[sess_priv->sess.key].finalize_enc;
> +	} else {
> +		sess_priv->update =
> +			internals->ops[sess_priv->sess.key].update_dec;
> +		sess_priv->finalize =
> +			internals->ops[sess_priv->sess.key].finalize_dec;
> +	}
> +
> +	sess->sess_private_data = sess_priv;
> +
> +	return 0;
> +}
> +
> +static int
> +aesni_gcm_security_session_destroy(void *dev __rte_unused,
> +		struct rte_security_session *sess)
> +{
> +	void *sess_priv = get_sec_session_private_data(sess);
> +
> +	if (sess_priv) {
> +		struct rte_mempool *sess_mp = rte_mempool_from_obj(sess_priv);
> +
> +		memset(sess, 0, sizeof(struct aesni_gcm_security_session));
> +		set_sec_session_private_data(sess, NULL);
> +		rte_mempool_put(sess_mp, sess_priv);
> +	}
> +	return 0;
> +}
> +
> +static unsigned int
> +aesni_gcm_sec_session_get_size(__rte_unused void *device)
> +{
> +	return sizeof(struct aesni_gcm_security_session);
> +}
> +
>  struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
>  		.dev_configure		= aesni_gcm_pmd_config,
>  		.dev_start		= aesni_gcm_pmd_start,
> @@ -336,4 +416,19 @@ struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
>  		.sym_session_clear	= aesni_gcm_pmd_sym_session_clear
>  };
> 
> +static struct rte_security_ops aesni_gcm_security_ops = {
> +		.session_create = aesni_gcm_security_session_create,
> +		.session_get_size = aesni_gcm_sec_session_get_size,
> +		.session_update = NULL,
> +		.session_stats_get = NULL,
> +		.session_destroy = aesni_gcm_security_session_destroy,
> +		.set_pkt_metadata = NULL,
> +		.capabilities_get = NULL,
> +		.process_cpu_crypto_bulk =
> +				aesni_gcm_sec_crypto_process_bulk,
> +};
> +
>  struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops = &aesni_gcm_pmd_ops;
> +
> +struct rte_security_ops *rte_aesni_gcm_pmd_security_ops =
> +		&aesni_gcm_security_ops;
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> index 56b29e013..8e490b6ce 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> @@ -114,5 +114,28 @@ aesni_gcm_set_session_parameters(const struct aesni_gcm_ops *ops,
>   * Device specific operations function pointer structure */
>  extern struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops;
> 
> +/**
> + * Security session structure.
> + */
> +struct aesni_gcm_security_session {
> +	/** Temp digest for decryption */
> +	uint8_t temp_digest[DIGEST_LENGTH_MAX];
> +	/** GCM operations */
> +	aesni_gcm_pre_t pre;
> +	aesni_gcm_init_t init;
> +	aesni_gcm_update_t update;
> +	aesni_gcm_finalize_t finalize;
> +	/** AESNI-GCM session */
> +	struct aesni_gcm_session sess;
> +	/** AESNI-GCM context */
> +	struct gcm_context_data gdata_ctx;
> +};
> +
> +extern void
> +aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
> +extern struct rte_security_ops *rte_aesni_gcm_pmd_security_ops;
> 
>  #endif /* _RTE_AESNI_GCM_PMD_PRIVATE_H_ */
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-09-18 12:45     ` Ananyev, Konstantin
  2019-09-29  6:00     ` Hemant Agrawal
  1 sibling, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-18 12:45 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

> +/**
> + * Security vector structure, contains pointer to vector array and the length
> + * of the array
> + */
> +struct rte_security_vec {
> +	struct iovec *vec;
> +	uint32_t num;
> +};
> +
> +/**
> + * Processing bulk crypto workload with CPU
> + *
> + * @param	instance	security instance.
> + * @param	sess		security session
> + * @param	buf		array of buffer SGL vectors
> + * @param	iv		array of IV pointers
> + * @param	aad		array of AAD pointers
> + * @param	digest		array of digest pointers
> + * @param	status		array of status for the function to return


Need to specify what are expected status values.
I suppose zero for success, negative errno for some error happens?

> + * @param	num		number of elements in each array
> + *
> + */
> +__rte_experimental
> +void
> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>  #ifdef __cplusplus
>  }
>  #endif

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 05/10] crypto/aesni_mb: add rte_security handler
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
@ 2019-09-18 15:20     ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-18 15:20 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal


> 
> This patch add rte_security support support to AESNI-MB PMD. The PMD now
> initialize security context instance, create/delete PMD specific security
> sessions, and process crypto workloads in synchronous mode.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         | 291 ++++++++++++++++++++-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |  91 ++++++-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |  21 +-
>  3 files changed, 398 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> index b495a9679..68767c04e 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> @@ -8,6 +8,8 @@
>  #include <rte_hexdump.h>
>  #include <rte_cryptodev.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security.h>
> +#include <rte_security_driver.h>
>  #include <rte_bus_vdev.h>
>  #include <rte_malloc.h>
>  #include <rte_cpuflags.h>
> @@ -789,6 +791,167 @@ auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
>  			(UINT64_MAX - u_src + u_dst + 1);
>  }
> 
> +union sec_userdata_field {
> +	int status;
> +	struct {
> +		uint16_t is_gen_digest;
> +		uint16_t digest_len;
> +	};
> +};
> +
> +struct sec_udata_digest_field {
> +	uint32_t is_digest_gen;
> +	uint32_t digest_len;
> +};
> +
> +static inline int
> +set_mb_job_params_sec(JOB_AES_HMAC *job, struct aesni_mb_sec_session *sec_sess,
> +		void *buf, uint32_t buf_len, void *iv, void *aad, void *digest,
> +		int *status, uint8_t *digest_idx)
> +{
> +	struct aesni_mb_session *session = &sec_sess->sess;
> +	uint32_t cipher_offset = sec_sess->cipher_offset;
> +	void *user_digest = NULL;
> +	union sec_userdata_field udata;
> +
> +	if (unlikely(cipher_offset > buf_len))
> +		return -EINVAL;
> +
> +	/* Set crypto operation */
> +	job->chain_order = session->chain_order;
> +
> +	/* Set cipher parameters */
> +	job->cipher_direction = session->cipher.direction;
> +	job->cipher_mode = session->cipher.mode;
> +
> +	job->aes_key_len_in_bytes = session->cipher.key_length_in_bytes;
> +
> +	/* Set authentication parameters */
> +	job->hash_alg = session->auth.algo;
> +	job->iv = iv;
> +
> +	switch (job->hash_alg) {
> +	case AES_XCBC:
> +		job->u.XCBC._k1_expanded = session->auth.xcbc.k1_expanded;
> +		job->u.XCBC._k2 = session->auth.xcbc.k2;
> +		job->u.XCBC._k3 = session->auth.xcbc.k3;
> +
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		break;
> +
> +	case AES_CCM:
> +		job->u.CCM.aad = (uint8_t *)aad + 18;
> +		job->u.CCM.aad_len_in_bytes = session->aead.aad_len;
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		job->iv++;
> +		break;
> +
> +	case AES_CMAC:
> +		job->u.CMAC._key_expanded = session->auth.cmac.expkey;
> +		job->u.CMAC._skey1 = session->auth.cmac.skey1;
> +		job->u.CMAC._skey2 = session->auth.cmac.skey2;
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		break;
> +
> +	case AES_GMAC:
> +		if (session->cipher.mode == GCM) {
> +			job->u.GCM.aad = aad;
> +			job->u.GCM.aad_len_in_bytes = session->aead.aad_len;
> +		} else {
> +			/* For GMAC */
> +			job->u.GCM.aad = aad;
> +			job->u.GCM.aad_len_in_bytes = buf_len;
> +			job->cipher_mode = GCM;
> +		}
> +		job->aes_enc_key_expanded = &session->cipher.gcm_key;
> +		job->aes_dec_key_expanded = &session->cipher.gcm_key;
> +		break;
> +
> +	default:
> +		job->u.HMAC._hashed_auth_key_xor_ipad =
> +				session->auth.pads.inner;
> +		job->u.HMAC._hashed_auth_key_xor_opad =
> +				session->auth.pads.outer;
> +
> +		if (job->cipher_mode == DES3) {
> +			job->aes_enc_key_expanded =
> +				session->cipher.exp_3des_keys.ks_ptr;
> +			job->aes_dec_key_expanded =
> +				session->cipher.exp_3des_keys.ks_ptr;
> +		} else {
> +			job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +			job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		}
> +	}

Seems like too many branches at data-path.
We'll have only one job-type(alg) per session.
So we can have prefilled job struct template with all common fields already setuped,
and then at process() just copy it over and update few fields that has to be different
(like msg_len_to_cipher_in_bytes).
 

> +
> +	/* Set digest output location */
> +	if (job->hash_alg != NULL_HASH &&
> +			session->auth.operation == RTE_CRYPTO_AUTH_OP_VERIFY) {
> +		job->auth_tag_output = sec_sess->temp_digests[*digest_idx];
> +		*digest_idx = (*digest_idx + 1) % MAX_JOBS;
> +
> +		udata.is_gen_digest = 0;
> +		udata.digest_len = session->auth.req_digest_len;
> +		user_digest = (void *)digest;
> +	} else {
> +		udata.is_gen_digest = 1;
> +		udata.digest_len = session->auth.req_digest_len;
> +
> +		if (session->auth.req_digest_len !=
> +				session->auth.gen_digest_len) {
> +			job->auth_tag_output =
> +					sec_sess->temp_digests[*digest_idx];
> +			*digest_idx = (*digest_idx + 1) % MAX_JOBS;
> +
> +			user_digest = (void *)digest;
> +		} else
> +			job->auth_tag_output = digest;
> +
> +		/* A bit of hack here, since job structure only supports
> +		 * 2 user data fields and we need 4 params to be passed
> +		 * (status, direction, digest for verify, and length of
> +		 * digest), we set the status value as digest length +
> +		 * direction here temporarily to avoid creating longer
> +		 * buffer to store all 4 params.
> +		 */
> +		*status = udata.status;
> +	}
> +	/*
> +	 * Multi-buffer library current only support returning a truncated
> +	 * digest length as specified in the relevant IPsec RFCs
> +	 */
> +
> +	/* Set digest length */
> +	job->auth_tag_output_len_in_bytes = session->auth.gen_digest_len;
> +
> +	/* Set IV parameters */
> +	job->iv_len_in_bytes = session->iv.length;
> +
> +	/* Data Parameters */
> +	job->src = buf;
> +	job->dst = buf;
> +	job->cipher_start_src_offset_in_bytes = cipher_offset;
> +	job->msg_len_to_cipher_in_bytes = buf_len - cipher_offset;
> +	job->hash_start_src_offset_in_bytes = 0;
> +	job->msg_len_to_hash_in_bytes = buf_len;
> +
> +	job->user_data = (void *)status;
> +	job->user_data2 = user_digest;
> +
> +	return 0;
> +}
> +
>  /**
>   * Process a crypto operation and complete a JOB_AES_HMAC job structure for
>   * submission to the multi buffer library for processing.
> @@ -1081,6 +1244,37 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
>  	return op;
>  }
> 
> +static inline void
> +post_process_mb_sec_job(JOB_AES_HMAC *job)
> +{
> +	void *user_digest = job->user_data2;
> +	int *status = job->user_data;
> +	union sec_userdata_field udata;
> +
> +	switch (job->status) {
> +	case STS_COMPLETED:
> +		if (user_digest) {
> +			udata.status = *status;
> +
> +			if (udata.is_gen_digest) {
> +				*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
> +				memcpy(user_digest, job->auth_tag_output,
> +						udata.digest_len);
> +			} else {
> +				verify_digest(job, user_digest,
> +					udata.digest_len, (uint8_t *)status);
> +
> +				if (*status == RTE_CRYPTO_OP_STATUS_AUTH_FAILED)
> +					*status = -1;
> +			}

Again - multiple process() functions instead of branches at data-path?

> +		} else
> +			*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
> +		break;
> +	default:
> +		*status = RTE_CRYPTO_OP_STATUS_ERROR;
> +	}
> +}
> +
>  /**
>   * Process a completed JOB_AES_HMAC job and keep processing jobs until
>   * get_completed_job return NULL
> @@ -1117,6 +1311,32 @@ handle_completed_jobs(struct aesni_mb_qp *qp, JOB_AES_HMAC *job,
>  	return processed_jobs;
>  }
> 
> +static inline uint32_t
> +handle_completed_sec_jobs(JOB_AES_HMAC *job, MB_MGR *mb_mgr)
> +{
> +	uint32_t processed = 0;
> +
> +	while (job != NULL) {
> +		post_process_mb_sec_job(job);
> +		job = IMB_GET_COMPLETED_JOB(mb_mgr);
> +		processed++;
> +	}
> +
> +	return processed;
> +}
> +
> +static inline uint32_t
> +flush_mb_sec_mgr(MB_MGR *mb_mgr)
> +{
> +	JOB_AES_HMAC *job = IMB_FLUSH_JOB(mb_mgr);
> +	uint32_t processed = 0;
> +
> +	if (job)
> +		processed = handle_completed_sec_jobs(job, mb_mgr);
> +
> +	return processed;
> +}
> +
>  static inline uint16_t
>  flush_mb_mgr(struct aesni_mb_qp *qp, struct rte_crypto_op **ops,
>  		uint16_t nb_ops)
> @@ -1220,6 +1440,55 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
>  	return processed_jobs;
>  }
> 
> +void
> +aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	struct aesni_mb_sec_session *sec_sess = sess->sess_private_data;
> +	JOB_AES_HMAC *job;
> +	uint8_t digest_idx = sec_sess->digest_idx;
> +	uint32_t i, processed = 0;
> +	int ret;
> +
> +	for (i = 0; i < num; i++) {
> +		void *seg_buf = buf[i].vec[0].iov_base;
> +		uint32_t buf_len = buf[i].vec[0].iov_len;
> +
> +		job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
> +		if (unlikely(job == NULL)) {
> +			processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
> +
> +			job = IMB_GET_NEXT_JOB(sec_sess->mb_mgr);
> +			if (!job)
> +				return;

You can't just return here.
Need to fill remaining statsu[] with some meaningfull error value.
As alternative make proceee_bulk() to return number of processed buffers instead of void.


> +		}
> +
> +		ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
> +				iv[i], aad[i], digest[i], &status[i],
> +				&digest_idx);

That doesn't look right: 
digest_idx is a temporary valiable, you pass it's address to set_mb_job_params_sec(),
where it will be updated, but then you never write you back.
So do we really need digest_idx inside the session?
Overall, the whole construction with having status and idx stored inside job struct
seems overcomplicated and probably error prone.
AFAIK, aesni-mb job-manager guarantees FIFO order jobs submitted.
So just having idx counter inside that function seems enough, no?


> +				/* Submit job to multi-buffer for processing */
> +		if (ret) {
> +			processed++;
> +			status[i] = ret;
> +			continue;
> +		}
> +
> +#ifdef RTE_LIBRTE_PMD_AESNI_MB_DEBUG
> +		job = IMB_SUBMIT_JOB(sec_sess->mb_mgr);
> +#else
> +		job = IMB_SUBMIT_JOB_NOCHECK(sec_sess->mb_mgr);
> +#endif
> +
> +		if (job)
> +			processed += handle_completed_sec_jobs(job,
> +					sec_sess->mb_mgr);
> +	}
> +
> +	while (processed < num)
> +		processed += flush_mb_sec_mgr(sec_sess->mb_mgr);
> +}
> +
>  static int cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev);
> 
>  static int
> @@ -1229,8 +1498,10 @@ cryptodev_aesni_mb_create(const char *name,
>  {
>  	struct rte_cryptodev *dev;
>  	struct aesni_mb_private *internals;
> +	struct rte_security_ctx *sec_ctx;
>  	enum aesni_mb_vector_mode vector_mode;
>  	MB_MGR *mb_mgr;
> +	char sec_name[RTE_DEV_NAME_MAX_LEN];
> 
>  	/* Check CPU for support for AES instruction set */
>  	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
> @@ -1264,7 +1535,8 @@ cryptodev_aesni_mb_create(const char *name,
>  	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO |
>  			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
>  			RTE_CRYPTODEV_FF_CPU_AESNI |
> -			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
> +			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
> +			RTE_CRYPTODEV_FF_SECURITY;
> 
> 
>  	mb_mgr = alloc_mb_mgr(0);
> @@ -1303,11 +1575,28 @@ cryptodev_aesni_mb_create(const char *name,
>  	AESNI_MB_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
>  			imb_get_version_str());
> 
> +	/* setup security operations */
> +	snprintf(sec_name, sizeof(sec_name) - 1, "aes_mb_sec_%u",
> +			dev->driver_id);
> +	sec_ctx = rte_zmalloc_socket(sec_name,
> +			sizeof(struct rte_security_ctx),
> +			RTE_CACHE_LINE_SIZE, init_params->socket_id);
> +	if (sec_ctx == NULL) {
> +		AESNI_MB_LOG(ERR, "memory allocation failed\n");
> +		goto error_exit;
> +	}
> +
> +	sec_ctx->device = (void *)dev;
> +	sec_ctx->ops = rte_aesni_mb_pmd_security_ops;
> +	dev->security_ctx = sec_ctx;
> +
>  	return 0;
> 
>  error_exit:
>  	if (mb_mgr)
>  		free_mb_mgr(mb_mgr);
> +	if (sec_ctx)
> +		rte_free(sec_ctx);
> 
>  	rte_cryptodev_pmd_destroy(dev);
> 
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> index 8d15b99d4..ca6cea775 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> @@ -8,6 +8,7 @@
>  #include <rte_common.h>
>  #include <rte_malloc.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
> 
>  #include "rte_aesni_mb_pmd_private.h"
> 
> @@ -732,7 +733,8 @@ aesni_mb_pmd_qp_count(struct rte_cryptodev *dev)
>  static unsigned
>  aesni_mb_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
>  {
> -	return sizeof(struct aesni_mb_session);
> +	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_session),
> +			RTE_CACHE_LINE_SIZE);
>  }
> 
>  /** Configure a aesni multi-buffer session from a crypto xform chain */
> @@ -810,4 +812,91 @@ struct rte_cryptodev_ops aesni_mb_pmd_ops = {
>  		.sym_session_clear	= aesni_mb_pmd_sym_session_clear
>  };
> 
> +/** Set session authentication parameters */
> +
> +static int
> +aesni_mb_security_session_create(void *dev,
> +		struct rte_security_session_conf *conf,
> +		struct rte_security_session *sess,
> +		struct rte_mempool *mempool)
> +{
> +	struct rte_cryptodev *cdev = dev;
> +	struct aesni_mb_private *internals = cdev->data->dev_private;
> +	struct aesni_mb_sec_session *sess_priv;
> +	int ret;
> +
> +	if (!conf->crypto_xform) {
> +		AESNI_MB_LOG(ERR, "Invalid security session conf");
> +		return -EINVAL;
> +	}
> +
> +	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
> +		AESNI_MB_LOG(ERR,
> +				"Couldn't get object from session mempool");
> +		return -ENOMEM;
> +	}
> +
> +	sess_priv->mb_mgr = internals->mb_mgr;

After another thoughts - I don't think it is ok to use the same job-manager
across all sessions. Different sessions can be used by different threads, etc.
I think we need a separate instance of job-manager for every session.  


> +	if (sess_priv->mb_mgr == NULL)
> +		return -ENOMEM;
> +
> +	sess_priv->cipher_offset = conf->cpucrypto.cipher_offset;
> +
> +	ret = aesni_mb_set_session_parameters(sess_priv->mb_mgr,
> +			&sess_priv->sess, conf->crypto_xform);
> +	if (ret != 0) {
> +		AESNI_MB_LOG(ERR, "failed configure session parameters");
> +
> +		rte_mempool_put(mempool, sess_priv);
> +	}
> +
> +	sess->sess_private_data = (void *)sess_priv;
> +
> +	return ret;
> +}
> +
> +static int
> +aesni_mb_security_session_destroy(void *dev __rte_unused,
> +		struct rte_security_session *sess)
> +{
> +	struct aesni_mb_sec_session *sess_priv =
> +			get_sec_session_private_data(sess);
> +
> +	if (sess_priv) {
> +		struct rte_mempool *sess_mp = rte_mempool_from_obj(
> +				(void *)sess_priv);
> +
> +		memset(sess, 0, sizeof(struct aesni_mb_sec_session));
> +		set_sec_session_private_data(sess, NULL);
> +
> +		if (sess_mp == NULL) {
> +			AESNI_MB_LOG(ERR, "failed fetch session mempool");
> +			return -EINVAL;
> +		}
> +
> +		rte_mempool_put(sess_mp, sess_priv);
> +	}
> +
> +	return 0;
> +}
> +
> +static unsigned int
> +aesni_mb_sec_session_get_size(__rte_unused void *device)
> +{
> +	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_sec_session),
> +			RTE_CACHE_LINE_SIZE);
> +}
> +
> +static struct rte_security_ops aesni_mb_security_ops = {
> +		.session_create = aesni_mb_security_session_create,
> +		.session_get_size = aesni_mb_sec_session_get_size,
> +		.session_update = NULL,
> +		.session_stats_get = NULL,
> +		.session_destroy = aesni_mb_security_session_destroy,
> +		.set_pkt_metadata = NULL,
> +		.capabilities_get = NULL,
> +		.process_cpu_crypto_bulk = aesni_mb_sec_crypto_process_bulk,
> +};
> +
>  struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops = &aesni_mb_pmd_ops;
> +struct rte_security_ops *rte_aesni_mb_pmd_security_ops = &aesni_mb_security_ops;
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> index b794d4bc1..d1cf416ab 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> @@ -176,7 +176,6 @@ struct aesni_mb_qp {
>  	 */
>  } __rte_cache_aligned;
> 
> -/** AES-NI multi-buffer private session structure */
>  struct aesni_mb_session {
>  	JOB_CHAIN_ORDER chain_order;
>  	struct {
> @@ -265,16 +264,32 @@ struct aesni_mb_session {
>  		/** AAD data length */
>  		uint16_t aad_len;
>  	} aead;
> -} __rte_cache_aligned;

Didn't look through all the code 

> +};
> +
> +/** AES-NI multi-buffer private security session structure */
> +struct aesni_mb_sec_session {
> +	/**< Unique Queue Pair Name */
> +	struct aesni_mb_session sess;
> +	uint8_t temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];

Probably better to move these temp_digest[][] at the very end?
To have all read-only data grouped together?

> +	uint16_t digest_idx;
> +	uint32_t cipher_offset;
> +	MB_MGR *mb_mgr;
> +};
> 
>  extern int
>  aesni_mb_set_session_parameters(const MB_MGR *mb_mgr,
>  		struct aesni_mb_session *sess,
>  		const struct rte_crypto_sym_xform *xform);
> 
> +extern void
> +aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>  /** device specific operations function pointer structure */
>  extern struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops;
> 
> -
> +/** device specific operations function pointer structure for rte_security */
> +extern struct rte_security_ops *rte_aesni_mb_pmd_security_ops;
> 
>  #endif /* _RTE_AESNI_MB_PMD_PRIVATE_H_ */
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-18  7:44                     ` Ananyev, Konstantin
@ 2019-09-25 18:24                       ` Ananyev, Konstantin
  2019-09-27  9:26                         ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-25 18:24 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


> > > > > > > > > This action type allows the burst of symmetric crypto workload using
> > > the
> > > > > > > same
> > > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > > synchronously.
> > > > > > > > > This flexible action type does not require external hardware
> > > involvement,
> > > > > > > > > having the crypto workload processed synchronously, and is more
> > > > > > > performant
> > > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed "async
> > > > > mode
> > > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > > >
> > > > > > > > Does that mean application will not call the cryptodev_enqueue_burst
> > > and
> > > > > > > corresponding dequeue burst.
> > > > > > >
> > > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > > >
> > > > > > > > It would be a new API something like process_packets and it will have
> > > the
> > > > > > > crypto processed packets while returning from the API?
> > > > > > >
> > > > > > > Yes, though the plan is that API will operate on raw data buffers, not
> > > mbufs.
> > > > > > >
> > > > > > > >
> > > > > > > > I still do not understand why we cannot do with the conventional
> > > crypto lib
> > > > > > > only.
> > > > > > > > As far as I can understand, you are not doing any protocol processing
> > > or
> > > > > any
> > > > > > > value add
> > > > > > > > To the crypto processing. IMO, you just need a synchronous crypto
> > > > > processing
> > > > > > > API which
> > > > > > > > Can be defined in cryptodev, you don't need to re-create a crypto
> > > session
> > > > > in
> > > > > > > the name of
> > > > > > > > Security session in the driver just to do a synchronous processing.
> > > > > > >
> > > > > > > I suppose your question is why not to have
> > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > The main reason is that would require disruptive changes in existing
> > > > > cryptodev
> > > > > > > API
> > > > > > > (would cause ABI/API breakage).
> > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need some
> > > extra
> > > > > > > information
> > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > (cipher offset from the start of the buffer, might be something extra in
> > > > > future).
> > > > > >
> > > > > > Cipher offset will be part of rte_crypto_op.
> > > > >
> > > > > fill/read (+ alloc/free) is one of the main things that slowdown current
> > > crypto-op
> > > > > approach.
> > > > > That's why the general idea - have all data that wouldn't change from packet
> > > to
> > > > > packet
> > > > > included into the session and setup it once at session_init().
> > > >
> > > > I agree that you cannot use crypto-op.
> > > > You can have the new API in crypto.
> > > > As per the current patch, you only need cipher_offset which you can have it as
> > > a parameter until
> > > > You get it approved in the crypto xform. I believe it will be beneficial in case of
> > > other crypto cases as well.
> > > > We can have cipher offset at both places(crypto-op and cipher_xform). It will
> > > give flexibility to the user to
> > > > override it.
> > >
> > > After having another thought on your proposal:
> > > Probably we can introduce new rte_crypto_sym_xform_types for CPU related
> > > stuff here?
> >
> > I also thought of adding new xforms, but that wont serve the purpose for may be all the cases.
> > You would be needing all information currently available in the current xforms.
> > So if you are adding new fields in the new xform, the size will be more than that of the union of xforms.
> > ABI breakage would still be there.
> >
> > If you think a valid compression of the AEAD xform can be done, then that can be done for each of the
> > Xforms and we can have a solution to this issue.
> 
> I think that we can re-use iv.offset for our purposes (for crypto offset).
> So for now we can make that path work without any ABI breakage.
> Fan, please feel free to correct me here, if I missed something.
> If in future we would need to add some extra information it might
> require ABI breakage, though by now I don't envision anything particular to add.
> Anyway, if there is no objection to go that way, we can try to make
> these changes for v2.
> 

Actually, after looking at it more deeply it appears not that easy as I thought it would be :)
Below is a very draft version of proposed API additions.
I think it avoids ABI breakages right now and provides enough flexibility for future extensions (if any). 
For now, it doesn't address your comments about naming conventions (_CPU_ vs _SYNC_) , etc.
but I suppose is comprehensive enough to provide a main idea beyond it.
Akhil and other interested parties, please try to review and provide feedback ASAP,
as related changes would take some time and we still like to hit 19.11 deadline.
Konstantin

 diff --git a/lib/librte_cryptodev/rte_crypto_sym.h b/lib/librte_cryptodev/rte_crypto_sym.h
index bc8da2466..c03069e23 100644
--- a/lib/librte_cryptodev/rte_crypto_sym.h
+++ b/lib/librte_cryptodev/rte_crypto_sym.h
@@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
  *
  * This structure contains data relating to Cipher (Encryption and Decryption)
  *  use to create a session.
+ * Actually I was wrong saying that we don't have free space inside xforms.
+ * Making key struct packed (see below) allow us to regain 6B that could be
+ * used for future extensions.
  */
 struct rte_crypto_cipher_xform {
        enum rte_crypto_cipher_operation op;
@@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
        struct {
                const uint8_t *data;    /**< pointer to key data */
                uint16_t length;        /**< key length in bytes */
-       } key;
+       } __attribute__((__packed__)) key;
+
+       /**
+         * offset for cipher to start within user provided data buffer.
+        * Fan suggested another (and less space consuming way) -
+         * reuse iv.offset space below, by changing:
+        * struct {uint16_t offset, length;} iv;
+        * to uunamed union:
+        * union {
+        *      struct {uint16_t offset, length;} iv;
+        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
+        * };
+        * Both approaches seems ok to me in general.
+        * Comments/suggestions are welcome.
+         */
+       uint16_t offset;
+
+       uint8_t reserved1[4];
+
        /**< Cipher key
         *
         * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data will
@@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
        struct {
                const uint8_t *data;    /**< pointer to key data */
                uint16_t length;        /**< key length in bytes */
-       } key;
+       } __attribute__((__packed__)) key;
        /**< Authentication key data.
         * The authentication key length MUST be less than or equal to the
         * block size of the algorithm. It is the callers responsibility to
@@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
         * (for example RFC 2104, FIPS 198a).
         */

+       uint8_t reserved1[6];
+
        struct {
                uint16_t offset;
                /**< Starting point for Initialisation Vector or Counter,
@@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
        struct {
                const uint8_t *data;    /**< pointer to key data */
                uint16_t length;        /**< key length in bytes */
-       } key;
+       } __attribute__((__packed__)) key;
+
+       /** offset for cipher to start within data buffer */
+       uint16_t cipher_offset;
+
+       uint8_t reserved1[4];

        struct {
                uint16_t offset;
diff --git a/lib/librte_cryptodev/rte_cryptodev.h b/lib/librte_cryptodev/rte_cryptodev.h
index e175b838c..c0c7bfed7 100644
--- a/lib/librte_cryptodev/rte_cryptodev.h
+++ b/lib/librte_cryptodev/rte_cryptodev.h
@@ -1272,6 +1272,101 @@ void *
 rte_cryptodev_sym_session_get_user_data(
                                        struct rte_cryptodev_sym_session *sess);

+/*
+ * After several thoughts decided not to try to squeeze CPU_CRYPTO
+ * into existing rte_crypto_sym_session structure/API, but instead
+ * introduce an extentsion to it via new fully opaque
+ * struct rte_crypto_cpu_sym_session and additional related API.
+ * Main points:
+ * - Current crypto-dev API is reasonably mature and it is desirable
+ *   to keep it unchanged (API/ABI stability). From other side, this
+ *   new sync API is new one and probably would require extra changes.
+ *   Having it as a new one allows to mark it as experimental, without
+ *   affecting existing one.
+ * - Fully opaque cpu_sym_session structure gives more flexibility
+ *   to the PMD writers and again allows to avoid ABI breakages in future.
+ * - process() function per set of xforms
+ *   allows to expose different process() functions for different
+ *   xform combinations. PMD writer can decide, does he wants to
+ *   push all supported algorithms into one process() function,
+ *   or spread it across several ones.
+ *   I.E. More flexibility for PMD writer.
+ * - Not storing process() pointer inside the session -
+ *   Allows user to choose does he want to store a process() pointer
+ *   per session, or per group of sessions for that device that share
+ *   the same input xforms. I.E. extra flexibility for the user,
+ *   plus allows us to keep cpu_sym_session totally opaque, see above.
+ * Sketched usage model:
+ * ....
+ * /* control path, alloc/init session */
+ * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
+ * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
+ * rte_crypto_cpu_sym_process_t process =
+ *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
+ * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
+ * ...
+ * /* data-path*/
+ * process(ses, ....);
+ * ....
+ * /* control path, termiante/free session */
+ * rte_crypto_cpu_sym_session_fini(dev_id, ses);
+ */
+
+/**
+ * vector structure, contains pointer to vector array and the length
+ * of the array
+ */
+struct rte_crypto_vec {
+       struct iovec *vec;
+       uint32_t num;
+};
+
+/*
+ * Data-path bulk process crypto function.
+ */
+typedef void (*rte_crypto_cpu_sym_process_t)(
+               struct rte_crypto_cpu_sym_session *sess,
+               struct rte_crypto_vec buf[], void *iv[], void *aad[],
+               void *digest[], int status[], uint32_t num);
+/*
+ * for given device return process function specific to input xforms
+ * on error - return NULL and set rte_errno value.
+ * Note that for same input xfroms for the same device should return
+ * the same process function.
+ */
+__rte_experimental
+rte_crypto_cpu_sym_process_t
+rte_crypto_cpu_sym_session_func(uint8_t dev_id,
+                       const struct rte_crypto_sym_xform *xforms);
+
+/*
+ * Return required session size in bytes for given set of xforms.
+ * if xforms == NULL, then return the max possible session size,
+ * that would fit session for any supported by the device algorithm.
+ * if CPU mode is not supported at all, or requeted in xform
+ * algorithm is not supported, then return -ENOTSUP.
+ */
+__rte_experimental
+int
+rte_crypto_cpu_sym_session_size(uint8_t dev_id,
+                       const struct rte_crypto_sym_xform *xforms);
+
+/*
+ * Initialize session.
+ * It is caller responsibility to allocate enough space for it.
+ * See rte_crypto_cpu_sym_session_size above.
+ */
+__rte_experimental
+int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
+                       struct rte_crypto_cpu_sym_session *sess,
+                       const struct rte_crypto_sym_xform *xforms);
+
+__rte_experimental
+void
+rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
+                       struct rte_crypto_cpu_sym_session *sess);
+
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h b/lib/librte_cryptodev/rte_cryptodev_pmd.h
index defe05ea0..ed7e63fab 100644
--- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
+++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
@@ -310,6 +310,20 @@ typedef void (*cryptodev_sym_free_session_t)(struct rte_cryptodev *dev,
 typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *dev,
                struct rte_cryptodev_asym_session *sess);

+typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev *dev,
+                       const struct rte_crypto_sym_xform *xforms);
+
+typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev *dev,
+                       struct rte_crypto_cpu_sym_session *sess,
+                       const struct rte_crypto_sym_xform *xforms);
+
+typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev *dev,
+                       struct rte_crypto_cpu_sym_session *sess);
+
+typedef rte_crypto_cpu_sym_process_t (*cryptodev_cpu_sym_session_func_t) (
+                       struct rte_cryptodev *dev,
+                       const struct rte_crypto_sym_xform *xforms);
+
 /** Crypto device operations function pointer table */
 struct rte_cryptodev_ops {
        cryptodev_configure_t dev_configure;    /**< Configure device. */
@@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
        /**< Clear a Crypto sessions private data. */
        cryptodev_asym_free_session_t asym_session_clear;
        /**< Clear a Crypto sessions private data. */
+
+       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
+       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
+       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
+       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
 };





^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
@ 2019-09-26 23:20     ` Ananyev, Konstantin
  2019-09-27 10:38     ` Ananyev, Konstantin
  1 sibling, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-26 23:20 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

Hi Fan,

...
> diff --git a/lib/librte_ipsec/esp_outb.c b/lib/librte_ipsec/esp_outb.c
> index 55799a867..097cb663f 100644
> --- a/lib/librte_ipsec/esp_outb.c
> +++ b/lib/librte_ipsec/esp_outb.c
> @@ -403,6 +403,292 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
>  	return k;
>  }
> 
> +
> +static inline int
> +outb_sync_crypto_proc_prepare(struct rte_mbuf *m, const struct rte_ipsec_sa *sa,
> +		const uint64_t ivp[IPSEC_MAX_IV_QWORD],
> +		const union sym_op_data *icv, uint32_t hlen, uint32_t plen,
> +		struct rte_security_vec *buf, struct iovec *cur_vec, void *iv,
> +		void **aad, void **digest)
> +{
> +	struct rte_mbuf *ms;
> +	struct aead_gcm_iv *gcm;
> +	struct aesctr_cnt_blk *ctr;
> +	struct iovec *vec = cur_vec;
> +	uint32_t left, off = 0, n_seg = 0;

Please separate variable definition and value assignment.
It makes it hard to read, plus we don't do that in the rest of the library,
so better to follow rest of the code style. 

> +	uint32_t algo;
> +
> +	algo = sa->algo_type;
> +
> +	switch (algo) {
> +	case ALGO_TYPE_AES_GCM:
> +		gcm = iv;
> +		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
> +		*aad = (void *)(icv->va + sa->icv_len);

Why do we want to allocate aad inside the packet at all?
Why not just to do that on the stack instead?
In that case you probably wouldn't need this icv stuff at all to be passed to that function.

> +		off = sa->ctp.cipher.offset + hlen;
> +		break;
> +	case ALGO_TYPE_AES_CBC:
> +	case ALGO_TYPE_3DES_CBC:
> +		off = sa->ctp.auth.offset + hlen;
> +		break;
> +	case ALGO_TYPE_AES_CTR:
> +		ctr = iv;
> +		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
> +		break;
> +	case ALGO_TYPE_NULL:
> +		break;

For latest two, why off is zero?
Shouldn't it at least be 'hlen'?
In fact, I think it needs to be: sa->ctp.auth.offset + hlen;

> +	}
> +
> +	*digest = (void *)icv->va;

Could be done in the upper layer function, together with aad assignment, I think.

Looking at this function, it seems to consist of 2 separate parts:
1. calculates offset and generates iv
2. setup iovec[].
Probably worth to split it into 2 separate functions like that.
Would be much easier to read/understand.

> +
> +	left = sa->ctp.cipher.length + plen;
> +
> +	ms = mbuf_get_seg_ofs(m, &off);
> +	if (!ms)
> +		return -1;

outb_tun_pkt_prepare() should already check that we have a valid packet.
I don't think there is a need to check for any failure here.
Another thing, our esp header will be in the first segment for sure,
so do we need get_seg_ofs() here at all? 

> +
> +	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {

I don't think this is right, we shouldn't impose additional limitations to
the number of segments in the packet.

> +		uint32_t len = RTE_MIN(left, ms->data_len - off);
> +
> +		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
> +		vec->iov_len = len;
> +
> +		left -= len;
> +		vec++;
> +		n_seg++;
> +		ms = ms->next;
> +		off = 0;


Whole construction seems a bit over-complicated here...
Why just not have a separate function that would dill iovec[] from mbuf
And return an error if there is not enough iovec[] entries?
Something like:

static inline int
mbuf_to_iovec(const struct rte_mbuf *mb, uint32_t ofs, uint32_t len, struct iovec vec[], uint32_t num)
{
     uint32_t i;
     if (mb->nb_seg > num)
        return - mb->nb_seg;

    vec[0].iov_base =  rte_pktmbuf_mtod_offset(mb, void *, off);
    vec[0].iov_len = mb->data_len - off;

    for (i = 1, ms = mb->next; mb != NULL; ms = ms->next, i++) {
        vec[i].iov_base = rte_pktmbuf_mtod(ms);
        vec[i].iov_len = ms->data_len;
    }

   vec[i].iov_len -= mb->pkt_len - len;
   return i;
}

Then we can use that function to fill our iovec[] in a loop.

> +	}
> +
> +	if (left)
> +		return -1;
> +
> +	buf->vec = cur_vec;
> +	buf->num = n_seg;
> +
> +	return n_seg;
> +}
> +
> +/**
> + * Local post process function prototype that same as process function prototype
> + * as rte_ipsec_sa_pkt_func's process().
> + */
> +typedef uint16_t (*sync_crypto_post_process)(const struct rte_ipsec_session *ss,
> +				struct rte_mbuf *mb[],
> +				uint16_t num);

Stylish thing: typdef newtype_t ....

> +static uint16_t
> +esp_outb_tun_sync_crypto_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num,
> +		sync_crypto_post_process post_process)
> +{
> +	uint64_t sqn;
> +	rte_be64_t sqc;
> +	struct rte_ipsec_sa *sa;
> +	struct rte_security_ctx *ctx;
> +	struct rte_security_session *rss;
> +	union sym_op_data icv;
> +	struct rte_security_vec buf[num];
> +	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
> +	uint32_t vec_idx = 0;
> +	void *aad[num];
> +	void *digest[num];
> +	void *iv[num];
> +	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
> +	uint64_t ivp[IPSEC_MAX_IV_QWORD];

Why do we need both ivs and ivp?

> +	int status[num];
> +	uint32_t dr[num];
> +	uint32_t i, n, k;
> +	int32_t rc;
> +
> +	sa = ss->sa;
> +	ctx = ss->security.ctx;
> +	rss = ss->security.ses;
> +
> +	k = 0;
> +	n = num;
> +	sqn = esn_outb_update_sqn(sa, &n);
> +	if (n != num)
> +		rte_errno = EOVERFLOW;
> +
> +	for (i = 0; i != n; i++) {
> +		sqc = rte_cpu_to_be_64(sqn + i);
> +		gen_iv(ivp, sqc);
> +
> +		/* try to update the packet itself */
> +		rc = outb_tun_pkt_prepare(sa, sqc, ivp, mb[i], &icv,
> +				sa->sqh_len);
> +
> +		/* success, setup crypto op */
> +		if (rc >= 0) {
> +			outb_pkt_xprepare(sa, sqc, &icv);

We probably need something like outb_pkt_sync_xprepare(sa, sqc, &aad[i]); here.
To avoid using space in the packet for aad.

> +
> +			iv[k] = (void *)ivs[k];

Do we really need type conversion here?

> +			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
> +					0, rc, &buf[k], &vec[vec_idx], iv[k],
> +					&aad[k], &digest[k]);



> +			if (rc < 0) {
> +				dr[i - k] = i;
> +				rte_errno = -rc;
> +				continue;
> +			}
> +
> +			vec_idx += rc;
> +			k++;
> +		/* failure, put packet into the death-row */
> +		} else {
> +			dr[i - k] = i;
> +			rte_errno = -rc;
> +		}
> +	}
> +
> +	 /* copy not prepared mbufs beyond good ones */
> +	if (k != n && k != 0)
> +		move_bad_mbufs(mb, dr, n, n - k);
> +
> +	if (unlikely(k == 0)) {

I don't think 'unlikely' will make any difference here here.

> +		rte_errno = EBADMSG;
> +		return 0;
> +	}
> +
> +	/* process the packets */
> +	n = 0;
> +	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad, digest,
> +			status, k);

Looking at the code below, I think it will be plausible to make 
rte_security_process_cpu_crypto_bulk() to return number of failures
(or number of succese).

> +	/* move failed process packets to dr */
> +	for (i = 0; i < n; i++) {

That loop will never be executed.
Should be i < k.

> +		if (status[i])
> +			dr[n++] = i;

Forgot to set rte_errno.

> +	}
> +
> +	if (n)

if (n != 0 && n != k)

> +		move_bad_mbufs(mb, dr, k, n);
> +
> +	return post_process(ss, mb, k - n);
> +}
> +
> +static uint16_t
> +esp_outb_trs_sync_crypto_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num,
> +		sync_crypto_post_process post_process)
> +
> +{
> +	uint64_t sqn;
> +	rte_be64_t sqc;
> +	struct rte_ipsec_sa *sa;
> +	struct rte_security_ctx *ctx;
> +	struct rte_security_session *rss;
> +	union sym_op_data icv;
> +	struct rte_security_vec buf[num];
> +	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
> +	uint32_t vec_idx = 0;
> +	void *aad[num];
> +	void *digest[num];
> +	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
> +	void *iv[num];
> +	int status[num];
> +	uint64_t ivp[IPSEC_MAX_IV_QWORD];
> +	uint32_t dr[num];
> +	uint32_t i, n, k;
> +	uint32_t l2, l3;
> +	int32_t rc;
> +
> +	sa = ss->sa;
> +	ctx = ss->security.ctx;
> +	rss = ss->security.ses;
> +
> +	k = 0;
> +	n = num;
> +	sqn = esn_outb_update_sqn(sa, &n);
> +	if (n != num)
> +		rte_errno = EOVERFLOW;
> +
> +	for (i = 0; i != n; i++) {
> +		l2 = mb[i]->l2_len;
> +		l3 = mb[i]->l3_len;
> +
> +		sqc = rte_cpu_to_be_64(sqn + i);
> +		gen_iv(ivp, sqc);
> +
> +		/* try to update the packet itself */
> +		rc = outb_trs_pkt_prepare(sa, sqc, ivp, mb[i], l2, l3, &icv,
> +				sa->sqh_len);
> +
> +		/* success, setup crypto op */
> +		if (rc >= 0) {
> +			outb_pkt_xprepare(sa, sqc, &icv);
> +
> +			iv[k] = (void *)ivs[k];
> +
> +			rc = outb_sync_crypto_proc_prepare(mb[i], sa, ivp, &icv,
> +					l2 + l3, rc, &buf[k], &vec[vec_idx],
> +					iv[k], &aad[k], &digest[k]);
> +			if (rc < 0) {
> +				dr[i - k] = i;
> +				rte_errno = -rc;
> +				continue;
> +			}
> +
> +			vec_idx += rc;
> +			k++;
> +		/* failure, put packet into the death-row */
> +		} else {
> +			dr[i - k] = i;
> +			rte_errno = -rc;
> +		}
> +	}
> +
> +	 /* copy not prepared mbufs beyond good ones */
> +	if (k != n && k != 0)
> +		move_bad_mbufs(mb, dr, n, n - k);


You don't really need to do it here.
Just one such thing at the very end should be enough.

> +
> +	/* process the packets */
> +	n = 0;
> +	rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad, digest,
> +			status, k);
> +	/* move failed process packets to dr */
> +	for (i = 0; i < k; i++) {
> +		if (status[i])
> +			dr[n++] = i;
> +	}
> +
> +	if (n)
> +		move_bad_mbufs(mb, dr, k, n);
> +
> +	return post_process(ss, mb, k - n);
> +}
> +
> +uint16_t
> +esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	return esp_outb_tun_sync_crypto_process(ss, mb, num,
> +			esp_outb_sqh_process);

esp_outb_sqh_process() relies on PKT_RX_SEC_OFFLOAD_FAILED been set
in mb->ol_flags for failed packets.
At first for _sync_ case no-one will set it for you.
Second - for _sync_ you don't really need that it is just an extra overhead here.
So I think you can't reuse this function without some modifications here.
Probably easier to make a new one (and extract some common code into
another helper function that esp_out_sqh_process and new one can call).

> +}
> +
> +uint16_t
> +esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	return esp_outb_tun_sync_crypto_process(ss, mb, num,
> +			esp_outb_pkt_flag_process);

Same as above, plus the fact that you made esp_outb_pkt_flag_process()
is not a static one, so compiler wouldn't be able to inline it.

> +}
> +
> +uint16_t
> +esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	return esp_outb_trs_sync_crypto_process(ss, mb, num,
> +			esp_outb_sqh_process);
> +}
> +
> +uint16_t
> +esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	return esp_outb_trs_sync_crypto_process(ss, mb, num,
> +			esp_outb_pkt_flag_process);
> +}
> +
>  /*
>   * process outbound packets for SA with ESN support,
>   * for algorithms that require SQN.hibits to be implictly included
> @@ -410,8 +696,8 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
>   * In that case we have to move ICV bytes back to their proper place.
>   */
>  uint16_t
> -esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
> -	uint16_t num)
> +esp_outb_sqh_process(const struct rte_ipsec_session *ss,
> +	struct rte_mbuf *mb[], uint16_t num)

Any purpose for that change?

>  {
>  	uint32_t i, k, icv_len, *icv;
>  	struct rte_mbuf *ml;
> diff --git a/lib/librte_ipsec/sa.c b/lib/librte_ipsec/sa.c
> index 23d394b46..31ffbce2c 100644
> --- a/lib/librte_ipsec/sa.c
> +++ b/lib/librte_ipsec/sa.c
> @@ -544,9 +544,9 @@ lksd_proto_prepare(const struct rte_ipsec_session *ss,
>   * - inbound/outbound for RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
>   * - outbound for RTE_SECURITY_ACTION_TYPE_NONE when ESN is disabled
>   */
> -static uint16_t
> -pkt_flag_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
> -	uint16_t num)
> +uint16_t
> +esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)


Why to rename this function?
As comment above it states, the function is used for both inbound and outbound
code path. 
Such renaming seems misleading to me.

>  {
>  	uint32_t i, k;
>  	uint32_t dr[num];
> @@ -599,12 +599,48 @@ lksd_none_pkt_func_select(const struct rte_ipsec_sa *sa,
>  	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
>  		pf->prepare = esp_outb_tun_prepare;
>  		pf->process = (sa->sqh_len != 0) ?
> -			esp_outb_sqh_process : pkt_flag_process;
> +			esp_outb_sqh_process : esp_outb_pkt_flag_process;
>  		break;
>  	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
>  		pf->prepare = esp_outb_trs_prepare;
>  		pf->process = (sa->sqh_len != 0) ?
> -			esp_outb_sqh_process : pkt_flag_process;
> +			esp_outb_sqh_process : esp_outb_pkt_flag_process;
> +		break;
> +	default:
> +		rc = -ENOTSUP;
> +	}
> +
> +	return rc;
> +}
> +
> +static int
> +lksd_sync_crypto_pkt_func_select(const struct rte_ipsec_sa *sa,
> +		struct rte_ipsec_sa_pkt_func *pf)

As a nit: probably no point to have lksd_prefix for _sync_ functions.

> +{
> +	int32_t rc;
> +
> +	static const uint64_t msk = RTE_IPSEC_SATP_DIR_MASK |
> +			RTE_IPSEC_SATP_MODE_MASK;
> +
> +	rc = 0;
> +	switch (sa->type & msk) {
> +	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV4):
> +	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV6):
> +		pf->process = esp_inb_tun_sync_crypto_pkt_process;
> +		break;
> +	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TRANS):
> +		pf->process = esp_inb_trs_sync_crypto_pkt_process;
> +		break;
> +	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV4):
> +	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
> +		pf->process = (sa->sqh_len != 0) ?
> +			esp_outb_tun_sync_crpyto_sqh_process :
> +			esp_outb_tun_sync_crpyto_flag_process;
> +		break;
> +	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
> +		pf->process = (sa->sqh_len != 0) ?
> +			esp_outb_trs_sync_crpyto_sqh_process :
> +			esp_outb_trs_sync_crpyto_flag_process;
>  		break;
>  	default:
>  		rc = -ENOTSUP;
> @@ -672,13 +708,16 @@ ipsec_sa_pkt_func_select(const struct rte_ipsec_session *ss,
>  	case RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL:
>  		if ((sa->type & RTE_IPSEC_SATP_DIR_MASK) ==
>  				RTE_IPSEC_SATP_DIR_IB)
> -			pf->process = pkt_flag_process;
> +			pf->process = esp_outb_pkt_flag_process;
>  		else
>  			pf->process = inline_proto_outb_pkt_process;
>  		break;
>  	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
>  		pf->prepare = lksd_proto_prepare;
> -		pf->process = pkt_flag_process;
> +		pf->process = esp_outb_pkt_flag_process;
> +		break;
> +	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
> +		rc = lksd_sync_crypto_pkt_func_select(sa, pf);
>  		break;
>  	default:
>  		rc = -ENOTSUP;
> diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
> index 51e69ad05..02c7abc60 100644
> --- a/lib/librte_ipsec/sa.h
> +++ b/lib/librte_ipsec/sa.h
> @@ -156,6 +156,14 @@ uint16_t
>  inline_inb_trs_pkt_process(const struct rte_ipsec_session *ss,
>  	struct rte_mbuf *mb[], uint16_t num);
> 
> +uint16_t
> +esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
> +uint16_t
> +esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
>  /* outbound processing */
> 
>  uint16_t
> @@ -170,6 +178,10 @@ uint16_t
>  esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
>  	uint16_t num);
> 
> +uint16_t
> +esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
> +	struct rte_mbuf *mb[], uint16_t num);
> +
>  uint16_t
>  inline_outb_tun_pkt_process(const struct rte_ipsec_session *ss,
>  	struct rte_mbuf *mb[], uint16_t num);
> @@ -182,4 +194,21 @@ uint16_t
>  inline_proto_outb_pkt_process(const struct rte_ipsec_session *ss,
>  	struct rte_mbuf *mb[], uint16_t num);
> 
> +uint16_t
> +esp_outb_tun_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
> +uint16_t
> +esp_outb_tun_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
> +uint16_t
> +esp_outb_trs_sync_crpyto_sqh_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
> +uint16_t
> +esp_outb_trs_sync_crpyto_flag_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num);
> +
> +
>  #endif /* _SA_H_ */


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-25 18:24                       ` Ananyev, Konstantin
@ 2019-09-27  9:26                         ` Akhil Goyal
  2019-09-30 12:22                           ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-27  9:26 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph

Hi Konstantin,

> -----Original Message-----
> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> Sent: Wednesday, September 25, 2019 11:54 PM
> To: Akhil Goyal <akhil.goyal@nxp.com>; 'dev@dpdk.org' <dev@dpdk.org>; De
> Lara Guarch, Pablo <pablo.de.lara.guarch@intel.com>; 'Thomas Monjalon'
> <thomas@monjalon.net>
> Cc: Zhang, Roy Fan <roy.fan.zhang@intel.com>; Doherty, Declan
> <declan.doherty@intel.com>; 'Anoob Joseph' <anoobj@marvell.com>
> Subject: RE: [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
> 
> 
> > > > > > > > > > This action type allows the burst of symmetric crypto workload
> using
> > > > the
> > > > > > > > same
> > > > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > > > synchronously.
> > > > > > > > > > This flexible action type does not require external hardware
> > > > involvement,
> > > > > > > > > > having the crypto workload processed synchronously, and is
> more
> > > > > > > > performant
> > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed
> "async
> > > > > > mode
> > > > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > > > >
> > > > > > > > > Does that mean application will not call the
> cryptodev_enqueue_burst
> > > > and
> > > > > > > > corresponding dequeue burst.
> > > > > > > >
> > > > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > > > >
> > > > > > > > > It would be a new API something like process_packets and it will
> have
> > > > the
> > > > > > > > crypto processed packets while returning from the API?
> > > > > > > >
> > > > > > > > Yes, though the plan is that API will operate on raw data buffers,
> not
> > > > mbufs.
> > > > > > > >
> > > > > > > > >
> > > > > > > > > I still do not understand why we cannot do with the conventional
> > > > crypto lib
> > > > > > > > only.
> > > > > > > > > As far as I can understand, you are not doing any protocol
> processing
> > > > or
> > > > > > any
> > > > > > > > value add
> > > > > > > > > To the crypto processing. IMO, you just need a synchronous
> crypto
> > > > > > processing
> > > > > > > > API which
> > > > > > > > > Can be defined in cryptodev, you don't need to re-create a crypto
> > > > session
> > > > > > in
> > > > > > > > the name of
> > > > > > > > > Security session in the driver just to do a synchronous processing.
> > > > > > > >
> > > > > > > > I suppose your question is why not to have
> > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > The main reason is that would require disruptive changes in existing
> > > > > > cryptodev
> > > > > > > > API
> > > > > > > > (would cause ABI/API breakage).
> > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need
> some
> > > > extra
> > > > > > > > information
> > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > (cipher offset from the start of the buffer, might be something extra
> in
> > > > > > future).
> > > > > > >
> > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > >
> > > > > > fill/read (+ alloc/free) is one of the main things that slowdown current
> > > > crypto-op
> > > > > > approach.
> > > > > > That's why the general idea - have all data that wouldn't change from
> packet
> > > > to
> > > > > > packet
> > > > > > included into the session and setup it once at session_init().
> > > > >
> > > > > I agree that you cannot use crypto-op.
> > > > > You can have the new API in crypto.
> > > > > As per the current patch, you only need cipher_offset which you can have
> it as
> > > > a parameter until
> > > > > You get it approved in the crypto xform. I believe it will be beneficial in
> case of
> > > > other crypto cases as well.
> > > > > We can have cipher offset at both places(crypto-op and cipher_xform). It
> will
> > > > give flexibility to the user to
> > > > > override it.
> > > >
> > > > After having another thought on your proposal:
> > > > Probably we can introduce new rte_crypto_sym_xform_types for CPU
> related
> > > > stuff here?
> > >
> > > I also thought of adding new xforms, but that wont serve the purpose for
> may be all the cases.
> > > You would be needing all information currently available in the current
> xforms.
> > > So if you are adding new fields in the new xform, the size will be more than
> that of the union of xforms.
> > > ABI breakage would still be there.
> > >
> > > If you think a valid compression of the AEAD xform can be done, then that
> can be done for each of the
> > > Xforms and we can have a solution to this issue.
> >
> > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > So for now we can make that path work without any ABI breakage.
> > Fan, please feel free to correct me here, if I missed something.
> > If in future we would need to add some extra information it might
> > require ABI breakage, though by now I don't envision anything particular to
> add.
> > Anyway, if there is no objection to go that way, we can try to make
> > these changes for v2.
> >
> 
> Actually, after looking at it more deeply it appears not that easy as I thought it
> would be :)
> Below is a very draft version of proposed API additions.
> I think it avoids ABI breakages right now and provides enough flexibility for
> future extensions (if any).
> For now, it doesn't address your comments about naming conventions (_CPU_
> vs _SYNC_) , etc.
> but I suppose is comprehensive enough to provide a main idea beyond it.
> Akhil and other interested parties, please try to review and provide feedback
> ASAP,
> as related changes would take some time and we still like to hit 19.11 deadline.
> Konstantin
> 
>  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> b/lib/librte_cryptodev/rte_crypto_sym.h
> index bc8da2466..c03069e23 100644
> --- a/lib/librte_cryptodev/rte_crypto_sym.h
> +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
>   *
>   * This structure contains data relating to Cipher (Encryption and Decryption)
>   *  use to create a session.
> + * Actually I was wrong saying that we don't have free space inside xforms.
> + * Making key struct packed (see below) allow us to regain 6B that could be
> + * used for future extensions.
>   */
>  struct rte_crypto_cipher_xform {
>         enum rte_crypto_cipher_operation op;
> @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
>         struct {
>                 const uint8_t *data;    /**< pointer to key data */
>                 uint16_t length;        /**< key length in bytes */
> -       } key;
> +       } __attribute__((__packed__)) key;
> +
> +       /**
> +         * offset for cipher to start within user provided data buffer.
> +        * Fan suggested another (and less space consuming way) -
> +         * reuse iv.offset space below, by changing:
> +        * struct {uint16_t offset, length;} iv;
> +        * to uunamed union:
> +        * union {
> +        *      struct {uint16_t offset, length;} iv;
> +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> +        * };
> +        * Both approaches seems ok to me in general.

No strong opinions here. OK with this one.

> +        * Comments/suggestions are welcome.
> +         */
> +       uint16_t offset;
> +
> +       uint8_t reserved1[4];
> +
>         /**< Cipher key
>          *
>          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data will
> @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
>         struct {
>                 const uint8_t *data;    /**< pointer to key data */
>                 uint16_t length;        /**< key length in bytes */
> -       } key;
> +       } __attribute__((__packed__)) key;
>         /**< Authentication key data.
>          * The authentication key length MUST be less than or equal to the
>          * block size of the algorithm. It is the callers responsibility to
> @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
>          * (for example RFC 2104, FIPS 198a).
>          */
> 
> +       uint8_t reserved1[6];
> +
>         struct {
>                 uint16_t offset;
>                 /**< Starting point for Initialisation Vector or Counter,
> @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
>         struct {
>                 const uint8_t *data;    /**< pointer to key data */
>                 uint16_t length;        /**< key length in bytes */
> -       } key;
> +       } __attribute__((__packed__)) key;
> +
> +       /** offset for cipher to start within data buffer */
> +       uint16_t cipher_offset;
> +
> +       uint8_t reserved1[4];
> 
>         struct {
>                 uint16_t offset;
> diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> b/lib/librte_cryptodev/rte_cryptodev.h
> index e175b838c..c0c7bfed7 100644
> --- a/lib/librte_cryptodev/rte_cryptodev.h
> +++ b/lib/librte_cryptodev/rte_cryptodev.h
> @@ -1272,6 +1272,101 @@ void *
>  rte_cryptodev_sym_session_get_user_data(
>                                         struct rte_cryptodev_sym_session *sess);
> 
> +/*
> + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> + * into existing rte_crypto_sym_session structure/API, but instead
> + * introduce an extentsion to it via new fully opaque
> + * struct rte_crypto_cpu_sym_session and additional related API.


What all things do we need to squeeze?
In this proposal I do not see the new struct cpu_sym_session  defined here.
I believe you will have same lib API/struct for cpu_sym_session  and sym_session.
I am not sure if that would be needed.
It would be internal to the driver that if synchronous processing is supported(from feature flag) and
Have relevant fields in xform(the newly added ones which are packed as per your suggestions) set,
It will create that type of session.


> + * Main points:
> + * - Current crypto-dev API is reasonably mature and it is desirable
> + *   to keep it unchanged (API/ABI stability). From other side, this
> + *   new sync API is new one and probably would require extra changes.
> + *   Having it as a new one allows to mark it as experimental, without
> + *   affecting existing one.
> + * - Fully opaque cpu_sym_session structure gives more flexibility
> + *   to the PMD writers and again allows to avoid ABI breakages in future.
> + * - process() function per set of xforms
> + *   allows to expose different process() functions for different
> + *   xform combinations. PMD writer can decide, does he wants to
> + *   push all supported algorithms into one process() function,
> + *   or spread it across several ones.
> + *   I.E. More flexibility for PMD writer.

Which process function should be chosen is internal to PMD, how would that info
be visible to the application or the library. These will get stored in the session private
data. It would be upto the PMD writer, to store the per session process function in
the session private data.

Process function would be a dev ops just like enc/deq operations and it should call
The respective process API stored in the session private data.

I am not sure if you would need a new session init API for this as nothing would be visible to
the app or lib.

> + * - Not storing process() pointer inside the session -
> + *   Allows user to choose does he want to store a process() pointer
> + *   per session, or per group of sessions for that device that share
> + *   the same input xforms. I.E. extra flexibility for the user,
> + *   plus allows us to keep cpu_sym_session totally opaque, see above.

If multiple sessions need to be processed via the same process function, 
PMD would save the same process in all the sessions, I don't think there would
be any perf overhead with that.

> + * Sketched usage model:
> + * ....
> + * /* control path, alloc/init session */
> + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> + * rte_crypto_cpu_sym_process_t process =
> + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> + * ...
> + * /* data-path*/
> + * process(ses, ....);
> + * ....
> + * /* control path, termiante/free session */
> + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> + */
> +
> +/**
> + * vector structure, contains pointer to vector array and the length
> + * of the array
> + */
> +struct rte_crypto_vec {
> +       struct iovec *vec;
> +       uint32_t num;
> +};
> +
> +/*
> + * Data-path bulk process crypto function.
> + */
> +typedef void (*rte_crypto_cpu_sym_process_t)(
> +               struct rte_crypto_cpu_sym_session *sess,
> +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> +               void *digest[], int status[], uint32_t num);
> +/*
> + * for given device return process function specific to input xforms
> + * on error - return NULL and set rte_errno value.
> + * Note that for same input xfroms for the same device should return
> + * the same process function.
> + */
> +__rte_experimental
> +rte_crypto_cpu_sym_process_t
> +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
> +/*
> + * Return required session size in bytes for given set of xforms.
> + * if xforms == NULL, then return the max possible session size,
> + * that would fit session for any supported by the device algorithm.
> + * if CPU mode is not supported at all, or requeted in xform
> + * algorithm is not supported, then return -ENOTSUP.
> + */
> +__rte_experimental
> +int
> +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
> +/*
> + * Initialize session.
> + * It is caller responsibility to allocate enough space for it.
> + * See rte_crypto_cpu_sym_session_size above.
> + */
> +__rte_experimental
> +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> +                       struct rte_crypto_cpu_sym_session *sess,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
> +__rte_experimental
> +void
> +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> +                       struct rte_crypto_cpu_sym_session *sess);
> +
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> index defe05ea0..ed7e63fab 100644
> --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> @@ -310,6 +310,20 @@ typedef void (*cryptodev_sym_free_session_t)(struct
> rte_cryptodev *dev,
>  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *dev,
>                 struct rte_cryptodev_asym_session *sess);
> 
> +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev *dev,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
> +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev *dev,
> +                       struct rte_crypto_cpu_sym_session *sess,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
> +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev *dev,
> +                       struct rte_crypto_cpu_sym_session *sess);
> +
> +typedef rte_crypto_cpu_sym_process_t (*cryptodev_cpu_sym_session_func_t)
> (
> +                       struct rte_cryptodev *dev,
> +                       const struct rte_crypto_sym_xform *xforms);
> +
>  /** Crypto device operations function pointer table */
>  struct rte_cryptodev_ops {
>         cryptodev_configure_t dev_configure;    /**< Configure device. */
> @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
>         /**< Clear a Crypto sessions private data. */
>         cryptodev_asym_free_session_t asym_session_clear;
>         /**< Clear a Crypto sessions private data. */
> +
> +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
>  };
> 
> 
> 


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
  2019-09-26 23:20     ` Ananyev, Konstantin
@ 2019-09-27 10:38     ` Ananyev, Konstantin
  1 sibling, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-27 10:38 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

Hi Fan,

> 
> This patch updates the ipsec library to handle the newly introduced
> RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  lib/librte_ipsec/esp_inb.c  | 174 +++++++++++++++++++++++++-
>  lib/librte_ipsec/esp_outb.c | 290 +++++++++++++++++++++++++++++++++++++++++++-
>  lib/librte_ipsec/sa.c       |  53 ++++++--
>  lib/librte_ipsec/sa.h       |  29 +++++
>  lib/librte_ipsec/ses.c      |   4 +-
>  5 files changed, 539 insertions(+), 11 deletions(-)
> 
> diff --git a/lib/librte_ipsec/esp_inb.c b/lib/librte_ipsec/esp_inb.c
> index 8e3ecbc64..6077dcb1e 100644
> --- a/lib/librte_ipsec/esp_inb.c
> +++ b/lib/librte_ipsec/esp_inb.c
> @@ -105,6 +105,73 @@ inb_cop_prepare(struct rte_crypto_op *cop,
>  	}
>  }
> 
> +static inline int
> +inb_sync_crypto_proc_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
> +	const union sym_op_data *icv, uint32_t pofs, uint32_t plen,
> +	struct rte_security_vec *buf, struct iovec *cur_vec,
> +	void *iv, void **aad, void **digest)
> +{
> +	struct rte_mbuf *ms;
> +	struct iovec *vec = cur_vec;
> +	struct aead_gcm_iv *gcm;
> +	struct aesctr_cnt_blk *ctr;
> +	uint64_t *ivp;
> +	uint32_t algo, left, off = 0, n_seg = 0;

Same thing as for outbound pls keep definitions and assignments separated.

> +
> +	ivp = rte_pktmbuf_mtod_offset(mb, uint64_t *,
> +		pofs + sizeof(struct rte_esp_hdr));
> +	algo = sa->algo_type;
> +
> +	switch (algo) {
> +	case ALGO_TYPE_AES_GCM:
> +		gcm = (struct aead_gcm_iv *)iv;
> +		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
> +		*aad = icv->va + sa->icv_len;
> +		off = sa->ctp.cipher.offset + pofs;
> +		break;
> +	case ALGO_TYPE_AES_CBC:
> +	case ALGO_TYPE_3DES_CBC:
> +		off = sa->ctp.auth.offset + pofs;
> +		break;
> +	case ALGO_TYPE_AES_CTR:
> +		off = sa->ctp.auth.offset + pofs;
> +		ctr = (struct aesctr_cnt_blk *)iv;
> +		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
> +		break;
> +	case ALGO_TYPE_NULL:
> +		break;
> +	}
> +
> +	*digest = icv->va;
> +
> +	left = plen - sa->ctp.cipher.length;
> +
> +	ms = mbuf_get_seg_ofs(mb, &off);
> +	if (!ms)
> +		return -1;

Same as for outbound: I think no need to check/return failure.
This function could be split into two.

> +
> +	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {


Same thing - we shouldn't limt ourselves to 5 segs per packet.
Pretty much same comments about code restructuring as for outbound case.

> +		uint32_t len = RTE_MIN(left, ms->data_len - off);
> +
> +		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
> +		vec->iov_len = len;
> +
> +		left -= len;
> +		vec++;
> +		n_seg++;
> +		ms = ms->next;
> +		off = 0;
> +	}
> +
> +	if (left)
> +		return -1;
> +
> +	buf->vec = cur_vec;
> +	buf->num = n_seg;
> +
> +	return n_seg;
> +}
> +
>  /*
>   * Helper function for prepare() to deal with situation when
>   * ICV is spread by two segments. Tries to move ICV completely into the
> @@ -512,7 +579,6 @@ tun_process(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
>  	return k;
>  }
> 
> -
>  /*
>   * *process* function for tunnel packets
>   */
> @@ -625,6 +691,112 @@ esp_inb_pkt_process(struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
>  	return n;
>  }
> 
> +/*
> + * process packets using sync crypto engine
> + */
> +static uint16_t
> +esp_inb_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num, uint8_t sqh_len,
> +		esp_inb_process_t process)
> +{
> +	int32_t rc;
> +	uint32_t i, k, hl, n, p;
> +	struct rte_ipsec_sa *sa;
> +	struct replay_sqn *rsn;
> +	union sym_op_data icv;
> +	uint32_t sqn[num];
> +	uint32_t dr[num];
> +	struct rte_security_vec buf[num];
> +	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
> +	uint32_t vec_idx = 0;
> +	uint8_t ivs[num][IPSEC_MAX_IV_SIZE];
> +	void *iv[num];
> +	void *aad[num];
> +	void *digest[num];
> +	int status[num];
> +
> +	sa = ss->sa;
> +	rsn = rsn_acquire(sa);
> +
> +	k = 0;
> +	for (i = 0; i != num; i++) {
> +		hl = mb[i]->l2_len + mb[i]->l3_len;
> +		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, &icv);
> +		if (rc >= 0) {
> +			iv[k] = (void *)ivs[k];
> +			rc = inb_sync_crypto_proc_prepare(sa, mb[i], &icv, hl,
> +					rc, &buf[k], &vec[vec_idx], iv[k],
> +					&aad[k], &digest[k]);
> +			if (rc < 0) {
> +				dr[i - k] = i;
> +				continue;
> +			}
> +
> +			vec_idx += rc;
> +			k++;
> +		} else
> +			dr[i - k] = i;
> +	}
> +
> +	/* copy not prepared mbufs beyond good ones */
> +	if (k != num) {
> +		rte_errno = EBADMSG;
> +
> +		if (unlikely(k == 0))
> +			return 0;
> +
> +		move_bad_mbufs(mb, dr, num, num - k);
> +	}
> +
> +	/* process the packets */
> +	n = 0;
> +	rte_security_process_cpu_crypto_bulk(ss->security.ctx,
> +			ss->security.ses, buf, iv, aad, digest, status,
> +			k);
> +	/* move failed process packets to dr */
> +	for (i = 0; i < k; i++) {
> +		if (status[i]) {
> +			dr[n++] = i;
> +			rte_errno = EBADMSG;
> +		}
> +	}
> +
> +	/* move bad packets to the back */
> +	if (n)
> +		move_bad_mbufs(mb, dr, k, n);

I don't think you need to set dr[] here and call that function, see below.

> +
> +	/* process packets */
> +	p = process(sa, mb, sqn, dr, k - n, sqh_len);

tun_process(), etc. expects PKT_RX_SEC_OFFLOAD_FAILED to be set in mb->ol_flags
for failed packets.
So you either need to set this value in ol_flags based on status,
or tweak existing process functions, or introduce new ones.


> +
> +	if (p != k - n && p != 0)
> +		move_bad_mbufs(mb, dr, k - n, k - n - p);
> +
> +	if (p != num)
> +		rte_errno = EBADMSG;
> +
> +	return p;
> +}
> +
> +uint16_t
> +esp_inb_tun_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	struct rte_ipsec_sa *sa = ss->sa;
> +
> +	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
> +			tun_process);
> +}
> +
> +uint16_t
> +esp_inb_trs_sync_crypto_pkt_process(const struct rte_ipsec_session *ss,
> +		struct rte_mbuf *mb[], uint16_t num)
> +{
> +	struct rte_ipsec_sa *sa = ss->sa;
> +
> +	return esp_inb_sync_crypto_pkt_process(ss, mb, num, sa->sqh_len,
> +			trs_process);
> +}
> +
>  /*
>   * process group of ESP inbound tunnel packets.
>   */

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-06 13:13   ` [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API Fan Zhang
  2019-09-18 12:45     ` Ananyev, Konstantin
@ 2019-09-29  6:00     ` Hemant Agrawal
  2019-09-29 16:59       ` Ananyev, Konstantin
  1 sibling, 1 reply; 84+ messages in thread
From: Hemant Agrawal @ 2019-09-29  6:00 UTC (permalink / raw)
  To: Fan Zhang, dev; +Cc: konstantin.ananyev, declan.doherty, Akhil Goyal

Some comments inline.

On 06-Sep-19 6:43 PM, Fan Zhang wrote:
> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
> security library. The type represents performing crypto operation with CPU
> cycles. The patch also includes a new API to process crypto operations in
> bulk and the function pointers for PMDs.
>
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>   lib/librte_security/rte_security.c           | 16 +++++++++
>   lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
>   lib/librte_security/rte_security_driver.h    | 19 +++++++++++
>   lib/librte_security/rte_security_version.map |  1 +
>   4 files changed, 86 insertions(+), 1 deletion(-)
>
> diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
> index bc81ce15d..0f85c1b59 100644
> --- a/lib/librte_security/rte_security.c
> +++ b/lib/librte_security/rte_security.c
> @@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
>   
>   	return NULL;
>   }
> +
> +void
> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	uint32_t i;
> +
> +	for (i = 0; i < num; i++)
> +		status[i] = -1;
> +
> +	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
> +	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
> +			aad, digest, status, num);
> +}
> diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
> index 96806e3a2..5a0f8901b 100644
> --- a/lib/librte_security/rte_security.h
> +++ b/lib/librte_security/rte_security.h
> @@ -18,6 +18,7 @@ extern "C" {
>   #endif
>   
>   #include <sys/types.h>
> +#include <sys/uio.h>
>   
>   #include <netinet/in.h>
>   #include <netinet/ip.h>
> @@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
>   	uint32_t hfn_threshold;
>   };
>   
> +struct rte_security_cpu_crypto_xform {
> +	/** For cipher/authentication crypto operation the authentication may
> +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
> +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
> +	 * header but whole packet (apart from MAC header) is authenticated.
> +	 * The cipher_offset field is used to deduct the cipher data pointer
> +	 * from the buffer to be processed.
> +	 *
> +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
> +	 * uses the same offset for cipher and authentication.
> +	 */
> +	int32_t cipher_offset;
> +};
> +
>   /**
>    * Security session action type.
>    */
> @@ -286,10 +301,14 @@ enum rte_security_session_action_type {
>   	/**< All security protocol processing is performed inline during
>   	 * transmission
>   	 */
> -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
>   	/**< All security protocol processing including crypto is performed
>   	 * on a lookaside accelerator
>   	 */
> +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> +	/**< Crypto processing for security protocol is processed by CPU
> +	 * synchronously
> +	 */
though you are naming it cpu crypto, but it is more like raw packet 
crypto, where you want to skip mbuf/crypto ops and directly wants to 
work on raw buffer.
>   };
>   
>   /** Security session protocol definition */
> @@ -315,6 +334,7 @@ struct rte_security_session_conf {
>   		struct rte_security_ipsec_xform ipsec;
>   		struct rte_security_macsec_xform macsec;
>   		struct rte_security_pdcp_xform pdcp;
> +		struct rte_security_cpu_crypto_xform cpucrypto;
>   	};
>   	/**< Configuration parameters for security session */
>   	struct rte_crypto_sym_xform *crypto_xform;
> @@ -639,6 +659,35 @@ const struct rte_security_capability *
>   rte_security_capability_get(struct rte_security_ctx *instance,
>   			    struct rte_security_capability_idx *idx);
>   
> +/**
> + * Security vector structure, contains pointer to vector array and the length
> + * of the array
> + */
> +struct rte_security_vec {
> +	struct iovec *vec;
> +	uint32_t num;
> +};
> +

Just wondering if you want to change it to *in_vec and *out_vec, that 
will be helpful in future, if the out-of-place processing is required 
for CPU usecase as well?

> +/**
> + * Processing bulk crypto workload with CPU
> + *
> + * @param	instance	security instance.
> + * @param	sess		security session
> + * @param	buf		array of buffer SGL vectors
> + * @param	iv		array of IV pointers
> + * @param	aad		array of AAD pointers
> + * @param	digest		array of digest pointers
> + * @param	status		array of status for the function to return
> + * @param	num		number of elements in each array
> + *
> + */
> +__rte_experimental
> +void
> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +

Why not make the return as int, to indicate whether this API completely 
failed or processed or have some valid status to look into?


>   #ifdef __cplusplus
>   }
>   #endif
> diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
> index 1b561f852..70fcb0c26 100644
> --- a/lib/librte_security/rte_security_driver.h
> +++ b/lib/librte_security/rte_security_driver.h
> @@ -132,6 +132,23 @@ typedef int (*security_get_userdata_t)(void *device,
>   typedef const struct rte_security_capability *(*security_capabilities_get_t)(
>   		void *device);
>   
> +/**
> + * Process security operations in bulk using CPU accelerated method.
> + *
> + * @param	sess		Security session structure.
> + * @param	buf		Buffer to the vectors to be processed.
> + * @param	iv		IV pointers.
> + * @param	aad		AAD pointers.
> + * @param	digest		Digest pointers.
> + * @param	status		Array of status value.
> + * @param	num		Number of elements in each array.
> + */
> +
> +typedef void (*security_process_cpu_crypto_bulk_t)(
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>   /** Security operations function pointer table */
>   struct rte_security_ops {
>   	security_session_create_t session_create;
> @@ -150,6 +167,8 @@ struct rte_security_ops {
>   	/**< Get userdata associated with session which processed the packet. */
>   	security_capabilities_get_t capabilities_get;
>   	/**< Get security capabilities. */
> +	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
> +	/**< Process data in bulk. */
>   };
>   
>   #ifdef __cplusplus
> diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
> index 53267bf3c..2132e7a00 100644
> --- a/lib/librte_security/rte_security_version.map
> +++ b/lib/librte_security/rte_security_version.map
> @@ -18,4 +18,5 @@ EXPERIMENTAL {
>   	rte_security_get_userdata;
>   	rte_security_session_stats_get;
>   	rte_security_session_update;
> +	rte_security_process_cpu_crypto_bulk;
>   };

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-29  6:00     ` Hemant Agrawal
@ 2019-09-29 16:59       ` Ananyev, Konstantin
  2019-09-30  9:43         ` Hemant Agrawal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-29 16:59 UTC (permalink / raw)
  To: Hemant Agrawal, Zhang, Roy Fan, dev; +Cc: Doherty, Declan, Akhil Goyal

Hi Hemant,

> 
> On 06-Sep-19 6:43 PM, Fan Zhang wrote:
> > This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
> > security library. The type represents performing crypto operation with CPU
> > cycles. The patch also includes a new API to process crypto operations in
> > bulk and the function pointers for PMDs.
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > ---
> >   lib/librte_security/rte_security.c           | 16 +++++++++
> >   lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
> >   lib/librte_security/rte_security_driver.h    | 19 +++++++++++
> >   lib/librte_security/rte_security_version.map |  1 +
> >   4 files changed, 86 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
> > index bc81ce15d..0f85c1b59 100644
> > --- a/lib/librte_security/rte_security.c
> > +++ b/lib/librte_security/rte_security.c
> > @@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
> >
> >   	return NULL;
> >   }
> > +
> > +void
> > +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> > +		struct rte_security_session *sess,
> > +		struct rte_security_vec buf[], void *iv[], void *aad[],
> > +		void *digest[], int status[], uint32_t num)
> > +{
> > +	uint32_t i;
> > +
> > +	for (i = 0; i < num; i++)
> > +		status[i] = -1;
> > +
> > +	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
> > +	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
> > +			aad, digest, status, num);
> > +}
> > diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
> > index 96806e3a2..5a0f8901b 100644
> > --- a/lib/librte_security/rte_security.h
> > +++ b/lib/librte_security/rte_security.h
> > @@ -18,6 +18,7 @@ extern "C" {
> >   #endif
> >
> >   #include <sys/types.h>
> > +#include <sys/uio.h>
> >
> >   #include <netinet/in.h>
> >   #include <netinet/ip.h>
> > @@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
> >   	uint32_t hfn_threshold;
> >   };
> >
> > +struct rte_security_cpu_crypto_xform {
> > +	/** For cipher/authentication crypto operation the authentication may
> > +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
> > +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
> > +	 * header but whole packet (apart from MAC header) is authenticated.
> > +	 * The cipher_offset field is used to deduct the cipher data pointer
> > +	 * from the buffer to be processed.
> > +	 *
> > +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
> > +	 * uses the same offset for cipher and authentication.
> > +	 */
> > +	int32_t cipher_offset;
> > +};
> > +
> >   /**
> >    * Security session action type.
> >    */
> > @@ -286,10 +301,14 @@ enum rte_security_session_action_type {
> >   	/**< All security protocol processing is performed inline during
> >   	 * transmission
> >   	 */
> > -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> > +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
> >   	/**< All security protocol processing including crypto is performed
> >   	 * on a lookaside accelerator
> >   	 */
> > +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > +	/**< Crypto processing for security protocol is processed by CPU
> > +	 * synchronously
> > +	 */
> though you are naming it cpu crypto, but it is more like raw packet
> crypto, where you want to skip mbuf/crypto ops and directly wants to
> work on raw buffer.

Yes, but we do wat to do that (skip mbuf/crypto ops and use raw buffer),
because this API is destined for SW backed implementation.
For that case crypto-ops , mbuf, enqueue/dequeue are just unnecessary overhead. 

> >   };
> >
> >   /** Security session protocol definition */
> > @@ -315,6 +334,7 @@ struct rte_security_session_conf {
> >   		struct rte_security_ipsec_xform ipsec;
> >   		struct rte_security_macsec_xform macsec;
> >   		struct rte_security_pdcp_xform pdcp;
> > +		struct rte_security_cpu_crypto_xform cpucrypto;
> >   	};
> >   	/**< Configuration parameters for security session */
> >   	struct rte_crypto_sym_xform *crypto_xform;
> > @@ -639,6 +659,35 @@ const struct rte_security_capability *
> >   rte_security_capability_get(struct rte_security_ctx *instance,
> >   			    struct rte_security_capability_idx *idx);
> >
> > +/**
> > + * Security vector structure, contains pointer to vector array and the length
> > + * of the array
> > + */
> > +struct rte_security_vec {
> > +	struct iovec *vec;
> > +	uint32_t num;
> > +};
> > +
> 
> Just wondering if you want to change it to *in_vec and *out_vec, that
> will be helpful in future, if the out-of-place processing is required
> for CPU usecase as well?

I suppose this is doable, though right now we don't plan to support such model.

> 
> > +/**
> > + * Processing bulk crypto workload with CPU
> > + *
> > + * @param	instance	security instance.
> > + * @param	sess		security session
> > + * @param	buf		array of buffer SGL vectors
> > + * @param	iv		array of IV pointers
> > + * @param	aad		array of AAD pointers
> > + * @param	digest		array of digest pointers
> > + * @param	status		array of status for the function to return
> > + * @param	num		number of elements in each array
> > + *
> > + */
> > +__rte_experimental
> > +void
> > +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> > +		struct rte_security_session *sess,
> > +		struct rte_security_vec buf[], void *iv[], void *aad[],
> > +		void *digest[], int status[], uint32_t num);
> > +
> 
> Why not make the return as int, to indicate whether this API completely
> failed or processed or have some valid status to look into?

Good point, will change as suggested.

> 
> 
> >   #ifdef __cplusplus
> >   }
> >   #endif
> > diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
> > index 1b561f852..70fcb0c26 100644
> > --- a/lib/librte_security/rte_security_driver.h
> > +++ b/lib/librte_security/rte_security_driver.h
> > @@ -132,6 +132,23 @@ typedef int (*security_get_userdata_t)(void *device,
> >   typedef const struct rte_security_capability *(*security_capabilities_get_t)(
> >   		void *device);
> >
> > +/**
> > + * Process security operations in bulk using CPU accelerated method.
> > + *
> > + * @param	sess		Security session structure.
> > + * @param	buf		Buffer to the vectors to be processed.
> > + * @param	iv		IV pointers.
> > + * @param	aad		AAD pointers.
> > + * @param	digest		Digest pointers.
> > + * @param	status		Array of status value.
> > + * @param	num		Number of elements in each array.
> > + */
> > +
> > +typedef void (*security_process_cpu_crypto_bulk_t)(
> > +		struct rte_security_session *sess,
> > +		struct rte_security_vec buf[], void *iv[], void *aad[],
> > +		void *digest[], int status[], uint32_t num);
> > +
> >   /** Security operations function pointer table */
> >   struct rte_security_ops {
> >   	security_session_create_t session_create;
> > @@ -150,6 +167,8 @@ struct rte_security_ops {
> >   	/**< Get userdata associated with session which processed the packet. */
> >   	security_capabilities_get_t capabilities_get;
> >   	/**< Get security capabilities. */
> > +	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
> > +	/**< Process data in bulk. */
> >   };
> >
> >   #ifdef __cplusplus
> > diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
> > index 53267bf3c..2132e7a00 100644
> > --- a/lib/librte_security/rte_security_version.map
> > +++ b/lib/librte_security/rte_security_version.map
> > @@ -18,4 +18,5 @@ EXPERIMENTAL {
> >   	rte_security_get_userdata;
> >   	rte_security_session_stats_get;
> >   	rte_security_session_update;
> > +	rte_security_process_cpu_crypto_bulk;
> >   };

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-29 16:59       ` Ananyev, Konstantin
@ 2019-09-30  9:43         ` Hemant Agrawal
  2019-10-01 15:27           ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Hemant Agrawal @ 2019-09-30  9:43 UTC (permalink / raw)
  To: Ananyev, Konstantin, Zhang, Roy Fan, dev; +Cc: Doherty, Declan, Akhil Goyal

Hi Konstantin,

n 06-Sep-19 6:43 PM, Fan Zhang wrote:
>>> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
>>> security library. The type represents performing crypto operation with CPU
>>> cycles. The patch also includes a new API to process crypto operations in
>>> bulk and the function pointers for PMDs.
>>>
>>> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
>>> ---
>>>    lib/librte_security/rte_security.c           | 16 +++++++++
>>>    lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
>>>    lib/librte_security/rte_security_driver.h    | 19 +++++++++++
>>>    lib/librte_security/rte_security_version.map |  1 +
>>>    4 files changed, 86 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
>>> index bc81ce15d..0f85c1b59 100644
>>> --- a/lib/librte_security/rte_security.c
>>> +++ b/lib/librte_security/rte_security.c
>>> @@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
>>>
>>>    	return NULL;
>>>    }
>>> +
>>> +void
>>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
>>> +		struct rte_security_session *sess,
>>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
>>> +		void *digest[], int status[], uint32_t num)
>>> +{
>>> +	uint32_t i;
>>> +
>>> +	for (i = 0; i < num; i++)
>>> +		status[i] = -1;
>>> +
>>> +	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
>>> +	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
>>> +			aad, digest, status, num);
>>> +}
>>> diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
>>> index 96806e3a2..5a0f8901b 100644
>>> --- a/lib/librte_security/rte_security.h
>>> +++ b/lib/librte_security/rte_security.h
>>> @@ -18,6 +18,7 @@ extern "C" {
>>>    #endif
>>>
>>>    #include <sys/types.h>
>>> +#include <sys/uio.h>
>>>
>>>    #include <netinet/in.h>
>>>    #include <netinet/ip.h>
>>> @@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
>>>    	uint32_t hfn_threshold;
>>>    };
>>>
>>> +struct rte_security_cpu_crypto_xform {
>>> +	/** For cipher/authentication crypto operation the authentication may
>>> +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
>>> +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
>>> +	 * header but whole packet (apart from MAC header) is authenticated.
>>> +	 * The cipher_offset field is used to deduct the cipher data pointer
>>> +	 * from the buffer to be processed.
>>> +	 *
>>> +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
>>> +	 * uses the same offset for cipher and authentication.
>>> +	 */
>>> +	int32_t cipher_offset;
>>> +};
>>> +
>>>    /**
>>>     * Security session action type.
>>>     */
>>> @@ -286,10 +301,14 @@ enum rte_security_session_action_type {
>>>    	/**< All security protocol processing is performed inline during
>>>    	 * transmission
>>>    	 */
>>> -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
>>> +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
>>>    	/**< All security protocol processing including crypto is performed
>>>    	 * on a lookaside accelerator
>>>    	 */
>>> +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
>>> +	/**< Crypto processing for security protocol is processed by CPU
>>> +	 * synchronously
>>> +	 */
>> though you are naming it cpu crypto, but it is more like raw packet
>> crypto, where you want to skip mbuf/crypto ops and directly wants to
>> work on raw buffer.
> Yes, but we do wat to do that (skip mbuf/crypto ops and use raw buffer),
> because this API is destined for SW backed implementation.
> For that case crypto-ops , mbuf, enqueue/dequeue are just unnecessary overhead.
I agree, we are also planning to take advantage of it for some specific 
use-cases in future.
>>>    };
>>>
>>>    /** Security session protocol definition */
>>> @@ -315,6 +334,7 @@ struct rte_security_session_conf {
>>>    		struct rte_security_ipsec_xform ipsec;
>>>    		struct rte_security_macsec_xform macsec;
>>>    		struct rte_security_pdcp_xform pdcp;
>>> +		struct rte_security_cpu_crypto_xform cpucrypto;
>>>    	};
>>>    	/**< Configuration parameters for security session */
>>>    	struct rte_crypto_sym_xform *crypto_xform;
>>> @@ -639,6 +659,35 @@ const struct rte_security_capability *
>>>    rte_security_capability_get(struct rte_security_ctx *instance,
>>>    			    struct rte_security_capability_idx *idx);
>>>
>>> +/**
>>> + * Security vector structure, contains pointer to vector array and the length
>>> + * of the array
>>> + */
>>> +struct rte_security_vec {
>>> +	struct iovec *vec;
>>> +	uint32_t num;
>>> +};
>>> +
>> Just wondering if you want to change it to *in_vec and *out_vec, that
>> will be helpful in future, if the out-of-place processing is required
>> for CPU usecase as well?
> I suppose this is doable, though right now we don't plan to support such model.
They will come handy in future. I plan to use it in future and we can 
skip the API/ABI breakage, if the placeholder are present
>
>>> +/**
>>> + * Processing bulk crypto workload with CPU
>>> + *
>>> + * @param	instance	security instance.
>>> + * @param	sess		security session
>>> + * @param	buf		array of buffer SGL vectors
>>> + * @param	iv		array of IV pointers
>>> + * @param	aad		array of AAD pointers
>>> + * @param	digest		array of digest pointers
>>> + * @param	status		array of status for the function to return
>>> + * @param	num		number of elements in each array
>>> + *
>>> + */
>>> +__rte_experimental
>>> +void
>>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
>>> +		struct rte_security_session *sess,
>>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
>>> +		void *digest[], int status[], uint32_t num);
>>> +
>> Why not make the return as int, to indicate whether this API completely
>> failed or processed or have some valid status to look into?
> Good point, will change as suggested.

I have another suggestions w.r.t iv, aad, digest etc. Why not put them 
in a structure, so that you will

be able to add/remove the variable without breaking the API prototype.

>
>>
>>>    #ifdef __cplusplus
>>>    }
>>>    #endif
>>> diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
>>> index 1b561f852..70fcb0c26 100644
>>> --- a/lib/librte_security/rte_security_driver.h
>>> +++ b/lib/librte_security/rte_security_driver.h
>>> @@ -132,6 +132,23 @@ typedef int (*security_get_userdata_t)(void *device,
>>>    typedef const struct rte_security_capability *(*security_capabilities_get_t)(
>>>    		void *device);
>>>
>>> +/**
>>> + * Process security operations in bulk using CPU accelerated method.
>>> + *
>>> + * @param	sess		Security session structure.
>>> + * @param	buf		Buffer to the vectors to be processed.
>>> + * @param	iv		IV pointers.
>>> + * @param	aad		AAD pointers.
>>> + * @param	digest		Digest pointers.
>>> + * @param	status		Array of status value.
>>> + * @param	num		Number of elements in each array.
>>> + */
>>> +
>>> +typedef void (*security_process_cpu_crypto_bulk_t)(
>>> +		struct rte_security_session *sess,
>>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
>>> +		void *digest[], int status[], uint32_t num);
>>> +
>>>    /** Security operations function pointer table */
>>>    struct rte_security_ops {
>>>    	security_session_create_t session_create;
>>> @@ -150,6 +167,8 @@ struct rte_security_ops {
>>>    	/**< Get userdata associated with session which processed the packet. */
>>>    	security_capabilities_get_t capabilities_get;
>>>    	/**< Get security capabilities. */
>>> +	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
>>> +	/**< Process data in bulk. */
>>>    };
>>>
>>>    #ifdef __cplusplus
>>> diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
>>> index 53267bf3c..2132e7a00 100644
>>> --- a/lib/librte_security/rte_security_version.map
>>> +++ b/lib/librte_security/rte_security_version.map
>>> @@ -18,4 +18,5 @@ EXPERIMENTAL {
>>>    	rte_security_get_userdata;
>>>    	rte_security_session_stats_get;
>>>    	rte_security_session_update;
>>> +	rte_security_process_cpu_crypto_bulk;
>>>    };

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-27  9:26                         ` Akhil Goyal
@ 2019-09-30 12:22                           ` Ananyev, Konstantin
  2019-09-30 13:43                             ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-09-30 12:22 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph

Hi Akhil,

> > > > > > > > > > > This action type allows the burst of symmetric crypto workload
> > using
> > > > > the
> > > > > > > > > same
> > > > > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > > > > synchronously.
> > > > > > > > > > > This flexible action type does not require external hardware
> > > > > involvement,
> > > > > > > > > > > having the crypto workload processed synchronously, and is
> > more
> > > > > > > > > performant
> > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed
> > "async
> > > > > > > mode
> > > > > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > > > > >
> > > > > > > > > > Does that mean application will not call the
> > cryptodev_enqueue_burst
> > > > > and
> > > > > > > > > corresponding dequeue burst.
> > > > > > > > >
> > > > > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > > > > >
> > > > > > > > > > It would be a new API something like process_packets and it will
> > have
> > > > > the
> > > > > > > > > crypto processed packets while returning from the API?
> > > > > > > > >
> > > > > > > > > Yes, though the plan is that API will operate on raw data buffers,
> > not
> > > > > mbufs.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > I still do not understand why we cannot do with the conventional
> > > > > crypto lib
> > > > > > > > > only.
> > > > > > > > > > As far as I can understand, you are not doing any protocol
> > processing
> > > > > or
> > > > > > > any
> > > > > > > > > value add
> > > > > > > > > > To the crypto processing. IMO, you just need a synchronous
> > crypto
> > > > > > > processing
> > > > > > > > > API which
> > > > > > > > > > Can be defined in cryptodev, you don't need to re-create a crypto
> > > > > session
> > > > > > > in
> > > > > > > > > the name of
> > > > > > > > > > Security session in the driver just to do a synchronous processing.
> > > > > > > > >
> > > > > > > > > I suppose your question is why not to have
> > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > > The main reason is that would require disruptive changes in existing
> > > > > > > cryptodev
> > > > > > > > > API
> > > > > > > > > (would cause ABI/API breakage).
> > > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need
> > some
> > > > > extra
> > > > > > > > > information
> > > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > > (cipher offset from the start of the buffer, might be something extra
> > in
> > > > > > > future).
> > > > > > > >
> > > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > > >
> > > > > > > fill/read (+ alloc/free) is one of the main things that slowdown current
> > > > > crypto-op
> > > > > > > approach.
> > > > > > > That's why the general idea - have all data that wouldn't change from
> > packet
> > > > > to
> > > > > > > packet
> > > > > > > included into the session and setup it once at session_init().
> > > > > >
> > > > > > I agree that you cannot use crypto-op.
> > > > > > You can have the new API in crypto.
> > > > > > As per the current patch, you only need cipher_offset which you can have
> > it as
> > > > > a parameter until
> > > > > > You get it approved in the crypto xform. I believe it will be beneficial in
> > case of
> > > > > other crypto cases as well.
> > > > > > We can have cipher offset at both places(crypto-op and cipher_xform). It
> > will
> > > > > give flexibility to the user to
> > > > > > override it.
> > > > >
> > > > > After having another thought on your proposal:
> > > > > Probably we can introduce new rte_crypto_sym_xform_types for CPU
> > related
> > > > > stuff here?
> > > >
> > > > I also thought of adding new xforms, but that wont serve the purpose for
> > may be all the cases.
> > > > You would be needing all information currently available in the current
> > xforms.
> > > > So if you are adding new fields in the new xform, the size will be more than
> > that of the union of xforms.
> > > > ABI breakage would still be there.
> > > >
> > > > If you think a valid compression of the AEAD xform can be done, then that
> > can be done for each of the
> > > > Xforms and we can have a solution to this issue.
> > >
> > > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > > So for now we can make that path work without any ABI breakage.
> > > Fan, please feel free to correct me here, if I missed something.
> > > If in future we would need to add some extra information it might
> > > require ABI breakage, though by now I don't envision anything particular to
> > add.
> > > Anyway, if there is no objection to go that way, we can try to make
> > > these changes for v2.
> > >
> >
> > Actually, after looking at it more deeply it appears not that easy as I thought it
> > would be :)
> > Below is a very draft version of proposed API additions.
> > I think it avoids ABI breakages right now and provides enough flexibility for
> > future extensions (if any).
> > For now, it doesn't address your comments about naming conventions (_CPU_
> > vs _SYNC_) , etc.
> > but I suppose is comprehensive enough to provide a main idea beyond it.
> > Akhil and other interested parties, please try to review and provide feedback
> > ASAP,
> > as related changes would take some time and we still like to hit 19.11 deadline.
> > Konstantin
> >
> >  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > b/lib/librte_cryptodev/rte_crypto_sym.h
> > index bc8da2466..c03069e23 100644
> > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
> >   *
> >   * This structure contains data relating to Cipher (Encryption and Decryption)
> >   *  use to create a session.
> > + * Actually I was wrong saying that we don't have free space inside xforms.
> > + * Making key struct packed (see below) allow us to regain 6B that could be
> > + * used for future extensions.
> >   */
> >  struct rte_crypto_cipher_xform {
> >         enum rte_crypto_cipher_operation op;
> > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
> >         struct {
> >                 const uint8_t *data;    /**< pointer to key data */
> >                 uint16_t length;        /**< key length in bytes */
> > -       } key;
> > +       } __attribute__((__packed__)) key;
> > +
> > +       /**
> > +         * offset for cipher to start within user provided data buffer.
> > +        * Fan suggested another (and less space consuming way) -
> > +         * reuse iv.offset space below, by changing:
> > +        * struct {uint16_t offset, length;} iv;
> > +        * to uunamed union:
> > +        * union {
> > +        *      struct {uint16_t offset, length;} iv;
> > +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> > +        * };
> > +        * Both approaches seems ok to me in general.
> 
> No strong opinions here. OK with this one.
> 
> > +        * Comments/suggestions are welcome.
> > +         */
> > +       uint16_t offset;

After another thought - it is probably a bit better to have offset as a separate field.
In that case we can use the same xforms to create both type of sessions.

> > +
> > +       uint8_t reserved1[4];
> > +
> >         /**< Cipher key
> >          *
> >          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data will
> > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
> >         struct {
> >                 const uint8_t *data;    /**< pointer to key data */
> >                 uint16_t length;        /**< key length in bytes */
> > -       } key;
> > +       } __attribute__((__packed__)) key;
> >         /**< Authentication key data.
> >          * The authentication key length MUST be less than or equal to the
> >          * block size of the algorithm. It is the callers responsibility to
> > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
> >          * (for example RFC 2104, FIPS 198a).
> >          */
> >
> > +       uint8_t reserved1[6];
> > +
> >         struct {
> >                 uint16_t offset;
> >                 /**< Starting point for Initialisation Vector or Counter,
> > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
> >         struct {
> >                 const uint8_t *data;    /**< pointer to key data */
> >                 uint16_t length;        /**< key length in bytes */
> > -       } key;
> > +       } __attribute__((__packed__)) key;
> > +
> > +       /** offset for cipher to start within data buffer */
> > +       uint16_t cipher_offset;
> > +
> > +       uint8_t reserved1[4];
> >
> >         struct {
> >                 uint16_t offset;
> > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > b/lib/librte_cryptodev/rte_cryptodev.h
> > index e175b838c..c0c7bfed7 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > @@ -1272,6 +1272,101 @@ void *
> >  rte_cryptodev_sym_session_get_user_data(
> >                                         struct rte_cryptodev_sym_session *sess);
> >
> > +/*
> > + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> > + * into existing rte_crypto_sym_session structure/API, but instead
> > + * introduce an extentsion to it via new fully opaque
> > + * struct rte_crypto_cpu_sym_session and additional related API.
> 
> 
> What all things do we need to squeeze?
> In this proposal I do not see the new struct cpu_sym_session  defined here.

The plan is to have it totally opaque to the user, i.e. just:
struct rte_crypto_cpu_sym_session;
in public header files.

> I believe you will have same lib API/struct for cpu_sym_session  and sym_session.

I thought about such way, but there are few things that looks clumsy to me:
1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_session,
so it is not possible to easy distinguish what session do you have: lksd_sym or cpu_sym.
In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we can add some extra field
here, but in that case  we wouldn't be able to use the same xform for both  lksd_sym or cpu_sym
(which seems really plausible thing for me).
2.  Majority of rte_cryptodev_sym_session fields I think are unnecessary for rte_crypto_cpu_sym_session:
sess_data[], opaque_data, user_data, nb_drivers.
All that consumes space, that could be used somewhere else instead.
3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any breakages I can't foresee right now.
From other side - if we'll add new functions/structs for cpu_sym_session we can mark it
and keep it for some time as experimental, so further changes (if needed) would still be possible.

> I am not sure if that would be needed.
> It would be internal to the driver that if synchronous processing is supported(from feature flag) and
> Have relevant fields in xform(the newly added ones which are packed as per your suggestions) set,
> It will create that type of session.
> 
> 
> > + * Main points:
> > + * - Current crypto-dev API is reasonably mature and it is desirable
> > + *   to keep it unchanged (API/ABI stability). From other side, this
> > + *   new sync API is new one and probably would require extra changes.
> > + *   Having it as a new one allows to mark it as experimental, without
> > + *   affecting existing one.
> > + * - Fully opaque cpu_sym_session structure gives more flexibility
> > + *   to the PMD writers and again allows to avoid ABI breakages in future.
> > + * - process() function per set of xforms
> > + *   allows to expose different process() functions for different
> > + *   xform combinations. PMD writer can decide, does he wants to
> > + *   push all supported algorithms into one process() function,
> > + *   or spread it across several ones.
> > + *   I.E. More flexibility for PMD writer.
> 
> Which process function should be chosen is internal to PMD, how would that info
> be visible to the application or the library. These will get stored in the session private
> data. It would be upto the PMD writer, to store the per session process function in
> the session private data.
> 
> Process function would be a dev ops just like enc/deq operations and it should call
> The respective process API stored in the session private data.

That model (via devops) is possible, but has several drawbacks from my perspective:

1. It means we'll need to pass dev_id as a parameter to process() function.
Though in fact dev_id is not a relevant information for us here
(all we need is pointer to the session and pointer to the fuction to call)
and I tried to avoid using it in data-path functions for that API.
2. As you pointed in that case it will be just one process() function per device.
So if PMD would like to have several process() functions for different type of sessions  
(let say one per alg) first thing it has to do inside it's process() - read session data and
based on that, do a jump/call to particular internal sub-routine.
Something like:
driver_id = get_pmd_driver_id();
priv_ses = ses->sess_data[driver_id];
Then either:
switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...}
OR 
priv_ses->process(priv_sess, ...);

to select and call the proper function.
Looks like totally unnecessary overhead to me.
Though if we'll have ability to query/extract some sort session_ops based on the xform -
we can avoid  this extra de-refererence+jump/call thing.

> 
> I am not sure if you would need a new session init API for this as nothing would be visible to
> the app or lib.
> 
> > + * - Not storing process() pointer inside the session -
> > + *   Allows user to choose does he want to store a process() pointer
> > + *   per session, or per group of sessions for that device that share
> > + *   the same input xforms. I.E. extra flexibility for the user,
> > + *   plus allows us to keep cpu_sym_session totally opaque, see above.
> 
> If multiple sessions need to be processed via the same process function,
> PMD would save the same process in all the sessions, I don't think there would
> be any perf overhead with that.

I think it would, see above.

> 
> > + * Sketched usage model:
> > + * ....
> > + * /* control path, alloc/init session */
> > + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> > + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> > + * rte_crypto_cpu_sym_process_t process =
> > + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> > + * ...
> > + * /* data-path*/
> > + * process(ses, ....);
> > + * ....
> > + * /* control path, termiante/free session */
> > + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> > + */
> > +
> > +/**
> > + * vector structure, contains pointer to vector array and the length
> > + * of the array
> > + */
> > +struct rte_crypto_vec {
> > +       struct iovec *vec;
> > +       uint32_t num;
> > +};
> > +
> > +/*
> > + * Data-path bulk process crypto function.
> > + */
> > +typedef void (*rte_crypto_cpu_sym_process_t)(
> > +               struct rte_crypto_cpu_sym_session *sess,
> > +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> > +               void *digest[], int status[], uint32_t num);
> > +/*
> > + * for given device return process function specific to input xforms
> > + * on error - return NULL and set rte_errno value.
> > + * Note that for same input xfroms for the same device should return
> > + * the same process function.
> > + */
> > +__rte_experimental
> > +rte_crypto_cpu_sym_process_t
> > +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> > +/*
> > + * Return required session size in bytes for given set of xforms.
> > + * if xforms == NULL, then return the max possible session size,
> > + * that would fit session for any supported by the device algorithm.
> > + * if CPU mode is not supported at all, or requeted in xform
> > + * algorithm is not supported, then return -ENOTSUP.
> > + */
> > +__rte_experimental
> > +int
> > +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> > +/*
> > + * Initialize session.
> > + * It is caller responsibility to allocate enough space for it.
> > + * See rte_crypto_cpu_sym_session_size above.
> > + */
> > +__rte_experimental
> > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> > +                       struct rte_crypto_cpu_sym_session *sess,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> > +__rte_experimental
> > +void
> > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> > +                       struct rte_crypto_cpu_sym_session *sess);
> > +
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > index defe05ea0..ed7e63fab 100644
> > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > @@ -310,6 +310,20 @@ typedef void (*cryptodev_sym_free_session_t)(struct
> > rte_cryptodev *dev,
> >  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *dev,
> >                 struct rte_cryptodev_asym_session *sess);
> >
> > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev *dev,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev *dev,
> > +                       struct rte_crypto_cpu_sym_session *sess,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev *dev,
> > +                       struct rte_crypto_cpu_sym_session *sess);
> > +
> > +typedef rte_crypto_cpu_sym_process_t (*cryptodev_cpu_sym_session_func_t)
> > (
> > +                       struct rte_cryptodev *dev,
> > +                       const struct rte_crypto_sym_xform *xforms);
> > +
> >  /** Crypto device operations function pointer table */
> >  struct rte_cryptodev_ops {
> >         cryptodev_configure_t dev_configure;    /**< Configure device. */
> > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
> >         /**< Clear a Crypto sessions private data. */
> >         cryptodev_asym_free_session_t asym_session_clear;
> >         /**< Clear a Crypto sessions private data. */
> > +
> > +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> > +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> > +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> > +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
> >  };
> >
> >
> >


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-30 12:22                           ` Ananyev, Konstantin
@ 2019-09-30 13:43                             ` Akhil Goyal
  2019-10-01 14:49                               ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-09-30 13:43 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Konstantin,
> 
> Hi Akhil,
> 
> > > > > > > > > > > > This action type allows the burst of symmetric crypto
> workload
> > > using
> > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > > > > > synchronously.
> > > > > > > > > > > > This flexible action type does not require external hardware
> > > > > > involvement,
> > > > > > > > > > > > having the crypto workload processed synchronously, and is
> > > more
> > > > > > > > > > performant
> > > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed
> > > "async
> > > > > > > > mode
> > > > > > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > > > > > >
> > > > > > > > > > > Does that mean application will not call the
> > > cryptodev_enqueue_burst
> > > > > > and
> > > > > > > > > > corresponding dequeue burst.
> > > > > > > > > >
> > > > > > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > > > > > >
> > > > > > > > > > > It would be a new API something like process_packets and it
> will
> > > have
> > > > > > the
> > > > > > > > > > crypto processed packets while returning from the API?
> > > > > > > > > >
> > > > > > > > > > Yes, though the plan is that API will operate on raw data buffers,
> > > not
> > > > > > mbufs.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > I still do not understand why we cannot do with the
> conventional
> > > > > > crypto lib
> > > > > > > > > > only.
> > > > > > > > > > > As far as I can understand, you are not doing any protocol
> > > processing
> > > > > > or
> > > > > > > > any
> > > > > > > > > > value add
> > > > > > > > > > > To the crypto processing. IMO, you just need a synchronous
> > > crypto
> > > > > > > > processing
> > > > > > > > > > API which
> > > > > > > > > > > Can be defined in cryptodev, you don't need to re-create a
> crypto
> > > > > > session
> > > > > > > > in
> > > > > > > > > > the name of
> > > > > > > > > > > Security session in the driver just to do a synchronous
> processing.
> > > > > > > > > >
> > > > > > > > > > I suppose your question is why not to have
> > > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > > > The main reason is that would require disruptive changes in
> existing
> > > > > > > > cryptodev
> > > > > > > > > > API
> > > > > > > > > > (would cause ABI/API breakage).
> > > > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need
> > > some
> > > > > > extra
> > > > > > > > > > information
> > > > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > > > (cipher offset from the start of the buffer, might be something
> extra
> > > in
> > > > > > > > future).
> > > > > > > > >
> > > > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > > > >
> > > > > > > > fill/read (+ alloc/free) is one of the main things that slowdown
> current
> > > > > > crypto-op
> > > > > > > > approach.
> > > > > > > > That's why the general idea - have all data that wouldn't change
> from
> > > packet
> > > > > > to
> > > > > > > > packet
> > > > > > > > included into the session and setup it once at session_init().
> > > > > > >
> > > > > > > I agree that you cannot use crypto-op.
> > > > > > > You can have the new API in crypto.
> > > > > > > As per the current patch, you only need cipher_offset which you can
> have
> > > it as
> > > > > > a parameter until
> > > > > > > You get it approved in the crypto xform. I believe it will be beneficial
> in
> > > case of
> > > > > > other crypto cases as well.
> > > > > > > We can have cipher offset at both places(crypto-op and
> cipher_xform). It
> > > will
> > > > > > give flexibility to the user to
> > > > > > > override it.
> > > > > >
> > > > > > After having another thought on your proposal:
> > > > > > Probably we can introduce new rte_crypto_sym_xform_types for CPU
> > > related
> > > > > > stuff here?
> > > > >
> > > > > I also thought of adding new xforms, but that wont serve the purpose for
> > > may be all the cases.
> > > > > You would be needing all information currently available in the current
> > > xforms.
> > > > > So if you are adding new fields in the new xform, the size will be more
> than
> > > that of the union of xforms.
> > > > > ABI breakage would still be there.
> > > > >
> > > > > If you think a valid compression of the AEAD xform can be done, then
> that
> > > can be done for each of the
> > > > > Xforms and we can have a solution to this issue.
> > > >
> > > > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > > > So for now we can make that path work without any ABI breakage.
> > > > Fan, please feel free to correct me here, if I missed something.
> > > > If in future we would need to add some extra information it might
> > > > require ABI breakage, though by now I don't envision anything particular to
> > > add.
> > > > Anyway, if there is no objection to go that way, we can try to make
> > > > these changes for v2.
> > > >
> > >
> > > Actually, after looking at it more deeply it appears not that easy as I thought
> it
> > > would be :)
> > > Below is a very draft version of proposed API additions.
> > > I think it avoids ABI breakages right now and provides enough flexibility for
> > > future extensions (if any).
> > > For now, it doesn't address your comments about naming conventions
> (_CPU_
> > > vs _SYNC_) , etc.
> > > but I suppose is comprehensive enough to provide a main idea beyond it.
> > > Akhil and other interested parties, please try to review and provide feedback
> > > ASAP,
> > > as related changes would take some time and we still like to hit 19.11
> deadline.
> > > Konstantin
> > >
> > >  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > > b/lib/librte_cryptodev/rte_crypto_sym.h
> > > index bc8da2466..c03069e23 100644
> > > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
> > >   *
> > >   * This structure contains data relating to Cipher (Encryption and Decryption)
> > >   *  use to create a session.
> > > + * Actually I was wrong saying that we don't have free space inside xforms.
> > > + * Making key struct packed (see below) allow us to regain 6B that could be
> > > + * used for future extensions.
> > >   */
> > >  struct rte_crypto_cipher_xform {
> > >         enum rte_crypto_cipher_operation op;
> > > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
> > >         struct {
> > >                 const uint8_t *data;    /**< pointer to key data */
> > >                 uint16_t length;        /**< key length in bytes */
> > > -       } key;
> > > +       } __attribute__((__packed__)) key;
> > > +
> > > +       /**
> > > +         * offset for cipher to start within user provided data buffer.
> > > +        * Fan suggested another (and less space consuming way) -
> > > +         * reuse iv.offset space below, by changing:
> > > +        * struct {uint16_t offset, length;} iv;
> > > +        * to uunamed union:
> > > +        * union {
> > > +        *      struct {uint16_t offset, length;} iv;
> > > +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> > > +        * };
> > > +        * Both approaches seems ok to me in general.
> >
> > No strong opinions here. OK with this one.
> >
> > > +        * Comments/suggestions are welcome.
> > > +         */
> > > +       uint16_t offset;
> 
> After another thought - it is probably a bit better to have offset as a separate
> field.
> In that case we can use the same xforms to create both type of sessions.
ok
> 
> > > +
> > > +       uint8_t reserved1[4];
> > > +
> > >         /**< Cipher key
> > >          *
> > >          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data
> will
> > > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
> > >         struct {
> > >                 const uint8_t *data;    /**< pointer to key data */
> > >                 uint16_t length;        /**< key length in bytes */
> > > -       } key;
> > > +       } __attribute__((__packed__)) key;
> > >         /**< Authentication key data.
> > >          * The authentication key length MUST be less than or equal to the
> > >          * block size of the algorithm. It is the callers responsibility to
> > > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
> > >          * (for example RFC 2104, FIPS 198a).
> > >          */
> > >
> > > +       uint8_t reserved1[6];
> > > +
> > >         struct {
> > >                 uint16_t offset;
> > >                 /**< Starting point for Initialisation Vector or Counter,
> > > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
> > >         struct {
> > >                 const uint8_t *data;    /**< pointer to key data */
> > >                 uint16_t length;        /**< key length in bytes */
> > > -       } key;
> > > +       } __attribute__((__packed__)) key;
> > > +
> > > +       /** offset for cipher to start within data buffer */
> > > +       uint16_t cipher_offset;
> > > +
> > > +       uint8_t reserved1[4];
> > >
> > >         struct {
> > >                 uint16_t offset;
> > > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > > b/lib/librte_cryptodev/rte_cryptodev.h
> > > index e175b838c..c0c7bfed7 100644
> > > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > > @@ -1272,6 +1272,101 @@ void *
> > >  rte_cryptodev_sym_session_get_user_data(
> > >                                         struct rte_cryptodev_sym_session *sess);
> > >
> > > +/*
> > > + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> > > + * into existing rte_crypto_sym_session structure/API, but instead
> > > + * introduce an extentsion to it via new fully opaque
> > > + * struct rte_crypto_cpu_sym_session and additional related API.
> >
> >
> > What all things do we need to squeeze?
> > In this proposal I do not see the new struct cpu_sym_session  defined here.
> 
> The plan is to have it totally opaque to the user, i.e. just:
> struct rte_crypto_cpu_sym_session;
> in public header files.
> 
> > I believe you will have same lib API/struct for cpu_sym_session  and
> sym_session.
> 
> I thought about such way, but there are few things that looks clumsy to me:
> 1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_session,
> so it is not possible to easy distinguish what session do you have: lksd_sym or
> cpu_sym.
> In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we can add
> some extra field
> here, but in that case  we wouldn't be able to use the same xform for both
> lksd_sym or cpu_sym
> (which seems really plausible thing for me).
> 2.  Majority of rte_cryptodev_sym_session fields I think are unnecessary for
> rte_crypto_cpu_sym_session:
> sess_data[], opaque_data, user_data, nb_drivers.
> All that consumes space, that could be used somewhere else instead.
> 3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any
> breakages I can't foresee right now.
> From other side - if we'll add new functions/structs for cpu_sym_session we can
> mark it
> and keep it for some time as experimental, so further changes (if needed) would
> still be possible.
> 

OK let us assume that you have a separate structure. But I have a few queries:
1. how can multiple drivers use a same session
2. Can somebody use the scheduler pmd for scheduling the different type of payloads for the same session?

With your proposal the APIs would be very specific to your use case only.
When you would add more functionality to this sync API/struct, it will end up being the same API/struct.

Let us  see how close/ far we are from the existing APIs when the actual implementation is done.

> > I am not sure if that would be needed.
> > It would be internal to the driver that if synchronous processing is
> supported(from feature flag) and
> > Have relevant fields in xform(the newly added ones which are packed as per
> your suggestions) set,
> > It will create that type of session.
> >
> >
> > > + * Main points:
> > > + * - Current crypto-dev API is reasonably mature and it is desirable
> > > + *   to keep it unchanged (API/ABI stability). From other side, this
> > > + *   new sync API is new one and probably would require extra changes.
> > > + *   Having it as a new one allows to mark it as experimental, without
> > > + *   affecting existing one.
> > > + * - Fully opaque cpu_sym_session structure gives more flexibility
> > > + *   to the PMD writers and again allows to avoid ABI breakages in future.
> > > + * - process() function per set of xforms
> > > + *   allows to expose different process() functions for different
> > > + *   xform combinations. PMD writer can decide, does he wants to
> > > + *   push all supported algorithms into one process() function,
> > > + *   or spread it across several ones.
> > > + *   I.E. More flexibility for PMD writer.
> >
> > Which process function should be chosen is internal to PMD, how would that
> info
> > be visible to the application or the library. These will get stored in the session
> private
> > data. It would be upto the PMD writer, to store the per session process
> function in
> > the session private data.
> >
> > Process function would be a dev ops just like enc/deq operations and it should
> call
> > The respective process API stored in the session private data.
> 
> That model (via devops) is possible, but has several drawbacks from my
> perspective:
> 
> 1. It means we'll need to pass dev_id as a parameter to process() function.
> Though in fact dev_id is not a relevant information for us here
> (all we need is pointer to the session and pointer to the fuction to call)
> and I tried to avoid using it in data-path functions for that API.

You have a single vdev, but someone may have multiple vdevs for each thread, or may
Have same dev with multiple queues for each core.

> 2. As you pointed in that case it will be just one process() function per device.
> So if PMD would like to have several process() functions for different type of
> sessions
> (let say one per alg) first thing it has to do inside it's process() - read session data
> and
> based on that, do a jump/call to particular internal sub-routine.
> Something like:
> driver_id = get_pmd_driver_id();
> priv_ses = ses->sess_data[driver_id];
> Then either:
> switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...}
> OR
> priv_ses->process(priv_sess, ...);
> 
> to select and call the proper function.
> Looks like totally unnecessary overhead to me.
> Though if we'll have ability to query/extract some sort session_ops based on the
> xform -
> we can avoid  this extra de-refererence+jump/call thing.

What is the issue in the priv_ses->process(); approach?
I don't understand what are you saving by not doing this.
In any case you would need to identify which session correspond to which process().
For that you would be doing it somewhere in your data path.

> 
> >
> > I am not sure if you would need a new session init API for this as nothing would
> be visible to
> > the app or lib.
> >
> > > + * - Not storing process() pointer inside the session -
> > > + *   Allows user to choose does he want to store a process() pointer
> > > + *   per session, or per group of sessions for that device that share
> > > + *   the same input xforms. I.E. extra flexibility for the user,
> > > + *   plus allows us to keep cpu_sym_session totally opaque, see above.
> >
> > If multiple sessions need to be processed via the same process function,
> > PMD would save the same process in all the sessions, I don't think there would
> > be any perf overhead with that.
> 
> I think it would, see above.
> 
> >
> > > + * Sketched usage model:
> > > + * ....
> > > + * /* control path, alloc/init session */
> > > + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> > > + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> > > + * rte_crypto_cpu_sym_process_t process =
> > > + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> > > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> > > + * ...
> > > + * /* data-path*/
> > > + * process(ses, ....);
> > > + * ....
> > > + * /* control path, termiante/free session */
> > > + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> > > + */
> > > +
> > > +/**
> > > + * vector structure, contains pointer to vector array and the length
> > > + * of the array
> > > + */
> > > +struct rte_crypto_vec {
> > > +       struct iovec *vec;
> > > +       uint32_t num;
> > > +};
> > > +
> > > +/*
> > > + * Data-path bulk process crypto function.
> > > + */
> > > +typedef void (*rte_crypto_cpu_sym_process_t)(
> > > +               struct rte_crypto_cpu_sym_session *sess,
> > > +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> > > +               void *digest[], int status[], uint32_t num);
> > > +/*
> > > + * for given device return process function specific to input xforms
> > > + * on error - return NULL and set rte_errno value.
> > > + * Note that for same input xfroms for the same device should return
> > > + * the same process function.
> > > + */
> > > +__rte_experimental
> > > +rte_crypto_cpu_sym_process_t
> > > +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > > +/*
> > > + * Return required session size in bytes for given set of xforms.
> > > + * if xforms == NULL, then return the max possible session size,
> > > + * that would fit session for any supported by the device algorithm.
> > > + * if CPU mode is not supported at all, or requeted in xform
> > > + * algorithm is not supported, then return -ENOTSUP.
> > > + */
> > > +__rte_experimental
> > > +int
> > > +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > > +/*
> > > + * Initialize session.
> > > + * It is caller responsibility to allocate enough space for it.
> > > + * See rte_crypto_cpu_sym_session_size above.
> > > + */
> > > +__rte_experimental
> > > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > > +__rte_experimental
> > > +void
> > > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > +
> > > +
> > >  #ifdef __cplusplus
> > >  }
> > >  #endif
> > > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > index defe05ea0..ed7e63fab 100644
> > > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > @@ -310,6 +310,20 @@ typedef void
> (*cryptodev_sym_free_session_t)(struct
> > > rte_cryptodev *dev,
> > >  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *dev,
> > >                 struct rte_cryptodev_asym_session *sess);
> > >
> > > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev
> *dev,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev
> *dev,
> > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev
> *dev,
> > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > +
> > > +typedef rte_crypto_cpu_sym_process_t
> (*cryptodev_cpu_sym_session_func_t)
> > > (
> > > +                       struct rte_cryptodev *dev,
> > > +                       const struct rte_crypto_sym_xform *xforms);
> > > +
> > >  /** Crypto device operations function pointer table */
> > >  struct rte_cryptodev_ops {
> > >         cryptodev_configure_t dev_configure;    /**< Configure device. */
> > > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
> > >         /**< Clear a Crypto sessions private data. */
> > >         cryptodev_asym_free_session_t asym_session_clear;
> > >         /**< Clear a Crypto sessions private data. */
> > > +
> > > +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> > > +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> > > +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> > > +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
> > >  };
> > >
> > >
> > >


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-09-30 13:43                             ` Akhil Goyal
@ 2019-10-01 14:49                               ` Ananyev, Konstantin
  2019-10-03 13:24                                 ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-01 14:49 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Akhil,

> > > > > > > > > > > > > This action type allows the burst of symmetric crypto
> > workload
> > > > using
> > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > > algorithm, key, and direction being processed by CPU cycles
> > > > > > > > > synchronously.
> > > > > > > > > > > > > This flexible action type does not require external hardware
> > > > > > > involvement,
> > > > > > > > > > > > > having the crypto workload processed synchronously, and is
> > > > more
> > > > > > > > > > > performant
> > > > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on removed
> > > > "async
> > > > > > > > > mode
> > > > > > > > > > > > > simulation" as well as 3 cacheline access of the crypto ops.
> > > > > > > > > > > >
> > > > > > > > > > > > Does that mean application will not call the
> > > > cryptodev_enqueue_burst
> > > > > > > and
> > > > > > > > > > > corresponding dequeue burst.
> > > > > > > > > > >
> > > > > > > > > > > Yes, instead it just call rte_security_process_cpu_crypto_bulk(...)
> > > > > > > > > > >
> > > > > > > > > > > > It would be a new API something like process_packets and it
> > will
> > > > have
> > > > > > > the
> > > > > > > > > > > crypto processed packets while returning from the API?
> > > > > > > > > > >
> > > > > > > > > > > Yes, though the plan is that API will operate on raw data buffers,
> > > > not
> > > > > > > mbufs.
> > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > I still do not understand why we cannot do with the
> > conventional
> > > > > > > crypto lib
> > > > > > > > > > > only.
> > > > > > > > > > > > As far as I can understand, you are not doing any protocol
> > > > processing
> > > > > > > or
> > > > > > > > > any
> > > > > > > > > > > value add
> > > > > > > > > > > > To the crypto processing. IMO, you just need a synchronous
> > > > crypto
> > > > > > > > > processing
> > > > > > > > > > > API which
> > > > > > > > > > > > Can be defined in cryptodev, you don't need to re-create a
> > crypto
> > > > > > > session
> > > > > > > > > in
> > > > > > > > > > > the name of
> > > > > > > > > > > > Security session in the driver just to do a synchronous
> > processing.
> > > > > > > > > > >
> > > > > > > > > > > I suppose your question is why not to have
> > > > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > > > > The main reason is that would require disruptive changes in
> > existing
> > > > > > > > > cryptodev
> > > > > > > > > > > API
> > > > > > > > > > > (would cause ABI/API breakage).
> > > > > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need
> > > > some
> > > > > > > extra
> > > > > > > > > > > information
> > > > > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > > > > (cipher offset from the start of the buffer, might be something
> > extra
> > > > in
> > > > > > > > > future).
> > > > > > > > > >
> > > > > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > > > > >
> > > > > > > > > fill/read (+ alloc/free) is one of the main things that slowdown
> > current
> > > > > > > crypto-op
> > > > > > > > > approach.
> > > > > > > > > That's why the general idea - have all data that wouldn't change
> > from
> > > > packet
> > > > > > > to
> > > > > > > > > packet
> > > > > > > > > included into the session and setup it once at session_init().
> > > > > > > >
> > > > > > > > I agree that you cannot use crypto-op.
> > > > > > > > You can have the new API in crypto.
> > > > > > > > As per the current patch, you only need cipher_offset which you can
> > have
> > > > it as
> > > > > > > a parameter until
> > > > > > > > You get it approved in the crypto xform. I believe it will be beneficial
> > in
> > > > case of
> > > > > > > other crypto cases as well.
> > > > > > > > We can have cipher offset at both places(crypto-op and
> > cipher_xform). It
> > > > will
> > > > > > > give flexibility to the user to
> > > > > > > > override it.
> > > > > > >
> > > > > > > After having another thought on your proposal:
> > > > > > > Probably we can introduce new rte_crypto_sym_xform_types for CPU
> > > > related
> > > > > > > stuff here?
> > > > > >
> > > > > > I also thought of adding new xforms, but that wont serve the purpose for
> > > > may be all the cases.
> > > > > > You would be needing all information currently available in the current
> > > > xforms.
> > > > > > So if you are adding new fields in the new xform, the size will be more
> > than
> > > > that of the union of xforms.
> > > > > > ABI breakage would still be there.
> > > > > >
> > > > > > If you think a valid compression of the AEAD xform can be done, then
> > that
> > > > can be done for each of the
> > > > > > Xforms and we can have a solution to this issue.
> > > > >
> > > > > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > > > > So for now we can make that path work without any ABI breakage.
> > > > > Fan, please feel free to correct me here, if I missed something.
> > > > > If in future we would need to add some extra information it might
> > > > > require ABI breakage, though by now I don't envision anything particular to
> > > > add.
> > > > > Anyway, if there is no objection to go that way, we can try to make
> > > > > these changes for v2.
> > > > >
> > > >
> > > > Actually, after looking at it more deeply it appears not that easy as I thought
> > it
> > > > would be :)
> > > > Below is a very draft version of proposed API additions.
> > > > I think it avoids ABI breakages right now and provides enough flexibility for
> > > > future extensions (if any).
> > > > For now, it doesn't address your comments about naming conventions
> > (_CPU_
> > > > vs _SYNC_) , etc.
> > > > but I suppose is comprehensive enough to provide a main idea beyond it.
> > > > Akhil and other interested parties, please try to review and provide feedback
> > > > ASAP,
> > > > as related changes would take some time and we still like to hit 19.11
> > deadline.
> > > > Konstantin
> > > >
> > > >  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > index bc8da2466..c03069e23 100644
> > > > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
> > > >   *
> > > >   * This structure contains data relating to Cipher (Encryption and Decryption)
> > > >   *  use to create a session.
> > > > + * Actually I was wrong saying that we don't have free space inside xforms.
> > > > + * Making key struct packed (see below) allow us to regain 6B that could be
> > > > + * used for future extensions.
> > > >   */
> > > >  struct rte_crypto_cipher_xform {
> > > >         enum rte_crypto_cipher_operation op;
> > > > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
> > > >         struct {
> > > >                 const uint8_t *data;    /**< pointer to key data */
> > > >                 uint16_t length;        /**< key length in bytes */
> > > > -       } key;
> > > > +       } __attribute__((__packed__)) key;
> > > > +
> > > > +       /**
> > > > +         * offset for cipher to start within user provided data buffer.
> > > > +        * Fan suggested another (and less space consuming way) -
> > > > +         * reuse iv.offset space below, by changing:
> > > > +        * struct {uint16_t offset, length;} iv;
> > > > +        * to uunamed union:
> > > > +        * union {
> > > > +        *      struct {uint16_t offset, length;} iv;
> > > > +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> > > > +        * };
> > > > +        * Both approaches seems ok to me in general.
> > >
> > > No strong opinions here. OK with this one.
> > >
> > > > +        * Comments/suggestions are welcome.
> > > > +         */
> > > > +       uint16_t offset;
> >
> > After another thought - it is probably a bit better to have offset as a separate
> > field.
> > In that case we can use the same xforms to create both type of sessions.
> ok
> >
> > > > +
> > > > +       uint8_t reserved1[4];
> > > > +
> > > >         /**< Cipher key
> > > >          *
> > > >          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data
> > will
> > > > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
> > > >         struct {
> > > >                 const uint8_t *data;    /**< pointer to key data */
> > > >                 uint16_t length;        /**< key length in bytes */
> > > > -       } key;
> > > > +       } __attribute__((__packed__)) key;
> > > >         /**< Authentication key data.
> > > >          * The authentication key length MUST be less than or equal to the
> > > >          * block size of the algorithm. It is the callers responsibility to
> > > > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
> > > >          * (for example RFC 2104, FIPS 198a).
> > > >          */
> > > >
> > > > +       uint8_t reserved1[6];
> > > > +
> > > >         struct {
> > > >                 uint16_t offset;
> > > >                 /**< Starting point for Initialisation Vector or Counter,
> > > > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
> > > >         struct {
> > > >                 const uint8_t *data;    /**< pointer to key data */
> > > >                 uint16_t length;        /**< key length in bytes */
> > > > -       } key;
> > > > +       } __attribute__((__packed__)) key;
> > > > +
> > > > +       /** offset for cipher to start within data buffer */
> > > > +       uint16_t cipher_offset;
> > > > +
> > > > +       uint8_t reserved1[4];
> > > >
> > > >         struct {
> > > >                 uint16_t offset;
> > > > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > > > b/lib/librte_cryptodev/rte_cryptodev.h
> > > > index e175b838c..c0c7bfed7 100644
> > > > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > > > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > > > @@ -1272,6 +1272,101 @@ void *
> > > >  rte_cryptodev_sym_session_get_user_data(
> > > >                                         struct rte_cryptodev_sym_session *sess);
> > > >
> > > > +/*
> > > > + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> > > > + * into existing rte_crypto_sym_session structure/API, but instead
> > > > + * introduce an extentsion to it via new fully opaque
> > > > + * struct rte_crypto_cpu_sym_session and additional related API.
> > >
> > >
> > > What all things do we need to squeeze?
> > > In this proposal I do not see the new struct cpu_sym_session  defined here.
> >
> > The plan is to have it totally opaque to the user, i.e. just:
> > struct rte_crypto_cpu_sym_session;
> > in public header files.
> >
> > > I believe you will have same lib API/struct for cpu_sym_session  and
> > sym_session.
> >
> > I thought about such way, but there are few things that looks clumsy to me:
> > 1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_session,
> > so it is not possible to easy distinguish what session do you have: lksd_sym or
> > cpu_sym.
> > In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we can add
> > some extra field
> > here, but in that case  we wouldn't be able to use the same xform for both
> > lksd_sym or cpu_sym
> > (which seems really plausible thing for me).
> > 2.  Majority of rte_cryptodev_sym_session fields I think are unnecessary for
> > rte_crypto_cpu_sym_session:
> > sess_data[], opaque_data, user_data, nb_drivers.
> > All that consumes space, that could be used somewhere else instead.
> > 3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any
> > breakages I can't foresee right now.
> > From other side - if we'll add new functions/structs for cpu_sym_session we can
> > mark it
> > and keep it for some time as experimental, so further changes (if needed) would
> > still be possible.
> >
> 
> OK let us assume that you have a separate structure. But I have a few queries:
> 1. how can multiple drivers use a same session

As a short answer: they can't.
It is pretty much the same approach as with rte_security - each device needs to create/init its own session.
So upper layer would need to maintain its own array (or so) for such case.
Though the question is why would you like to have same session over multiple SW backed devices?
As it would be anyway just a synchronous function call that will be executed on the same cpu. 

> 2. Can somebody use the scheduler pmd for scheduling the different type of payloads for the same session?

In theory yes. 
Though for that scheduler pmd should have inside it's rte_crypto_cpu_sym_session an array of pointers to
the underlying devices sessions.

> 
> With your proposal the APIs would be very specific to your use case only.

Yes in some way.
I consider that API specific for SW backed crypto PMDs.
I can hardly see how any 'real HW' PMDs (lksd-none, lksd-proto) will benefit from it.
Current crypto-op API is very much HW oriented. 
Which is ok, that's for it was intended for, but I think we also need one that would be designed
for SW backed implementation in mind.

> When you would add more functionality to this sync API/struct, it will end up being the same API/struct.
> 
> Let us  see how close/ far we are from the existing APIs when the actual implementation is done.
> 
> > > I am not sure if that would be needed.
> > > It would be internal to the driver that if synchronous processing is
> > supported(from feature flag) and
> > > Have relevant fields in xform(the newly added ones which are packed as per
> > your suggestions) set,
> > > It will create that type of session.
> > >
> > >
> > > > + * Main points:
> > > > + * - Current crypto-dev API is reasonably mature and it is desirable
> > > > + *   to keep it unchanged (API/ABI stability). From other side, this
> > > > + *   new sync API is new one and probably would require extra changes.
> > > > + *   Having it as a new one allows to mark it as experimental, without
> > > > + *   affecting existing one.
> > > > + * - Fully opaque cpu_sym_session structure gives more flexibility
> > > > + *   to the PMD writers and again allows to avoid ABI breakages in future.
> > > > + * - process() function per set of xforms
> > > > + *   allows to expose different process() functions for different
> > > > + *   xform combinations. PMD writer can decide, does he wants to
> > > > + *   push all supported algorithms into one process() function,
> > > > + *   or spread it across several ones.
> > > > + *   I.E. More flexibility for PMD writer.
> > >
> > > Which process function should be chosen is internal to PMD, how would that
> > info
> > > be visible to the application or the library. These will get stored in the session
> > private
> > > data. It would be upto the PMD writer, to store the per session process
> > function in
> > > the session private data.
> > >
> > > Process function would be a dev ops just like enc/deq operations and it should
> > call
> > > The respective process API stored in the session private data.
> >
> > That model (via devops) is possible, but has several drawbacks from my
> > perspective:
> >
> > 1. It means we'll need to pass dev_id as a parameter to process() function.
> > Though in fact dev_id is not a relevant information for us here
> > (all we need is pointer to the session and pointer to the fuction to call)
> > and I tried to avoid using it in data-path functions for that API.
> 
> You have a single vdev, but someone may have multiple vdevs for each thread, or may
> Have same dev with multiple queues for each core.

That's fine. As I said above it is a SW backed implementation.
Each session has to be a separate entity that contains all necessary information
(keys, alg/mode info,  etc.)  to process input buffers.
Plus we need the actual function pointer to call.
I just don't see what for we need a dev_id in that situation.
Again, here we don't need care about queues and their pinning to cores.
If let say someone would like to process buffers from the same IPsec SA on 2
different cores in parallel, he can just create 2 sessions for the same xform,
give one to thread #1  and second to thread #2.
After that both threads are free to call process(this_thread_ses, ...) at will.  

> 
> > 2. As you pointed in that case it will be just one process() function per device.
> > So if PMD would like to have several process() functions for different type of
> > sessions
> > (let say one per alg) first thing it has to do inside it's process() - read session data
> > and
> > based on that, do a jump/call to particular internal sub-routine.
> > Something like:
> > driver_id = get_pmd_driver_id();
> > priv_ses = ses->sess_data[driver_id];
> > Then either:
> > switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...}
> > OR
> > priv_ses->process(priv_sess, ...);
> >
> > to select and call the proper function.
> > Looks like totally unnecessary overhead to me.
> > Though if we'll have ability to query/extract some sort session_ops based on the
> > xform -
> > we can avoid  this extra de-refererence+jump/call thing.
> 
> What is the issue in the priv_ses->process(); approach?

Nothing at all.
What I am saying that schema with dev_ops 
dev[dev_id]->dev_ops.process(ses->priv_ses[driver_id], ...)
   |
   |-> priv_ses->process(...)

Has bigger overhead then just:
process(ses,...);

So what for to introduce extra-level of indirection here?

> I don't understand what are you saving by not doing this.
> In any case you would need to identify which session correspond to which process().

Yes, sure, but I think we can make user to store information that relationship,
in a way he likes: store process() pointer for each session, or group sessions
that share the same process() somehow, or...

> For that you would be doing it somewhere in your data path.

Why at data-path?
Only once at session creation/initialization time.
Or might be even once per group of sessions.

> 
> >
> > >
> > > I am not sure if you would need a new session init API for this as nothing would
> > be visible to
> > > the app or lib.
> > >
> > > > + * - Not storing process() pointer inside the session -
> > > > + *   Allows user to choose does he want to store a process() pointer
> > > > + *   per session, or per group of sessions for that device that share
> > > > + *   the same input xforms. I.E. extra flexibility for the user,
> > > > + *   plus allows us to keep cpu_sym_session totally opaque, see above.
> > >
> > > If multiple sessions need to be processed via the same process function,
> > > PMD would save the same process in all the sessions, I don't think there would
> > > be any perf overhead with that.
> >
> > I think it would, see above.
> >
> > >
> > > > + * Sketched usage model:
> > > > + * ....
> > > > + * /* control path, alloc/init session */
> > > > + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> > > > + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> > > > + * rte_crypto_cpu_sym_process_t process =
> > > > + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> > > > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> > > > + * ...
> > > > + * /* data-path*/
> > > > + * process(ses, ....);
> > > > + * ....
> > > > + * /* control path, termiante/free session */
> > > > + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> > > > + */
> > > > +
> > > > +/**
> > > > + * vector structure, contains pointer to vector array and the length
> > > > + * of the array
> > > > + */
> > > > +struct rte_crypto_vec {
> > > > +       struct iovec *vec;
> > > > +       uint32_t num;
> > > > +};
> > > > +
> > > > +/*
> > > > + * Data-path bulk process crypto function.
> > > > + */
> > > > +typedef void (*rte_crypto_cpu_sym_process_t)(
> > > > +               struct rte_crypto_cpu_sym_session *sess,
> > > > +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> > > > +               void *digest[], int status[], uint32_t num);
> > > > +/*
> > > > + * for given device return process function specific to input xforms
> > > > + * on error - return NULL and set rte_errno value.
> > > > + * Note that for same input xfroms for the same device should return
> > > > + * the same process function.
> > > > + */
> > > > +__rte_experimental
> > > > +rte_crypto_cpu_sym_process_t
> > > > +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > > +/*
> > > > + * Return required session size in bytes for given set of xforms.
> > > > + * if xforms == NULL, then return the max possible session size,
> > > > + * that would fit session for any supported by the device algorithm.
> > > > + * if CPU mode is not supported at all, or requeted in xform
> > > > + * algorithm is not supported, then return -ENOTSUP.
> > > > + */
> > > > +__rte_experimental
> > > > +int
> > > > +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > > +/*
> > > > + * Initialize session.
> > > > + * It is caller responsibility to allocate enough space for it.
> > > > + * See rte_crypto_cpu_sym_session_size above.
> > > > + */
> > > > +__rte_experimental
> > > > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > > +__rte_experimental
> > > > +void
> > > > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > +
> > > > +
> > > >  #ifdef __cplusplus
> > > >  }
> > > >  #endif
> > > > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > index defe05ea0..ed7e63fab 100644
> > > > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > @@ -310,6 +310,20 @@ typedef void
> > (*cryptodev_sym_free_session_t)(struct
> > > > rte_cryptodev *dev,
> > > >  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *dev,
> > > >                 struct rte_cryptodev_asym_session *sess);
> > > >
> > > > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev
> > *dev,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev
> > *dev,
> > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev
> > *dev,
> > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > +
> > > > +typedef rte_crypto_cpu_sym_process_t
> > (*cryptodev_cpu_sym_session_func_t)
> > > > (
> > > > +                       struct rte_cryptodev *dev,
> > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > +
> > > >  /** Crypto device operations function pointer table */
> > > >  struct rte_cryptodev_ops {
> > > >         cryptodev_configure_t dev_configure;    /**< Configure device. */
> > > > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
> > > >         /**< Clear a Crypto sessions private data. */
> > > >         cryptodev_asym_free_session_t asym_session_clear;
> > > >         /**< Clear a Crypto sessions private data. */
> > > > +
> > > > +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> > > > +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> > > > +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> > > > +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
> > > >  };
> > > >
> > > >
> > > >


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-09-30  9:43         ` Hemant Agrawal
@ 2019-10-01 15:27           ` Ananyev, Konstantin
  2019-10-02  2:47             ` Hemant Agrawal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-01 15:27 UTC (permalink / raw)
  To: Hemant Agrawal, Zhang, Roy Fan, dev; +Cc: Doherty, Declan, Akhil Goyal


Hi Hemant,

> >>> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
> >>> security library. The type represents performing crypto operation with CPU
> >>> cycles. The patch also includes a new API to process crypto operations in
> >>> bulk and the function pointers for PMDs.
> >>>
> >>> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> >>> ---
> >>>    lib/librte_security/rte_security.c           | 16 +++++++++
> >>>    lib/librte_security/rte_security.h           | 51 +++++++++++++++++++++++++++-
> >>>    lib/librte_security/rte_security_driver.h    | 19 +++++++++++
> >>>    lib/librte_security/rte_security_version.map |  1 +
> >>>    4 files changed, 86 insertions(+), 1 deletion(-)
> >>>
> >>> diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
> >>> index bc81ce15d..0f85c1b59 100644
> >>> --- a/lib/librte_security/rte_security.c
> >>> +++ b/lib/librte_security/rte_security.c
> >>> @@ -141,3 +141,19 @@ rte_security_capability_get(struct rte_security_ctx *instance,
> >>>
> >>>    	return NULL;
> >>>    }
> >>> +
> >>> +void
> >>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> >>> +		struct rte_security_session *sess,
> >>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> >>> +		void *digest[], int status[], uint32_t num)
> >>> +{
> >>> +	uint32_t i;
> >>> +
> >>> +	for (i = 0; i < num; i++)
> >>> +		status[i] = -1;
> >>> +
> >>> +	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
> >>> +	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
> >>> +			aad, digest, status, num);
> >>> +}
> >>> diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
> >>> index 96806e3a2..5a0f8901b 100644
> >>> --- a/lib/librte_security/rte_security.h
> >>> +++ b/lib/librte_security/rte_security.h
> >>> @@ -18,6 +18,7 @@ extern "C" {
> >>>    #endif
> >>>
> >>>    #include <sys/types.h>
> >>> +#include <sys/uio.h>
> >>>
> >>>    #include <netinet/in.h>
> >>>    #include <netinet/ip.h>
> >>> @@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
> >>>    	uint32_t hfn_threshold;
> >>>    };
> >>>
> >>> +struct rte_security_cpu_crypto_xform {
> >>> +	/** For cipher/authentication crypto operation the authentication may
> >>> +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
> >>> +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
> >>> +	 * header but whole packet (apart from MAC header) is authenticated.
> >>> +	 * The cipher_offset field is used to deduct the cipher data pointer
> >>> +	 * from the buffer to be processed.
> >>> +	 *
> >>> +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
> >>> +	 * uses the same offset for cipher and authentication.
> >>> +	 */
> >>> +	int32_t cipher_offset;
> >>> +};
> >>> +
> >>>    /**
> >>>     * Security session action type.
> >>>     */
> >>> @@ -286,10 +301,14 @@ enum rte_security_session_action_type {
> >>>    	/**< All security protocol processing is performed inline during
> >>>    	 * transmission
> >>>    	 */
> >>> -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> >>> +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
> >>>    	/**< All security protocol processing including crypto is performed
> >>>    	 * on a lookaside accelerator
> >>>    	 */
> >>> +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> >>> +	/**< Crypto processing for security protocol is processed by CPU
> >>> +	 * synchronously
> >>> +	 */
> >> though you are naming it cpu crypto, but it is more like raw packet
> >> crypto, where you want to skip mbuf/crypto ops and directly wants to
> >> work on raw buffer.
> > Yes, but we do wat to do that (skip mbuf/crypto ops and use raw buffer),
> > because this API is destined for SW backed implementation.
> > For that case crypto-ops , mbuf, enqueue/dequeue are just unnecessary overhead.
> I agree, we are also planning to take advantage of it for some specific
> use-cases in future.
> >>>    };
> >>>
> >>>    /** Security session protocol definition */
> >>> @@ -315,6 +334,7 @@ struct rte_security_session_conf {
> >>>    		struct rte_security_ipsec_xform ipsec;
> >>>    		struct rte_security_macsec_xform macsec;
> >>>    		struct rte_security_pdcp_xform pdcp;
> >>> +		struct rte_security_cpu_crypto_xform cpucrypto;
> >>>    	};
> >>>    	/**< Configuration parameters for security session */
> >>>    	struct rte_crypto_sym_xform *crypto_xform;
> >>> @@ -639,6 +659,35 @@ const struct rte_security_capability *
> >>>    rte_security_capability_get(struct rte_security_ctx *instance,
> >>>    			    struct rte_security_capability_idx *idx);
> >>>
> >>> +/**
> >>> + * Security vector structure, contains pointer to vector array and the length
> >>> + * of the array
> >>> + */
> >>> +struct rte_security_vec {
> >>> +	struct iovec *vec;
> >>> +	uint32_t num;
> >>> +};
> >>> +
> >> Just wondering if you want to change it to *in_vec and *out_vec, that
> >> will be helpful in future, if the out-of-place processing is required
> >> for CPU usecase as well?
> > I suppose this is doable, though right now we don't plan to support such model.
> They will come handy in future. I plan to use it in future and we can
> skip the API/ABI breakage, if the placeholder are present
> >
> >>> +/**
> >>> + * Processing bulk crypto workload with CPU
> >>> + *
> >>> + * @param	instance	security instance.
> >>> + * @param	sess		security session
> >>> + * @param	buf		array of buffer SGL vectors
> >>> + * @param	iv		array of IV pointers
> >>> + * @param	aad		array of AAD pointers
> >>> + * @param	digest		array of digest pointers
> >>> + * @param	status		array of status for the function to return
> >>> + * @param	num		number of elements in each array
> >>> + *
> >>> + */
> >>> +__rte_experimental
> >>> +void
> >>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> >>> +		struct rte_security_session *sess,
> >>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> >>> +		void *digest[], int status[], uint32_t num);
> >>> +
> >> Why not make the return as int, to indicate whether this API completely
> >> failed or processed or have some valid status to look into?
> > Good point, will change as suggested.
> 
> I have another suggestions w.r.t iv, aad, digest etc. Why not put them
> in a structure, so that you will
> 
> be able to add/remove the variable without breaking the API prototype.


Just to confirm, you are talking about something like:

struct rte_security_vec {
   struct iovec *vec;
   uint32_t num;
};

struct rte_security_sym_vec {
      struct rte_security_vec buf;
      void *iv;
      void *aad;
      void *digest;
};

rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
	struct rte_security_session *sess, struct rte_security_sym_vec buf[],
               int status[], uint32_t num);

?
We thought about such way, though for PMD it would be
more plausible to have same type of params grouped together,
i.e. void *in[], void *out[], void *digest[], ...
Another thing - above grouping wouldn't help to avoid ABI breakage,
in case we'll need to add new field into rte_security_sym_vec
(though it might help to avoid API breakage).

In theory other way is also possible:
struct rte_security_sym_vec {
      struct rte_security_vec *buf;
      void **iv;
      void **aad;
      void **digest;
};

rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
	struct rte_security_session *sess, struct rte_security_sym_vec *buf,
               int status[], uint32_t num);

And that might help for both ABI and API stability, 
but it looks really weird that way (at least to me).
Also this API is experimental and I suppose needs to stay experimental for
few releases before we are sure nothing important is missing,
so probably API/ABI stability is not that high concern for it right now. 

Konstantin

 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH 01/10] security: introduce CPU Crypto action type and API
  2019-10-01 15:27           ` Ananyev, Konstantin
@ 2019-10-02  2:47             ` Hemant Agrawal
  0 siblings, 0 replies; 84+ messages in thread
From: Hemant Agrawal @ 2019-10-02  2:47 UTC (permalink / raw)
  To: Ananyev, Konstantin, Zhang, Roy Fan, dev; +Cc: Doherty, Declan, Akhil Goyal

Hi Konstantin,

> > >>> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > >>> action type to security library. The type represents performing
> > >>> crypto operation with CPU cycles. The patch also includes a new
> > >>> API to process crypto operations in bulk and the function pointers for
> PMDs.
> > >>>
> > >>> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > >>> ---
> > >>>    lib/librte_security/rte_security.c           | 16 +++++++++
> > >>>    lib/librte_security/rte_security.h           | 51
> +++++++++++++++++++++++++++-
> > >>>    lib/librte_security/rte_security_driver.h    | 19 +++++++++++
> > >>>    lib/librte_security/rte_security_version.map |  1 +
> > >>>    4 files changed, 86 insertions(+), 1 deletion(-)
> > >>>
> > >>> diff --git a/lib/librte_security/rte_security.c
> > >>> b/lib/librte_security/rte_security.c
> > >>> index bc81ce15d..0f85c1b59 100644
> > >>> --- a/lib/librte_security/rte_security.c
> > >>> +++ b/lib/librte_security/rte_security.c
> > >>> @@ -141,3 +141,19 @@ rte_security_capability_get(struct
> > >>> rte_security_ctx *instance,
> > >>>
> > >>>    	return NULL;
> > >>>    }
> > >>> +
> > >>> +void
> > >>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx
> *instance,
> > >>> +		struct rte_security_session *sess,
> > >>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> > >>> +		void *digest[], int status[], uint32_t num) {
> > >>> +	uint32_t i;
> > >>> +
> > >>> +	for (i = 0; i < num; i++)
> > >>> +		status[i] = -1;
> > >>> +
> > >>> +	RTE_FUNC_PTR_OR_RET(*instance->ops->process_cpu_crypto_bulk);
> > >>> +	instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
> > >>> +			aad, digest, status, num);
> > >>> +}
> > >>> diff --git a/lib/librte_security/rte_security.h
> > >>> b/lib/librte_security/rte_security.h
> > >>> index 96806e3a2..5a0f8901b 100644
> > >>> --- a/lib/librte_security/rte_security.h
> > >>> +++ b/lib/librte_security/rte_security.h
> > >>> @@ -18,6 +18,7 @@ extern "C" {
> > >>>    #endif
> > >>>
> > >>>    #include <sys/types.h>
> > >>> +#include <sys/uio.h>
> > >>>
> > >>>    #include <netinet/in.h>
> > >>>    #include <netinet/ip.h>
> > >>> @@ -272,6 +273,20 @@ struct rte_security_pdcp_xform {
> > >>>    	uint32_t hfn_threshold;
> > >>>    };
> > >>>
> > >>> +struct rte_security_cpu_crypto_xform {
> > >>> +	/** For cipher/authentication crypto operation the authentication
> may
> > >>> +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
> > >>> +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the
> ESP
> > >>> +	 * header but whole packet (apart from MAC header) is
> authenticated.
> > >>> +	 * The cipher_offset field is used to deduct the cipher data pointer
> > >>> +	 * from the buffer to be processed.
> > >>> +	 *
> > >>> +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
> > >>> +	 * uses the same offset for cipher and authentication.
> > >>> +	 */
> > >>> +	int32_t cipher_offset;
> > >>> +};
> > >>> +
> > >>>    /**
> > >>>     * Security session action type.
> > >>>     */
> > >>> @@ -286,10 +301,14 @@ enum rte_security_session_action_type {
> > >>>    	/**< All security protocol processing is performed inline during
> > >>>    	 * transmission
> > >>>    	 */
> > >>> -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> > >>> +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
> > >>>    	/**< All security protocol processing including crypto is performed
> > >>>    	 * on a lookaside accelerator
> > >>>    	 */
> > >>> +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > >>> +	/**< Crypto processing for security protocol is processed by CPU
> > >>> +	 * synchronously
> > >>> +	 */
> > >> though you are naming it cpu crypto, but it is more like raw packet
> > >> crypto, where you want to skip mbuf/crypto ops and directly wants
> > >> to work on raw buffer.
> > > Yes, but we do wat to do that (skip mbuf/crypto ops and use raw
> > > buffer), because this API is destined for SW backed implementation.
> > > For that case crypto-ops , mbuf, enqueue/dequeue are just unnecessary
> overhead.
> > I agree, we are also planning to take advantage of it for some
> > specific use-cases in future.
> > >>>    };
> > >>>
> > >>>    /** Security session protocol definition */ @@ -315,6 +334,7 @@
> > >>> struct rte_security_session_conf {
> > >>>    		struct rte_security_ipsec_xform ipsec;
> > >>>    		struct rte_security_macsec_xform macsec;
> > >>>    		struct rte_security_pdcp_xform pdcp;
> > >>> +		struct rte_security_cpu_crypto_xform cpucrypto;
> > >>>    	};
> > >>>    	/**< Configuration parameters for security session */
> > >>>    	struct rte_crypto_sym_xform *crypto_xform; @@ -639,6 +659,35
> > >>> @@ const struct rte_security_capability *
> > >>>    rte_security_capability_get(struct rte_security_ctx *instance,
> > >>>    			    struct rte_security_capability_idx *idx);
> > >>>
> > >>> +/**
> > >>> + * Security vector structure, contains pointer to vector array
> > >>> +and the length
> > >>> + * of the array
> > >>> + */
> > >>> +struct rte_security_vec {
> > >>> +	struct iovec *vec;
> > >>> +	uint32_t num;
> > >>> +};
> > >>> +
> > >> Just wondering if you want to change it to *in_vec and *out_vec,
> > >> that will be helpful in future, if the out-of-place processing is
> > >> required for CPU usecase as well?
> > > I suppose this is doable, though right now we don't plan to support such
> model.
> > They will come handy in future. I plan to use it in future and we can
> > skip the API/ABI breakage, if the placeholder are present
> > >
> > >>> +/**
> > >>> + * Processing bulk crypto workload with CPU
> > >>> + *
> > >>> + * @param	instance	security instance.
> > >>> + * @param	sess		security session
> > >>> + * @param	buf		array of buffer SGL vectors
> > >>> + * @param	iv		array of IV pointers
> > >>> + * @param	aad		array of AAD pointers
> > >>> + * @param	digest		array of digest pointers
> > >>> + * @param	status		array of status for the function to
> return
> > >>> + * @param	num		number of elements in each array
> > >>> + *
> > >>> + */
> > >>> +__rte_experimental
> > >>> +void
> > >>> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx
> *instance,
> > >>> +		struct rte_security_session *sess,
> > >>> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> > >>> +		void *digest[], int status[], uint32_t num);
> > >>> +
> > >> Why not make the return as int, to indicate whether this API
> > >> completely failed or processed or have some valid status to look into?
> > > Good point, will change as suggested.
> >
> > I have another suggestions w.r.t iv, aad, digest etc. Why not put them
> > in a structure, so that you will
> >
> > be able to add/remove the variable without breaking the API prototype.
> 
> 
> Just to confirm, you are talking about something like:
> 
> struct rte_security_vec {
>    struct iovec *vec;
>    uint32_t num;
> };

[Hemant] My idea is:
 struct rte_security_vec {
    struct iovec *vec;
    struct iovec *out_vec;
    uint32_t num_in;
    uint32_t num_out; 
};

> 
> struct rte_security_sym_vec {
>       struct rte_security_vec buf;
>       void *iv;
>       void *aad;
>       void *digest;
> };
> 
[Hemant]  or leave the rte_security_vec altogether and make it part of rte_security_sym_vec itself.

> rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> 	struct rte_security_session *sess, struct rte_security_sym_vec buf[],
>                int status[], uint32_t num);
> 
> ?
> We thought about such way, though for PMD it would be more plausible to
> have same type of params grouped together, i.e. void *in[], void *out[], void
> *digest[], ...
> Another thing - above grouping wouldn't help to avoid ABI breakage, in case
> we'll need to add new field into rte_security_sym_vec (though it might help
> to avoid API breakage).
> 
> In theory other way is also possible:
> struct rte_security_sym_vec {
>       struct rte_security_vec *buf;
>       void **iv;
>       void **aad;
>       void **digest;
> };
> 
> rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> 	struct rte_security_session *sess, struct rte_security_sym_vec *buf,
>                int status[], uint32_t num);
> 
> And that might help for both ABI and API stability, but it looks really weird
> that way (at least to me).

[Hemant] I am fine either way. 

> Also this API is experimental and I suppose needs to stay experimental for
> few releases before we are sure nothing important is missing, so probably
> API/ABI stability is not that high concern for it right now.
> 
> Konstantin
> 
> 

^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-10-01 14:49                               ` Ananyev, Konstantin
@ 2019-10-03 13:24                                 ` Akhil Goyal
  2019-10-07 12:53                                   ` Ananyev, Konstantin
  0 siblings, 1 reply; 84+ messages in thread
From: Akhil Goyal @ 2019-10-03 13:24 UTC (permalink / raw)
  To: Ananyev, Konstantin, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Konstantin,
> 
> Hi Akhil,
> 
> > > > > > > > > > > > > > This action type allows the burst of symmetric crypto
> > > workload
> > > > > using
> > > > > > > > the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > algorithm, key, and direction being processed by CPU
> cycles
> > > > > > > > > > synchronously.
> > > > > > > > > > > > > > This flexible action type does not require external
> hardware
> > > > > > > > involvement,
> > > > > > > > > > > > > > having the crypto workload processed synchronously,
> and is
> > > > > more
> > > > > > > > > > > > performant
> > > > > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on
> removed
> > > > > "async
> > > > > > > > > > mode
> > > > > > > > > > > > > > simulation" as well as 3 cacheline access of the crypto
> ops.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Does that mean application will not call the
> > > > > cryptodev_enqueue_burst
> > > > > > > > and
> > > > > > > > > > > > corresponding dequeue burst.
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, instead it just call
> rte_security_process_cpu_crypto_bulk(...)
> > > > > > > > > > > >
> > > > > > > > > > > > > It would be a new API something like process_packets and
> it
> > > will
> > > > > have
> > > > > > > > the
> > > > > > > > > > > > crypto processed packets while returning from the API?
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, though the plan is that API will operate on raw data
> buffers,
> > > > > not
> > > > > > > > mbufs.
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I still do not understand why we cannot do with the
> > > conventional
> > > > > > > > crypto lib
> > > > > > > > > > > > only.
> > > > > > > > > > > > > As far as I can understand, you are not doing any protocol
> > > > > processing
> > > > > > > > or
> > > > > > > > > > any
> > > > > > > > > > > > value add
> > > > > > > > > > > > > To the crypto processing. IMO, you just need a
> synchronous
> > > > > crypto
> > > > > > > > > > processing
> > > > > > > > > > > > API which
> > > > > > > > > > > > > Can be defined in cryptodev, you don't need to re-create a
> > > crypto
> > > > > > > > session
> > > > > > > > > > in
> > > > > > > > > > > > the name of
> > > > > > > > > > > > > Security session in the driver just to do a synchronous
> > > processing.
> > > > > > > > > > > >
> > > > > > > > > > > > I suppose your question is why not to have
> > > > > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > > > > > The main reason is that would require disruptive changes in
> > > existing
> > > > > > > > > > cryptodev
> > > > > > > > > > > > API
> > > > > > > > > > > > (would cause ABI/API breakage).
> > > > > > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> need
> > > > > some
> > > > > > > > extra
> > > > > > > > > > > > information
> > > > > > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > > > > > (cipher offset from the start of the buffer, might be
> something
> > > extra
> > > > > in
> > > > > > > > > > future).
> > > > > > > > > > >
> > > > > > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > > > > > >
> > > > > > > > > > fill/read (+ alloc/free) is one of the main things that slowdown
> > > current
> > > > > > > > crypto-op
> > > > > > > > > > approach.
> > > > > > > > > > That's why the general idea - have all data that wouldn't change
> > > from
> > > > > packet
> > > > > > > > to
> > > > > > > > > > packet
> > > > > > > > > > included into the session and setup it once at session_init().
> > > > > > > > >
> > > > > > > > > I agree that you cannot use crypto-op.
> > > > > > > > > You can have the new API in crypto.
> > > > > > > > > As per the current patch, you only need cipher_offset which you
> can
> > > have
> > > > > it as
> > > > > > > > a parameter until
> > > > > > > > > You get it approved in the crypto xform. I believe it will be
> beneficial
> > > in
> > > > > case of
> > > > > > > > other crypto cases as well.
> > > > > > > > > We can have cipher offset at both places(crypto-op and
> > > cipher_xform). It
> > > > > will
> > > > > > > > give flexibility to the user to
> > > > > > > > > override it.
> > > > > > > >
> > > > > > > > After having another thought on your proposal:
> > > > > > > > Probably we can introduce new rte_crypto_sym_xform_types for
> CPU
> > > > > related
> > > > > > > > stuff here?
> > > > > > >
> > > > > > > I also thought of adding new xforms, but that wont serve the purpose
> for
> > > > > may be all the cases.
> > > > > > > You would be needing all information currently available in the
> current
> > > > > xforms.
> > > > > > > So if you are adding new fields in the new xform, the size will be more
> > > than
> > > > > that of the union of xforms.
> > > > > > > ABI breakage would still be there.
> > > > > > >
> > > > > > > If you think a valid compression of the AEAD xform can be done, then
> > > that
> > > > > can be done for each of the
> > > > > > > Xforms and we can have a solution to this issue.
> > > > > >
> > > > > > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > > > > > So for now we can make that path work without any ABI breakage.
> > > > > > Fan, please feel free to correct me here, if I missed something.
> > > > > > If in future we would need to add some extra information it might
> > > > > > require ABI breakage, though by now I don't envision anything
> particular to
> > > > > add.
> > > > > > Anyway, if there is no objection to go that way, we can try to make
> > > > > > these changes for v2.
> > > > > >
> > > > >
> > > > > Actually, after looking at it more deeply it appears not that easy as I
> thought
> > > it
> > > > > would be :)
> > > > > Below is a very draft version of proposed API additions.
> > > > > I think it avoids ABI breakages right now and provides enough flexibility
> for
> > > > > future extensions (if any).
> > > > > For now, it doesn't address your comments about naming conventions
> > > (_CPU_
> > > > > vs _SYNC_) , etc.
> > > > > but I suppose is comprehensive enough to provide a main idea beyond it.
> > > > > Akhil and other interested parties, please try to review and provide
> feedback
> > > > > ASAP,
> > > > > as related changes would take some time and we still like to hit 19.11
> > > deadline.
> > > > > Konstantin
> > > > >
> > > > >  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > index bc8da2466..c03069e23 100644
> > > > > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
> > > > >   *
> > > > >   * This structure contains data relating to Cipher (Encryption and
> Decryption)
> > > > >   *  use to create a session.
> > > > > + * Actually I was wrong saying that we don't have free space inside
> xforms.
> > > > > + * Making key struct packed (see below) allow us to regain 6B that could
> be
> > > > > + * used for future extensions.
> > > > >   */
> > > > >  struct rte_crypto_cipher_xform {
> > > > >         enum rte_crypto_cipher_operation op;
> > > > > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
> > > > >         struct {
> > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > -       } key;
> > > > > +       } __attribute__((__packed__)) key;
> > > > > +
> > > > > +       /**
> > > > > +         * offset for cipher to start within user provided data buffer.
> > > > > +        * Fan suggested another (and less space consuming way) -
> > > > > +         * reuse iv.offset space below, by changing:
> > > > > +        * struct {uint16_t offset, length;} iv;
> > > > > +        * to uunamed union:
> > > > > +        * union {
> > > > > +        *      struct {uint16_t offset, length;} iv;
> > > > > +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> > > > > +        * };
> > > > > +        * Both approaches seems ok to me in general.
> > > >
> > > > No strong opinions here. OK with this one.
> > > >
> > > > > +        * Comments/suggestions are welcome.
> > > > > +         */
> > > > > +       uint16_t offset;
> > >
> > > After another thought - it is probably a bit better to have offset as a separate
> > > field.
> > > In that case we can use the same xforms to create both type of sessions.
> > ok
> > >
> > > > > +
> > > > > +       uint8_t reserved1[4];
> > > > > +
> > > > >         /**< Cipher key
> > > > >          *
> > > > >          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation,
> key.data
> > > will
> > > > > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
> > > > >         struct {
> > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > -       } key;
> > > > > +       } __attribute__((__packed__)) key;
> > > > >         /**< Authentication key data.
> > > > >          * The authentication key length MUST be less than or equal to the
> > > > >          * block size of the algorithm. It is the callers responsibility to
> > > > > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
> > > > >          * (for example RFC 2104, FIPS 198a).
> > > > >          */
> > > > >
> > > > > +       uint8_t reserved1[6];
> > > > > +
> > > > >         struct {
> > > > >                 uint16_t offset;
> > > > >                 /**< Starting point for Initialisation Vector or Counter,
> > > > > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
> > > > >         struct {
> > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > -       } key;
> > > > > +       } __attribute__((__packed__)) key;
> > > > > +
> > > > > +       /** offset for cipher to start within data buffer */
> > > > > +       uint16_t cipher_offset;
> > > > > +
> > > > > +       uint8_t reserved1[4];
> > > > >
> > > > >         struct {
> > > > >                 uint16_t offset;
> > > > > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > > > > b/lib/librte_cryptodev/rte_cryptodev.h
> > > > > index e175b838c..c0c7bfed7 100644
> > > > > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > > > > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > > > > @@ -1272,6 +1272,101 @@ void *
> > > > >  rte_cryptodev_sym_session_get_user_data(
> > > > >                                         struct rte_cryptodev_sym_session *sess);
> > > > >
> > > > > +/*
> > > > > + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> > > > > + * into existing rte_crypto_sym_session structure/API, but instead
> > > > > + * introduce an extentsion to it via new fully opaque
> > > > > + * struct rte_crypto_cpu_sym_session and additional related API.
> > > >
> > > >
> > > > What all things do we need to squeeze?
> > > > In this proposal I do not see the new struct cpu_sym_session  defined here.
> > >
> > > The plan is to have it totally opaque to the user, i.e. just:
> > > struct rte_crypto_cpu_sym_session;
> > > in public header files.
> > >
> > > > I believe you will have same lib API/struct for cpu_sym_session  and
> > > sym_session.
> > >
> > > I thought about such way, but there are few things that looks clumsy to me:
> > > 1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_session,
> > > so it is not possible to easy distinguish what session do you have: lksd_sym or
> > > cpu_sym.
> > > In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we can
> add
> > > some extra field
> > > here, but in that case  we wouldn't be able to use the same xform for both
> > > lksd_sym or cpu_sym
> > > (which seems really plausible thing for me).
> > > 2.  Majority of rte_cryptodev_sym_session fields I think are unnecessary for
> > > rte_crypto_cpu_sym_session:
> > > sess_data[], opaque_data, user_data, nb_drivers.
> > > All that consumes space, that could be used somewhere else instead.
> > > 3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any
> > > breakages I can't foresee right now.
> > > From other side - if we'll add new functions/structs for cpu_sym_session we
> can
> > > mark it
> > > and keep it for some time as experimental, so further changes (if needed)
> would
> > > still be possible.
> > >
> >
> > OK let us assume that you have a separate structure. But I have a few queries:
> > 1. how can multiple drivers use a same session
> 
> As a short answer: they can't.
> It is pretty much the same approach as with rte_security - each device needs to
> create/init its own session.
> So upper layer would need to maintain its own array (or so) for such case.
> Though the question is why would you like to have same session over multiple
> SW backed devices?
> As it would be anyway just a synchronous function call that will be executed on
> the same cpu.

I may have single FAT tunnel which may be distributed over multiple
Cores, and each core is affined to a different SW device.
So a single session may be accessed by multiple devices.

One more example would be depending on packet sizes, I may switch between
HW/SW PMDs with the same session.

> 
> > 2. Can somebody use the scheduler pmd for scheduling the different type of
> payloads for the same session?
> 
> In theory yes.
> Though for that scheduler pmd should have inside it's
> rte_crypto_cpu_sym_session an array of pointers to
> the underlying devices sessions.
> 
> >
> > With your proposal the APIs would be very specific to your use case only.
> 
> Yes in some way.
> I consider that API specific for SW backed crypto PMDs.
> I can hardly see how any 'real HW' PMDs (lksd-none, lksd-proto) will benefit
> from it.
> Current crypto-op API is very much HW oriented.
> Which is ok, that's for it was intended for, but I think we also need one that
> would be designed
> for SW backed implementation in mind.

We may re-use your API for HW PMDs as well which do not have requirement of
Crypto-op/mbuf etc.
The return type of your new process API may have a status which say 'processed'
Or can be say 'enqueued'. So if it is  'enqueued', we may have a new API for raw
Bufs dequeue as well.

This requirement can be for any hardware PMDs like QAT as well.
That is why a dev-ops would be a better option.

> 
> > When you would add more functionality to this sync API/struct, it will end up
> being the same API/struct.
> >
> > Let us  see how close/ far we are from the existing APIs when the actual
> implementation is done.
> >
> > > > I am not sure if that would be needed.
> > > > It would be internal to the driver that if synchronous processing is
> > > supported(from feature flag) and
> > > > Have relevant fields in xform(the newly added ones which are packed as
> per
> > > your suggestions) set,
> > > > It will create that type of session.
> > > >
> > > >
> > > > > + * Main points:
> > > > > + * - Current crypto-dev API is reasonably mature and it is desirable
> > > > > + *   to keep it unchanged (API/ABI stability). From other side, this
> > > > > + *   new sync API is new one and probably would require extra changes.
> > > > > + *   Having it as a new one allows to mark it as experimental, without
> > > > > + *   affecting existing one.
> > > > > + * - Fully opaque cpu_sym_session structure gives more flexibility
> > > > > + *   to the PMD writers and again allows to avoid ABI breakages in future.
> > > > > + * - process() function per set of xforms
> > > > > + *   allows to expose different process() functions for different
> > > > > + *   xform combinations. PMD writer can decide, does he wants to
> > > > > + *   push all supported algorithms into one process() function,
> > > > > + *   or spread it across several ones.
> > > > > + *   I.E. More flexibility for PMD writer.
> > > >
> > > > Which process function should be chosen is internal to PMD, how would
> that
> > > info
> > > > be visible to the application or the library. These will get stored in the
> session
> > > private
> > > > data. It would be upto the PMD writer, to store the per session process
> > > function in
> > > > the session private data.
> > > >
> > > > Process function would be a dev ops just like enc/deq operations and it
> should
> > > call
> > > > The respective process API stored in the session private data.
> > >
> > > That model (via devops) is possible, but has several drawbacks from my
> > > perspective:
> > >
> > > 1. It means we'll need to pass dev_id as a parameter to process() function.
> > > Though in fact dev_id is not a relevant information for us here
> > > (all we need is pointer to the session and pointer to the fuction to call)
> > > and I tried to avoid using it in data-path functions for that API.
> >
> > You have a single vdev, but someone may have multiple vdevs for each thread,
> or may
> > Have same dev with multiple queues for each core.
> 
> That's fine. As I said above it is a SW backed implementation.
> Each session has to be a separate entity that contains all necessary information
> (keys, alg/mode info,  etc.)  to process input buffers.
> Plus we need the actual function pointer to call.
> I just don't see what for we need a dev_id in that situation.

To iterate the session private data in the session.

> Again, here we don't need care about queues and their pinning to cores.
> If let say someone would like to process buffers from the same IPsec SA on 2
> different cores in parallel, he can just create 2 sessions for the same xform,
> give one to thread #1  and second to thread #2.
> After that both threads are free to call process(this_thread_ses, ...) at will.

Say you have a 16core device to handle 100G of traffic on a single tunnel.
Will we make 16 sessions with same parameters?

> 
> >
> > > 2. As you pointed in that case it will be just one process() function per device.
> > > So if PMD would like to have several process() functions for different type of
> > > sessions
> > > (let say one per alg) first thing it has to do inside it's process() - read session
> data
> > > and
> > > based on that, do a jump/call to particular internal sub-routine.
> > > Something like:
> > > driver_id = get_pmd_driver_id();
> > > priv_ses = ses->sess_data[driver_id];
> > > Then either:
> > > switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...}
> > > OR
> > > priv_ses->process(priv_sess, ...);
> > >
> > > to select and call the proper function.
> > > Looks like totally unnecessary overhead to me.
> > > Though if we'll have ability to query/extract some sort session_ops based on
> the
> > > xform -
> > > we can avoid  this extra de-refererence+jump/call thing.
> >
> > What is the issue in the priv_ses->process(); approach?
> 
> Nothing at all.
> What I am saying that schema with dev_ops
> dev[dev_id]->dev_ops.process(ses->priv_ses[driver_id], ...)
>    |
>    |-> priv_ses->process(...)
> 
> Has bigger overhead then just:
> process(ses,...);
> 
> So what for to introduce extra-level of indirection here?

Explained above.

> 
> > I don't understand what are you saving by not doing this.
> > In any case you would need to identify which session correspond to which
> process().
> 
> Yes, sure, but I think we can make user to store information that relationship,
> in a way he likes: store process() pointer for each session, or group sessions
> that share the same process() somehow, or...

So whatever relationship that user will make and store will make its life complicated.
If we can hide that information in the driver, then what is the issue in that and user
Will not need to worry. He would just call the process() and driver will choose which
Process need to be called.

I think we should have a POC around this and see the difference in the cycle count.
IMO it would be negligible and we would end up making a generic API set which
can be used by others as well.

> 
> > For that you would be doing it somewhere in your data path.
> 
> Why at data-path?
> Only once at session creation/initialization time.
> Or might be even once per group of sessions.
> 
> >
> > >
> > > >
> > > > I am not sure if you would need a new session init API for this as nothing
> would
> > > be visible to
> > > > the app or lib.
> > > >
> > > > > + * - Not storing process() pointer inside the session -
> > > > > + *   Allows user to choose does he want to store a process() pointer
> > > > > + *   per session, or per group of sessions for that device that share
> > > > > + *   the same input xforms. I.E. extra flexibility for the user,
> > > > > + *   plus allows us to keep cpu_sym_session totally opaque, see above.
> > > >
> > > > If multiple sessions need to be processed via the same process function,
> > > > PMD would save the same process in all the sessions, I don't think there
> would
> > > > be any perf overhead with that.
> > >
> > > I think it would, see above.
> > >
> > > >
> > > > > + * Sketched usage model:
> > > > > + * ....
> > > > > + * /* control path, alloc/init session */
> > > > > + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> > > > > + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> > > > > + * rte_crypto_cpu_sym_process_t process =
> > > > > + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> > > > > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> > > > > + * ...
> > > > > + * /* data-path*/
> > > > > + * process(ses, ....);
> > > > > + * ....
> > > > > + * /* control path, termiante/free session */
> > > > > + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> > > > > + */
> > > > > +
> > > > > +/**
> > > > > + * vector structure, contains pointer to vector array and the length
> > > > > + * of the array
> > > > > + */
> > > > > +struct rte_crypto_vec {
> > > > > +       struct iovec *vec;
> > > > > +       uint32_t num;
> > > > > +};
> > > > > +
> > > > > +/*
> > > > > + * Data-path bulk process crypto function.
> > > > > + */
> > > > > +typedef void (*rte_crypto_cpu_sym_process_t)(
> > > > > +               struct rte_crypto_cpu_sym_session *sess,
> > > > > +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> > > > > +               void *digest[], int status[], uint32_t num);
> > > > > +/*
> > > > > + * for given device return process function specific to input xforms
> > > > > + * on error - return NULL and set rte_errno value.
> > > > > + * Note that for same input xfroms for the same device should return
> > > > > + * the same process function.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +rte_crypto_cpu_sym_process_t
> > > > > +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > > +/*
> > > > > + * Return required session size in bytes for given set of xforms.
> > > > > + * if xforms == NULL, then return the max possible session size,
> > > > > + * that would fit session for any supported by the device algorithm.
> > > > > + * if CPU mode is not supported at all, or requeted in xform
> > > > > + * algorithm is not supported, then return -ENOTSUP.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int
> > > > > +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > > +/*
> > > > > + * Initialize session.
> > > > > + * It is caller responsibility to allocate enough space for it.
> > > > > + * See rte_crypto_cpu_sym_session_size above.
> > > > > + */
> > > > > +__rte_experimental
> > > > > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> > > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > > +__rte_experimental
> > > > > +void
> > > > > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> > > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > > +
> > > > > +
> > > > >  #ifdef __cplusplus
> > > > >  }
> > > > >  #endif
> > > > > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > index defe05ea0..ed7e63fab 100644
> > > > > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > @@ -310,6 +310,20 @@ typedef void
> > > (*cryptodev_sym_free_session_t)(struct
> > > > > rte_cryptodev *dev,
> > > > >  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev
> *dev,
> > > > >                 struct rte_cryptodev_asym_session *sess);
> > > > >
> > > > > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev
> > > *dev,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev
> > > *dev,
> > > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev
> > > *dev,
> > > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > > +
> > > > > +typedef rte_crypto_cpu_sym_process_t
> > > (*cryptodev_cpu_sym_session_func_t)
> > > > > (
> > > > > +                       struct rte_cryptodev *dev,
> > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > +
> > > > >  /** Crypto device operations function pointer table */
> > > > >  struct rte_cryptodev_ops {
> > > > >         cryptodev_configure_t dev_configure;    /**< Configure device. */
> > > > > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
> > > > >         /**< Clear a Crypto sessions private data. */
> > > > >         cryptodev_asym_free_session_t asym_session_clear;
> > > > >         /**< Clear a Crypto sessions private data. */
> > > > > +
> > > > > +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> > > > > +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> > > > > +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> > > > > +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
> > > > >  };
> > > > >
> > > > >
> > > > >


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API
  2019-10-03 13:24                                 ` Akhil Goyal
@ 2019-10-07 12:53                                   ` Ananyev, Konstantin
  2019-10-09  7:20                                     ` Akhil Goyal
  0 siblings, 1 reply; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-07 12:53 UTC (permalink / raw)
  To: Akhil Goyal, dev, De Lara Guarch, Pablo, Thomas Monjalon
  Cc: Zhang, Roy Fan, Doherty, Declan, Anoob Joseph


Hi Akhil,

> > > > > > > > > > > > > > > This action type allows the burst of symmetric crypto
> > > > workload
> > > > > > using
> > > > > > > > > the
> > > > > > > > > > > > > same
> > > > > > > > > > > > > > > algorithm, key, and direction being processed by CPU
> > cycles
> > > > > > > > > > > synchronously.
> > > > > > > > > > > > > > > This flexible action type does not require external
> > hardware
> > > > > > > > > involvement,
> > > > > > > > > > > > > > > having the crypto workload processed synchronously,
> > and is
> > > > > > more
> > > > > > > > > > > > > performant
> > > > > > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on
> > removed
> > > > > > "async
> > > > > > > > > > > mode
> > > > > > > > > > > > > > > simulation" as well as 3 cacheline access of the crypto
> > ops.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Does that mean application will not call the
> > > > > > cryptodev_enqueue_burst
> > > > > > > > > and
> > > > > > > > > > > > > corresponding dequeue burst.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, instead it just call
> > rte_security_process_cpu_crypto_bulk(...)
> > > > > > > > > > > > >
> > > > > > > > > > > > > > It would be a new API something like process_packets and
> > it
> > > > will
> > > > > > have
> > > > > > > > > the
> > > > > > > > > > > > > crypto processed packets while returning from the API?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, though the plan is that API will operate on raw data
> > buffers,
> > > > > > not
> > > > > > > > > mbufs.
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I still do not understand why we cannot do with the
> > > > conventional
> > > > > > > > > crypto lib
> > > > > > > > > > > > > only.
> > > > > > > > > > > > > > As far as I can understand, you are not doing any protocol
> > > > > > processing
> > > > > > > > > or
> > > > > > > > > > > any
> > > > > > > > > > > > > value add
> > > > > > > > > > > > > > To the crypto processing. IMO, you just need a
> > synchronous
> > > > > > crypto
> > > > > > > > > > > processing
> > > > > > > > > > > > > API which
> > > > > > > > > > > > > > Can be defined in cryptodev, you don't need to re-create a
> > > > crypto
> > > > > > > > > session
> > > > > > > > > > > in
> > > > > > > > > > > > > the name of
> > > > > > > > > > > > > > Security session in the driver just to do a synchronous
> > > > processing.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I suppose your question is why not to have
> > > > > > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead?
> > > > > > > > > > > > > The main reason is that would require disruptive changes in
> > > > existing
> > > > > > > > > > > cryptodev
> > > > > > > > > > > > > API
> > > > > > > > > > > > > (would cause ABI/API breakage).
> > > > > > > > > > > > > Session for  RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> > need
> > > > > > some
> > > > > > > > > extra
> > > > > > > > > > > > > information
> > > > > > > > > > > > > that normal crypto_sym_xform doesn't contain
> > > > > > > > > > > > > (cipher offset from the start of the buffer, might be
> > something
> > > > extra
> > > > > > in
> > > > > > > > > > > future).
> > > > > > > > > > > >
> > > > > > > > > > > > Cipher offset will be part of rte_crypto_op.
> > > > > > > > > > >
> > > > > > > > > > > fill/read (+ alloc/free) is one of the main things that slowdown
> > > > current
> > > > > > > > > crypto-op
> > > > > > > > > > > approach.
> > > > > > > > > > > That's why the general idea - have all data that wouldn't change
> > > > from
> > > > > > packet
> > > > > > > > > to
> > > > > > > > > > > packet
> > > > > > > > > > > included into the session and setup it once at session_init().
> > > > > > > > > >
> > > > > > > > > > I agree that you cannot use crypto-op.
> > > > > > > > > > You can have the new API in crypto.
> > > > > > > > > > As per the current patch, you only need cipher_offset which you
> > can
> > > > have
> > > > > > it as
> > > > > > > > > a parameter until
> > > > > > > > > > You get it approved in the crypto xform. I believe it will be
> > beneficial
> > > > in
> > > > > > case of
> > > > > > > > > other crypto cases as well.
> > > > > > > > > > We can have cipher offset at both places(crypto-op and
> > > > cipher_xform). It
> > > > > > will
> > > > > > > > > give flexibility to the user to
> > > > > > > > > > override it.
> > > > > > > > >
> > > > > > > > > After having another thought on your proposal:
> > > > > > > > > Probably we can introduce new rte_crypto_sym_xform_types for
> > CPU
> > > > > > related
> > > > > > > > > stuff here?
> > > > > > > >
> > > > > > > > I also thought of adding new xforms, but that wont serve the purpose
> > for
> > > > > > may be all the cases.
> > > > > > > > You would be needing all information currently available in the
> > current
> > > > > > xforms.
> > > > > > > > So if you are adding new fields in the new xform, the size will be more
> > > > than
> > > > > > that of the union of xforms.
> > > > > > > > ABI breakage would still be there.
> > > > > > > >
> > > > > > > > If you think a valid compression of the AEAD xform can be done, then
> > > > that
> > > > > > can be done for each of the
> > > > > > > > Xforms and we can have a solution to this issue.
> > > > > > >
> > > > > > > I think that we can re-use iv.offset for our purposes (for crypto offset).
> > > > > > > So for now we can make that path work without any ABI breakage.
> > > > > > > Fan, please feel free to correct me here, if I missed something.
> > > > > > > If in future we would need to add some extra information it might
> > > > > > > require ABI breakage, though by now I don't envision anything
> > particular to
> > > > > > add.
> > > > > > > Anyway, if there is no objection to go that way, we can try to make
> > > > > > > these changes for v2.
> > > > > > >
> > > > > >
> > > > > > Actually, after looking at it more deeply it appears not that easy as I
> > thought
> > > > it
> > > > > > would be :)
> > > > > > Below is a very draft version of proposed API additions.
> > > > > > I think it avoids ABI breakages right now and provides enough flexibility
> > for
> > > > > > future extensions (if any).
> > > > > > For now, it doesn't address your comments about naming conventions
> > > > (_CPU_
> > > > > > vs _SYNC_) , etc.
> > > > > > but I suppose is comprehensive enough to provide a main idea beyond it.
> > > > > > Akhil and other interested parties, please try to review and provide
> > feedback
> > > > > > ASAP,
> > > > > > as related changes would take some time and we still like to hit 19.11
> > > > deadline.
> > > > > > Konstantin
> > > > > >
> > > > > >  diff --git a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > > b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > > index bc8da2466..c03069e23 100644
> > > > > > --- a/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > > +++ b/lib/librte_cryptodev/rte_crypto_sym.h
> > > > > > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[];
> > > > > >   *
> > > > > >   * This structure contains data relating to Cipher (Encryption and
> > Decryption)
> > > > > >   *  use to create a session.
> > > > > > + * Actually I was wrong saying that we don't have free space inside
> > xforms.
> > > > > > + * Making key struct packed (see below) allow us to regain 6B that could
> > be
> > > > > > + * used for future extensions.
> > > > > >   */
> > > > > >  struct rte_crypto_cipher_xform {
> > > > > >         enum rte_crypto_cipher_operation op;
> > > > > > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform {
> > > > > >         struct {
> > > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > > -       } key;
> > > > > > +       } __attribute__((__packed__)) key;
> > > > > > +
> > > > > > +       /**
> > > > > > +         * offset for cipher to start within user provided data buffer.
> > > > > > +        * Fan suggested another (and less space consuming way) -
> > > > > > +         * reuse iv.offset space below, by changing:
> > > > > > +        * struct {uint16_t offset, length;} iv;
> > > > > > +        * to uunamed union:
> > > > > > +        * union {
> > > > > > +        *      struct {uint16_t offset, length;} iv;
> > > > > > +        *      struct {uint16_t iv_len, crypto_offset} cpu_crypto_param;
> > > > > > +        * };
> > > > > > +        * Both approaches seems ok to me in general.
> > > > >
> > > > > No strong opinions here. OK with this one.
> > > > >
> > > > > > +        * Comments/suggestions are welcome.
> > > > > > +         */
> > > > > > +       uint16_t offset;
> > > >
> > > > After another thought - it is probably a bit better to have offset as a separate
> > > > field.
> > > > In that case we can use the same xforms to create both type of sessions.
> > > ok
> > > >
> > > > > > +
> > > > > > +       uint8_t reserved1[4];
> > > > > > +
> > > > > >         /**< Cipher key
> > > > > >          *
> > > > > >          * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation,
> > key.data
> > > > will
> > > > > > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform {
> > > > > >         struct {
> > > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > > -       } key;
> > > > > > +       } __attribute__((__packed__)) key;
> > > > > >         /**< Authentication key data.
> > > > > >          * The authentication key length MUST be less than or equal to the
> > > > > >          * block size of the algorithm. It is the callers responsibility to
> > > > > > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform {
> > > > > >          * (for example RFC 2104, FIPS 198a).
> > > > > >          */
> > > > > >
> > > > > > +       uint8_t reserved1[6];
> > > > > > +
> > > > > >         struct {
> > > > > >                 uint16_t offset;
> > > > > >                 /**< Starting point for Initialisation Vector or Counter,
> > > > > > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform {
> > > > > >         struct {
> > > > > >                 const uint8_t *data;    /**< pointer to key data */
> > > > > >                 uint16_t length;        /**< key length in bytes */
> > > > > > -       } key;
> > > > > > +       } __attribute__((__packed__)) key;
> > > > > > +
> > > > > > +       /** offset for cipher to start within data buffer */
> > > > > > +       uint16_t cipher_offset;
> > > > > > +
> > > > > > +       uint8_t reserved1[4];
> > > > > >
> > > > > >         struct {
> > > > > >                 uint16_t offset;
> > > > > > diff --git a/lib/librte_cryptodev/rte_cryptodev.h
> > > > > > b/lib/librte_cryptodev/rte_cryptodev.h
> > > > > > index e175b838c..c0c7bfed7 100644
> > > > > > --- a/lib/librte_cryptodev/rte_cryptodev.h
> > > > > > +++ b/lib/librte_cryptodev/rte_cryptodev.h
> > > > > > @@ -1272,6 +1272,101 @@ void *
> > > > > >  rte_cryptodev_sym_session_get_user_data(
> > > > > >                                         struct rte_cryptodev_sym_session *sess);
> > > > > >
> > > > > > +/*
> > > > > > + * After several thoughts decided not to try to squeeze CPU_CRYPTO
> > > > > > + * into existing rte_crypto_sym_session structure/API, but instead
> > > > > > + * introduce an extentsion to it via new fully opaque
> > > > > > + * struct rte_crypto_cpu_sym_session and additional related API.
> > > > >
> > > > >
> > > > > What all things do we need to squeeze?
> > > > > In this proposal I do not see the new struct cpu_sym_session  defined here.
> > > >
> > > > The plan is to have it totally opaque to the user, i.e. just:
> > > > struct rte_crypto_cpu_sym_session;
> > > > in public header files.
> > > >
> > > > > I believe you will have same lib API/struct for cpu_sym_session  and
> > > > sym_session.
> > > >
> > > > I thought about such way, but there are few things that looks clumsy to me:
> > > > 1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_session,
> > > > so it is not possible to easy distinguish what session do you have: lksd_sym or
> > > > cpu_sym.
> > > > In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we can
> > add
> > > > some extra field
> > > > here, but in that case  we wouldn't be able to use the same xform for both
> > > > lksd_sym or cpu_sym
> > > > (which seems really plausible thing for me).
> > > > 2.  Majority of rte_cryptodev_sym_session fields I think are unnecessary for
> > > > rte_crypto_cpu_sym_session:
> > > > sess_data[], opaque_data, user_data, nb_drivers.
> > > > All that consumes space, that could be used somewhere else instead.
> > > > 3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any
> > > > breakages I can't foresee right now.
> > > > From other side - if we'll add new functions/structs for cpu_sym_session we
> > can
> > > > mark it
> > > > and keep it for some time as experimental, so further changes (if needed)
> > would
> > > > still be possible.
> > > >
> > >
> > > OK let us assume that you have a separate structure. But I have a few queries:
> > > 1. how can multiple drivers use a same session
> >
> > As a short answer: they can't.
> > It is pretty much the same approach as with rte_security - each device needs to
> > create/init its own session.
> > So upper layer would need to maintain its own array (or so) for such case.
> > Though the question is why would you like to have same session over multiple
> > SW backed devices?
> > As it would be anyway just a synchronous function call that will be executed on
> > the same cpu.
> 
> I may have single FAT tunnel which may be distributed over multiple
> Cores, and each core is affined to a different SW device.

If it is pure SW, then we don't need multiple devices for such scenario.
Device in that case is pure abstraction that we can skip.

> So a single session may be accessed by multiple devices.
> 
> One more example would be depending on packet sizes, I may switch between
> HW/SW PMDs with the same session.

Sure, but then we'll have multiple sessions.
BTW, we have same thing now - these private session pointers are just stored
inside the same rte_crypto_sym_session.
And if user wants to support this model, he would also need to store <dev_id, queue_id>
pair for each HW device anyway.

> 
> >
> > > 2. Can somebody use the scheduler pmd for scheduling the different type of
> > payloads for the same session?
> >
> > In theory yes.
> > Though for that scheduler pmd should have inside it's
> > rte_crypto_cpu_sym_session an array of pointers to
> > the underlying devices sessions.
> >
> > >
> > > With your proposal the APIs would be very specific to your use case only.
> >
> > Yes in some way.
> > I consider that API specific for SW backed crypto PMDs.
> > I can hardly see how any 'real HW' PMDs (lksd-none, lksd-proto) will benefit
> > from it.
> > Current crypto-op API is very much HW oriented.
> > Which is ok, that's for it was intended for, but I think we also need one that
> > would be designed
> > for SW backed implementation in mind.
> 
> We may re-use your API for HW PMDs as well which do not have requirement of
> Crypto-op/mbuf etc.
> The return type of your new process API may have a status which say 'processed'
> Or can be say 'enqueued'. So if it is  'enqueued', we may have a new API for raw
> Bufs dequeue as well.
> 
> This requirement can be for any hardware PMDs like QAT as well.

I don't think it is a good idea to extend this API for async (lookaside) devices.
You'll need to:
 - provide dev_id and queue_id for each process(enqueue) and dequeuer operation.
 - provide IOVA for all buffers passing to that function (data buffers, digest, IV, aad).
 - On dequeue provide some way to associate dequed data and digest buffers with
   crypto-session that was used  (and probably with mbuf).  
 So most likely we'll end up with another just version of our current crypto-op structure.  
If you'd like to get rid of mbufs dependency within current crypto-op API that understandable,
but I don't think we should have same API for both sync (CPU) and async (lookaside) cases. 
It doesn't seem feasible at all and voids whole purpose of that patch.

> That is why a dev-ops would be a better option.
> 
> >
> > > When you would add more functionality to this sync API/struct, it will end up
> > being the same API/struct.
> > >
> > > Let us  see how close/ far we are from the existing APIs when the actual
> > implementation is done.
> > >
> > > > > I am not sure if that would be needed.
> > > > > It would be internal to the driver that if synchronous processing is
> > > > supported(from feature flag) and
> > > > > Have relevant fields in xform(the newly added ones which are packed as
> > per
> > > > your suggestions) set,
> > > > > It will create that type of session.
> > > > >
> > > > >
> > > > > > + * Main points:
> > > > > > + * - Current crypto-dev API is reasonably mature and it is desirable
> > > > > > + *   to keep it unchanged (API/ABI stability). From other side, this
> > > > > > + *   new sync API is new one and probably would require extra changes.
> > > > > > + *   Having it as a new one allows to mark it as experimental, without
> > > > > > + *   affecting existing one.
> > > > > > + * - Fully opaque cpu_sym_session structure gives more flexibility
> > > > > > + *   to the PMD writers and again allows to avoid ABI breakages in future.
> > > > > > + * - process() function per set of xforms
> > > > > > + *   allows to expose different process() functions for different
> > > > > > + *   xform combinations. PMD writer can decide, does he wants to
> > > > > > + *   push all supported algorithms into one process() function,
> > > > > > + *   or spread it across several ones.
> > > > > > + *   I.E. More flexibility for PMD writer.
> > > > >
> > > > > Which process function should be chosen is internal to PMD, how would
> > that
> > > > info
> > > > > be visible to the application or the library. These will get stored in the
> > session
> > > > private
> > > > > data. It would be upto the PMD writer, to store the per session process
> > > > function in
> > > > > the session private data.
> > > > >
> > > > > Process function would be a dev ops just like enc/deq operations and it
> > should
> > > > call
> > > > > The respective process API stored in the session private data.
> > > >
> > > > That model (via devops) is possible, but has several drawbacks from my
> > > > perspective:
> > > >
> > > > 1. It means we'll need to pass dev_id as a parameter to process() function.
> > > > Though in fact dev_id is not a relevant information for us here
> > > > (all we need is pointer to the session and pointer to the fuction to call)
> > > > and I tried to avoid using it in data-path functions for that API.
> > >
> > > You have a single vdev, but someone may have multiple vdevs for each thread,
> > or may
> > > Have same dev with multiple queues for each core.
> >
> > That's fine. As I said above it is a SW backed implementation.
> > Each session has to be a separate entity that contains all necessary information
> > (keys, alg/mode info,  etc.)  to process input buffers.
> > Plus we need the actual function pointer to call.
> > I just don't see what for we need a dev_id in that situation.
> 
> To iterate the session private data in the session.
> 
> > Again, here we don't need care about queues and their pinning to cores.
> > If let say someone would like to process buffers from the same IPsec SA on 2
> > different cores in parallel, he can just create 2 sessions for the same xform,
> > give one to thread #1  and second to thread #2.
> > After that both threads are free to call process(this_thread_ses, ...) at will.
> 
> Say you have a 16core device to handle 100G of traffic on a single tunnel.
> Will we make 16 sessions with same parameters?

Absolutely same question we can ask for current crypto-op API.
You have lookaside crypto-dev with 16 HW queues, each queue is serviced by different CPU.
For the same SA, do you need a separate session per queue, or is it ok to reuse current one?
AFAIK, right now this is a grey area not clearly defined.
For crypto-devs I am aware - user can reuse the same session (as PMD uses it read-only).
But again, right now I think it is not clearly defined and is implementation specific.

> 
> >
> > >
> > > > 2. As you pointed in that case it will be just one process() function per device.
> > > > So if PMD would like to have several process() functions for different type of
> > > > sessions
> > > > (let say one per alg) first thing it has to do inside it's process() - read session
> > data
> > > > and
> > > > based on that, do a jump/call to particular internal sub-routine.
> > > > Something like:
> > > > driver_id = get_pmd_driver_id();
> > > > priv_ses = ses->sess_data[driver_id];
> > > > Then either:
> > > > switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...}
> > > > OR
> > > > priv_ses->process(priv_sess, ...);
> > > >
> > > > to select and call the proper function.
> > > > Looks like totally unnecessary overhead to me.
> > > > Though if we'll have ability to query/extract some sort session_ops based on
> > the
> > > > xform -
> > > > we can avoid  this extra de-refererence+jump/call thing.
> > >
> > > What is the issue in the priv_ses->process(); approach?
> >
> > Nothing at all.
> > What I am saying that schema with dev_ops
> > dev[dev_id]->dev_ops.process(ses->priv_ses[driver_id], ...)
> >    |
> >    |-> priv_ses->process(...)
> >
> > Has bigger overhead then just:
> > process(ses,...);
> >
> > So what for to introduce extra-level of indirection here?
> 
> Explained above.
> 
> >
> > > I don't understand what are you saving by not doing this.
> > > In any case you would need to identify which session correspond to which
> > process().
> >
> > Yes, sure, but I think we can make user to store information that relationship,
> > in a way he likes: store process() pointer for each session, or group sessions
> > that share the same process() somehow, or...
> 
> So whatever relationship that user will make and store will make its life complicated.
> If we can hide that information in the driver, then what is the issue in that and user
> Will not need to worry. He would just call the process() and driver will choose which
> Process need to be called.

Driver can do that at config/init time.
Then at run-time we can avoid that choice at all and call already chosen function.

> 
> I think we should have a POC around this and see the difference in the cycle count.
> IMO it would be negligible and we would end up making a generic API set which
> can be used by others as well.
> 
> >
> > > For that you would be doing it somewhere in your data path.
> >
> > Why at data-path?
> > Only once at session creation/initialization time.
> > Or might be even once per group of sessions.
> >
> > >
> > > >
> > > > >
> > > > > I am not sure if you would need a new session init API for this as nothing
> > would
> > > > be visible to
> > > > > the app or lib.
> > > > >
> > > > > > + * - Not storing process() pointer inside the session -
> > > > > > + *   Allows user to choose does he want to store a process() pointer
> > > > > > + *   per session, or per group of sessions for that device that share
> > > > > > + *   the same input xforms. I.E. extra flexibility for the user,
> > > > > > + *   plus allows us to keep cpu_sym_session totally opaque, see above.
> > > > >
> > > > > If multiple sessions need to be processed via the same process function,
> > > > > PMD would save the same process in all the sessions, I don't think there
> > would
> > > > > be any perf overhead with that.
> > > >
> > > > I think it would, see above.
> > > >
> > > > >
> > > > > > + * Sketched usage model:
> > > > > > + * ....
> > > > > > + * /* control path, alloc/init session */
> > > > > > + * int32_t sz = rte_crypto_cpu_sym_session_size(dev_id, &xform);
> > > > > > + * struct rte_crypto_cpu_sym_session *ses = user_alloc(..., sz);
> > > > > > + * rte_crypto_cpu_sym_process_t process =
> > > > > > + *     rte_crypto_cpu_sym_session_func(dev_id, &xform);
> > > > > > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform);
> > > > > > + * ...
> > > > > > + * /* data-path*/
> > > > > > + * process(ses, ....);
> > > > > > + * ....
> > > > > > + * /* control path, termiante/free session */
> > > > > > + * rte_crypto_cpu_sym_session_fini(dev_id, ses);
> > > > > > + */
> > > > > > +
> > > > > > +/**
> > > > > > + * vector structure, contains pointer to vector array and the length
> > > > > > + * of the array
> > > > > > + */
> > > > > > +struct rte_crypto_vec {
> > > > > > +       struct iovec *vec;
> > > > > > +       uint32_t num;
> > > > > > +};
> > > > > > +
> > > > > > +/*
> > > > > > + * Data-path bulk process crypto function.
> > > > > > + */
> > > > > > +typedef void (*rte_crypto_cpu_sym_process_t)(
> > > > > > +               struct rte_crypto_cpu_sym_session *sess,
> > > > > > +               struct rte_crypto_vec buf[], void *iv[], void *aad[],
> > > > > > +               void *digest[], int status[], uint32_t num);
> > > > > > +/*
> > > > > > + * for given device return process function specific to input xforms
> > > > > > + * on error - return NULL and set rte_errno value.
> > > > > > + * Note that for same input xfroms for the same device should return
> > > > > > + * the same process function.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +rte_crypto_cpu_sym_process_t
> > > > > > +rte_crypto_cpu_sym_session_func(uint8_t dev_id,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > > +/*
> > > > > > + * Return required session size in bytes for given set of xforms.
> > > > > > + * if xforms == NULL, then return the max possible session size,
> > > > > > + * that would fit session for any supported by the device algorithm.
> > > > > > + * if CPU mode is not supported at all, or requeted in xform
> > > > > > + * algorithm is not supported, then return -ENOTSUP.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int
> > > > > > +rte_crypto_cpu_sym_session_size(uint8_t dev_id,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > > +/*
> > > > > > + * Initialize session.
> > > > > > + * It is caller responsibility to allocate enough space for it.
> > > > > > + * See rte_crypto_cpu_sym_session_size above.
> > > > > > + */
> > > > > > +__rte_experimental
> > > > > > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id,
> > > > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > > +__rte_experimental
> > > > > > +void
> > > > > > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id,
> > > > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > > > +
> > > > > > +
> > > > > >  #ifdef __cplusplus
> > > > > >  }
> > > > > >  #endif
> > > > > > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > > b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > > index defe05ea0..ed7e63fab 100644
> > > > > > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h
> > > > > > @@ -310,6 +310,20 @@ typedef void
> > > > (*cryptodev_sym_free_session_t)(struct
> > > > > > rte_cryptodev *dev,
> > > > > >  typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev
> > *dev,
> > > > > >                 struct rte_cryptodev_asym_session *sess);
> > > > > >
> > > > > > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev
> > > > *dev,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev
> > > > *dev,
> > > > > > +                       struct rte_crypto_cpu_sym_session *sess,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev
> > > > *dev,
> > > > > > +                       struct rte_crypto_cpu_sym_session *sess);
> > > > > > +
> > > > > > +typedef rte_crypto_cpu_sym_process_t
> > > > (*cryptodev_cpu_sym_session_func_t)
> > > > > > (
> > > > > > +                       struct rte_cryptodev *dev,
> > > > > > +                       const struct rte_crypto_sym_xform *xforms);
> > > > > > +
> > > > > >  /** Crypto device operations function pointer table */
> > > > > >  struct rte_cryptodev_ops {
> > > > > >         cryptodev_configure_t dev_configure;    /**< Configure device. */
> > > > > > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops {
> > > > > >         /**< Clear a Crypto sessions private data. */
> > > > > >         cryptodev_asym_free_session_t asym_session_clear;
> > > > > >         /**< Clear a Crypto sessions private data. */
> > > > > > +
> > > > > > +       cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size;
> > > > > > +       cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func;
> > > > > > +       cryptodev_cpu_sym_session_init_t sym_cpu_session_init;
> > > > > > +       cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini;
> > > > > >  };
> > > > > >
> > > > > >
> > > > > >


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 00/10] security: add software synchronous crypto process
  2019-09-06 13:13 ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Fan Zhang
                     ` (10 preceding siblings ...)
  2019-09-09 12:43   ` [dpdk-dev] [PATCH 00/10] security: add software synchronous crypto process Aaron Conole
@ 2019-10-07 16:28   ` " Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API Fan Zhang
                       ` (9 more replies)
  11 siblings, 10 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This RFC patch adds a way to rte_security to process symmetric crypto
workload in bulk synchronously for SW crypto devices.

Originally both SW and HW crypto PMDs works under rte_cryptodev to
process the crypto workload asynchronously. This way provides uniformity
to both PMD types but also introduce unnecessary performance penalty to
SW PMDs such as extra SW ring enqueue/dequeue steps to "simulate"
asynchronous working manner and unnecessary HW addresses computation.

We introduce a new way for SW crypto devices that perform crypto operation
synchronously with only fields required for the computation as input.

In rte_security, a new action type "RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO"
is introduced. This action type allows the burst of symmetric crypto
workload using the same algorithm, key, and direction being processed by
CPU cycles synchronously. This flexible action type does not require
external hardware involvement.

This patch also includes the announcement of a new API
"rte_security_process_cpu_crypto_bulk". With this API the packet is sent to
the crypto device for symmetric crypto processing. The device will encrypt
or decrypt the buffer based on the session data specified and preprocessed
in the security session. Different than the inline or lookaside modes, when
the function exits, the user will expect the buffers are either processed
successfully, or having the error number assigned to the appropriate index
of the status array.

The proof-of-concept AESNI-GCM and AESNI-MB SW PMDs are updated with the
support of this new method. To demonstrate the performance gain with
this method 2 simple performance evaluation apps under unit-test are added
"app/test: security_aesni_gcm_perftest/security_aesni_mb_perftest". The
users can freely compare their results against crypto perf application
results.

In the end, the ipsec library and ipsec-secgw sample application are also
updated to support this feature. Several test scripts are added to the
ipsec-secgw test-suite to prove the correctness of the implementation.

v2:
- changed API return from "void" to "int"
- rework on ipsec library implementation.
- fixed bugs in aesni-mb PMD.
- fixed bugs in ipsec-secgw application.

Fan Zhang (10):
  security: introduce CPU Crypto action type and API
  crypto/aesni_gcm: add rte_security handler
  app/test: add security cpu crypto autotest
  app/test: add security cpu crypto perftest
  crypto/aesni_mb: add rte_security handler
  app/test: add aesni_mb security cpu crypto autotest
  app/test: add aesni_mb security cpu crypto perftest
  ipsec: add rte_security cpu_crypto action support
  examples/ipsec-secgw: add security cpu_crypto action support
  doc: update security cpu process description

 app/test/Makefile                                  |    1 +
 app/test/meson.build                               |    1 +
 app/test/test_security_cpu_crypto.c                | 1326 ++++++++++++++++++++
 doc/guides/cryptodevs/aesni_gcm.rst                |    6 +
 doc/guides/cryptodevs/aesni_mb.rst                 |    7 +
 doc/guides/prog_guide/rte_security.rst             |  112 +-
 doc/guides/rel_notes/release_19_11.rst             |    7 +
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c           |   97 +-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c       |   95 ++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h   |   23 +
 drivers/crypto/aesni_gcm/meson.build               |    2 +-
 drivers/crypto/aesni_mb/meson.build                |    2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         |  368 +++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |   92 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |   21 +-
 examples/ipsec-secgw/ipsec.c                       |   35 +
 examples/ipsec-secgw/ipsec_process.c               |    7 +-
 examples/ipsec-secgw/sa.c                          |   13 +-
 examples/ipsec-secgw/test/run_test.sh              |   10 +
 .../test/trs_3descbc_sha1_common_defs.sh           |    8 +-
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/trs_aescbc_sha1_common_defs.sh            |    8 +-
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/trs_aesctr_sha1_common_defs.sh            |    8 +-
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 .../test/tun_3descbc_sha1_common_defs.sh           |    8 +-
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |    5 +
 .../test/tun_aescbc_sha1_common_defs.sh            |    8 +-
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |    5 +
 .../test/tun_aesctr_sha1_common_defs.sh            |    8 +-
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |    5 +
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |    5 +
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |    7 +
 lib/librte_ipsec/crypto.h                          |   24 +
 lib/librte_ipsec/esp_inb.c                         |  200 ++-
 lib/librte_ipsec/esp_outb.c                        |  369 +++++-
 lib/librte_ipsec/sa.c                              |   53 +-
 lib/librte_ipsec/sa.h                              |   29 +
 lib/librte_ipsec/ses.c                             |    4 +-
 lib/librte_security/rte_security.c                 |   11 +
 lib/librte_security/rte_security.h                 |   53 +-
 lib/librte_security/rte_security_driver.h          |   22 +
 lib/librte_security/rte_security_version.map       |    1 +
 45 files changed, 2994 insertions(+), 99 deletions(-)
 create mode 100644 app/test/test_security_cpu_crypto.c
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-08 13:42       ` Ananyev, Konstantin
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
                       ` (8 subsequent siblings)
  9 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
security library. The type represents performing crypto operation with CPU
cycles. The patch also includes a new API to process crypto operations in
bulk and the function pointers for PMDs.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_security/rte_security.c           | 11 ++++++
 lib/librte_security/rte_security.h           | 53 +++++++++++++++++++++++++++-
 lib/librte_security/rte_security_driver.h    | 22 ++++++++++++
 lib/librte_security/rte_security_version.map |  1 +
 4 files changed, 86 insertions(+), 1 deletion(-)

diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
index bc81ce15d..cdd1ee6af 100644
--- a/lib/librte_security/rte_security.c
+++ b/lib/librte_security/rte_security.c
@@ -141,3 +141,14 @@ rte_security_capability_get(struct rte_security_ctx *instance,
 
 	return NULL;
 }
+
+int
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->process_cpu_crypto_bulk, -1);
+	return instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
+			aad, digest, status, num);
+}
diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
index aaafdfcd7..0caf5d697 100644
--- a/lib/librte_security/rte_security.h
+++ b/lib/librte_security/rte_security.h
@@ -18,6 +18,7 @@ extern "C" {
 #endif
 
 #include <sys/types.h>
+#include <sys/uio.h>
 
 #include <netinet/in.h>
 #include <netinet/ip.h>
@@ -289,6 +290,20 @@ struct rte_security_pdcp_xform {
 	uint32_t hfn_ovrd;
 };
 
+struct rte_security_cpu_crypto_xform {
+	/** For cipher/authentication crypto operation the authentication may
+	 * cover more content then the cipher. E.g., for IPSec ESP encryption
+	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
+	 * header but whole packet (apart from MAC header) is authenticated.
+	 * The cipher_offset field is used to deduct the cipher data pointer
+	 * from the buffer to be processed.
+	 *
+	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
+	 * uses the same offset for cipher and authentication.
+	 */
+	int32_t cipher_offset;
+};
+
 /**
  * Security session action type.
  */
@@ -303,10 +318,14 @@ enum rte_security_session_action_type {
 	/**< All security protocol processing is performed inline during
 	 * transmission
 	 */
-	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
+	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
 	/**< All security protocol processing including crypto is performed
 	 * on a lookaside accelerator
 	 */
+	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
+	/**< Crypto processing for security protocol is processed by CPU
+	 * synchronously
+	 */
 };
 
 /** Security session protocol definition */
@@ -332,6 +351,7 @@ struct rte_security_session_conf {
 		struct rte_security_ipsec_xform ipsec;
 		struct rte_security_macsec_xform macsec;
 		struct rte_security_pdcp_xform pdcp;
+		struct rte_security_cpu_crypto_xform cpucrypto;
 	};
 	/**< Configuration parameters for security session */
 	struct rte_crypto_sym_xform *crypto_xform;
@@ -665,6 +685,37 @@ const struct rte_security_capability *
 rte_security_capability_get(struct rte_security_ctx *instance,
 			    struct rte_security_capability_idx *idx);
 
+/**
+ * Security vector structure, contains pointer to vector array and the length
+ * of the array
+ */
+struct rte_security_vec {
+	struct iovec *vec;
+	uint32_t num;
+};
+
+/**
+ * Processing bulk crypto workload with CPU
+ *
+ * @param	instance	security instance.
+ * @param	sess		security session
+ * @param	buf		array of buffer SGL vectors
+ * @param	iv		array of IV pointers
+ * @param	aad		array of AAD pointers
+ * @param	digest		array of digest pointers
+ * @param	status		array of status for the function to return
+ * @param	num		number of elements in each array
+ * @return
+ *  - On success, 0
+ *  - On any failure, -1
+ */
+__rte_experimental
+int
+rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 #ifdef __cplusplus
 }
 #endif
diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
index 1b561f852..fe940fffa 100644
--- a/lib/librte_security/rte_security_driver.h
+++ b/lib/librte_security/rte_security_driver.h
@@ -132,6 +132,26 @@ typedef int (*security_get_userdata_t)(void *device,
 typedef const struct rte_security_capability *(*security_capabilities_get_t)(
 		void *device);
 
+/**
+ * Process security operations in bulk using CPU accelerated method.
+ *
+ * @param	sess		Security session structure.
+ * @param	buf		Buffer to the vectors to be processed.
+ * @param	iv		IV pointers.
+ * @param	aad		AAD pointers.
+ * @param	digest		Digest pointers.
+ * @param	status		Array of status value.
+ * @param	num		Number of elements in each array.
+ * @return
+ *  - On success, 0
+ *  - On any failure, -1
+ */
+
+typedef int (*security_process_cpu_crypto_bulk_t)(
+		struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** Security operations function pointer table */
 struct rte_security_ops {
 	security_session_create_t session_create;
@@ -150,6 +170,8 @@ struct rte_security_ops {
 	/**< Get userdata associated with session which processed the packet. */
 	security_capabilities_get_t capabilities_get;
 	/**< Get security capabilities. */
+	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
+	/**< Process data in bulk. */
 };
 
 #ifdef __cplusplus
diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
index 53267bf3c..2132e7a00 100644
--- a/lib/librte_security/rte_security_version.map
+++ b/lib/librte_security/rte_security_version.map
@@ -18,4 +18,5 @@ EXPERIMENTAL {
 	rte_security_get_userdata;
 	rte_security_session_stats_get;
 	rte_security_session_update;
+	rte_security_process_cpu_crypto_bulk;
 };
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 02/10] crypto/aesni_gcm: add rte_security handler
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-08 13:44       ` Ananyev, Konstantin
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 03/10] app/test: add security cpu crypto autotest Fan Zhang
                       ` (7 subsequent siblings)
  9 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch add rte_security support support to AESNI-GCM PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode with
scatter-gather list buffer supported.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_gcm/aesni_gcm_pmd.c         | 97 +++++++++++++++++++++++-
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c     | 95 +++++++++++++++++++++++
 drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h | 23 ++++++
 drivers/crypto/aesni_gcm/meson.build             |  2 +-
 4 files changed, 215 insertions(+), 2 deletions(-)

diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
index 1006a5c4d..2e91bf149 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
@@ -6,6 +6,7 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -174,6 +175,56 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct rte_crypto_op *op)
 	return sess;
 }
 
+static __rte_always_inline int
+process_gcm_security_sgl_buf(struct aesni_gcm_security_session *sess,
+		struct rte_security_vec *buf, uint8_t *iv,
+		uint8_t *aad, uint8_t *digest)
+{
+	struct aesni_gcm_session *session = &sess->sess;
+	uint8_t *tag;
+	uint32_t i;
+
+	sess->init(&session->gdata_key, &sess->gdata_ctx, iv, aad,
+			(uint64_t)session->aad_length);
+
+	for (i = 0; i < buf->num; i++) {
+		struct iovec *vec = &buf->vec[i];
+
+		sess->update(&session->gdata_key, &sess->gdata_ctx,
+				vec->iov_base, vec->iov_base, vec->iov_len);
+	}
+
+	switch (session->op) {
+	case AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION:
+		if (session->req_digest_length != session->gen_digest_length)
+			tag = sess->temp_digest;
+		else
+			tag = digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (session->req_digest_length != session->gen_digest_length)
+			memcpy(digest, sess->temp_digest,
+					session->req_digest_length);
+		break;
+
+	case AESNI_GCM_OP_AUTHENTICATED_DECRYPTION:
+		tag = sess->temp_digest;
+
+		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
+				session->gen_digest_length);
+
+		if (memcmp(tag, digest,	session->req_digest_length) != 0)
+			return -1;
+		break;
+	default:
+		return -1;
+	}
+
+	return 0;
+}
+
 /**
  * Process a crypto operation, calling
  * the GCM API from the multi buffer library.
@@ -488,8 +539,10 @@ aesni_gcm_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_gcm_private *internals;
+	struct rte_security_ctx *sec_ctx;
 	enum aesni_gcm_vector_mode vector_mode;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -524,7 +577,8 @@ aesni_gcm_create(const char *name,
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
 			RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 	mb_mgr = alloc_mb_mgr(0);
 	if (mb_mgr == NULL)
@@ -587,6 +641,21 @@ aesni_gcm_create(const char *name,
 
 	internals->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_gcm_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_GCM_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_gcm_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 #if IMB_VERSION_NUM >= IMB_VERSION(0, 50, 0)
 	AESNI_GCM_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
@@ -641,6 +710,8 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	if (cryptodev == NULL)
 		return -ENODEV;
 
+	rte_free(cryptodev->security_ctx);
+
 	internals = cryptodev->data->dev_private;
 
 	free_mb_mgr(internals->mb_mgr);
@@ -648,6 +719,30 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
 	return rte_cryptodev_pmd_destroy(cryptodev);
 }
 
+int
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_gcm_security_session *session =
+			get_sec_session_private_data(sess);
+	uint32_t i;
+	int errcnt = 0;
+
+	if (unlikely(!session))
+		return -num;
+
+	for (i = 0; i < num; i++) {
+		status[i] = process_gcm_security_sgl_buf(session, &buf[i],
+				(uint8_t *)iv[i], (uint8_t *)aad[i],
+				(uint8_t *)digest[i]);
+		if (unlikely(status[i]))
+			errcnt -= 1;
+	}
+
+	return errcnt;
+}
+
 static struct rte_vdev_driver aesni_gcm_pmd_drv = {
 	.probe = aesni_gcm_probe,
 	.remove = aesni_gcm_remove
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
index 2f66c7c58..cc71dbd60 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
@@ -7,6 +7,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "aesni_gcm_pmd_private.h"
 
@@ -316,6 +317,85 @@ aesni_gcm_pmd_sym_session_clear(struct rte_cryptodev *dev,
 	}
 }
 
+static int
+aesni_gcm_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_gcm_private *internals = cdev->data->dev_private;
+	struct aesni_gcm_security_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_GCM_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (conf->crypto_xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
+		AESNI_GCM_LOG(ERR, "GMAC is not supported in security session");
+		return -EINVAL;
+	}
+
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_GCM_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	ret = aesni_gcm_set_session_parameters(internals->ops,
+				&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_GCM_LOG(ERR, "Failed configure session parameters");
+
+		/* Return session to mempool */
+		rte_mempool_put(mempool, (void *)sess_priv);
+		return ret;
+	}
+
+	sess_priv->pre = internals->ops[sess_priv->sess.key].pre;
+	sess_priv->init = internals->ops[sess_priv->sess.key].init;
+	if (sess_priv->sess.op == AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION) {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_enc;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_enc;
+	} else {
+		sess_priv->update =
+			internals->ops[sess_priv->sess.key].update_dec;
+		sess_priv->finalize =
+			internals->ops[sess_priv->sess.key].finalize_dec;
+	}
+
+	sess->sess_private_data = sess_priv;
+
+	return 0;
+}
+
+static int
+aesni_gcm_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	void *sess_priv = get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_gcm_security_session));
+		set_sec_session_private_data(sess, NULL);
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+	return 0;
+}
+
+static unsigned int
+aesni_gcm_sec_session_get_size(__rte_unused void *device)
+{
+	return sizeof(struct aesni_gcm_security_session);
+}
+
 struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.dev_configure		= aesni_gcm_pmd_config,
 		.dev_start		= aesni_gcm_pmd_start,
@@ -336,4 +416,19 @@ struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
 		.sym_session_clear	= aesni_gcm_pmd_sym_session_clear
 };
 
+static struct rte_security_ops aesni_gcm_security_ops = {
+		.session_create = aesni_gcm_security_session_create,
+		.session_get_size = aesni_gcm_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_gcm_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk =
+				aesni_gcm_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops = &aesni_gcm_pmd_ops;
+
+struct rte_security_ops *rte_aesni_gcm_pmd_security_ops =
+		&aesni_gcm_security_ops;
diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
index 56b29e013..ed3f6eb2e 100644
--- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
+++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
@@ -114,5 +114,28 @@ aesni_gcm_set_session_parameters(const struct aesni_gcm_ops *ops,
  * Device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops;
 
+/**
+ * Security session structure.
+ */
+struct aesni_gcm_security_session {
+	/** Temp digest for decryption */
+	uint8_t temp_digest[DIGEST_LENGTH_MAX];
+	/** GCM operations */
+	aesni_gcm_pre_t pre;
+	aesni_gcm_init_t init;
+	aesni_gcm_update_t update;
+	aesni_gcm_finalize_t finalize;
+	/** AESNI-GCM session */
+	struct aesni_gcm_session sess;
+	/** AESNI-GCM context */
+	struct gcm_context_data gdata_ctx;
+};
+
+extern int
+aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
+extern struct rte_security_ops *rte_aesni_gcm_pmd_security_ops;
 
 #endif /* _RTE_AESNI_GCM_PMD_PRIVATE_H_ */
diff --git a/drivers/crypto/aesni_gcm/meson.build b/drivers/crypto/aesni_gcm/meson.build
index 3a6e332dc..f6e160bb3 100644
--- a/drivers/crypto/aesni_gcm/meson.build
+++ b/drivers/crypto/aesni_gcm/meson.build
@@ -22,4 +22,4 @@ endif
 
 allow_experimental_apis = true
 sources = files('aesni_gcm_pmd.c', 'aesni_gcm_pmd_ops.c')
-deps += ['bus_vdev']
+deps += ['bus_vdev', 'security']
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 03/10] app/test: add security cpu crypto autotest
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 04/10] app/test: add security cpu crypto perftest Fan Zhang
                       ` (6 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch adds cpu crypto unit test for AESNI_GCM PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/Makefile                   |   1 +
 app/test/meson.build                |   1 +
 app/test/test_security_cpu_crypto.c | 564 ++++++++++++++++++++++++++++++++++++
 3 files changed, 566 insertions(+)
 create mode 100644 app/test/test_security_cpu_crypto.c

diff --git a/app/test/Makefile b/app/test/Makefile
index df7f77f44..0caff561c 100644
--- a/app/test/Makefile
+++ b/app/test/Makefile
@@ -197,6 +197,7 @@ SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_blockcipher.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev.c
 SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_cryptodev_asym.c
 SRCS-$(CONFIG_RTE_LIBRTE_SECURITY) += test_cryptodev_security_pdcp.c
+SRCS-$(CONFIG_RTE_LIBRTE_CRYPTODEV) += test_security_cpu_crypto.c
 
 SRCS-$(CONFIG_RTE_LIBRTE_METRICS) += test_metrics.c
 
diff --git a/app/test/meson.build b/app/test/meson.build
index 2c23c6347..0d096c564 100644
--- a/app/test/meson.build
+++ b/app/test/meson.build
@@ -104,6 +104,7 @@ test_sources = files('commands.c',
 	'test_ring_perf.c',
 	'test_rwlock.c',
 	'test_sched.c',
+	'test_security_cpu_crypto.c',
 	'test_service_cores.c',
 	'test_spinlock.c',
 	'test_stack.c',
diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
new file mode 100644
index 000000000..d345922b2
--- /dev/null
+++ b/app/test/test_security_cpu_crypto.c
@@ -0,0 +1,564 @@
+/* SPDX-License-Identifier: BSD-3-Clause
+ * Copyright(c) 2019 Intel Corporation
+ */
+
+#include <rte_common.h>
+#include <rte_hexdump.h>
+#include <rte_mbuf.h>
+#include <rte_malloc.h>
+#include <rte_memcpy.h>
+#include <rte_pause.h>
+#include <rte_bus_vdev.h>
+#include <rte_random.h>
+
+#include <rte_security.h>
+
+#include <rte_crypto.h>
+#include <rte_cryptodev.h>
+#include <rte_cryptodev_pmd.h>
+
+#include "test.h"
+#include "test_cryptodev.h"
+#include "test_cryptodev_aead_test_vectors.h"
+
+#define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
+#define MAX_NB_SIGMENTS			4
+
+enum buffer_assemble_option {
+	SGL_MAX_SEG,
+	SGL_ONE_SEG,
+};
+
+struct cpu_crypto_test_case {
+	struct {
+		uint8_t seg[MBUF_DATAPAYLOAD_SIZE];
+		uint32_t seg_len;
+	} seg_buf[MAX_NB_SIGMENTS];
+	uint8_t iv[MAXIMUM_IV_LENGTH];
+	uint8_t aad[CPU_CRYPTO_TEST_MAX_AAD_LENGTH];
+	uint8_t digest[DIGEST_BYTE_LENGTH_SHA512];
+} __rte_cache_aligned;
+
+struct cpu_crypto_test_obj {
+	struct iovec vec[MAX_NUM_OPS_INFLIGHT][MAX_NB_SIGMENTS];
+	struct rte_security_vec sec_buf[MAX_NUM_OPS_INFLIGHT];
+	void *iv[MAX_NUM_OPS_INFLIGHT];
+	void *digest[MAX_NUM_OPS_INFLIGHT];
+	void *aad[MAX_NUM_OPS_INFLIGHT];
+	int status[MAX_NUM_OPS_INFLIGHT];
+};
+
+struct cpu_crypto_testsuite_params {
+	struct rte_mempool *buf_pool;
+	struct rte_mempool *session_priv_mpool;
+	struct rte_security_ctx *ctx;
+};
+
+struct cpu_crypto_unittest_params {
+	struct rte_security_session *sess;
+	void *test_datas[MAX_NUM_OPS_INFLIGHT];
+	struct cpu_crypto_test_obj test_obj;
+	uint32_t nb_bufs;
+};
+
+static struct cpu_crypto_testsuite_params testsuite_params = { NULL };
+static struct cpu_crypto_unittest_params unittest_params;
+
+static int gbl_driver_id;
+
+static int
+testsuite_setup(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct rte_cryptodev_info info;
+	uint32_t i;
+	uint32_t nb_devs;
+	uint32_t sess_sz;
+	int ret;
+
+	memset(ts_params, 0, sizeof(*ts_params));
+
+	ts_params->buf_pool = rte_mempool_lookup("CPU_CRYPTO_MBUFPOOL");
+	if (ts_params->buf_pool == NULL) {
+		/* Not already created so create */
+		ts_params->buf_pool = rte_pktmbuf_pool_create(
+				"CRYPTO_MBUFPOOL",
+				NUM_MBUFS, MBUF_CACHE_SIZE, 0,
+				sizeof(struct cpu_crypto_test_case),
+				rte_socket_id());
+		if (ts_params->buf_pool == NULL) {
+			RTE_LOG(ERR, USER1, "Can't create CRYPTO_MBUFPOOL\n");
+			return TEST_FAILED;
+		}
+	}
+
+	/* Create an AESNI MB device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD)));
+		if (nb_devs < 1) {
+			ret = rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD), NULL);
+
+			TEST_ASSERT(ret == 0,
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+		}
+	}
+
+	/* Create an AESNI GCM device if required */
+	if (gbl_driver_id == rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD))) {
+		nb_devs = rte_cryptodev_device_count_by_driver(
+				rte_cryptodev_driver_id_get(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD)));
+		if (nb_devs < 1) {
+			TEST_ASSERT_SUCCESS(rte_vdev_init(
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD), NULL),
+				"Failed to create instance of"
+				" pmd : %s",
+				RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+		}
+	}
+
+	nb_devs = rte_cryptodev_count();
+	if (nb_devs < 1) {
+		RTE_LOG(ERR, USER1, "No crypto devices found?\n");
+		return TEST_FAILED;
+	}
+
+	/* Get security context */
+	for (i = 0; i < nb_devs; i++) {
+		rte_cryptodev_info_get(i, &info);
+		if (info.driver_id != gbl_driver_id)
+			continue;
+
+		ts_params->ctx = rte_cryptodev_get_sec_ctx(i);
+		if (!ts_params->ctx) {
+			RTE_LOG(ERR, USER1, "Rte_security is not supported\n");
+			return TEST_FAILED;
+		}
+	}
+
+	sess_sz = rte_security_session_get_size(ts_params->ctx);
+	ts_params->session_priv_mpool = rte_mempool_create(
+			"cpu_crypto_test_sess_mp", 2, sess_sz, 0, 0,
+			NULL, NULL, NULL, NULL,
+			SOCKET_ID_ANY, 0);
+	if (!ts_params->session_priv_mpool) {
+		RTE_LOG(ERR, USER1, "Not enough memory\n");
+		return TEST_FAILED;
+	}
+
+	return TEST_SUCCESS;
+}
+
+static void
+testsuite_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+
+	if (ts_params->buf_pool)
+		rte_mempool_free(ts_params->buf_pool);
+
+	if (ts_params->session_priv_mpool)
+		rte_mempool_free(ts_params->session_priv_mpool);
+}
+
+static int
+ut_setup(void)
+{
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	memset(ut_params, 0, sizeof(*ut_params));
+	return TEST_SUCCESS;
+}
+
+static void
+ut_teardown(void)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+
+	if (ut_params->sess)
+		rte_security_session_destroy(ts_params->ctx, ut_params->sess);
+
+	if (ut_params->nb_bufs) {
+		uint32_t i;
+
+		for (i = 0; i < ut_params->nb_bufs; i++)
+			memset(ut_params->test_datas[i], 0,
+				sizeof(struct cpu_crypto_test_case));
+
+		rte_mempool_put_bulk(ts_params->buf_pool, ut_params->test_datas,
+				ut_params->nb_bufs);
+	}
+}
+
+static int
+allocate_buf(uint32_t n)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	int ret;
+
+	ret = rte_mempool_get_bulk(ts_params->buf_pool, ut_params->test_datas,
+			n);
+
+	if (ret == 0)
+		ut_params->nb_bufs = n;
+
+	return ret;
+}
+
+static int
+check_status(struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	uint32_t i;
+
+	for (i = 0; i < n; i++)
+		if (obj->status[i] < 0)
+			return -1;
+
+	return 0;
+}
+
+static struct rte_security_session *
+create_aead_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xform = {0};
+
+	if (is_unit_test)
+		debug_hexdump(stdout, "key:", test_data->key.data,
+				test_data->key.len);
+
+	/* Setup AEAD Parameters */
+	xform.type = RTE_CRYPTO_SYM_XFORM_AEAD;
+	xform.next = NULL;
+	xform.aead.algo = test_data->algo;
+	xform.aead.op = op;
+	xform.aead.key.data = test_data->key.data;
+	xform.aead.key.length = test_data->key.len;
+	xform.aead.iv.offset = 0;
+	xform.aead.iv.length = test_data->iv.len;
+	xform.aead.digest_length = test_data->auth_tag.len;
+	xform.aead.aad_length = test_data->aad.len;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = &xform;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_aead_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *test_data,
+		enum buffer_assemble_option sgl_option,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t seg_idx;
+	uint32_t bytes_per_seg;
+	uint32_t left;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->auth_tag.data,
+				test_data->auth_tag.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:",
+					test_data->auth_tag.data,
+					test_data->auth_tag.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	switch (sgl_option) {
+	case SGL_MAX_SEG:
+		seg_idx = 0;
+		bytes_per_seg = src_len / MAX_NB_SIGMENTS + 1;
+		left = src_len;
+
+		if (bytes_per_seg > (MBUF_DATAPAYLOAD_SIZE / MAX_NB_SIGMENTS))
+			return -ENOMEM;
+
+		while (left) {
+			uint32_t cp_len = RTE_MIN(left, bytes_per_seg);
+			memcpy(data->seg_buf[seg_idx].seg, src, cp_len);
+			data->seg_buf[seg_idx].seg_len = cp_len;
+			obj->vec[obj_idx][seg_idx].iov_base =
+					(void *)data->seg_buf[seg_idx].seg;
+			obj->vec[obj_idx][seg_idx].iov_len = cp_len;
+			src += cp_len;
+			left -= cp_len;
+			seg_idx++;
+		}
+
+		if (left)
+			return -ENOMEM;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = seg_idx;
+
+		break;
+	case SGL_ONE_SEG:
+		memcpy(data->seg_buf[0].seg, src, src_len);
+		data->seg_buf[0].seg_len = src_len;
+		obj->vec[obj_idx][0].iov_base =
+				(void *)data->seg_buf[0].seg;
+		obj->vec[obj_idx][0].iov_len = src_len;
+
+		obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+		obj->sec_buf[obj_idx].num = 1;
+		break;
+	default:
+		return -1;
+	}
+
+	if (test_data->algo == RTE_CRYPTO_AEAD_AES_CCM) {
+		memcpy(data->iv + 1, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad + 18, test_data->aad.data, test_data->aad.len);
+	} else {
+		memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+		memcpy(data->aad, test_data->aad.data, test_data->aad.len);
+	}
+
+	if (is_unit_test) {
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+		debug_hexdump(stdout, "aad:", test_data->aad.data,
+				test_data->aad.len);
+	}
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+	obj->aad[obj_idx] = (void *)data->aad;
+
+	return 0;
+}
+
+#define CPU_CRYPTO_ERR_EXP_CT	"expect ciphertext:"
+#define CPU_CRYPTO_ERR_GEN_CT	"gen ciphertext:"
+#define CPU_CRYPTO_ERR_EXP_PT	"expect plaintext:"
+#define CPU_CRYPTO_ERR_GEN_PT	"gen plaintext:"
+
+static int
+check_aead_result(struct cpu_crypto_test_case *tcase,
+		enum rte_crypto_aead_operation op,
+		const struct aead_test_data *tdata)
+{
+	const char *err_msg1, *err_msg2;
+	const uint8_t *src_pt_ct;
+	const uint8_t *tmp_src;
+	uint32_t src_len;
+	uint32_t left;
+	uint32_t i = 0;
+	int ret;
+
+	if (op == RTE_CRYPTO_AEAD_OP_ENCRYPT) {
+		err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+
+		src_pt_ct = tdata->ciphertext.data;
+		src_len = tdata->ciphertext.len;
+
+		ret = memcmp(tcase->digest, tdata->auth_tag.data,
+				tdata->auth_tag.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					tdata->auth_tag.data,
+					tdata->auth_tag.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					tdata->auth_tag.len);
+			return -1;
+		}
+	} else {
+		src_pt_ct = tdata->plaintext.data;
+		src_len = tdata->plaintext.len;
+		err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+		err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+	}
+
+	tmp_src = src_pt_ct;
+	left = src_len;
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		ret = memcmp(tcase->seg_buf[i].seg, tmp_src,
+				tcase->seg_buf[i].seg_len);
+		if (ret != 0)
+			goto sgl_err_dump;
+		tmp_src += tcase->seg_buf[i].seg_len;
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+
+	if (left) {
+		ret = -ENOMEM;
+		goto sgl_err_dump;
+	}
+
+	return 0;
+
+sgl_err_dump:
+	left = src_len;
+	i = 0;
+
+	debug_hexdump(stdout, err_msg1,
+			tdata->ciphertext.data,
+			tdata->ciphertext.len);
+
+	while (left && i < MAX_NB_SIGMENTS) {
+		debug_hexdump(stdout, err_msg2,
+				tcase->seg_buf[i].seg,
+				tcase->seg_buf[i].seg_len);
+		left -= tcase->seg_buf[i].seg_len;
+		i++;
+	}
+	return ret;
+}
+
+static inline void
+run_test(struct rte_security_ctx *ctx, struct rte_security_session *sess,
+		struct cpu_crypto_test_obj *obj, uint32_t n)
+{
+	rte_security_process_cpu_crypto_bulk(ctx, sess, obj->sec_buf,
+			obj->iv, obj->aad, obj->digest, obj->status, n);
+}
+
+static int
+cpu_crypto_test_aead(const struct aead_test_data *tdata,
+		enum rte_crypto_aead_operation dir,
+		enum buffer_assemble_option sgl_option)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			dir,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_aead_buf(tcase, obj, 0, dir, tdata, sgl_option, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_aead_result(tcase, dir, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* test-vector/sgl-option */
+#define all_gcm_unit_test_cases(type)		\
+	TEST_EXPAND(gcm_test_case_1, type)	\
+	TEST_EXPAND(gcm_test_case_2, type)	\
+	TEST_EXPAND(gcm_test_case_3, type)	\
+	TEST_EXPAND(gcm_test_case_4, type)	\
+	TEST_EXPAND(gcm_test_case_5, type)	\
+	TEST_EXPAND(gcm_test_case_6, type)	\
+	TEST_EXPAND(gcm_test_case_7, type)	\
+	TEST_EXPAND(gcm_test_case_8, type)	\
+	TEST_EXPAND(gcm_test_case_192_1, type)	\
+	TEST_EXPAND(gcm_test_case_192_2, type)	\
+	TEST_EXPAND(gcm_test_case_192_3, type)	\
+	TEST_EXPAND(gcm_test_case_192_4, type)	\
+	TEST_EXPAND(gcm_test_case_192_5, type)	\
+	TEST_EXPAND(gcm_test_case_192_6, type)	\
+	TEST_EXPAND(gcm_test_case_192_7, type)	\
+	TEST_EXPAND(gcm_test_case_256_1, type)	\
+	TEST_EXPAND(gcm_test_case_256_2, type)	\
+	TEST_EXPAND(gcm_test_case_256_3, type)	\
+	TEST_EXPAND(gcm_test_case_256_4, type)	\
+	TEST_EXPAND(gcm_test_case_256_5, type)	\
+	TEST_EXPAND(gcm_test_case_256_6, type)	\
+	TEST_EXPAND(gcm_test_case_256_7, type)
+
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_aead_enc_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_ENCRYPT, o);	\
+}									\
+static int								\
+cpu_crypto_aead_dec_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_aead(&t, RTE_CRYPTO_AEAD_OP_DECRYPT, o);	\
+}									\
+
+all_gcm_unit_test_cases(SGL_ONE_SEG)
+all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-GCM Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_gcm_unit_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
+}
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
+		test_security_cpu_crypto_aesni_gcm);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 04/10] app/test: add security cpu crypto perftest
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (2 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 03/10] app/test: add security cpu crypto autotest Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
                       ` (5 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple GCM CPU crypto performance test to crypto unittest
application. The test includes different key and data sizes test with
single buffer and SGL buffer test items and will display the throughput
as well as cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 201 ++++++++++++++++++++++++++++++++++++
 1 file changed, 201 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index d345922b2..ca9a8dae6 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -23,6 +23,7 @@
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
+#define CACHE_WARM_ITER			2048
 
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
@@ -560,5 +561,205 @@ test_security_cpu_crypto_aesni_gcm(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesgcm_testsuite);
 }
 
+
+static inline void
+gen_rand(uint8_t *data, uint32_t len)
+{
+	uint32_t i;
+
+	for (i = 0; i < len; i++)
+		data[i] = (uint8_t)rte_rand();
+}
+
+static inline void
+switch_aead_enc_to_dec(struct aead_test_data *tdata,
+		struct cpu_crypto_test_case *tcase,
+		enum buffer_assemble_option sgl_option)
+{
+	uint32_t i;
+	uint8_t *dst = tdata->ciphertext.data;
+
+	switch (sgl_option) {
+	case SGL_ONE_SEG:
+		memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+		tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+		break;
+	case SGL_MAX_SEG:
+		tdata->ciphertext.len = 0;
+		for (i = 0; i < MAX_NB_SIGMENTS; i++) {
+			memcpy(dst, tcase->seg_buf[i].seg,
+					tcase->seg_buf[i].seg_len);
+			tdata->ciphertext.len += tcase->seg_buf[i].seg_len;
+		}
+		break;
+	}
+
+	memcpy(tdata->auth_tag.data, tcase->digest, tdata->auth_tag.len);
+}
+
+static int
+cpu_crypto_test_aead_perf(enum buffer_assemble_option sgl_option,
+		uint32_t key_sz)
+{
+	struct aead_test_data tdata = {0};
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint8_t aad[16];
+	int ret;
+
+	tdata.key.len = key_sz;
+	gen_rand(tdata.key.data, tdata.key.len);
+	tdata.algo = RTE_CRYPTO_AEAD_AES_GCM;
+	tdata.aad.data = aad;
+
+	ut_params->sess = create_aead_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			RTE_CRYPTO_AEAD_OP_DECRYPT,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(tdata.plaintext.data,
+					tdata.plaintext.len);
+
+			tdata.aad.len = 12;
+			gen_rand(tdata.aad.data, tdata.aad.len);
+
+			tdata.auth_tag.len = 16;
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_ENCRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_aead_enc_to_dec(&tdata, tcase, sgl_option);
+			ret = assemble_aead_buf(tcase, obj, j,
+					RTE_CRYPTO_AEAD_OP_DECRYPT,
+					&tdata, sgl_option, 0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("AES-GCM-%u(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+				key_sz * 8, test_data_szs[i], rate,
+				rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* test-perfix/key-size/sgl-type */
+#define all_gcm_perf_test_cases(type)					\
+	TEST_EXPAND(_128, 16, type)					\
+	TEST_EXPAND(_192, 24, type)					\
+	TEST_EXPAND(_256, 32, type)
+
+#define TEST_EXPAND(a, b, c)						\
+static int								\
+cpu_crypto_gcm_perf##a##_##c(void)					\
+{									\
+	return cpu_crypto_test_aead_perf(c, b);				\
+}									\
+
+all_gcm_perf_test_cases(SGL_ONE_SEG)
+all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesgcm_perf_testsuite  = {
+		.suite_name = "Security CPU Crypto AESNI-GCM Perf Test Suite",
+		.setup = testsuite_setup,
+		.teardown = testsuite_teardown,
+		.unit_test_cases = {
+#define TEST_EXPAND(a, b, c)						\
+		TEST_CASE_ST(ut_setup, ut_teardown,			\
+				cpu_crypto_gcm_perf##a##_##c),		\
+
+		all_gcm_perf_test_cases(SGL_ONE_SEG)
+		all_gcm_perf_test_cases(SGL_MAX_SEG)
+#undef TEST_EXPAND
+
+		TEST_CASES_END() /**< NULL terminate unit test array */
+		},
+};
+
+static int
+test_security_cpu_crypto_aesni_gcm_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_GCM_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesgcm_perf_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
+
+REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
+		test_security_cpu_crypto_aesni_gcm_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 05/10] crypto/aesni_mb: add rte_security handler
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (3 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 04/10] app/test: add security cpu crypto perftest Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-08 16:23       ` Ananyev, Konstantin
  2019-10-09  8:29       ` Ananyev, Konstantin
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 06/10] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
                       ` (4 subsequent siblings)
  9 siblings, 2 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch add rte_security support support to AESNI-MB PMD. The PMD now
initialize security context instance, create/delete PMD specific security
sessions, and process crypto workloads in synchronous mode.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 drivers/crypto/aesni_mb/meson.build                |   2 +-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         | 368 +++++++++++++++++++--
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |  92 +++++-
 drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |  21 +-
 4 files changed, 453 insertions(+), 30 deletions(-)

diff --git a/drivers/crypto/aesni_mb/meson.build b/drivers/crypto/aesni_mb/meson.build
index 3e1687416..e7b585168 100644
--- a/drivers/crypto/aesni_mb/meson.build
+++ b/drivers/crypto/aesni_mb/meson.build
@@ -23,4 +23,4 @@ endif
 
 sources = files('rte_aesni_mb_pmd.c', 'rte_aesni_mb_pmd_ops.c')
 allow_experimental_apis = true
-deps += ['bus_vdev']
+deps += ['bus_vdev', 'security']
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
index ce1144b95..a4cd518b7 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
@@ -8,6 +8,8 @@
 #include <rte_hexdump.h>
 #include <rte_cryptodev.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security.h>
+#include <rte_security_driver.h>
 #include <rte_bus_vdev.h>
 #include <rte_malloc.h>
 #include <rte_cpuflags.h>
@@ -19,6 +21,9 @@
 #define HMAC_MAX_BLOCK_SIZE 128
 static uint8_t cryptodev_driver_id;
 
+static enum aesni_mb_vector_mode vector_mode;
+/**< CPU vector instruction set mode */
+
 typedef void (*hash_one_block_t)(const void *data, void *digest);
 typedef void (*aes_keyexp_t)(const void *key, void *enc_exp_keys, void *dec_exp_keys);
 
@@ -808,6 +813,164 @@ auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
 			(UINT64_MAX - u_src + u_dst + 1);
 }
 
+union sec_userdata_field {
+	int status;
+	struct {
+		uint16_t is_gen_digest;
+		uint16_t digest_len;
+	};
+};
+
+struct sec_udata_digest_field {
+	uint32_t is_digest_gen;
+	uint32_t digest_len;
+};
+
+static inline int
+set_mb_job_params_sec(JOB_AES_HMAC *job, struct aesni_mb_sec_session *sec_sess,
+		void *buf, uint32_t buf_len, void *iv, void *aad, void *digest,
+		int *status, uint8_t *digest_idx)
+{
+	struct aesni_mb_session *session = &sec_sess->sess;
+	uint32_t cipher_offset = sec_sess->cipher_offset;
+	union sec_userdata_field udata;
+
+	if (unlikely(cipher_offset > buf_len))
+		return -EINVAL;
+
+	/* Set crypto operation */
+	job->chain_order = session->chain_order;
+
+	/* Set cipher parameters */
+	job->cipher_direction = session->cipher.direction;
+	job->cipher_mode = session->cipher.mode;
+
+	job->aes_key_len_in_bytes = session->cipher.key_length_in_bytes;
+
+	/* Set authentication parameters */
+	job->hash_alg = session->auth.algo;
+	job->iv = iv;
+
+	switch (job->hash_alg) {
+	case AES_XCBC:
+		job->u.XCBC._k1_expanded = session->auth.xcbc.k1_expanded;
+		job->u.XCBC._k2 = session->auth.xcbc.k2;
+		job->u.XCBC._k3 = session->auth.xcbc.k3;
+
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_CCM:
+		job->u.CCM.aad = (uint8_t *)aad + 18;
+		job->u.CCM.aad_len_in_bytes = session->aead.aad_len;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		job->iv++;
+		break;
+
+	case AES_CMAC:
+		job->u.CMAC._key_expanded = session->auth.cmac.expkey;
+		job->u.CMAC._skey1 = session->auth.cmac.skey1;
+		job->u.CMAC._skey2 = session->auth.cmac.skey2;
+		job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+		job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		break;
+
+	case AES_GMAC:
+		if (session->cipher.mode == GCM) {
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = session->aead.aad_len;
+		} else {
+			/* For GMAC */
+			job->u.GCM.aad = aad;
+			job->u.GCM.aad_len_in_bytes = buf_len;
+			job->cipher_mode = GCM;
+		}
+		job->aes_enc_key_expanded = &session->cipher.gcm_key;
+		job->aes_dec_key_expanded = &session->cipher.gcm_key;
+		break;
+
+	default:
+		job->u.HMAC._hashed_auth_key_xor_ipad =
+				session->auth.pads.inner;
+		job->u.HMAC._hashed_auth_key_xor_opad =
+				session->auth.pads.outer;
+
+		if (job->cipher_mode == DES3) {
+			job->aes_enc_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+			job->aes_dec_key_expanded =
+				session->cipher.exp_3des_keys.ks_ptr;
+		} else {
+			job->aes_enc_key_expanded =
+				session->cipher.expanded_aes_keys.encode;
+			job->aes_dec_key_expanded =
+				session->cipher.expanded_aes_keys.decode;
+		}
+	}
+
+	/* Set digest output location */
+	if (job->hash_alg != NULL_HASH &&
+			session->auth.operation == RTE_CRYPTO_AUTH_OP_VERIFY) {
+		job->auth_tag_output = sec_sess->temp_digests[*digest_idx];
+		*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+
+		udata.is_gen_digest = 0;
+		udata.digest_len = session->auth.req_digest_len;
+	} else {
+		udata.is_gen_digest = 1;
+		udata.digest_len = session->auth.req_digest_len;
+
+		if (session->auth.req_digest_len !=
+				session->auth.gen_digest_len) {
+			job->auth_tag_output =
+					sec_sess->temp_digests[*digest_idx];
+			*digest_idx = (*digest_idx + 1) % MAX_JOBS;
+		} else
+			job->auth_tag_output = digest;
+	}
+
+	/* A bit of hack here, since job structure only supports
+	 * 2 user data fields and we need 4 params to be passed
+	 * (status, direction, digest for verify, and length of
+	 * digest), we set the status value as digest length +
+	 * direction here temporarily to avoid creating longer
+	 * buffer to store all 4 params.
+	 */
+	*status = udata.status;
+
+	/*
+	 * Multi-buffer library current only support returning a truncated
+	 * digest length as specified in the relevant IPsec RFCs
+	 */
+
+	/* Set digest length */
+	job->auth_tag_output_len_in_bytes = session->auth.gen_digest_len;
+
+	/* Set IV parameters */
+	job->iv_len_in_bytes = session->iv.length;
+
+	/* Data Parameters */
+	job->src = buf;
+	job->dst = (uint8_t *)buf + cipher_offset;
+	job->cipher_start_src_offset_in_bytes = cipher_offset;
+	job->msg_len_to_cipher_in_bytes = buf_len - cipher_offset;
+	job->hash_start_src_offset_in_bytes = 0;
+	job->msg_len_to_hash_in_bytes = buf_len;
+
+	job->user_data = (void *)status;
+	job->user_data2 = digest;
+
+	return 0;
+}
+
 /**
  * Process a crypto operation and complete a JOB_AES_HMAC job structure for
  * submission to the multi buffer library for processing.
@@ -1100,6 +1263,35 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
 	return op;
 }
 
+static inline void
+post_process_mb_sec_job(JOB_AES_HMAC *job)
+{
+	void *user_digest = job->user_data2;
+	int *status = job->user_data;
+
+	switch (job->status) {
+	case STS_COMPLETED:
+		if (user_digest) {
+			union sec_userdata_field udata;
+
+			udata.status = *status;
+			if (udata.is_gen_digest) {
+				*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+				memcpy(user_digest, job->auth_tag_output,
+						udata.digest_len);
+			} else {
+				*status = (memcmp(job->auth_tag_output,
+					user_digest, udata.digest_len) != 0) ?
+						-1 : 0;
+			}
+		} else
+			*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
+		break;
+	default:
+		*status = RTE_CRYPTO_OP_STATUS_ERROR;
+	}
+}
+
 /**
  * Process a completed JOB_AES_HMAC job and keep processing jobs until
  * get_completed_job return NULL
@@ -1136,6 +1328,32 @@ handle_completed_jobs(struct aesni_mb_qp *qp, JOB_AES_HMAC *job,
 	return processed_jobs;
 }
 
+static inline uint32_t
+handle_completed_sec_jobs(JOB_AES_HMAC *job, MB_MGR *mb_mgr)
+{
+	uint32_t processed = 0;
+
+	while (job != NULL) {
+		post_process_mb_sec_job(job);
+		job = IMB_GET_COMPLETED_JOB(mb_mgr);
+		processed++;
+	}
+
+	return processed;
+}
+
+static inline uint32_t
+flush_mb_sec_mgr(MB_MGR *mb_mgr)
+{
+	JOB_AES_HMAC *job = IMB_FLUSH_JOB(mb_mgr);
+	uint32_t processed = 0;
+
+	if (job)
+		processed = handle_completed_sec_jobs(job, mb_mgr);
+
+	return processed;
+}
+
 static inline uint16_t
 flush_mb_mgr(struct aesni_mb_qp *qp, struct rte_crypto_op **ops,
 		uint16_t nb_ops)
@@ -1239,6 +1457,105 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
 	return processed_jobs;
 }
 
+static MB_MGR *
+alloc_init_mb_mgr(void)
+{
+	MB_MGR *mb_mgr = alloc_mb_mgr(0);
+	if (mb_mgr == NULL)
+		return NULL;
+
+	switch (vector_mode) {
+	case RTE_AESNI_MB_SSE:
+		init_mb_mgr_sse(mb_mgr);
+		break;
+	case RTE_AESNI_MB_AVX:
+		init_mb_mgr_avx(mb_mgr);
+		break;
+	case RTE_AESNI_MB_AVX2:
+		init_mb_mgr_avx2(mb_mgr);
+		break;
+	case RTE_AESNI_MB_AVX512:
+		init_mb_mgr_avx512(mb_mgr);
+		break;
+	default:
+		AESNI_MB_LOG(ERR, "Unsupported vector mode %u\n", vector_mode);
+		free_mb_mgr(mb_mgr);
+		return NULL;
+	}
+
+	return mb_mgr;
+}
+
+static MB_MGR *sec_mb_mgrs[RTE_MAX_LCORE];
+
+int
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num)
+{
+	struct aesni_mb_sec_session *sec_sess = sess->sess_private_data;
+	JOB_AES_HMAC *job;
+	static MB_MGR *mb_mgr;
+	uint32_t lcore_id = rte_lcore_id();
+	uint8_t digest_idx = sec_sess->digest_idx;
+	uint32_t i, processed = 0;
+	int ret = 0, errcnt = 0;
+
+	if (unlikely(sec_mb_mgrs[lcore_id] == NULL)) {
+		sec_mb_mgrs[lcore_id] = alloc_init_mb_mgr();
+
+		if (sec_mb_mgrs[lcore_id] == NULL) {
+			for (i = 0; i < num; i++)
+				status[i] = -ENOMEM;
+
+			return -num;
+		}
+	}
+
+	mb_mgr = sec_mb_mgrs[lcore_id];
+
+	for (i = 0; i < num; i++) {
+		void *seg_buf = buf[i].vec[0].iov_base;
+		uint32_t buf_len = buf[i].vec[0].iov_len;
+
+		job = IMB_GET_NEXT_JOB(mb_mgr);
+		if (unlikely(job == NULL)) {
+			processed += flush_mb_sec_mgr(mb_mgr);
+
+			job = IMB_GET_NEXT_JOB(mb_mgr);
+			if (!job) {
+				errcnt -= 1;
+				status[i] = -ENOMEM;
+			}
+		}
+
+		ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
+				iv[i], aad[i], digest[i], &status[i],
+				&digest_idx);
+				/* Submit job to multi-buffer for processing */
+		if (ret) {
+			processed++;
+			status[i] = ret;
+			errcnt -= 1;
+			continue;
+		}
+
+#ifdef RTE_LIBRTE_PMD_AESNI_MB_DEBUG
+		job = IMB_SUBMIT_JOB(mb_mgr);
+#else
+		job = IMB_SUBMIT_JOB_NOCHECK(mb_mgr);
+#endif
+
+		if (job)
+			processed += handle_completed_sec_jobs(job, mb_mgr);
+	}
+
+	while (processed < num)
+		processed += flush_mb_sec_mgr(mb_mgr);
+
+	return errcnt;
+}
+
 static int cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev);
 
 static int
@@ -1248,8 +1565,9 @@ cryptodev_aesni_mb_create(const char *name,
 {
 	struct rte_cryptodev *dev;
 	struct aesni_mb_private *internals;
-	enum aesni_mb_vector_mode vector_mode;
+	struct rte_security_ctx *sec_ctx;
 	MB_MGR *mb_mgr;
+	char sec_name[RTE_DEV_NAME_MAX_LEN];
 
 	/* Check CPU for support for AES instruction set */
 	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
@@ -1283,35 +1601,14 @@ cryptodev_aesni_mb_create(const char *name,
 	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO |
 			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
 			RTE_CRYPTODEV_FF_CPU_AESNI |
-			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
+			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
+			RTE_CRYPTODEV_FF_SECURITY;
 
 
-	mb_mgr = alloc_mb_mgr(0);
+	mb_mgr = alloc_init_mb_mgr();
 	if (mb_mgr == NULL)
 		return -ENOMEM;
 
-	switch (vector_mode) {
-	case RTE_AESNI_MB_SSE:
-		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_SSE;
-		init_mb_mgr_sse(mb_mgr);
-		break;
-	case RTE_AESNI_MB_AVX:
-		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX;
-		init_mb_mgr_avx(mb_mgr);
-		break;
-	case RTE_AESNI_MB_AVX2:
-		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX2;
-		init_mb_mgr_avx2(mb_mgr);
-		break;
-	case RTE_AESNI_MB_AVX512:
-		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX512;
-		init_mb_mgr_avx512(mb_mgr);
-		break;
-	default:
-		AESNI_MB_LOG(ERR, "Unsupported vector mode %u\n", vector_mode);
-		goto error_exit;
-	}
-
 	/* Set vector instructions mode supported */
 	internals = dev->data->dev_private;
 
@@ -1322,11 +1619,28 @@ cryptodev_aesni_mb_create(const char *name,
 	AESNI_MB_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
 			imb_get_version_str());
 
+	/* setup security operations */
+	snprintf(sec_name, sizeof(sec_name) - 1, "aes_mb_sec_%u",
+			dev->driver_id);
+	sec_ctx = rte_zmalloc_socket(sec_name,
+			sizeof(struct rte_security_ctx),
+			RTE_CACHE_LINE_SIZE, init_params->socket_id);
+	if (sec_ctx == NULL) {
+		AESNI_MB_LOG(ERR, "memory allocation failed\n");
+		goto error_exit;
+	}
+
+	sec_ctx->device = (void *)dev;
+	sec_ctx->ops = rte_aesni_mb_pmd_security_ops;
+	dev->security_ctx = sec_ctx;
+
 	return 0;
 
 error_exit:
 	if (mb_mgr)
 		free_mb_mgr(mb_mgr);
+	if (sec_ctx)
+		rte_free(sec_ctx);
 
 	rte_cryptodev_pmd_destroy(dev);
 
@@ -1367,6 +1681,7 @@ cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev)
 	struct rte_cryptodev *cryptodev;
 	struct aesni_mb_private *internals;
 	const char *name;
+	uint32_t i;
 
 	name = rte_vdev_device_name(vdev);
 	if (name == NULL)
@@ -1379,6 +1694,9 @@ cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev)
 	internals = cryptodev->data->dev_private;
 
 	free_mb_mgr(internals->mb_mgr);
+	for (i = 0; i < RTE_MAX_LCORE; i++)
+		if (sec_mb_mgrs[i])
+			free_mb_mgr(sec_mb_mgrs[i]);
 
 	return rte_cryptodev_pmd_destroy(cryptodev);
 }
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
index 8d15b99d4..f47df2d57 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
@@ -8,6 +8,7 @@
 #include <rte_common.h>
 #include <rte_malloc.h>
 #include <rte_cryptodev_pmd.h>
+#include <rte_security_driver.h>
 
 #include "rte_aesni_mb_pmd_private.h"
 
@@ -732,7 +733,8 @@ aesni_mb_pmd_qp_count(struct rte_cryptodev *dev)
 static unsigned
 aesni_mb_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
 {
-	return sizeof(struct aesni_mb_session);
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_session),
+			RTE_CACHE_LINE_SIZE);
 }
 
 /** Configure a aesni multi-buffer session from a crypto xform chain */
@@ -810,4 +812,92 @@ struct rte_cryptodev_ops aesni_mb_pmd_ops = {
 		.sym_session_clear	= aesni_mb_pmd_sym_session_clear
 };
 
+/** Set session authentication parameters */
+
+static int
+aesni_mb_security_session_create(void *dev,
+		struct rte_security_session_conf *conf,
+		struct rte_security_session *sess,
+		struct rte_mempool *mempool)
+{
+	struct rte_cryptodev *cdev = dev;
+	struct aesni_mb_private *internals = cdev->data->dev_private;
+	struct aesni_mb_sec_session *sess_priv;
+	int ret;
+
+	if (!conf->crypto_xform) {
+		AESNI_MB_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (conf->cpucrypto.cipher_offset < 0) {
+		AESNI_MB_LOG(ERR, "Invalid security session conf");
+		return -EINVAL;
+	}
+
+	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
+		AESNI_MB_LOG(ERR,
+				"Couldn't get object from session mempool");
+		return -ENOMEM;
+	}
+
+	sess_priv->cipher_offset = conf->cpucrypto.cipher_offset;
+
+	ret = aesni_mb_set_session_parameters(internals->mb_mgr,
+			&sess_priv->sess, conf->crypto_xform);
+	if (ret != 0) {
+		AESNI_MB_LOG(ERR, "failed configure session parameters");
+
+		rte_mempool_put(mempool, sess_priv);
+	}
+
+	sess->sess_private_data = (void *)sess_priv;
+
+	return ret;
+}
+
+static int
+aesni_mb_security_session_destroy(void *dev __rte_unused,
+		struct rte_security_session *sess)
+{
+	struct aesni_mb_sec_session *sess_priv =
+			get_sec_session_private_data(sess);
+
+	if (sess_priv) {
+		struct rte_mempool *sess_mp = rte_mempool_from_obj(
+				(void *)sess_priv);
+
+		memset(sess, 0, sizeof(struct aesni_mb_sec_session));
+		set_sec_session_private_data(sess, NULL);
+
+		if (sess_mp == NULL) {
+			AESNI_MB_LOG(ERR, "failed fetch session mempool");
+			return -EINVAL;
+		}
+
+		rte_mempool_put(sess_mp, sess_priv);
+	}
+
+	return 0;
+}
+
+static unsigned int
+aesni_mb_sec_session_get_size(__rte_unused void *device)
+{
+	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_sec_session),
+			RTE_CACHE_LINE_SIZE);
+}
+
+static struct rte_security_ops aesni_mb_security_ops = {
+		.session_create = aesni_mb_security_session_create,
+		.session_get_size = aesni_mb_sec_session_get_size,
+		.session_update = NULL,
+		.session_stats_get = NULL,
+		.session_destroy = aesni_mb_security_session_destroy,
+		.set_pkt_metadata = NULL,
+		.capabilities_get = NULL,
+		.process_cpu_crypto_bulk = aesni_mb_sec_crypto_process_bulk,
+};
+
 struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops = &aesni_mb_pmd_ops;
+struct rte_security_ops *rte_aesni_mb_pmd_security_ops = &aesni_mb_security_ops;
diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
index b794d4bc1..64b58ca8e 100644
--- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
+++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
@@ -176,7 +176,6 @@ struct aesni_mb_qp {
 	 */
 } __rte_cache_aligned;
 
-/** AES-NI multi-buffer private session structure */
 struct aesni_mb_session {
 	JOB_CHAIN_ORDER chain_order;
 	struct {
@@ -265,16 +264,32 @@ struct aesni_mb_session {
 		/** AAD data length */
 		uint16_t aad_len;
 	} aead;
-} __rte_cache_aligned;
+};
+
+/** AES-NI multi-buffer private security session structure */
+struct aesni_mb_sec_session {
+	/**< Unique Queue Pair Name */
+	struct aesni_mb_session sess;
+	uint8_t temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];
+	uint16_t digest_idx;
+	uint32_t cipher_offset;
+	MB_MGR *mb_mgr;
+};
 
 extern int
 aesni_mb_set_session_parameters(const MB_MGR *mb_mgr,
 		struct aesni_mb_session *sess,
 		const struct rte_crypto_sym_xform *xform);
 
+extern int
+aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
+		struct rte_security_vec buf[], void *iv[], void *aad[],
+		void *digest[], int status[], uint32_t num);
+
 /** device specific operations function pointer structure */
 extern struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops;
 
-
+/** device specific operations function pointer structure for rte_security */
+extern struct rte_security_ops *rte_aesni_mb_pmd_security_ops;
 
 #endif /* _RTE_AESNI_MB_PMD_PRIVATE_H_ */
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 06/10] app/test: add aesni_mb security cpu crypto autotest
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (4 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 07/10] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
                       ` (3 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch adds cpu crypto unit test for AESNI_MB PMD.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 371 +++++++++++++++++++++++++++++++++++-
 1 file changed, 369 insertions(+), 2 deletions(-)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index ca9a8dae6..a9853a0c0 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -19,12 +19,23 @@
 
 #include "test.h"
 #include "test_cryptodev.h"
+#include "test_cryptodev_blockcipher.h"
+#include "test_cryptodev_aes_test_vectors.h"
 #include "test_cryptodev_aead_test_vectors.h"
+#include "test_cryptodev_des_test_vectors.h"
+#include "test_cryptodev_hash_test_vectors.h"
 
 #define CPU_CRYPTO_TEST_MAX_AAD_LENGTH	16
 #define MAX_NB_SIGMENTS			4
 #define CACHE_WARM_ITER			2048
 
+#define TOP_ENC		BLOCKCIPHER_TEST_OP_ENCRYPT
+#define TOP_DEC		BLOCKCIPHER_TEST_OP_DECRYPT
+#define TOP_AUTH_GEN	BLOCKCIPHER_TEST_OP_AUTH_GEN
+#define TOP_AUTH_VER	BLOCKCIPHER_TEST_OP_AUTH_VERIFY
+#define TOP_ENC_AUTH	BLOCKCIPHER_TEST_OP_ENC_AUTH_GEN
+#define TOP_AUTH_DEC	BLOCKCIPHER_TEST_OP_AUTH_VERIFY_DEC
+
 enum buffer_assemble_option {
 	SGL_MAX_SEG,
 	SGL_ONE_SEG,
@@ -35,8 +46,8 @@ struct cpu_crypto_test_case {
 		uint8_t seg[MBUF_DATAPAYLOAD_SIZE];
 		uint32_t seg_len;
 	} seg_buf[MAX_NB_SIGMENTS];
-	uint8_t iv[MAXIMUM_IV_LENGTH];
-	uint8_t aad[CPU_CRYPTO_TEST_MAX_AAD_LENGTH];
+	uint8_t iv[MAXIMUM_IV_LENGTH * 2];
+	uint8_t aad[CPU_CRYPTO_TEST_MAX_AAD_LENGTH * 4];
 	uint8_t digest[DIGEST_BYTE_LENGTH_SHA512];
 } __rte_cache_aligned;
 
@@ -516,6 +527,11 @@ cpu_crypto_test_aead(const struct aead_test_data *tdata,
 	TEST_EXPAND(gcm_test_case_256_6, type)	\
 	TEST_EXPAND(gcm_test_case_256_7, type)
 
+/* test-vector/sgl-option */
+#define all_ccm_unit_test_cases \
+	TEST_EXPAND(ccm_test_case_128_1, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_2, SGL_ONE_SEG) \
+	TEST_EXPAND(ccm_test_case_128_3, SGL_ONE_SEG)
 
 #define TEST_EXPAND(t, o)						\
 static int								\
@@ -531,6 +547,7 @@ cpu_crypto_aead_dec_test_##t##_##o(void)				\
 
 all_gcm_unit_test_cases(SGL_ONE_SEG)
 all_gcm_unit_test_cases(SGL_MAX_SEG)
+all_ccm_unit_test_cases
 #undef TEST_EXPAND
 
 static struct unit_test_suite security_cpu_crypto_aesgcm_testsuite  = {
@@ -758,8 +775,358 @@ test_security_cpu_crypto_aesni_gcm_perf(void)
 			&security_cpu_crypto_aesgcm_perf_testsuite);
 }
 
+static struct rte_security_session *
+create_blockcipher_session(struct rte_security_ctx *ctx,
+		struct rte_mempool *sess_mp,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	struct rte_security_session_conf sess_conf = {0};
+	struct rte_crypto_sym_xform xforms[2] = { {0} };
+	struct rte_crypto_sym_xform *cipher_xform = NULL;
+	struct rte_crypto_sym_xform *auth_xform = NULL;
+	struct rte_crypto_sym_xform *xform;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		cipher_xform = &xforms[0];
+		cipher_xform->type = RTE_CRYPTO_SYM_XFORM_CIPHER;
+
+		if (op_mask & TOP_ENC)
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_ENCRYPT;
+		else
+			cipher_xform->cipher.op =
+				RTE_CRYPTO_CIPHER_OP_DECRYPT;
+
+		cipher_xform->cipher.algo = test_data->crypto_algo;
+		cipher_xform->cipher.key.data = test_data->cipher_key.data;
+		cipher_xform->cipher.key.length = test_data->cipher_key.len;
+		cipher_xform->cipher.iv.offset = 0;
+		cipher_xform->cipher.iv.length = test_data->iv.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "cipher key:",
+					test_data->cipher_key.data,
+					test_data->cipher_key.len);
+	}
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH) {
+		auth_xform = &xforms[1];
+		auth_xform->type = RTE_CRYPTO_SYM_XFORM_AUTH;
+
+		if (op_mask & TOP_AUTH_GEN)
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_GENERATE;
+		else
+			auth_xform->auth.op = RTE_CRYPTO_AUTH_OP_VERIFY;
+
+		auth_xform->auth.algo = test_data->auth_algo;
+		auth_xform->auth.key.length = test_data->auth_key.len;
+		auth_xform->auth.key.data = test_data->auth_key.data;
+		auth_xform->auth.digest_length = test_data->digest.len;
+
+		if (is_unit_test)
+			debug_hexdump(stdout, "auth key:",
+					test_data->auth_key.data,
+					test_data->auth_key.len);
+	}
+
+	if (op_mask == TOP_ENC ||
+			op_mask == TOP_DEC)
+		xform = cipher_xform;
+	else if (op_mask == TOP_AUTH_GEN ||
+			op_mask == TOP_AUTH_VER)
+		xform = auth_xform;
+	else if (op_mask == TOP_ENC_AUTH) {
+		xform = cipher_xform;
+		xform->next = auth_xform;
+	} else if (op_mask == TOP_AUTH_DEC) {
+		xform = auth_xform;
+		xform->next = cipher_xform;
+	} else
+		return NULL;
+
+	if (test_data->cipher_offset < test_data->auth_offset)
+		return NULL;
+
+	sess_conf.action_type = RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
+	sess_conf.crypto_xform = xform;
+	sess_conf.cpucrypto.cipher_offset = test_data->cipher_offset -
+			test_data->auth_offset;
+
+	return rte_security_session_create(ctx, &sess_conf, sess_mp);
+}
+
+static inline int
+assemble_blockcipher_buf(struct cpu_crypto_test_case *data,
+		struct cpu_crypto_test_obj *obj,
+		uint32_t obj_idx,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data,
+		uint32_t is_unit_test)
+{
+	const uint8_t *src;
+	uint32_t src_len;
+	uint32_t offset;
+
+	if (op_mask == TOP_ENC_AUTH ||
+			op_mask == TOP_AUTH_GEN ||
+			op_mask == BLOCKCIPHER_TEST_OP_AUTH_VERIFY)
+		offset = test_data->auth_offset;
+	else
+		offset = test_data->cipher_offset;
+
+	if (op_mask & TOP_ENC_AUTH) {
+		src = test_data->plaintext.data;
+		src_len = test_data->plaintext.len;
+		if (is_unit_test)
+			debug_hexdump(stdout, "plaintext:", src, src_len);
+	} else {
+		src = test_data->ciphertext.data;
+		src_len = test_data->ciphertext.len;
+		memcpy(data->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (is_unit_test) {
+			debug_hexdump(stdout, "ciphertext:", src, src_len);
+			debug_hexdump(stdout, "digest:", test_data->digest.data,
+					test_data->digest.len);
+		}
+	}
+
+	if (src_len > MBUF_DATAPAYLOAD_SIZE)
+		return -ENOMEM;
+
+	memcpy(data->seg_buf[0].seg, src, src_len);
+	data->seg_buf[0].seg_len = src_len;
+	obj->vec[obj_idx][0].iov_base =
+			(void *)(data->seg_buf[0].seg + offset);
+	obj->vec[obj_idx][0].iov_len = src_len - offset;
+
+	obj->sec_buf[obj_idx].vec = obj->vec[obj_idx];
+	obj->sec_buf[obj_idx].num = 1;
+
+	memcpy(data->iv, test_data->iv.data, test_data->iv.len);
+	if (is_unit_test)
+		debug_hexdump(stdout, "iv:", test_data->iv.data,
+				test_data->iv.len);
+
+	obj->iv[obj_idx] = (void *)data->iv;
+	obj->digest[obj_idx] = (void *)data->digest;
+
+	return 0;
+}
+
+static int
+check_blockcipher_result(struct cpu_crypto_test_case *tcase,
+		uint32_t op_mask,
+		const struct blockcipher_test_data *test_data)
+{
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER) {
+		const char *err_msg1, *err_msg2;
+		const uint8_t *src_pt_ct;
+		uint32_t src_len;
+
+		if (op_mask & TOP_ENC) {
+			src_pt_ct = test_data->ciphertext.data;
+			src_len = test_data->ciphertext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_CT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_CT;
+		} else {
+			src_pt_ct = test_data->plaintext.data;
+			src_len = test_data->plaintext.len;
+			err_msg1 = CPU_CRYPTO_ERR_EXP_PT;
+			err_msg2 = CPU_CRYPTO_ERR_GEN_PT;
+		}
+
+		ret = memcmp(tcase->seg_buf[0].seg, src_pt_ct, src_len);
+		if (ret != 0) {
+			debug_hexdump(stdout, err_msg1, src_pt_ct, src_len);
+			debug_hexdump(stdout, err_msg2,
+					tcase->seg_buf[0].seg,
+					test_data->ciphertext.len);
+			return -1;
+		}
+	}
+
+	if (op_mask & TOP_AUTH_GEN) {
+		ret = memcmp(tcase->digest, test_data->digest.data,
+				test_data->digest.len);
+		if (ret != 0) {
+			debug_hexdump(stdout, "expect digest:",
+					test_data->digest.data,
+					test_data->digest.len);
+			debug_hexdump(stdout, "gen digest:",
+					tcase->digest,
+					test_data->digest.len);
+			return -1;
+		}
+	}
+
+	return 0;
+}
+
+static int
+cpu_crypto_test_blockcipher(const struct blockcipher_test_data *tdata,
+		uint32_t op_mask)
+{
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	int ret;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			tdata,
+			1);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(1);
+	if (ret)
+		return ret;
+
+	tcase = ut_params->test_datas[0];
+	ret = assemble_blockcipher_buf(tcase, obj, 0, op_mask, tdata, 1);
+	if (ret < 0) {
+		printf("Test is not supported by the driver\n");
+		return ret;
+	}
+
+	run_test(ts_params->ctx, ut_params->sess, obj, 1);
+
+	ret = check_status(obj, 1);
+	if (ret < 0)
+		return ret;
+
+	ret = check_blockcipher_result(tcase, op_mask, tdata);
+	if (ret < 0)
+		return ret;
+
+	return 0;
+}
+
+/* Macro to save code for defining BlockCipher test cases */
+/* test-vector-name/op */
+#define all_blockcipher_test_cases \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_1, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_1, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_1, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_2, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_2, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_2, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_3, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_3, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_3, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_4, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_4, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_4, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_5, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_5, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_5, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_6, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_6, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_6, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_7, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_7, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_7, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_8, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_8, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_8, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_9, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_9, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_9, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_10, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_10, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_11, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_11, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_12, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_12, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_12, TOP_AUTH_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC) \
+	TEST_EXPAND(aes_test_data_13, TOP_DEC) \
+	TEST_EXPAND(aes_test_data_13, TOP_ENC_AUTH) \
+	TEST_EXPAND(aes_test_data_13, TOP_AUTH_DEC) \
+	TEST_EXPAND(des_test_data_1, TOP_ENC) \
+	TEST_EXPAND(des_test_data_1, TOP_DEC) \
+	TEST_EXPAND(des_test_data_2, TOP_ENC) \
+	TEST_EXPAND(des_test_data_2, TOP_DEC) \
+	TEST_EXPAND(des_test_data_3, TOP_ENC) \
+	TEST_EXPAND(des_test_data_3, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_ENC_AUTH) \
+	TEST_EXPAND(triple_des128cbc_hmac_sha1_test_vector, TOP_AUTH_DEC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des64cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des128cbc_test_vector, TOP_DEC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_ENC) \
+	TEST_EXPAND(triple_des192cbc_test_vector, TOP_DEC) \
+
+#define TEST_EXPAND(t, o)						\
+static int								\
+cpu_crypto_blockcipher_test_##t##_##o(void)				\
+{									\
+	return cpu_crypto_test_blockcipher(&t, o);			\
+}
+
+all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Unit Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_enc_test_##t##_##o),		\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_aead_dec_test_##t##_##o),		\
+
+	all_gcm_unit_test_cases(SGL_ONE_SEG)
+	all_ccm_unit_test_cases
+#undef TEST_EXPAND
+
+#define TEST_EXPAND(t, o)						\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+			cpu_crypto_blockcipher_test_##t##_##o),		\
+
+	all_blockcipher_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
+}
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
 REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 		test_security_cpu_crypto_aesni_gcm_perf);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
+		test_security_cpu_crypto_aesni_mb);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 07/10] app/test: add aesni_mb security cpu crypto perftest
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (5 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 06/10] app/test: add aesni_mb security cpu crypto autotest Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
                       ` (2 subsequent siblings)
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since crypto perf application does not support rte_security, this patch
adds a simple AES-CBC-SHA1-HMAC CPU crypto performance test to crypto
unittest application. The test includes different key and data sizes test
with single buffer test items and will display the throughput as well as
cycle count performance information.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 app/test/test_security_cpu_crypto.c | 194 ++++++++++++++++++++++++++++++++++++
 1 file changed, 194 insertions(+)

diff --git a/app/test/test_security_cpu_crypto.c b/app/test/test_security_cpu_crypto.c
index a9853a0c0..c3689d138 100644
--- a/app/test/test_security_cpu_crypto.c
+++ b/app/test/test_security_cpu_crypto.c
@@ -1122,6 +1122,197 @@ test_security_cpu_crypto_aesni_mb(void)
 	return unit_test_suite_runner(&security_cpu_crypto_aesni_mb_testsuite);
 }
 
+static inline void
+switch_blockcipher_enc_to_dec(struct blockcipher_test_data *tdata,
+		struct cpu_crypto_test_case *tcase, uint8_t *dst)
+{
+	memcpy(dst, tcase->seg_buf[0].seg, tcase->seg_buf[0].seg_len);
+	tdata->ciphertext.len = tcase->seg_buf[0].seg_len;
+	memcpy(tdata->digest.data, tcase->digest, tdata->digest.len);
+}
+
+static int
+cpu_crypto_test_blockcipher_perf(
+		const enum rte_crypto_cipher_algorithm cipher_algo,
+		uint32_t cipher_key_sz,
+		const enum rte_crypto_auth_algorithm auth_algo,
+		uint32_t auth_key_sz, uint32_t digest_sz,
+		uint32_t op_mask)
+{
+	struct blockcipher_test_data tdata = {0};
+	uint8_t plaintext[3000], ciphertext[3000];
+	struct cpu_crypto_testsuite_params *ts_params = &testsuite_params;
+	struct cpu_crypto_unittest_params *ut_params = &unittest_params;
+	struct cpu_crypto_test_obj *obj = &ut_params->test_obj;
+	struct cpu_crypto_test_case *tcase;
+	uint64_t hz = rte_get_tsc_hz(), time_start, time_now;
+	double rate, cycles_per_buf;
+	uint32_t test_data_szs[] = {64, 128, 256, 512, 1024, 2048};
+	uint32_t i, j;
+	uint32_t op_mask_opp = 0;
+	int ret;
+
+	if (op_mask & BLOCKCIPHER_TEST_OP_CIPHER)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_CIPHER);
+	if (op_mask & BLOCKCIPHER_TEST_OP_AUTH)
+		op_mask_opp |= (~op_mask & BLOCKCIPHER_TEST_OP_AUTH);
+
+	tdata.plaintext.data = plaintext;
+	tdata.ciphertext.data = ciphertext;
+
+	tdata.cipher_key.len = cipher_key_sz;
+	tdata.auth_key.len = auth_key_sz;
+
+	gen_rand(tdata.cipher_key.data, cipher_key_sz / 8);
+	gen_rand(tdata.auth_key.data, auth_key_sz / 8);
+
+	tdata.crypto_algo = cipher_algo;
+	tdata.auth_algo = auth_algo;
+
+	tdata.digest.len = digest_sz;
+
+	ut_params->sess = create_blockcipher_session(ts_params->ctx,
+			ts_params->session_priv_mpool,
+			op_mask,
+			&tdata,
+			0);
+	if (!ut_params->sess)
+		return -1;
+
+	ret = allocate_buf(MAX_NUM_OPS_INFLIGHT);
+	if (ret)
+		return ret;
+
+	for (i = 0; i < RTE_DIM(test_data_szs); i++) {
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tdata.plaintext.len = test_data_szs[i];
+			gen_rand(plaintext, tdata.plaintext.len);
+
+			tdata.iv.len = 16;
+			gen_rand(tdata.iv.data, tdata.iv.len);
+
+			tcase = ut_params->test_datas[j];
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		/* warm up cache */
+		for (j = 0; j < CACHE_WARM_ITER; j++)
+			run_test(ts_params->ctx, ut_params->sess, obj,
+					MAX_NUM_OPS_INFLIGHT);
+
+		time_start = rte_rdtsc();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_rdtsc();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Enc %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+			cycles_per_buf, cycles_per_buf / test_data_szs[i]);
+
+		for (j = 0; j < MAX_NUM_OPS_INFLIGHT; j++) {
+			tcase = ut_params->test_datas[j];
+
+			switch_blockcipher_enc_to_dec(&tdata, tcase,
+					ciphertext);
+			ret = assemble_blockcipher_buf(tcase, obj, j,
+					op_mask_opp,
+					&tdata,
+					0);
+			if (ret < 0) {
+				printf("Test is not supported by the driver\n");
+				return ret;
+			}
+		}
+
+		time_start = rte_get_timer_cycles();
+
+		run_test(ts_params->ctx, ut_params->sess, obj,
+				MAX_NUM_OPS_INFLIGHT);
+
+		time_now = rte_get_timer_cycles();
+
+		rate = time_now - time_start;
+		cycles_per_buf = rate / MAX_NUM_OPS_INFLIGHT;
+
+		rate = ((hz / cycles_per_buf)) / 1000000;
+
+		printf("%s-%u-%s(%4uB) Dec %03.3fMpps (%03.3fGbps) ",
+			rte_crypto_cipher_algorithm_strings[cipher_algo],
+			cipher_key_sz * 8,
+			rte_crypto_auth_algorithm_strings[auth_algo],
+			test_data_szs[i],
+			rate, rate  * test_data_szs[i] * 8 / 1000);
+		printf("cycles per buf %03.3f per byte %03.3f\n",
+				cycles_per_buf,
+				cycles_per_buf / test_data_szs[i]);
+	}
+
+	return 0;
+}
+
+/* cipher-algo/cipher-key-len/auth-algo/auth-key-len/digest-len/op */
+#define all_block_cipher_perf_test_cases				\
+	TEST_EXPAND(_AES_CBC, 128, _NULL, 0, 0, TOP_ENC)		\
+	TEST_EXPAND(_NULL, 0, _SHA1_HMAC, 160, 20, TOP_AUTH_GEN)	\
+	TEST_EXPAND(_AES_CBC, 128, _SHA1_HMAC, 160, 20, TOP_ENC_AUTH)
+
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+static int								\
+cpu_crypto_blockcipher_perf##a##_##b##c##_##f(void)			\
+{									\
+	return cpu_crypto_test_blockcipher_perf(RTE_CRYPTO_CIPHER##a,	\
+			b / 8, RTE_CRYPTO_AUTH##c, d / 8, e, f);	\
+}									\
+
+all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+static struct unit_test_suite security_cpu_crypto_aesni_mb_perf_testsuite  = {
+	.suite_name = "Security CPU Crypto AESNI-MB Perf Test Suite",
+	.setup = testsuite_setup,
+	.teardown = testsuite_teardown,
+	.unit_test_cases = {
+#define TEST_EXPAND(a, b, c, d, e, f)					\
+	TEST_CASE_ST(ut_setup, ut_teardown,				\
+		cpu_crypto_blockcipher_perf##a##_##b##c##_##f),	\
+
+	all_block_cipher_perf_test_cases
+#undef TEST_EXPAND
+
+	TEST_CASES_END() /**< NULL terminate unit test array */
+	},
+};
+
+static int
+test_security_cpu_crypto_aesni_mb_perf(void)
+{
+	gbl_driver_id =	rte_cryptodev_driver_id_get(
+			RTE_STR(CRYPTODEV_NAME_AESNI_MB_PMD));
+
+	return unit_test_suite_runner(
+			&security_cpu_crypto_aesni_mb_perf_testsuite);
+}
+
+
 REGISTER_TEST_COMMAND(security_aesni_gcm_autotest,
 		test_security_cpu_crypto_aesni_gcm);
 
@@ -1130,3 +1321,6 @@ REGISTER_TEST_COMMAND(security_aesni_gcm_perftest,
 
 REGISTER_TEST_COMMAND(security_aesni_mb_autotest,
 		test_security_cpu_crypto_aesni_mb);
+
+REGISTER_TEST_COMMAND(security_aesni_mb_perftest,
+		test_security_cpu_crypto_aesni_mb_perf);
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 08/10] ipsec: add rte_security cpu_crypto action support
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (6 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 07/10] app/test: add aesni_mb security cpu crypto perftest Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  2019-10-08 23:28       ` Ananyev, Konstantin
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 09/10] examples/ipsec-secgw: add security " Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 10/10] doc: update security cpu process description Fan Zhang
  9 siblings, 1 reply; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch updates the ipsec library to handle the newly introduced
RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 lib/librte_ipsec/crypto.h   |  24 +++
 lib/librte_ipsec/esp_inb.c  | 200 ++++++++++++++++++++++--
 lib/librte_ipsec/esp_outb.c | 369 +++++++++++++++++++++++++++++++++++++++++---
 lib/librte_ipsec/sa.c       |  53 ++++++-
 lib/librte_ipsec/sa.h       |  29 ++++
 lib/librte_ipsec/ses.c      |   4 +-
 6 files changed, 643 insertions(+), 36 deletions(-)

diff --git a/lib/librte_ipsec/crypto.h b/lib/librte_ipsec/crypto.h
index f8fbf8d4f..901c8c7de 100644
--- a/lib/librte_ipsec/crypto.h
+++ b/lib/librte_ipsec/crypto.h
@@ -179,4 +179,28 @@ lksd_none_cop_prepare(struct rte_crypto_op *cop,
 	__rte_crypto_sym_op_attach_sym_session(sop, cs);
 }
 
+typedef void* (*_set_icv_f)(void *val, struct rte_mbuf *ml, uint32_t icv_off);
+
+static inline void *
+set_icv_va_pa(void *val, struct rte_mbuf *ml, uint32_t icv_off)
+{
+	union sym_op_data *icv = val;
+
+	icv->va = rte_pktmbuf_mtod_offset(ml, void *, icv_off);
+	icv->pa = rte_pktmbuf_iova_offset(ml, icv_off);
+
+	return icv->va;
+}
+
+static inline void *
+set_icv_va(__rte_unused void *val, __rte_unused struct rte_mbuf *ml,
+		__rte_unused uint32_t icv_off)
+{
+	void **icv_va = val;
+
+	*icv_va = rte_pktmbuf_mtod_offset(ml, void *, icv_off);
+
+	return *icv_va;
+}
+
 #endif /* _CRYPTO_H_ */
diff --git a/lib/librte_ipsec/esp_inb.c b/lib/librte_ipsec/esp_inb.c
index 8e3ecbc64..c4476e819 100644
--- a/lib/librte_ipsec/esp_inb.c
+++ b/lib/librte_ipsec/esp_inb.c
@@ -105,6 +105,78 @@ inb_cop_prepare(struct rte_crypto_op *cop,
 	}
 }
 
+static inline int
+inb_cpu_crypto_proc_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
+	uint32_t pofs, uint32_t plen,
+	struct rte_security_vec *buf, struct iovec *cur_vec,
+	void *iv)
+{
+	struct rte_mbuf *ms;
+	struct iovec *vec = cur_vec;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	uint64_t *ivp;
+	uint32_t algo;
+	uint32_t left;
+	uint32_t off = 0, n_seg = 0;
+
+	ivp = rte_pktmbuf_mtod_offset(mb, uint64_t *,
+		pofs + sizeof(struct rte_esp_hdr));
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = (struct aead_gcm_iv *)iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		off = sa->ctp.cipher.offset + pofs;
+		left = plen - sa->ctp.cipher.length;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		copy_iv(iv, ivp, sa->iv_len);
+		off = sa->ctp.auth.offset + pofs;
+		left = plen - sa->ctp.auth.length;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		copy_iv(iv, ivp, sa->iv_len);
+		off = sa->ctp.auth.offset + pofs;
+		left = plen - sa->ctp.auth.length;
+		ctr = (struct aesctr_cnt_blk *)iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		left = plen - sa->ctp.cipher.length;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ms = mbuf_get_seg_ofs(mb, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
 /*
  * Helper function for prepare() to deal with situation when
  * ICV is spread by two segments. Tries to move ICV completely into the
@@ -139,20 +211,21 @@ move_icv(struct rte_mbuf *ml, uint32_t ofs)
  */
 static inline void
 inb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
-	const union sym_op_data *icv)
+	uint8_t *icv_va, void *aad_buf, uint32_t aad_off)
 {
 	struct aead_gcm_aad *aad;
 
 	/* insert SQN.hi between ESP trailer and ICV */
 	if (sa->sqh_len != 0)
-		insert_sqh(sqn_hi32(sqc), icv->va, sa->icv_len);
+		insert_sqh(sqn_hi32(sqc), icv_va, sa->icv_len);
 
 	/*
 	 * fill AAD fields, if any (aad fields are placed after icv),
 	 * right now we support only one AEAD algorithm: AES-GCM.
 	 */
 	if (sa->aad_len != 0) {
-		aad = (struct aead_gcm_aad *)(icv->va + sa->icv_len);
+		aad = aad_buf ? aad_buf :
+				(struct aead_gcm_aad *)(icv_va + aad_off);
 		aead_gcm_aad_fill(aad, sa->spi, sqc, IS_ESN(sa));
 	}
 }
@@ -162,13 +235,15 @@ inb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
  */
 static inline int32_t
 inb_pkt_prepare(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
-	struct rte_mbuf *mb, uint32_t hlen, union sym_op_data *icv)
+	struct rte_mbuf *mb, uint32_t hlen, _set_icv_f set_icv, void *icv_val,
+	void *aad_buf)
 {
 	int32_t rc;
 	uint64_t sqn;
 	uint32_t clen, icv_len, icv_ofs, plen;
 	struct rte_mbuf *ml;
 	struct rte_esp_hdr *esph;
+	void *icv_va;
 
 	esph = rte_pktmbuf_mtod_offset(mb, struct rte_esp_hdr *, hlen);
 
@@ -226,8 +301,8 @@ inb_pkt_prepare(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
 	if (sa->aad_len + sa->sqh_len > rte_pktmbuf_tailroom(ml))
 		return -ENOSPC;
 
-	icv->va = rte_pktmbuf_mtod_offset(ml, void *, icv_ofs);
-	icv->pa = rte_pktmbuf_iova_offset(ml, icv_ofs);
+	icv_va = set_icv(icv_val, ml, icv_ofs);
+	inb_pkt_xprepare(sa, sqn, icv_va, aad_buf, sa->icv_len);
 
 	/*
 	 * if esn is used then high-order 32 bits are also used in ICV
@@ -238,7 +313,6 @@ inb_pkt_prepare(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
 	mb->pkt_len += sa->sqh_len;
 	ml->data_len += sa->sqh_len;
 
-	inb_pkt_xprepare(sa, sqn, icv);
 	return plen;
 }
 
@@ -265,7 +339,8 @@ esp_inb_pkt_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	for (i = 0; i != num; i++) {
 
 		hl = mb[i]->l2_len + mb[i]->l3_len;
-		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, &icv);
+		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, set_icv_va_pa,
+				(void *)&icv, NULL);
 		if (rc >= 0) {
 			lksd_none_cop_prepare(cop[k], cs, mb[i]);
 			inb_cop_prepare(cop[k], sa, mb[i], &icv, hl, rc);
@@ -512,7 +587,6 @@ tun_process(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return k;
 }
 
-
 /*
  * *process* function for tunnel packets
  */
@@ -625,6 +699,114 @@ esp_inb_pkt_process(struct rte_ipsec_sa *sa, struct rte_mbuf *mb[],
 	return n;
 }
 
+/*
+ * process packets using sync crypto engine
+ */
+static uint16_t
+esp_inb_cpu_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num,
+		esp_inb_process_t process)
+{
+	int32_t rc;
+	uint32_t i, hl, n, p;
+	struct rte_ipsec_sa *sa;
+	struct replay_sqn *rsn;
+	void *icv_va;
+	uint32_t sqn[num];
+	uint32_t dr[num];
+	uint8_t sqh_len;
+
+	/* cpu crypto specific variables */
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	uint64_t iv_buf[num][IPSEC_MAX_IV_QWORD];
+	void *iv[num];
+	int status[num];
+	uint8_t *aad_buf[num][sizeof(struct aead_gcm_aad)];
+	void *aad[num];
+	void *digest[num];
+	uint32_t k;
+
+	sa = ss->sa;
+	rsn = rsn_acquire(sa);
+	sqh_len = sa->sqh_len;
+
+	k = 0;
+	for (i = 0; i != num; i++) {
+		hl = mb[i]->l2_len + mb[i]->l3_len;
+		rc = inb_pkt_prepare(sa, rsn, mb[i], hl, set_icv_va,
+				(void *)&icv_va, (void *)aad_buf[k]);
+		if (rc >= 0) {
+			iv[k] = (void *)iv_buf[k];
+			aad[k] = (void *)aad_buf[k];
+			digest[k] = (void *)icv_va;
+
+			rc = inb_cpu_crypto_proc_prepare(sa, mb[i], hl,
+					rc, &buf[k], &vec[vec_idx], iv[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		} else
+			dr[i - k] = i;
+	}
+
+	/* copy not prepared mbufs beyond good ones */
+	if (k != num) {
+		rte_errno = EBADMSG;
+
+		if (unlikely(k == 0))
+			return 0;
+
+		move_bad_mbufs(mb, dr, num, num - k);
+	}
+
+	/* process the packets */
+	n = 0;
+	rc = rte_security_process_cpu_crypto_bulk(ss->security.ctx,
+			ss->security.ses, buf, iv, aad, digest, status, k);
+	/* move failed process packets to dr */
+	for (i = 0; i < k; i++) {
+		if (status[i]) {
+			dr[n++] = i;
+			rte_errno = EBADMSG;
+		}
+	}
+
+	/* move bad packets to the back */
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	/* process packets */
+	p = process(sa, mb, sqn, dr, k - n, sqh_len);
+
+	if (p != k - n && p != 0)
+		move_bad_mbufs(mb, dr, k - n, k - n - p);
+
+	if (p != num)
+		rte_errno = EBADMSG;
+
+	return p;
+}
+
+uint16_t
+esp_inb_tun_cpu_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_inb_cpu_crypto_pkt_process(ss, mb, num, tun_process);
+}
+
+uint16_t
+esp_inb_trs_cpu_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_inb_cpu_crypto_pkt_process(ss, mb, num, trs_process);
+}
+
 /*
  * process group of ESP inbound tunnel packets.
  */
diff --git a/lib/librte_ipsec/esp_outb.c b/lib/librte_ipsec/esp_outb.c
index 55799a867..ecfc4cd3f 100644
--- a/lib/librte_ipsec/esp_outb.c
+++ b/lib/librte_ipsec/esp_outb.c
@@ -104,7 +104,7 @@ outb_cop_prepare(struct rte_crypto_op *cop,
 static inline int32_t
 outb_tun_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
 	const uint64_t ivp[IPSEC_MAX_IV_QWORD], struct rte_mbuf *mb,
-	union sym_op_data *icv, uint8_t sqh_len)
+	_set_icv_f set_icv, void *icv_val, uint8_t sqh_len)
 {
 	uint32_t clen, hlen, l2len, pdlen, pdofs, plen, tlen;
 	struct rte_mbuf *ml;
@@ -177,8 +177,8 @@ outb_tun_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
 	espt->pad_len = pdlen;
 	espt->next_proto = sa->proto;
 
-	icv->va = rte_pktmbuf_mtod_offset(ml, void *, pdofs);
-	icv->pa = rte_pktmbuf_iova_offset(ml, pdofs);
+	/* set icv va/pa value(s) */
+	set_icv(icv_val, ml, pdofs);
 
 	return clen;
 }
@@ -189,14 +189,14 @@ outb_tun_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
  */
 static inline void
 outb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
-	const union sym_op_data *icv)
+	uint8_t *icv_va, void *aad_buf)
 {
 	uint32_t *psqh;
 	struct aead_gcm_aad *aad;
 
 	/* insert SQN.hi between ESP trailer and ICV */
 	if (sa->sqh_len != 0) {
-		psqh = (uint32_t *)(icv->va - sa->sqh_len);
+		psqh = (uint32_t *)(icv_va - sa->sqh_len);
 		psqh[0] = sqn_hi32(sqc);
 	}
 
@@ -205,7 +205,7 @@ outb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
 	 * right now we support only one AEAD algorithm: AES-GCM .
 	 */
 	if (sa->aad_len != 0) {
-		aad = (struct aead_gcm_aad *)(icv->va + sa->icv_len);
+		aad = aad_buf;
 		aead_gcm_aad_fill(aad, sa->spi, sqc, IS_ESN(sa));
 	}
 }
@@ -242,11 +242,12 @@ esp_outb_tun_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 		gen_iv(iv, sqc);
 
 		/* try to update the packet itself */
-		rc = outb_tun_pkt_prepare(sa, sqc, iv, mb[i], &icv,
-					  sa->sqh_len);
+		rc = outb_tun_pkt_prepare(sa, sqc, iv, mb[i], set_icv_va_pa,
+				(void *)&icv, sa->sqh_len);
 		/* success, setup crypto op */
 		if (rc >= 0) {
-			outb_pkt_xprepare(sa, sqc, &icv);
+			outb_pkt_xprepare(sa, sqc, icv.va,
+					(void *)(icv.va + sa->icv_len));
 			lksd_none_cop_prepare(cop[k], cs, mb[i]);
 			outb_cop_prepare(cop[k], sa, iv, &icv, 0, rc);
 			k++;
@@ -270,7 +271,7 @@ esp_outb_tun_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 static inline int32_t
 outb_trs_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
 	const uint64_t ivp[IPSEC_MAX_IV_QWORD], struct rte_mbuf *mb,
-	uint32_t l2len, uint32_t l3len, union sym_op_data *icv,
+	uint32_t l2len, uint32_t l3len, _set_icv_f set_icv, void *icv_val,
 	uint8_t sqh_len)
 {
 	uint8_t np;
@@ -340,8 +341,7 @@ outb_trs_pkt_prepare(struct rte_ipsec_sa *sa, rte_be64_t sqc,
 	espt->pad_len = pdlen;
 	espt->next_proto = np;
 
-	icv->va = rte_pktmbuf_mtod_offset(ml, void *, pdofs);
-	icv->pa = rte_pktmbuf_iova_offset(ml, pdofs);
+	set_icv(icv_val, ml, pdofs);
 
 	return clen;
 }
@@ -381,11 +381,12 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 		gen_iv(iv, sqc);
 
 		/* try to update the packet itself */
-		rc = outb_trs_pkt_prepare(sa, sqc, iv, mb[i], l2, l3, &icv,
-					  sa->sqh_len);
+		rc = outb_trs_pkt_prepare(sa, sqc, iv, mb[i], l2, l3,
+				set_icv_va_pa, (void *)&icv, sa->sqh_len);
 		/* success, setup crypto op */
 		if (rc >= 0) {
-			outb_pkt_xprepare(sa, sqc, &icv);
+			outb_pkt_xprepare(sa, sqc, icv.va,
+					(void *)(icv.va + sa->icv_len));
 			lksd_none_cop_prepare(cop[k], cs, mb[i]);
 			outb_cop_prepare(cop[k], sa, iv, &icv, l2 + l3, rc);
 			k++;
@@ -403,6 +404,335 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	return k;
 }
 
+
+static inline int
+outb_cpu_crypto_proc_prepare(struct rte_mbuf *m, const struct rte_ipsec_sa *sa,
+		uint32_t hlen, uint32_t plen,
+		struct rte_security_vec *buf, struct iovec *cur_vec, void *iv)
+{
+	struct rte_mbuf *ms;
+	uint64_t *ivp = iv;
+	struct aead_gcm_iv *gcm;
+	struct aesctr_cnt_blk *ctr;
+	struct iovec *vec = cur_vec;
+	uint32_t left;
+	uint32_t off = 0;
+	uint32_t n_seg = 0;
+	uint32_t algo;
+
+	algo = sa->algo_type;
+
+	switch (algo) {
+	case ALGO_TYPE_AES_GCM:
+		gcm = iv;
+		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
+		off = sa->ctp.cipher.offset + hlen;
+		left = sa->ctp.cipher.length + plen;
+		break;
+	case ALGO_TYPE_AES_CBC:
+	case ALGO_TYPE_3DES_CBC:
+		off = sa->ctp.auth.offset + hlen;
+		left = sa->ctp.auth.length + plen;
+		break;
+	case ALGO_TYPE_AES_CTR:
+		off = sa->ctp.auth.offset + hlen;
+		left = sa->ctp.auth.length + plen;
+		ctr = iv;
+		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
+		break;
+	case ALGO_TYPE_NULL:
+		left = sa->ctp.cipher.length + plen;
+		break;
+	default:
+		return -EINVAL;
+	}
+
+	ms = mbuf_get_seg_ofs(m, &off);
+	if (!ms)
+		return -1;
+
+	while (n_seg < m->nb_segs && left && ms) {
+		uint32_t len = RTE_MIN(left, ms->data_len - off);
+
+		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
+		vec->iov_len = len;
+
+		left -= len;
+		vec++;
+		n_seg++;
+		ms = ms->next;
+		off = 0;
+	}
+
+	if (left)
+		return -1;
+
+	buf->vec = cur_vec;
+	buf->num = n_seg;
+
+	return n_seg;
+}
+
+static uint16_t
+esp_outb_tun_cpu_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	void *icv_va;
+	uint32_t dr[num];
+	uint32_t i, n;
+	int32_t rc;
+
+	/* cpu crypto specific variables */
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	uint64_t iv_buf[num][IPSEC_MAX_IV_QWORD];
+	void *iv[num];
+	int status[num];
+	uint8_t *aad_buf[num][sizeof(struct aead_gcm_aad)];
+	void *aad[num];
+	void *digest[num];
+	uint32_t k;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(iv_buf[k], sqc);
+
+		/* try to update the packet itself */
+		rc = outb_tun_pkt_prepare(sa, sqc, iv_buf[k], mb[i], set_icv_va,
+				(void *)&icv_va, sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			iv[k] = (void *)iv_buf[k];
+			aad[k] = (void *)aad_buf[k];
+			digest[k] = (void *)icv_va;
+
+			outb_pkt_xprepare(sa, sqc, icv_va, aad[k]);
+
+			rc = outb_cpu_crypto_proc_prepare(mb[i], sa,
+					0, rc, &buf[k], &vec[vec_idx], iv[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	if (unlikely(k == 0)) {
+		rte_errno = EBADMSG;
+		return 0;
+	}
+
+	/* process the packets */
+	n = 0;
+	rc = rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad,
+			digest, status, k);
+	/* move failed process packets to dr */
+	if (rc < 0)
+		for (i = 0; i < n; i++) {
+			if (status[i])
+				dr[n++] = i;
+		}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return k - n;
+}
+
+static uint16_t
+esp_outb_trs_cpu_crypto_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+
+{
+	uint64_t sqn;
+	rte_be64_t sqc;
+	struct rte_ipsec_sa *sa;
+	struct rte_security_ctx *ctx;
+	struct rte_security_session *rss;
+	void *icv_va;
+	uint32_t dr[num];
+	uint32_t i, n;
+	uint32_t l2, l3;
+	int32_t rc;
+
+	/* cpu crypto specific variables */
+	struct rte_security_vec buf[num];
+	struct iovec vec[RTE_LIBRTE_IP_FRAG_MAX_FRAG * num];
+	uint32_t vec_idx = 0;
+	uint64_t iv_buf[num][IPSEC_MAX_IV_QWORD];
+	void *iv[num];
+	int status[num];
+	uint8_t *aad_buf[num][sizeof(struct aead_gcm_aad)];
+	void *aad[num];
+	void *digest[num];
+	uint32_t k;
+
+	sa = ss->sa;
+	ctx = ss->security.ctx;
+	rss = ss->security.ses;
+
+	k = 0;
+	n = num;
+	sqn = esn_outb_update_sqn(sa, &n);
+	if (n != num)
+		rte_errno = EOVERFLOW;
+
+	for (i = 0; i != n; i++) {
+		l2 = mb[i]->l2_len;
+		l3 = mb[i]->l3_len;
+
+		sqc = rte_cpu_to_be_64(sqn + i);
+		gen_iv(iv_buf[k], sqc);
+
+		/* try to update the packet itself */
+		rc = outb_trs_pkt_prepare(sa, sqc, iv_buf[k], mb[i], l2, l3,
+				set_icv_va, (void *)&icv_va, sa->sqh_len);
+
+		/* success, setup crypto op */
+		if (rc >= 0) {
+			iv[k] = (void *)iv_buf[k];
+			aad[k] = (void *)aad_buf[k];
+			digest[k] = (void *)icv_va;
+
+			outb_pkt_xprepare(sa, sqc, icv_va, aad[k]);
+
+			rc = outb_cpu_crypto_proc_prepare(mb[i], sa,
+					l2 + l3, rc, &buf[k], &vec[vec_idx],
+					iv[k]);
+			if (rc < 0) {
+				dr[i - k] = i;
+				rte_errno = -rc;
+				continue;
+			}
+
+			vec_idx += rc;
+			k++;
+		/* failure, put packet into the death-row */
+		} else {
+			dr[i - k] = i;
+			rte_errno = -rc;
+		}
+	}
+
+	 /* copy not prepared mbufs beyond good ones */
+	if (k != n && k != 0)
+		move_bad_mbufs(mb, dr, n, n - k);
+
+	/* process the packets */
+	n = 0;
+	rc = rte_security_process_cpu_crypto_bulk(ctx, rss, buf, iv, aad,
+			digest, status, k);
+	/* move failed process packets to dr */
+	if (rc < 0)
+		for (i = 0; i < k; i++) {
+			if (status[i])
+				dr[n++] = i;
+		}
+
+	if (n)
+		move_bad_mbufs(mb, dr, k, n);
+
+	return k - n;
+}
+
+uint16_t
+esp_outb_tun_cpu_crypto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+	uint32_t icv_len;
+	void *icv;
+	uint16_t n;
+	uint16_t i;
+
+	n = esp_outb_tun_cpu_crypto_process(ss, mb, num);
+
+	icv_len = sa->icv_len;
+
+	for (i = 0; i < n; i++) {
+		struct rte_mbuf *ml = rte_pktmbuf_lastseg(mb[i]);
+
+		mb[i]->pkt_len -= sa->sqh_len;
+		ml->data_len -= sa->sqh_len;
+
+		icv = rte_pktmbuf_mtod_offset(ml, void *,
+				ml->data_len - icv_len);
+		remove_sqh(icv, sa->icv_len);
+	}
+
+	return n;
+}
+
+uint16_t
+esp_outb_tun_cpu_crypto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_tun_cpu_crypto_process(ss, mb, num);
+}
+
+uint16_t
+esp_outb_trs_cpu_crypto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	struct rte_ipsec_sa *sa = ss->sa;
+	uint32_t icv_len;
+	void *icv;
+	uint16_t n;
+	uint16_t i;
+
+	n = esp_outb_trs_cpu_crypto_process(ss, mb, num);
+	icv_len = sa->icv_len;
+
+	for (i = 0; i < n; i++) {
+		struct rte_mbuf *ml = rte_pktmbuf_lastseg(mb[i]);
+
+		mb[i]->pkt_len -= sa->sqh_len;
+		ml->data_len -= sa->sqh_len;
+
+		icv = rte_pktmbuf_mtod_offset(ml, void *,
+				ml->data_len - icv_len);
+		remove_sqh(icv, sa->icv_len);
+	}
+
+	return n;
+}
+
+uint16_t
+esp_outb_trs_cpu_crypto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
+{
+	return esp_outb_trs_cpu_crypto_process(ss, mb, num);
+}
+
 /*
  * process outbound packets for SA with ESN support,
  * for algorithms that require SQN.hibits to be implictly included
@@ -410,8 +740,8 @@ esp_outb_trs_prepare(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
  * In that case we have to move ICV bytes back to their proper place.
  */
 uint16_t
-esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+esp_outb_sqh_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k, icv_len, *icv;
 	struct rte_mbuf *ml;
@@ -498,7 +828,8 @@ inline_outb_tun_pkt_process(const struct rte_ipsec_session *ss,
 		gen_iv(iv, sqc);
 
 		/* try to update the packet itself */
-		rc = outb_tun_pkt_prepare(sa, sqc, iv, mb[i], &icv, 0);
+		rc = outb_tun_pkt_prepare(sa, sqc, iv, mb[i], set_icv_va_pa,
+				(void *)&icv, 0);
 
 		k += (rc >= 0);
 
@@ -552,7 +883,7 @@ inline_outb_trs_pkt_process(const struct rte_ipsec_session *ss,
 
 		/* try to update the packet itself */
 		rc = outb_trs_pkt_prepare(sa, sqc, iv, mb[i],
-				l2, l3, &icv, 0);
+				l2, l3, set_icv_va_pa, (void *)&icv, 0);
 
 		k += (rc >= 0);
 
diff --git a/lib/librte_ipsec/sa.c b/lib/librte_ipsec/sa.c
index 23d394b46..b8d55a1c7 100644
--- a/lib/librte_ipsec/sa.c
+++ b/lib/librte_ipsec/sa.c
@@ -544,9 +544,9 @@ lksd_proto_prepare(const struct rte_ipsec_session *ss,
  * - inbound/outbound for RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
  * - outbound for RTE_SECURITY_ACTION_TYPE_NONE when ESN is disabled
  */
-static uint16_t
-pkt_flag_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
-	uint16_t num)
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num)
 {
 	uint32_t i, k;
 	uint32_t dr[num];
@@ -599,12 +599,48 @@ lksd_none_pkt_func_select(const struct rte_ipsec_sa *sa,
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
 		pf->prepare = esp_outb_tun_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
 		break;
 	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
 		pf->prepare = esp_outb_trs_prepare;
 		pf->process = (sa->sqh_len != 0) ?
-			esp_outb_sqh_process : pkt_flag_process;
+			esp_outb_sqh_process : esp_outb_pkt_flag_process;
+		break;
+	default:
+		rc = -ENOTSUP;
+	}
+
+	return rc;
+}
+
+static int
+cpu_crypto_pkt_func_select(const struct rte_ipsec_sa *sa,
+		struct rte_ipsec_sa_pkt_func *pf)
+{
+	int32_t rc;
+
+	static const uint64_t msk = RTE_IPSEC_SATP_DIR_MASK |
+			RTE_IPSEC_SATP_MODE_MASK;
+
+	rc = 0;
+	switch (sa->type & msk) {
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = esp_inb_tun_cpu_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_IB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = esp_inb_trs_cpu_crypto_pkt_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV4):
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TUNLV6):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_tun_cpu_crypto_sqh_process :
+			esp_outb_tun_cpu_crypto_flag_process;
+		break;
+	case (RTE_IPSEC_SATP_DIR_OB | RTE_IPSEC_SATP_MODE_TRANS):
+		pf->process = (sa->sqh_len != 0) ?
+			esp_outb_trs_cpu_crypto_sqh_process :
+			esp_outb_trs_cpu_crypto_flag_process;
 		break;
 	default:
 		rc = -ENOTSUP;
@@ -672,13 +708,16 @@ ipsec_sa_pkt_func_select(const struct rte_ipsec_session *ss,
 	case RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL:
 		if ((sa->type & RTE_IPSEC_SATP_DIR_MASK) ==
 				RTE_IPSEC_SATP_DIR_IB)
-			pf->process = pkt_flag_process;
+			pf->process = esp_outb_pkt_flag_process;
 		else
 			pf->process = inline_proto_outb_pkt_process;
 		break;
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		pf->prepare = lksd_proto_prepare;
-		pf->process = pkt_flag_process;
+		pf->process = esp_outb_pkt_flag_process;
+		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		rc = cpu_crypto_pkt_func_select(sa, pf);
 		break;
 	default:
 		rc = -ENOTSUP;
diff --git a/lib/librte_ipsec/sa.h b/lib/librte_ipsec/sa.h
index 51e69ad05..770d36b8b 100644
--- a/lib/librte_ipsec/sa.h
+++ b/lib/librte_ipsec/sa.h
@@ -156,6 +156,14 @@ uint16_t
 inline_inb_trs_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_inb_tun_cpu_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_inb_trs_cpu_crypto_pkt_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
 /* outbound processing */
 
 uint16_t
@@ -170,6 +178,10 @@ uint16_t
 esp_outb_sqh_process(const struct rte_ipsec_session *ss, struct rte_mbuf *mb[],
 	uint16_t num);
 
+uint16_t
+esp_outb_pkt_flag_process(const struct rte_ipsec_session *ss,
+	struct rte_mbuf *mb[], uint16_t num);
+
 uint16_t
 inline_outb_tun_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
@@ -182,4 +194,21 @@ uint16_t
 inline_proto_outb_pkt_process(const struct rte_ipsec_session *ss,
 	struct rte_mbuf *mb[], uint16_t num);
 
+uint16_t
+esp_outb_tun_cpu_crypto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_tun_cpu_crypto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_cpu_crypto_sqh_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+uint16_t
+esp_outb_trs_cpu_crypto_flag_process(const struct rte_ipsec_session *ss,
+		struct rte_mbuf *mb[], uint16_t num);
+
+
 #endif /* _SA_H_ */
diff --git a/lib/librte_ipsec/ses.c b/lib/librte_ipsec/ses.c
index 82c765a33..eaa8c17b7 100644
--- a/lib/librte_ipsec/ses.c
+++ b/lib/librte_ipsec/ses.c
@@ -19,7 +19,9 @@ session_check(struct rte_ipsec_session *ss)
 			return -EINVAL;
 		if ((ss->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
 				ss->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) &&
+				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+				ss->type ==
+				RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
 				ss->security.ctx == NULL)
 			return -EINVAL;
 	}
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 09/10] examples/ipsec-secgw: add security cpu_crypto action support
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (7 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
@ 2019-10-07 16:28     ` " Fan Zhang
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 10/10] doc: update security cpu process description Fan Zhang
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

Since ipsec library is added cpu_crypto security action type support,
this patch updates ipsec-secgw sample application with added action type
"cpu-crypto". The patch also includes a number of test scripts to
prove the correctness of the implementation.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 examples/ipsec-secgw/ipsec.c                       | 35 ++++++++++++++++++++++
 examples/ipsec-secgw/ipsec_process.c               |  7 +++--
 examples/ipsec-secgw/sa.c                          | 13 ++++++--
 examples/ipsec-secgw/test/run_test.sh              | 10 +++++++
 .../test/trs_3descbc_sha1_common_defs.sh           |  8 ++---
 .../test/trs_3descbc_sha1_cpu_crypto_defs.sh       |  5 ++++
 .../test/trs_aescbc_sha1_common_defs.sh            |  8 ++---
 .../test/trs_aescbc_sha1_cpu_crypto_defs.sh        |  5 ++++
 .../test/trs_aesctr_sha1_common_defs.sh            |  8 ++---
 .../test/trs_aesctr_sha1_cpu_crypto_defs.sh        |  5 ++++
 .../ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh |  5 ++++
 .../test/trs_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++
 .../test/tun_3descbc_sha1_common_defs.sh           |  8 ++---
 .../test/tun_3descbc_sha1_cpu_crypto_defs.sh       |  5 ++++
 .../test/tun_aescbc_sha1_common_defs.sh            |  8 ++---
 .../test/tun_aescbc_sha1_cpu_crypto_defs.sh        |  5 ++++
 .../test/tun_aesctr_sha1_common_defs.sh            |  8 ++---
 .../test/tun_aesctr_sha1_cpu_crypto_defs.sh        |  5 ++++
 .../ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh |  5 ++++
 .../test/tun_aesgcm_mb_cpu_crypto_defs.sh          |  7 +++++
 20 files changed, 138 insertions(+), 29 deletions(-)
 create mode 100644 examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
 create mode 100644 examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh

diff --git a/examples/ipsec-secgw/ipsec.c b/examples/ipsec-secgw/ipsec.c
index 1145ca1c0..02b9443a8 100644
--- a/examples/ipsec-secgw/ipsec.c
+++ b/examples/ipsec-secgw/ipsec.c
@@ -10,6 +10,7 @@
 #include <rte_crypto.h>
 #include <rte_security.h>
 #include <rte_cryptodev.h>
+#include <rte_ipsec.h>
 #include <rte_ethdev.h>
 #include <rte_mbuf.h>
 #include <rte_hash.h>
@@ -51,6 +52,19 @@ set_ipsec_conf(struct ipsec_sa *sa, struct rte_security_ipsec_xform *ipsec)
 	ipsec->esn_soft_limit = IPSEC_OFFLOAD_ESN_SOFTLIMIT;
 }
 
+static int32_t
+compute_cipher_offset(struct ipsec_sa *sa)
+{
+	int32_t offset;
+
+	if (sa->aead_algo == RTE_CRYPTO_AEAD_AES_GCM)
+		return 0;
+
+	offset = (sa->iv_len + sizeof(struct rte_esp_hdr));
+
+	return offset;
+}
+
 int
 create_lookaside_session(struct ipsec_ctx *ipsec_ctx, struct ipsec_sa *sa)
 {
@@ -117,6 +131,25 @@ create_lookaside_session(struct ipsec_ctx *ipsec_ctx, struct ipsec_sa *sa)
 				"SEC Session init failed: err: %d\n", ret);
 				return -1;
 			}
+		} else if (sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
+			struct rte_security_ctx *ctx =
+				(struct rte_security_ctx *)
+				rte_cryptodev_get_sec_ctx(
+					ipsec_ctx->tbl[cdev_id_qp].id);
+
+			/* Set IPsec parameters in conf */
+			sess_conf.cpucrypto.cipher_offset =
+					compute_cipher_offset(sa);
+
+			set_ipsec_conf(sa, &(sess_conf.ipsec));
+			sa->security_ctx = ctx;
+			sa->sec_session = rte_security_session_create(ctx,
+				&sess_conf, ipsec_ctx->session_priv_pool);
+			if (sa->sec_session == NULL) {
+				RTE_LOG(ERR, IPSEC,
+				"SEC Session init failed: err: %d\n", ret);
+				return -1;
+			}
 		} else {
 			RTE_LOG(ERR, IPSEC, "Inline not supported\n");
 			return -1;
@@ -512,6 +545,8 @@ ipsec_enqueue(ipsec_xform_fn xform_func, struct ipsec_ctx *ipsec_ctx,
 						sa->security_ctx,
 						sa->sec_session, pkts[i], NULL);
 			continue;
+		default:
+			continue;
 		}
 
 		RTE_ASSERT(sa->cdev_id_qp < ipsec_ctx->nb_qps);
diff --git a/examples/ipsec-secgw/ipsec_process.c b/examples/ipsec-secgw/ipsec_process.c
index 868f1a28d..1932b631f 100644
--- a/examples/ipsec-secgw/ipsec_process.c
+++ b/examples/ipsec-secgw/ipsec_process.c
@@ -101,7 +101,8 @@ fill_ipsec_session(struct rte_ipsec_session *ss, struct ipsec_ctx *ctx,
 		}
 		ss->crypto.ses = sa->crypto_session;
 	/* setup session action type */
-	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL) {
+	} else if (sa->type == RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 		if (sa->sec_session == NULL) {
 			rc = create_lookaside_session(ctx, sa);
 			if (rc != 0)
@@ -227,8 +228,8 @@ ipsec_process(struct ipsec_ctx *ctx, struct ipsec_traffic *trf)
 
 		/* process packets inline */
 		else if (sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_CRYPTO ||
-				sa->type ==
-				RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL) {
+			sa->type == RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL ||
+			sa->type == RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) {
 
 			satp = rte_ipsec_sa_type(ips->sa);
 
diff --git a/examples/ipsec-secgw/sa.c b/examples/ipsec-secgw/sa.c
index c3cf3bd1f..ba773346f 100644
--- a/examples/ipsec-secgw/sa.c
+++ b/examples/ipsec-secgw/sa.c
@@ -570,6 +570,9 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 				RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL;
 			else if (strcmp(tokens[ti], "no-offload") == 0)
 				rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
+			else if (strcmp(tokens[ti], "cpu-crypto") == 0)
+				rule->type =
+					RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO;
 			else {
 				APP_CHECK(0, status, "Invalid input \"%s\"",
 						tokens[ti]);
@@ -624,10 +627,13 @@ parse_sa_tokens(char **tokens, uint32_t n_tokens,
 	if (status->status < 0)
 		return;
 
-	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE) && (portid_p == 0))
+	if ((rule->type != RTE_SECURITY_ACTION_TYPE_NONE && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO) &&
+			(portid_p == 0))
 		printf("Missing portid option, falling back to non-offload\n");
 
-	if (!type_p || !portid_p) {
+	if (!type_p || (!portid_p && rule->type !=
+			RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO)) {
 		rule->type = RTE_SECURITY_ACTION_TYPE_NONE;
 		rule->portid = -1;
 	}
@@ -709,6 +715,9 @@ print_one_sa_rule(const struct ipsec_sa *sa, int inbound)
 	case RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL:
 		printf("lookaside-protocol-offload ");
 		break;
+	case RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+		printf("cpu-crypto-accelerated");
+		break;
 	}
 	printf("\n");
 }
diff --git a/examples/ipsec-secgw/test/run_test.sh b/examples/ipsec-secgw/test/run_test.sh
index 8055a4c04..bcaf91715 100755
--- a/examples/ipsec-secgw/test/run_test.sh
+++ b/examples/ipsec-secgw/test/run_test.sh
@@ -32,15 +32,21 @@ usage()
 }
 
 LINUX_TEST="tun_aescbc_sha1 \
+tun_aescbc_sha1_cpu_crypto \
 tun_aescbc_sha1_esn \
 tun_aescbc_sha1_esn_atom \
 tun_aesgcm \
+tun_aesgcm_cpu_crypto \
+tun_aesgcm_mb_cpu_crypto \
 tun_aesgcm_esn \
 tun_aesgcm_esn_atom \
 trs_aescbc_sha1 \
+trs_aescbc_sha1_cpu_crypto \
 trs_aescbc_sha1_esn \
 trs_aescbc_sha1_esn_atom \
 trs_aesgcm \
+trs_aesgcm_cpu_crypto \
+trs_aesgcm_mb_cpu_crypto \
 trs_aesgcm_esn \
 trs_aesgcm_esn_atom \
 tun_aescbc_sha1_old \
@@ -49,17 +55,21 @@ trs_aescbc_sha1_old \
 trs_aesgcm_old \
 tun_aesctr_sha1 \
 tun_aesctr_sha1_old \
+tun_aesctr_sha1_cpu_crypto \
 tun_aesctr_sha1_esn \
 tun_aesctr_sha1_esn_atom \
 trs_aesctr_sha1 \
+trs_aesctr_sha1_cpu_crypto \
 trs_aesctr_sha1_old \
 trs_aesctr_sha1_esn \
 trs_aesctr_sha1_esn_atom \
 tun_3descbc_sha1 \
+tun_3descbc_sha1_cpu_crypto \
 tun_3descbc_sha1_old \
 tun_3descbc_sha1_esn \
 tun_3descbc_sha1_esn_atom \
 trs_3descbc_sha1 \
+trs_3descbc_sha1_cpu_crypto \
 trs_3descbc_sha1_old \
 trs_3descbc_sha1_esn \
 trs_3descbc_sha1_esn_atom"
diff --git a/examples/ipsec-secgw/test/trs_3descbc_sha1_common_defs.sh b/examples/ipsec-secgw/test/trs_3descbc_sha1_common_defs.sh
index bb4cef6a9..eda2ddf0c 100644
--- a/examples/ipsec-secgw/test/trs_3descbc_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/trs_3descbc_sha1_common_defs.sh
@@ -32,14 +32,14 @@ cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo 3des-cbc \
 cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo 3des-cbc \
@@ -47,7 +47,7 @@ cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 9 cipher_algo 3des-cbc \
@@ -55,7 +55,7 @@ cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..a864a8886
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aescbc_sha1_common_defs.sh b/examples/ipsec-secgw/test/trs_aescbc_sha1_common_defs.sh
index e2621e0df..49b7b0713 100644
--- a/examples/ipsec-secgw/test/trs_aescbc_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/trs_aescbc_sha1_common_defs.sh
@@ -31,27 +31,27 @@ sa in 7 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 9 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..b515cd9f8
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesctr_sha1_common_defs.sh b/examples/ipsec-secgw/test/trs_aesctr_sha1_common_defs.sh
index 9c213e3cc..428322307 100644
--- a/examples/ipsec-secgw/test/trs_aesctr_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/trs_aesctr_sha1_common_defs.sh
@@ -31,27 +31,27 @@ sa in 7 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 9 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode transport
+mode transport ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..745a2a02b
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..8917122da
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..26943321f
--- /dev/null
+++ b/examples/ipsec-secgw/test/trs_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/trs_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_3descbc_sha1_common_defs.sh b/examples/ipsec-secgw/test/tun_3descbc_sha1_common_defs.sh
index dd802d6be..a583ef605 100644
--- a/examples/ipsec-secgw/test/tun_3descbc_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/tun_3descbc_sha1_common_defs.sh
@@ -32,14 +32,14 @@ cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4}
+mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4} ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo 3des-cbc \
 cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6}
+mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6} ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo 3des-cbc \
@@ -47,14 +47,14 @@ cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4}
+mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4} ${SGW_CFG_XPRM}
 
 sa out 9 cipher_algo 3des-cbc \
 cipher_key \
 de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6}
+mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6} ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..747141f62
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_3descbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_3descbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aescbc_sha1_common_defs.sh b/examples/ipsec-secgw/test/tun_aescbc_sha1_common_defs.sh
index 4025da232..ac0232d2c 100644
--- a/examples/ipsec-secgw/test/tun_aescbc_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/tun_aescbc_sha1_common_defs.sh
@@ -31,26 +31,26 @@ sa in 7 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4}
+mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4} ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6}
+mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6} ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4}
+mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4} ${SGW_CFG_XPRM}
 
 sa out 9 cipher_algo aes-128-cbc \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6}
+mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6} ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..56076fa50
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aescbc_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aescbc_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesctr_sha1_common_defs.sh b/examples/ipsec-secgw/test/tun_aesctr_sha1_common_defs.sh
index a3ac3a698..523c396c9 100644
--- a/examples/ipsec-secgw/test/tun_aesctr_sha1_common_defs.sh
+++ b/examples/ipsec-secgw/test/tun_aesctr_sha1_common_defs.sh
@@ -31,26 +31,26 @@ sa in 7 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4}
+mode ipv4-tunnel src ${REMOTE_IPV4} dst ${LOCAL_IPV4} ${SGW_CFG_XPRM}
 
 sa in 9 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6}
+mode ipv6-tunnel src ${REMOTE_IPV6} dst ${LOCAL_IPV6} ${SGW_CFG_XPRM}
 
 #SA out rules
 sa out 7 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4}
+mode ipv4-tunnel src ${LOCAL_IPV4} dst ${REMOTE_IPV4} ${SGW_CFG_XPRM}
 
 sa out 9 cipher_algo aes-128-ctr \
 cipher_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
 auth_algo sha1-hmac \
 auth_key de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef:de:ad:be:ef \
-mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6}
+mode ipv6-tunnel src ${LOCAL_IPV6} dst ${REMOTE_IPV6} ${SGW_CFG_XPRM}
 
 #Routing rules
 rt ipv4 dst ${REMOTE_IPV4}/32 port 0
diff --git a/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
new file mode 100644
index 000000000..3af680533
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesctr_sha1_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesctr_sha1_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
new file mode 100644
index 000000000..5bf1c0ae5
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_cpu_crypto_defs.sh
@@ -0,0 +1,5 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+SGW_CFG_XPRM='type cpu-crypto'
diff --git a/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
new file mode 100644
index 000000000..039b8095e
--- /dev/null
+++ b/examples/ipsec-secgw/test/tun_aesgcm_mb_cpu_crypto_defs.sh
@@ -0,0 +1,7 @@
+#! /bin/bash
+
+. ${DIR}/tun_aesgcm_defs.sh
+
+CRYPTO_DEV=${CRYPTO_DEV:-'--vdev="crypto_aesni_mb0"'}
+
+SGW_CFG_XPRM='type cpu-crypto'
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* [dpdk-dev] [PATCH v2 10/10] doc: update security cpu process description
  2019-10-07 16:28   ` [dpdk-dev] [PATCH v2 " Fan Zhang
                       ` (8 preceding siblings ...)
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 09/10] examples/ipsec-secgw: add security " Fan Zhang
@ 2019-10-07 16:28     ` Fan Zhang
  9 siblings, 0 replies; 84+ messages in thread
From: Fan Zhang @ 2019-10-07 16:28 UTC (permalink / raw)
  To: dev; +Cc: konstantin.ananyev, declan.doherty, akhil.goyal, Fan Zhang

This patch updates programmer's guide and release note for
newly added security cpu process description.

Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
---
 doc/guides/cryptodevs/aesni_gcm.rst    |   6 ++
 doc/guides/cryptodevs/aesni_mb.rst     |   7 +++
 doc/guides/prog_guide/rte_security.rst | 112 ++++++++++++++++++++++++++++++++-
 doc/guides/rel_notes/release_19_11.rst |   7 +++
 4 files changed, 131 insertions(+), 1 deletion(-)

diff --git a/doc/guides/cryptodevs/aesni_gcm.rst b/doc/guides/cryptodevs/aesni_gcm.rst
index 15002aba7..e1c4f9d24 100644
--- a/doc/guides/cryptodevs/aesni_gcm.rst
+++ b/doc/guides/cryptodevs/aesni_gcm.rst
@@ -9,6 +9,12 @@ The AES-NI GCM PMD (**librte_pmd_aesni_gcm**) provides poll mode crypto driver
 support for utilizing Intel multi buffer library (see AES-NI Multi-buffer PMD documentation
 to learn more about it, including installation).
 
+The AES-NI GCM PMD also supports rte_security with security session create
+and ``rte_security_process_cpu_crypto_bulk`` function call to process
+symmetric crypto synchronously with all algorithms specified below. With this
+way it supports scather-gather buffers (``rte_security_vec`` can be greater than
+``1``. Please refer to ``rte_security`` programmer's guide for more detail.
+
 Features
 --------
 
diff --git a/doc/guides/cryptodevs/aesni_mb.rst b/doc/guides/cryptodevs/aesni_mb.rst
index 1eff2b073..1a3ddd850 100644
--- a/doc/guides/cryptodevs/aesni_mb.rst
+++ b/doc/guides/cryptodevs/aesni_mb.rst
@@ -12,6 +12,13 @@ support for utilizing Intel multi buffer library, see the white paper
 
 The AES-NI MB PMD has current only been tested on Fedora 21 64-bit with gcc.
 
+The AES-NI MB PMD also supports rte_security with security session create
+and ``rte_security_process_cpu_crypto_bulk`` function call to process
+symmetric crypto synchronously with all algorithms specified below. However
+it does not support scather-gather buffer so the ``num`` value in
+``rte_security_vec`` can only be ``1``. Please refer to ``rte_security``
+programmer's guide for more detail.
+
 Features
 --------
 
diff --git a/doc/guides/prog_guide/rte_security.rst b/doc/guides/prog_guide/rte_security.rst
index 7d0734a37..39bcc2e69 100644
--- a/doc/guides/prog_guide/rte_security.rst
+++ b/doc/guides/prog_guide/rte_security.rst
@@ -296,6 +296,56 @@ Just like IPsec, in case of PDCP also header addition/deletion, cipher/
 de-cipher, integrity protection/verification is done based on the action
 type chosen.
 
+
+Synchronous CPU Crypto
+~~~~~~~~~~~~~~~~~~~~~~
+
+RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO:
+This action type allows the burst of symmetric crypto workload using the same
+algorithm, key, and direction being processed by CPU cycles synchronously.
+
+The packet is sent to the crypto device for symmetric crypto
+processing. The device will encrypt or decrypt the buffer based on the key(s)
+and algorithm(s) specified and preprocessed in the security session. Different
+than the inline or lookaside modes, when the function exits, the user will
+expect the buffers are either processed successfully, or having the error number
+assigned to the appropriate index of the status array.
+
+E.g. in case of IPsec, the application will use CPU cycles to process both
+stack and crypto workload synchronously.
+
+.. code-block:: console
+
+         Egress Data Path
+                 |
+        +--------|--------+
+        |  egress IPsec   |
+        |        |        |
+        | +------V------+ |
+        | | SADB lookup | |
+        | +------|------+ |
+        | +------V------+ |
+        | |   Desc      | |
+        | +------|------+ |
+        +--------V--------+
+                 |
+        +--------V--------+
+        |    L2 Stack     |
+        +-----------------+
+        |                 |
+        |   Synchronous   |   <------ Using CPU instructions
+        |  Crypto Process |
+        |                 |
+        +--------V--------+
+        |  L2 Stack Post  |   <------ Add tunnel, ESP header etc header etc.
+        +--------|--------+
+                 |
+        +--------|--------+
+        |       NIC       |
+        +--------|--------+
+                 V
+
+
 Device Features and Capabilities
 ---------------------------------
 
@@ -491,6 +541,7 @@ Security Session configuration structure is defined as ``rte_security_session_co
                 struct rte_security_ipsec_xform ipsec;
                 struct rte_security_macsec_xform macsec;
                 struct rte_security_pdcp_xform pdcp;
+                struct rte_security_cpu_crypto_xform cpu_crypto;
         };
         /**< Configuration parameters for security session */
         struct rte_crypto_sym_xform *crypto_xform;
@@ -515,9 +566,12 @@ Offload.
         RTE_SECURITY_ACTION_TYPE_INLINE_PROTOCOL,
         /**< All security protocol processing is performed inline during
          * transmission */
-        RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
+        RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
         /**< All security protocol processing including crypto is performed
          * on a lookaside accelerator */
+        RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
+        /**< Crypto processing for security protocol is processed by CPU
+         * synchronously
     };
 
 The ``rte_security_session_protocol`` is defined as
@@ -587,6 +641,10 @@ PDCP related configuration parameters are defined in ``rte_security_pdcp_xform``
         uint32_t hfn_threshold;
     };
 
+For CPU Crypto processing action, the application should attach the initialized
+`xform` to the security session configuration to specify the algorithm, key,
+direction, and other necessary fields required to perform crypto operation.
+
 
 Security API
 ~~~~~~~~~~~~
@@ -650,3 +708,55 @@ it is only valid to have a single flow to map to that security session.
         +-------+            +--------+    +-----+
         |  Eth  | ->  ... -> |   ESP  | -> | END |
         +-------+            +--------+    +-----+
+
+
+Process bulk crypto workload using CPU instructions
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The inline and lookaside mode depends on the external HW to complete the
+workload, where the user has another option to use rte_security to process
+symmetric crypto synchronously with CPU instructions.
+
+When creating the security session the user need to fill the
+``rte_security_session_conf`` parameter with the ``action_type`` field as
+``RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO``, and points ``crypto_xform`` to an
+properly initialized cryptodev xform. The user then passes the
+``rte_security_session_conf`` instance to ``rte_security_session_create()``
+along with the security context pointer belongs to a certain SW crypto device.
+The crypto device may or may not support this action type or the algorithm /
+key sizes specified in the ``crypto_xform``, but when everything is ok
+the function will return the created security session.
+
+The user then can use this session to process the crypto workload synchronously.
+Instead of using mbuf ``next`` pointers, synchronous CPU crypto processing uses
+a special structure ``rte_security_vec`` to describe scatter-gather buffers.
+
+.. code-block:: c
+
+    struct rte_security_vec {
+        struct iovec *vec;
+        uint32_t num;
+    };
+
+Where the structure ``rte_security_vec`` is used to store scatter-gather buffer
+pointers, where ``vec`` is the pointer to one buffer and ``num`` indicates the
+number of buffers.
+
+Please note not all crypto devices support scatter-gather buffer processing,
+please check ``cryptodev`` guide for more details.
+
+The API of the synchronous CPU crypto process is
+
+.. code-block:: c
+
+    int
+    rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
+            struct rte_security_session *sess,
+            struct rte_security_vec buf[], void *iv[], void *aad[],
+            void *digest[], int status[], uint32_t num);
+
+This function will process ``num`` number of ``rte_security_vec`` buffers using
+the content stored in ``iv`` and ``aad`` arrays. The API only support in-place
+operation so ``buf`` will be overwritten the encrypted or decrypted values
+when successfully processed. Otherwise a negative value will be returned and
+the error number of the status array's according index will be set.
diff --git a/doc/guides/rel_notes/release_19_11.rst b/doc/guides/rel_notes/release_19_11.rst
index f971b3f77..3d89ab643 100644
--- a/doc/guides/rel_notes/release_19_11.rst
+++ b/doc/guides/rel_notes/release_19_11.rst
@@ -72,6 +72,13 @@ New Features
   Added a symmetric crypto PMD for Marvell NITROX V security processor.
   See the :doc:`../cryptodevs/nitrox` guide for more details on this new
 
+* **Added synchronous Crypto burst API with CPU for RTE_SECURITY.**
+
+  A new API rte_security_process_cpu_crypto_bulk is introduced in security
+  library to process crypto workload in bulk using CPU instructions. AESNI_MB
+  and AESNI_GCM PMD, as well as unit-test and ipsec-secgw sample applications
+  are updated to support this feature.
+
 
 Removed Items
 -------------
-- 
2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 01/10] security: introduce CPU Crypto action type and API Fan Zhang
@ 2019-10-08 13:42       ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-08 13:42 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

Hi Fan,

> 
> This patch introduce new RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action type to
> security library. The type represents performing crypto operation with CPU
> cycles. The patch also includes a new API to process crypto operations in
> bulk and the function pointers for PMDs.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  lib/librte_security/rte_security.c           | 11 ++++++
>  lib/librte_security/rte_security.h           | 53 +++++++++++++++++++++++++++-
>  lib/librte_security/rte_security_driver.h    | 22 ++++++++++++
>  lib/librte_security/rte_security_version.map |  1 +
>  4 files changed, 86 insertions(+), 1 deletion(-)
> 
> diff --git a/lib/librte_security/rte_security.c b/lib/librte_security/rte_security.c
> index bc81ce15d..cdd1ee6af 100644
> --- a/lib/librte_security/rte_security.c
> +++ b/lib/librte_security/rte_security.c
> @@ -141,3 +141,14 @@ rte_security_capability_get(struct rte_security_ctx *instance,
> 
>  	return NULL;
>  }
> +
> +int
> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	RTE_FUNC_PTR_OR_ERR_RET(*instance->ops->process_cpu_crypto_bulk, -1);
> +	return instance->ops->process_cpu_crypto_bulk(sess, buf, iv,
> +			aad, digest, status, num);
> +}
> diff --git a/lib/librte_security/rte_security.h b/lib/librte_security/rte_security.h
> index aaafdfcd7..0caf5d697 100644
> --- a/lib/librte_security/rte_security.h
> +++ b/lib/librte_security/rte_security.h
> @@ -18,6 +18,7 @@ extern "C" {
>  #endif
> 
>  #include <sys/types.h>
> +#include <sys/uio.h>
> 
>  #include <netinet/in.h>
>  #include <netinet/ip.h>
> @@ -289,6 +290,20 @@ struct rte_security_pdcp_xform {
>  	uint32_t hfn_ovrd;
>  };
> 
> +struct rte_security_cpu_crypto_xform {
> +	/** For cipher/authentication crypto operation the authentication may
> +	 * cover more content then the cipher. E.g., for IPSec ESP encryption
> +	 * with AES-CBC and SHA1-HMAC, the encryption happens after the ESP
> +	 * header but whole packet (apart from MAC header) is authenticated.
> +	 * The cipher_offset field is used to deduct the cipher data pointer
> +	 * from the buffer to be processed.
> +	 *
> +	 * NOTE this parameter shall be ignored by AEAD algorithms, since it
> +	 * uses the same offset for cipher and authentication.
> +	 */
> +	int32_t cipher_offset;
> +};
> +
>  /**
>   * Security session action type.
>   */
> @@ -303,10 +318,14 @@ enum rte_security_session_action_type {
>  	/**< All security protocol processing is performed inline during
>  	 * transmission
>  	 */
> -	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL
> +	RTE_SECURITY_ACTION_TYPE_LOOKASIDE_PROTOCOL,
>  	/**< All security protocol processing including crypto is performed
>  	 * on a lookaside accelerator
>  	 */
> +	RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO
> +	/**< Crypto processing for security protocol is processed by CPU
> +	 * synchronously
> +	 */
>  };
> 
>  /** Security session protocol definition */
> @@ -332,6 +351,7 @@ struct rte_security_session_conf {
>  		struct rte_security_ipsec_xform ipsec;
>  		struct rte_security_macsec_xform macsec;
>  		struct rte_security_pdcp_xform pdcp;
> +		struct rte_security_cpu_crypto_xform cpucrypto;
>  	};
>  	/**< Configuration parameters for security session */
>  	struct rte_crypto_sym_xform *crypto_xform;
> @@ -665,6 +685,37 @@ const struct rte_security_capability *
>  rte_security_capability_get(struct rte_security_ctx *instance,
>  			    struct rte_security_capability_idx *idx);
> 
> +/**
> + * Security vector structure, contains pointer to vector array and the length
> + * of the array
> + */
> +struct rte_security_vec {
> +	struct iovec *vec;
> +	uint32_t num;
> +};
> +
> +/**
> + * Processing bulk crypto workload with CPU
> + *
> + * @param	instance	security instance.
> + * @param	sess		security session
> + * @param	buf		array of buffer SGL vectors
> + * @param	iv		array of IV pointers
> + * @param	aad		array of AAD pointers
> + * @param	digest		array of digest pointers
> + * @param	status		array of status for the function to return
> + * @param	num		number of elements in each array
> + * @return
> + *  - On success, 0
> + *  - On any failure, -1

I think it is much better to retrun number of successfully process entries
(or number of failed entries - whatever is your preference).
Then user can easily determine does he need to walk through status
(and if yes till what point) or not at all.
Sorry if I wasn't clear in my previous comment.

> + */
> +__rte_experimental
> +int
> +rte_security_process_cpu_crypto_bulk(struct rte_security_ctx *instance,
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>  #ifdef __cplusplus
>  }
>  #endif
> diff --git a/lib/librte_security/rte_security_driver.h b/lib/librte_security/rte_security_driver.h
> index 1b561f852..fe940fffa 100644
> --- a/lib/librte_security/rte_security_driver.h
> +++ b/lib/librte_security/rte_security_driver.h
> @@ -132,6 +132,26 @@ typedef int (*security_get_userdata_t)(void *device,
>  typedef const struct rte_security_capability *(*security_capabilities_get_t)(
>  		void *device);
> 
> +/**
> + * Process security operations in bulk using CPU accelerated method.
> + *
> + * @param	sess		Security session structure.
> + * @param	buf		Buffer to the vectors to be processed.
> + * @param	iv		IV pointers.
> + * @param	aad		AAD pointers.
> + * @param	digest		Digest pointers.
> + * @param	status		Array of status value.
> + * @param	num		Number of elements in each array.
> + * @return
> + *  - On success, 0
> + *  - On any failure, -1
> + */
> +
> +typedef int (*security_process_cpu_crypto_bulk_t)(
> +		struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>  /** Security operations function pointer table */
>  struct rte_security_ops {
>  	security_session_create_t session_create;
> @@ -150,6 +170,8 @@ struct rte_security_ops {
>  	/**< Get userdata associated with session which processed the packet. */
>  	security_capabilities_get_t capabilities_get;
>  	/**< Get security capabilities. */
> +	security_process_cpu_crypto_bulk_t process_cpu_crypto_bulk;
> +	/**< Process data in bulk. */
>  };
> 
>  #ifdef __cplusplus
> diff --git a/lib/librte_security/rte_security_version.map b/lib/librte_security/rte_security_version.map
> index 53267bf3c..2132e7a00 100644
> --- a/lib/librte_security/rte_security_version.map
> +++ b/lib/librte_security/rte_security_version.map
> @@ -18,4 +18,5 @@ EXPERIMENTAL {
>  	rte_security_get_userdata;
>  	rte_security_session_stats_get;
>  	rte_security_session_update;
> +	rte_security_process_cpu_crypto_bulk;
>  };
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH v2 02/10] crypto/aesni_gcm: add rte_security handler
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 02/10] crypto/aesni_gcm: add rte_security handler Fan Zhang
@ 2019-10-08 13:44       ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-08 13:44 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal



> 
> This patch add rte_security support support to AESNI-GCM PMD. The PMD now
> initialize security context instance, create/delete PMD specific security
> sessions, and process crypto workloads in synchronous mode with
> scatter-gather list buffer supported.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd.c         | 97 +++++++++++++++++++++++-
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c     | 95 +++++++++++++++++++++++
>  drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h | 23 ++++++
>  drivers/crypto/aesni_gcm/meson.build             |  2 +-
>  4 files changed, 215 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> index 1006a5c4d..2e91bf149 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd.c
> @@ -6,6 +6,7 @@
>  #include <rte_hexdump.h>
>  #include <rte_cryptodev.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
>  #include <rte_bus_vdev.h>
>  #include <rte_malloc.h>
>  #include <rte_cpuflags.h>
> @@ -174,6 +175,56 @@ aesni_gcm_get_session(struct aesni_gcm_qp *qp, struct rte_crypto_op *op)
>  	return sess;
>  }
> 
> +static __rte_always_inline int
> +process_gcm_security_sgl_buf(struct aesni_gcm_security_session *sess,
> +		struct rte_security_vec *buf, uint8_t *iv,
> +		uint8_t *aad, uint8_t *digest)
> +{
> +	struct aesni_gcm_session *session = &sess->sess;
> +	uint8_t *tag;
> +	uint32_t i;
> +
> +	sess->init(&session->gdata_key, &sess->gdata_ctx, iv, aad,
> +			(uint64_t)session->aad_length);
> +
> +	for (i = 0; i < buf->num; i++) {
> +		struct iovec *vec = &buf->vec[i];
> +
> +		sess->update(&session->gdata_key, &sess->gdata_ctx,
> +				vec->iov_base, vec->iov_base, vec->iov_len);
> +	}
> +
> +	switch (session->op) {
> +	case AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION:
> +		if (session->req_digest_length != session->gen_digest_length)
> +			tag = sess->temp_digest;
> +		else
> +			tag = digest;
> +
> +		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
> +				session->gen_digest_length);
> +
> +		if (session->req_digest_length != session->gen_digest_length)
> +			memcpy(digest, sess->temp_digest,
> +					session->req_digest_length);
> +		break;
> +
> +	case AESNI_GCM_OP_AUTHENTICATED_DECRYPTION:
> +		tag = sess->temp_digest;
> +
> +		sess->finalize(&session->gdata_key, &sess->gdata_ctx, tag,
> +				session->gen_digest_length);
> +
> +		if (memcmp(tag, digest,	session->req_digest_length) != 0)
> +			return -1;
> +		break;
> +	default:
> +		return -1;
> +	}
> +
> +	return 0;
> +}
> +
>  /**
>   * Process a crypto operation, calling
>   * the GCM API from the multi buffer library.
> @@ -488,8 +539,10 @@ aesni_gcm_create(const char *name,
>  {
>  	struct rte_cryptodev *dev;
>  	struct aesni_gcm_private *internals;
> +	struct rte_security_ctx *sec_ctx;
>  	enum aesni_gcm_vector_mode vector_mode;
>  	MB_MGR *mb_mgr;
> +	char sec_name[RTE_DEV_NAME_MAX_LEN];
> 
>  	/* Check CPU for support for AES instruction set */
>  	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
> @@ -524,7 +577,8 @@ aesni_gcm_create(const char *name,
>  			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
>  			RTE_CRYPTODEV_FF_CPU_AESNI |
>  			RTE_CRYPTODEV_FF_OOP_SGL_IN_LB_OUT |
> -			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
> +			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
> +			RTE_CRYPTODEV_FF_SECURITY;
> 
>  	mb_mgr = alloc_mb_mgr(0);
>  	if (mb_mgr == NULL)
> @@ -587,6 +641,21 @@ aesni_gcm_create(const char *name,
> 
>  	internals->max_nb_queue_pairs = init_params->max_nb_queue_pairs;
> 
> +	/* setup security operations */
> +	snprintf(sec_name, sizeof(sec_name) - 1, "aes_gcm_sec_%u",
> +			dev->driver_id);
> +	sec_ctx = rte_zmalloc_socket(sec_name,
> +			sizeof(struct rte_security_ctx),
> +			RTE_CACHE_LINE_SIZE, init_params->socket_id);
> +	if (sec_ctx == NULL) {
> +		AESNI_GCM_LOG(ERR, "memory allocation failed\n");
> +		goto error_exit;
> +	}
> +
> +	sec_ctx->device = (void *)dev;
> +	sec_ctx->ops = rte_aesni_gcm_pmd_security_ops;
> +	dev->security_ctx = sec_ctx;
> +
>  #if IMB_VERSION_NUM >= IMB_VERSION(0, 50, 0)
>  	AESNI_GCM_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
>  			imb_get_version_str());
> @@ -641,6 +710,8 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
>  	if (cryptodev == NULL)
>  		return -ENODEV;
> 
> +	rte_free(cryptodev->security_ctx);
> +
>  	internals = cryptodev->data->dev_private;
> 
>  	free_mb_mgr(internals->mb_mgr);
> @@ -648,6 +719,30 @@ aesni_gcm_remove(struct rte_vdev_device *vdev)
>  	return rte_cryptodev_pmd_destroy(cryptodev);
>  }
> 
> +int
> +aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	struct aesni_gcm_security_session *session =
> +			get_sec_session_private_data(sess);
> +	uint32_t i;
> +	int errcnt = 0;
> +
> +	if (unlikely(!session))
> +		return -num;

You return negative status (error), but don't send each status[] value.


> +
> +	for (i = 0; i < num; i++) {
> +		status[i] = process_gcm_security_sgl_buf(session, &buf[i],
> +				(uint8_t *)iv[i], (uint8_t *)aad[i],
> +				(uint8_t *)digest[i]);
> +		if (unlikely(status[i]))
> +			errcnt -= 1;
> +	}
> +
> +	return errcnt;
> +}
> +
>  static struct rte_vdev_driver aesni_gcm_pmd_drv = {
>  	.probe = aesni_gcm_probe,
>  	.remove = aesni_gcm_remove
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> index 2f66c7c58..cc71dbd60 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_ops.c
> @@ -7,6 +7,7 @@
>  #include <rte_common.h>
>  #include <rte_malloc.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
> 
>  #include "aesni_gcm_pmd_private.h"
> 
> @@ -316,6 +317,85 @@ aesni_gcm_pmd_sym_session_clear(struct rte_cryptodev *dev,
>  	}
>  }
> 
> +static int
> +aesni_gcm_security_session_create(void *dev,
> +		struct rte_security_session_conf *conf,
> +		struct rte_security_session *sess,
> +		struct rte_mempool *mempool)
> +{
> +	struct rte_cryptodev *cdev = dev;
> +	struct aesni_gcm_private *internals = cdev->data->dev_private;
> +	struct aesni_gcm_security_session *sess_priv;
> +	int ret;
> +
> +	if (!conf->crypto_xform) {
> +		AESNI_GCM_LOG(ERR, "Invalid security session conf");
> +		return -EINVAL;
> +	}
> +
> +	if (conf->crypto_xform->type == RTE_CRYPTO_SYM_XFORM_AUTH) {
> +		AESNI_GCM_LOG(ERR, "GMAC is not supported in security session");
> +		return -EINVAL;
> +	}
> +
> +
> +	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
> +		AESNI_GCM_LOG(ERR,
> +				"Couldn't get object from session mempool");
> +		return -ENOMEM;
> +	}
> +
> +	ret = aesni_gcm_set_session_parameters(internals->ops,
> +				&sess_priv->sess, conf->crypto_xform);
> +	if (ret != 0) {
> +		AESNI_GCM_LOG(ERR, "Failed configure session parameters");
> +
> +		/* Return session to mempool */
> +		rte_mempool_put(mempool, (void *)sess_priv);
> +		return ret;
> +	}
> +
> +	sess_priv->pre = internals->ops[sess_priv->sess.key].pre;
> +	sess_priv->init = internals->ops[sess_priv->sess.key].init;
> +	if (sess_priv->sess.op == AESNI_GCM_OP_AUTHENTICATED_ENCRYPTION) {
> +		sess_priv->update =
> +			internals->ops[sess_priv->sess.key].update_enc;
> +		sess_priv->finalize =
> +			internals->ops[sess_priv->sess.key].finalize_enc;
> +	} else {
> +		sess_priv->update =
> +			internals->ops[sess_priv->sess.key].update_dec;
> +		sess_priv->finalize =
> +			internals->ops[sess_priv->sess.key].finalize_dec;
> +	}
> +
> +	sess->sess_private_data = sess_priv;
> +
> +	return 0;
> +}
> +
> +static int
> +aesni_gcm_security_session_destroy(void *dev __rte_unused,
> +		struct rte_security_session *sess)
> +{
> +	void *sess_priv = get_sec_session_private_data(sess);
> +
> +	if (sess_priv) {
> +		struct rte_mempool *sess_mp = rte_mempool_from_obj(sess_priv);
> +
> +		memset(sess, 0, sizeof(struct aesni_gcm_security_session));
> +		set_sec_session_private_data(sess, NULL);
> +		rte_mempool_put(sess_mp, sess_priv);
> +	}
> +	return 0;
> +}
> +
> +static unsigned int
> +aesni_gcm_sec_session_get_size(__rte_unused void *device)
> +{
> +	return sizeof(struct aesni_gcm_security_session);
> +}
> +
>  struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
>  		.dev_configure		= aesni_gcm_pmd_config,
>  		.dev_start		= aesni_gcm_pmd_start,
> @@ -336,4 +416,19 @@ struct rte_cryptodev_ops aesni_gcm_pmd_ops = {
>  		.sym_session_clear	= aesni_gcm_pmd_sym_session_clear
>  };
> 
> +static struct rte_security_ops aesni_gcm_security_ops = {
> +		.session_create = aesni_gcm_security_session_create,
> +		.session_get_size = aesni_gcm_sec_session_get_size,
> +		.session_update = NULL,
> +		.session_stats_get = NULL,
> +		.session_destroy = aesni_gcm_security_session_destroy,
> +		.set_pkt_metadata = NULL,
> +		.capabilities_get = NULL,
> +		.process_cpu_crypto_bulk =
> +				aesni_gcm_sec_crypto_process_bulk,
> +};
> +
>  struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops = &aesni_gcm_pmd_ops;
> +
> +struct rte_security_ops *rte_aesni_gcm_pmd_security_ops =
> +		&aesni_gcm_security_ops;
> diff --git a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> index 56b29e013..ed3f6eb2e 100644
> --- a/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> +++ b/drivers/crypto/aesni_gcm/aesni_gcm_pmd_private.h
> @@ -114,5 +114,28 @@ aesni_gcm_set_session_parameters(const struct aesni_gcm_ops *ops,
>   * Device specific operations function pointer structure */
>  extern struct rte_cryptodev_ops *rte_aesni_gcm_pmd_ops;
> 
> +/**
> + * Security session structure.
> + */
> +struct aesni_gcm_security_session {
> +	/** Temp digest for decryption */
> +	uint8_t temp_digest[DIGEST_LENGTH_MAX];
> +	/** GCM operations */
> +	aesni_gcm_pre_t pre;
> +	aesni_gcm_init_t init;
> +	aesni_gcm_update_t update;
> +	aesni_gcm_finalize_t finalize;
> +	/** AESNI-GCM session */
> +	struct aesni_gcm_session sess;
> +	/** AESNI-GCM context */
> +	struct gcm_context_data gdata_ctx;
> +};
> +
> +extern int
> +aesni_gcm_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
> +extern struct rte_security_ops *rte_aesni_gcm_pmd_security_ops;
> 
>  #endif /* _RTE_AESNI_GCM_PMD_PRIVATE_H_ */
> diff --git a/drivers/crypto/aesni_gcm/meson.build b/drivers/crypto/aesni_gcm/meson.build
> index 3a6e332dc..f6e160bb3 100644
> --- a/drivers/crypto/aesni_gcm/meson.build
> +++ b/drivers/crypto/aesni_gcm/meson.build
> @@ -22,4 +22,4 @@ endif
> 
>  allow_experimental_apis = true
>  sources = files('aesni_gcm_pmd.c', 'aesni_gcm_pmd_ops.c')
> -deps += ['bus_vdev']
> +deps += ['bus_vdev', 'security']
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH v2 05/10] crypto/aesni_mb: add rte_security handler
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 05/10] crypto/aesni_mb: add rte_security handler Fan Zhang
@ 2019-10-08 16:23       ` Ananyev, Konstantin
  2019-10-09  8:29       ` Ananyev, Konstantin
  1 sibling, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-08 16:23 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal


Hi Fan,
 
> This patch add rte_security support support to AESNI-MB PMD. The PMD now
> initialize security context instance, create/delete PMD specific security
> sessions, and process crypto workloads in synchronous mode.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  drivers/crypto/aesni_mb/meson.build                |   2 +-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c         | 368 +++++++++++++++++++--
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c     |  92 +++++-
>  drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h |  21 +-
>  4 files changed, 453 insertions(+), 30 deletions(-)
> 
> diff --git a/drivers/crypto/aesni_mb/meson.build b/drivers/crypto/aesni_mb/meson.build
> index 3e1687416..e7b585168 100644
> --- a/drivers/crypto/aesni_mb/meson.build
> +++ b/drivers/crypto/aesni_mb/meson.build
> @@ -23,4 +23,4 @@ endif
> 
>  sources = files('rte_aesni_mb_pmd.c', 'rte_aesni_mb_pmd_ops.c')
>  allow_experimental_apis = true
> -deps += ['bus_vdev']
> +deps += ['bus_vdev', 'security']
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> index ce1144b95..a4cd518b7 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd.c
> @@ -8,6 +8,8 @@
>  #include <rte_hexdump.h>
>  #include <rte_cryptodev.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security.h>
> +#include <rte_security_driver.h>
>  #include <rte_bus_vdev.h>
>  #include <rte_malloc.h>
>  #include <rte_cpuflags.h>
> @@ -19,6 +21,9 @@
>  #define HMAC_MAX_BLOCK_SIZE 128
>  static uint8_t cryptodev_driver_id;
> 
> +static enum aesni_mb_vector_mode vector_mode;
> +/**< CPU vector instruction set mode */
> +
>  typedef void (*hash_one_block_t)(const void *data, void *digest);
>  typedef void (*aes_keyexp_t)(const void *key, void *enc_exp_keys, void *dec_exp_keys);
> 
> @@ -808,6 +813,164 @@ auth_start_offset(struct rte_crypto_op *op, struct aesni_mb_session *session,
>  			(UINT64_MAX - u_src + u_dst + 1);
>  }
> 
> +union sec_userdata_field {
> +	int status;
> +	struct {
> +		uint16_t is_gen_digest;
> +		uint16_t digest_len;
> +	};
> +};
> +
> +struct sec_udata_digest_field {
> +	uint32_t is_digest_gen;
> +	uint32_t digest_len;
> +};
> +
> +static inline int
> +set_mb_job_params_sec(JOB_AES_HMAC *job, struct aesni_mb_sec_session *sec_sess,
> +		void *buf, uint32_t buf_len, void *iv, void *aad, void *digest,
> +		int *status, uint8_t *digest_idx)
> +{
> +	struct aesni_mb_session *session = &sec_sess->sess;
> +	uint32_t cipher_offset = sec_sess->cipher_offset;
> +	union sec_userdata_field udata;
> +
> +	if (unlikely(cipher_offset > buf_len))
> +		return -EINVAL;
> +
> +	/* Set crypto operation */
> +	job->chain_order = session->chain_order;
> +
> +	/* Set cipher parameters */
> +	job->cipher_direction = session->cipher.direction;
> +	job->cipher_mode = session->cipher.mode;
> +
> +	job->aes_key_len_in_bytes = session->cipher.key_length_in_bytes;
> +
> +	/* Set authentication parameters */
> +	job->hash_alg = session->auth.algo;
> +	job->iv = iv;
> +
> +	switch (job->hash_alg) {
> +	case AES_XCBC:
> +		job->u.XCBC._k1_expanded = session->auth.xcbc.k1_expanded;
> +		job->u.XCBC._k2 = session->auth.xcbc.k2;
> +		job->u.XCBC._k3 = session->auth.xcbc.k3;
> +
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		break;
> +
> +	case AES_CCM:
> +		job->u.CCM.aad = (uint8_t *)aad + 18;
> +		job->u.CCM.aad_len_in_bytes = session->aead.aad_len;
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		job->iv++;
> +		break;
> +
> +	case AES_CMAC:
> +		job->u.CMAC._key_expanded = session->auth.cmac.expkey;
> +		job->u.CMAC._skey1 = session->auth.cmac.skey1;
> +		job->u.CMAC._skey2 = session->auth.cmac.skey2;
> +		job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +		job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		break;
> +
> +	case AES_GMAC:
> +		if (session->cipher.mode == GCM) {
> +			job->u.GCM.aad = aad;
> +			job->u.GCM.aad_len_in_bytes = session->aead.aad_len;
> +		} else {
> +			/* For GMAC */
> +			job->u.GCM.aad = aad;
> +			job->u.GCM.aad_len_in_bytes = buf_len;
> +			job->cipher_mode = GCM;
> +		}
> +		job->aes_enc_key_expanded = &session->cipher.gcm_key;
> +		job->aes_dec_key_expanded = &session->cipher.gcm_key;
> +		break;
> +
> +	default:
> +		job->u.HMAC._hashed_auth_key_xor_ipad =
> +				session->auth.pads.inner;
> +		job->u.HMAC._hashed_auth_key_xor_opad =
> +				session->auth.pads.outer;

Same question as from v1:
Seems like too many branches at data-path.
We'll have only one job-type(alg) per session.
Can we have prefilled job struct template with all common fields already setuped,
and then at process() just copy it over and update few fields that has to be different
(like msg_len_to_cipher_in_bytes)?
If the whole job struct is big enough (184B), we at least can copy contents of u (24B)
in one go, can't we?


> +
> +		if (job->cipher_mode == DES3) {
> +			job->aes_enc_key_expanded =
> +				session->cipher.exp_3des_keys.ks_ptr;
> +			job->aes_dec_key_expanded =
> +				session->cipher.exp_3des_keys.ks_ptr;
> +		} else {
> +			job->aes_enc_key_expanded =
> +				session->cipher.expanded_aes_keys.encode;
> +			job->aes_dec_key_expanded =
> +				session->cipher.expanded_aes_keys.decode;
> +		}
> +	}
> +
> +	/* Set digest output location */
> +	if (job->hash_alg != NULL_HASH &&
> +			session->auth.operation == RTE_CRYPTO_AUTH_OP_VERIFY) {
> +		job->auth_tag_output = sec_sess->temp_digests[*digest_idx];
> +		*digest_idx = (*digest_idx + 1) % MAX_JOBS;
> +
> +		udata.is_gen_digest = 0;
> +		udata.digest_len = session->auth.req_digest_len;
> +	} else {
> +		udata.is_gen_digest = 1;
> +		udata.digest_len = session->auth.req_digest_len;
> +
> +		if (session->auth.req_digest_len !=
> +				session->auth.gen_digest_len) {
> +			job->auth_tag_output =
> +					sec_sess->temp_digests[*digest_idx];
> +			*digest_idx = (*digest_idx + 1) % MAX_JOBS;
> +		} else
> +			job->auth_tag_output = digest;
> +	}
> +
> +	/* A bit of hack here, since job structure only supports
> +	 * 2 user data fields and we need 4 params to be passed
> +	 * (status, direction, digest for verify, and length of
> +	 * digest), we set the status value as digest length +
> +	 * direction here temporarily to avoid creating longer
> +	 * buffer to store all 4 params.
> +	 */
> +	*status = udata.status;
> +
> +	/*
> +	 * Multi-buffer library current only support returning a truncated
> +	 * digest length as specified in the relevant IPsec RFCs
> +	 */
> +
> +	/* Set digest length */
> +	job->auth_tag_output_len_in_bytes = session->auth.gen_digest_len;
> +
> +	/* Set IV parameters */
> +	job->iv_len_in_bytes = session->iv.length;
> +
> +	/* Data Parameters */
> +	job->src = buf;
> +	job->dst = (uint8_t *)buf + cipher_offset;
> +	job->cipher_start_src_offset_in_bytes = cipher_offset;
> +	job->msg_len_to_cipher_in_bytes = buf_len - cipher_offset;
> +	job->hash_start_src_offset_in_bytes = 0;
> +	job->msg_len_to_hash_in_bytes = buf_len;
> +
> +	job->user_data = (void *)status;
> +	job->user_data2 = digest;
> +
> +	return 0;
> +}
> +
>  /**
>   * Process a crypto operation and complete a JOB_AES_HMAC job structure for
>   * submission to the multi buffer library for processing.
> @@ -1100,6 +1263,35 @@ post_process_mb_job(struct aesni_mb_qp *qp, JOB_AES_HMAC *job)
>  	return op;
>  }
> 
> +static inline void
> +post_process_mb_sec_job(JOB_AES_HMAC *job)
> +{
> +	void *user_digest = job->user_data2;
> +	int *status = job->user_data;
> +
> +	switch (job->status) {
> +	case STS_COMPLETED:
> +		if (user_digest) {
> +			union sec_userdata_field udata;
> +
> +			udata.status = *status;
> +			if (udata.is_gen_digest) {
> +				*status = RTE_CRYPTO_OP_STATUS_SUCCESS;
> +				memcpy(user_digest, job->auth_tag_output,
> +						udata.digest_len);
> +			} else {
> +				*status = (memcmp(job->auth_tag_output,
> +					user_digest, udata.digest_len) != 0) ?
> +						-1 : 0;
> +			}
> +		} else
> +			*status = RTE_CRYPTO_OP_STATUS_SUCCESS;

Same question as for v1:
multiple process() functions instead of branches at data-path?

> +		break;
> +	default:
> +		*status = RTE_CRYPTO_OP_STATUS_ERROR;
> +	}
> +}
> +
>  /**
>   * Process a completed JOB_AES_HMAC job and keep processing jobs until
>   * get_completed_job return NULL
> @@ -1136,6 +1328,32 @@ handle_completed_jobs(struct aesni_mb_qp *qp, JOB_AES_HMAC *job,
>  	return processed_jobs;
>  }
> 
> +static inline uint32_t
> +handle_completed_sec_jobs(JOB_AES_HMAC *job, MB_MGR *mb_mgr)
> +{
> +	uint32_t processed = 0;
> +
> +	while (job != NULL) {
> +		post_process_mb_sec_job(job);
> +		job = IMB_GET_COMPLETED_JOB(mb_mgr);
> +		processed++;
> +	}
> +
> +	return processed;
> +}
> +
> +static inline uint32_t
> +flush_mb_sec_mgr(MB_MGR *mb_mgr)
> +{
> +	JOB_AES_HMAC *job = IMB_FLUSH_JOB(mb_mgr);
> +	uint32_t processed = 0;
> +
> +	if (job)
> +		processed = handle_completed_sec_jobs(job, mb_mgr);
> +
> +	return processed;
> +}
> +
>  static inline uint16_t
>  flush_mb_mgr(struct aesni_mb_qp *qp, struct rte_crypto_op **ops,
>  		uint16_t nb_ops)
> @@ -1239,6 +1457,105 @@ aesni_mb_pmd_dequeue_burst(void *queue_pair, struct rte_crypto_op **ops,
>  	return processed_jobs;
>  }
> 
> +static MB_MGR *
> +alloc_init_mb_mgr(void)
> +{
> +	MB_MGR *mb_mgr = alloc_mb_mgr(0);
> +	if (mb_mgr == NULL)
> +		return NULL;
> +
> +	switch (vector_mode) {
> +	case RTE_AESNI_MB_SSE:
> +		init_mb_mgr_sse(mb_mgr);
> +		break;
> +	case RTE_AESNI_MB_AVX:
> +		init_mb_mgr_avx(mb_mgr);
> +		break;
> +	case RTE_AESNI_MB_AVX2:
> +		init_mb_mgr_avx2(mb_mgr);
> +		break;
> +	case RTE_AESNI_MB_AVX512:
> +		init_mb_mgr_avx512(mb_mgr);
> +		break;
> +	default:
> +		AESNI_MB_LOG(ERR, "Unsupported vector mode %u\n", vector_mode);
> +		free_mb_mgr(mb_mgr);
> +		return NULL;
> +	}
> +
> +	return mb_mgr;
> +}
> +
> +static MB_MGR *sec_mb_mgrs[RTE_MAX_LCORE];
> +
> +int
> +aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num)
> +{
> +	struct aesni_mb_sec_session *sec_sess = sess->sess_private_data;
> +	JOB_AES_HMAC *job;
> +	static MB_MGR *mb_mgr;
> +	uint32_t lcore_id = rte_lcore_id();
> +	uint8_t digest_idx = sec_sess->digest_idx;
> +	uint32_t i, processed = 0;
> +	int ret = 0, errcnt = 0;
> +
> +	if (unlikely(sec_mb_mgrs[lcore_id] == NULL)) {

I don't think it is completely safe.
For non-EAL threads rte_lcore_id() == -1.
So at least need to check for lcore_id < RTE_MAX_LCORE.


> +		sec_mb_mgrs[lcore_id] = alloc_init_mb_mgr();
> +
> +		if (sec_mb_mgrs[lcore_id] == NULL) {
> +			for (i = 0; i < num; i++)
> +				status[i] = -ENOMEM;
> +
> +			return -num;
> +		}
> +	}
> +
> +	mb_mgr = sec_mb_mgrs[lcore_id];
> +
> +	for (i = 0; i < num; i++) {
> +		void *seg_buf = buf[i].vec[0].iov_base;
> +		uint32_t buf_len = buf[i].vec[0].iov_len;
> +
> +		job = IMB_GET_NEXT_JOB(mb_mgr);
> +		if (unlikely(job == NULL)) {
> +			processed += flush_mb_sec_mgr(mb_mgr);
> +
> +			job = IMB_GET_NEXT_JOB(mb_mgr);
> +			if (!job) {
> +				errcnt -= 1;
> +				status[i] = -ENOMEM;
> +			}
> +		}
> +
> +		ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
> +				iv[i], aad[i], digest[i], &status[i],
> +				&digest_idx);

I still don't understand the purpose of passing digest_idx pointer here...
Why not to just:

ret = set_mb_job_params_sec(job, sec_sess, seg_buf, buf_len,
				iv[i], aad[i], digest[i], &status[i],
				digest_idx);
digest_idx = (digest_idx + 1) % MAX_JOBS;
Second thing, I am not sure what is the purpose to store digest_idx
inside the session (sess->digest_idx) at all?
As I can see you never update it, and it seems just 
digest_idx = 0
at the start of that function is enough?



> +				/* Submit job to multi-buffer for processing */
> +		if (ret) {
> +			processed++;
> +			status[i] = ret;
> +			errcnt -= 1;
> +			continue;
> +		}
> +
> +#ifdef RTE_LIBRTE_PMD_AESNI_MB_DEBUG
> +		job = IMB_SUBMIT_JOB(mb_mgr);
> +#else
> +		job = IMB_SUBMIT_JOB_NOCHECK(mb_mgr);
> +#endif
> +
> +		if (job)
> +			processed += handle_completed_sec_jobs(job, mb_mgr);
> +	}
> +
> +	while (processed < num)
> +		processed += flush_mb_sec_mgr(mb_mgr);
> +
> +	return errcnt;
> +}
> +
>  static int cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev);
> 
>  static int
> @@ -1248,8 +1565,9 @@ cryptodev_aesni_mb_create(const char *name,
>  {
>  	struct rte_cryptodev *dev;
>  	struct aesni_mb_private *internals;
> -	enum aesni_mb_vector_mode vector_mode;
> +	struct rte_security_ctx *sec_ctx;
>  	MB_MGR *mb_mgr;
> +	char sec_name[RTE_DEV_NAME_MAX_LEN];
> 
>  	/* Check CPU for support for AES instruction set */
>  	if (!rte_cpu_get_flag_enabled(RTE_CPUFLAG_AES)) {
> @@ -1283,35 +1601,14 @@ cryptodev_aesni_mb_create(const char *name,
>  	dev->feature_flags = RTE_CRYPTODEV_FF_SYMMETRIC_CRYPTO |
>  			RTE_CRYPTODEV_FF_SYM_OPERATION_CHAINING |
>  			RTE_CRYPTODEV_FF_CPU_AESNI |
> -			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT;
> +			RTE_CRYPTODEV_FF_OOP_LB_IN_LB_OUT |
> +			RTE_CRYPTODEV_FF_SECURITY;
> 
> 
> -	mb_mgr = alloc_mb_mgr(0);
> +	mb_mgr = alloc_init_mb_mgr();
>  	if (mb_mgr == NULL)
>  		return -ENOMEM;
> 
> -	switch (vector_mode) {
> -	case RTE_AESNI_MB_SSE:
> -		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_SSE;
> -		init_mb_mgr_sse(mb_mgr);
> -		break;
> -	case RTE_AESNI_MB_AVX:
> -		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX;
> -		init_mb_mgr_avx(mb_mgr);
> -		break;
> -	case RTE_AESNI_MB_AVX2:
> -		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX2;
> -		init_mb_mgr_avx2(mb_mgr);
> -		break;
> -	case RTE_AESNI_MB_AVX512:
> -		dev->feature_flags |= RTE_CRYPTODEV_FF_CPU_AVX512;
> -		init_mb_mgr_avx512(mb_mgr);
> -		break;
> -	default:
> -		AESNI_MB_LOG(ERR, "Unsupported vector mode %u\n", vector_mode);
> -		goto error_exit;
> -	}
> -
>  	/* Set vector instructions mode supported */
>  	internals = dev->data->dev_private;
> 
> @@ -1322,11 +1619,28 @@ cryptodev_aesni_mb_create(const char *name,
>  	AESNI_MB_LOG(INFO, "IPSec Multi-buffer library version used: %s\n",
>  			imb_get_version_str());
> 
> +	/* setup security operations */
> +	snprintf(sec_name, sizeof(sec_name) - 1, "aes_mb_sec_%u",
> +			dev->driver_id);
> +	sec_ctx = rte_zmalloc_socket(sec_name,
> +			sizeof(struct rte_security_ctx),
> +			RTE_CACHE_LINE_SIZE, init_params->socket_id);
> +	if (sec_ctx == NULL) {
> +		AESNI_MB_LOG(ERR, "memory allocation failed\n");
> +		goto error_exit;
> +	}
> +
> +	sec_ctx->device = (void *)dev;
> +	sec_ctx->ops = rte_aesni_mb_pmd_security_ops;
> +	dev->security_ctx = sec_ctx;
> +
>  	return 0;
> 
>  error_exit:
>  	if (mb_mgr)
>  		free_mb_mgr(mb_mgr);
> +	if (sec_ctx)
> +		rte_free(sec_ctx);
> 
>  	rte_cryptodev_pmd_destroy(dev);
> 
> @@ -1367,6 +1681,7 @@ cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev)
>  	struct rte_cryptodev *cryptodev;
>  	struct aesni_mb_private *internals;
>  	const char *name;
> +	uint32_t i;
> 
>  	name = rte_vdev_device_name(vdev);
>  	if (name == NULL)
> @@ -1379,6 +1694,9 @@ cryptodev_aesni_mb_remove(struct rte_vdev_device *vdev)
>  	internals = cryptodev->data->dev_private;
> 
>  	free_mb_mgr(internals->mb_mgr);
> +	for (i = 0; i < RTE_MAX_LCORE; i++)
> +		if (sec_mb_mgrs[i])
> +			free_mb_mgr(sec_mb_mgrs[i]);
> 
>  	return rte_cryptodev_pmd_destroy(cryptodev);
>  }
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> index 8d15b99d4..f47df2d57 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_ops.c
> @@ -8,6 +8,7 @@
>  #include <rte_common.h>
>  #include <rte_malloc.h>
>  #include <rte_cryptodev_pmd.h>
> +#include <rte_security_driver.h>
> 
>  #include "rte_aesni_mb_pmd_private.h"
> 
> @@ -732,7 +733,8 @@ aesni_mb_pmd_qp_count(struct rte_cryptodev *dev)
>  static unsigned
>  aesni_mb_pmd_sym_session_get_size(struct rte_cryptodev *dev __rte_unused)
>  {
> -	return sizeof(struct aesni_mb_session);
> +	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_session),
> +			RTE_CACHE_LINE_SIZE);
>  }
> 
>  /** Configure a aesni multi-buffer session from a crypto xform chain */
> @@ -810,4 +812,92 @@ struct rte_cryptodev_ops aesni_mb_pmd_ops = {
>  		.sym_session_clear	= aesni_mb_pmd_sym_session_clear
>  };
> 
> +/** Set session authentication parameters */
> +
> +static int
> +aesni_mb_security_session_create(void *dev,
> +		struct rte_security_session_conf *conf,
> +		struct rte_security_session *sess,
> +		struct rte_mempool *mempool)
> +{
> +	struct rte_cryptodev *cdev = dev;
> +	struct aesni_mb_private *internals = cdev->data->dev_private;
> +	struct aesni_mb_sec_session *sess_priv;
> +	int ret;
> +
> +	if (!conf->crypto_xform) {
> +		AESNI_MB_LOG(ERR, "Invalid security session conf");
> +		return -EINVAL;
> +	}
> +
> +	if (conf->cpucrypto.cipher_offset < 0) {
> +		AESNI_MB_LOG(ERR, "Invalid security session conf");
> +		return -EINVAL;
> +	}
> +
> +	if (rte_mempool_get(mempool, (void **)(&sess_priv))) {
> +		AESNI_MB_LOG(ERR,
> +				"Couldn't get object from session mempool");
> +		return -ENOMEM;
> +	}
> +
> +	sess_priv->cipher_offset = conf->cpucrypto.cipher_offset;
> +
> +	ret = aesni_mb_set_session_parameters(internals->mb_mgr,
> +			&sess_priv->sess, conf->crypto_xform);
> +	if (ret != 0) {
> +		AESNI_MB_LOG(ERR, "failed configure session parameters");
> +
> +		rte_mempool_put(mempool, sess_priv);
> +	}
> +
> +	sess->sess_private_data = (void *)sess_priv;
> +
> +	return ret;
> +}
> +
> +static int
> +aesni_mb_security_session_destroy(void *dev __rte_unused,
> +		struct rte_security_session *sess)
> +{
> +	struct aesni_mb_sec_session *sess_priv =
> +			get_sec_session_private_data(sess);
> +
> +	if (sess_priv) {
> +		struct rte_mempool *sess_mp = rte_mempool_from_obj(
> +				(void *)sess_priv);
> +
> +		memset(sess, 0, sizeof(struct aesni_mb_sec_session));
> +		set_sec_session_private_data(sess, NULL);
> +
> +		if (sess_mp == NULL) {
> +			AESNI_MB_LOG(ERR, "failed fetch session mempool");
> +			return -EINVAL;
> +		}
> +
> +		rte_mempool_put(sess_mp, sess_priv);
> +	}
> +
> +	return 0;
> +}
> +
> +static unsigned int
> +aesni_mb_sec_session_get_size(__rte_unused void *device)
> +{
> +	return RTE_ALIGN_CEIL(sizeof(struct aesni_mb_sec_session),
> +			RTE_CACHE_LINE_SIZE);
> +}
> +
> +static struct rte_security_ops aesni_mb_security_ops = {
> +		.session_create = aesni_mb_security_session_create,
> +		.session_get_size = aesni_mb_sec_session_get_size,
> +		.session_update = NULL,
> +		.session_stats_get = NULL,
> +		.session_destroy = aesni_mb_security_session_destroy,
> +		.set_pkt_metadata = NULL,
> +		.capabilities_get = NULL,
> +		.process_cpu_crypto_bulk = aesni_mb_sec_crypto_process_bulk,
> +};
> +
>  struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops = &aesni_mb_pmd_ops;
> +struct rte_security_ops *rte_aesni_mb_pmd_security_ops = &aesni_mb_security_ops;
> diff --git a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> index b794d4bc1..64b58ca8e 100644
> --- a/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> +++ b/drivers/crypto/aesni_mb/rte_aesni_mb_pmd_private.h
> @@ -176,7 +176,6 @@ struct aesni_mb_qp {
>  	 */
>  } __rte_cache_aligned;
> 
> -/** AES-NI multi-buffer private session structure */
>  struct aesni_mb_session {
>  	JOB_CHAIN_ORDER chain_order;
>  	struct {
> @@ -265,16 +264,32 @@ struct aesni_mb_session {
>  		/** AAD data length */
>  		uint16_t aad_len;
>  	} aead;
> -} __rte_cache_aligned;
> +};
> +
> +/** AES-NI multi-buffer private security session structure */
> +struct aesni_mb_sec_session {
> +	/**< Unique Queue Pair Name */
> +	struct aesni_mb_session sess;
> +	uint8_t temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];

Same question as for v1:
Probably better to move these temp_digest[][] at the very end?
To have all read-only data grouped together?
Another thought - do you need it here at all?
Can't we just allocate
temp_digests[MAX_JOBS][DIGEST_LENGTH_MAX];
on the stack inside process() function?

> +	uint16_t digest_idx;
> +	uint32_t cipher_offset;
> +	MB_MGR *mb_mgr;
> +};
> 
>  extern int
>  aesni_mb_set_session_parameters(const MB_MGR *mb_mgr,
>  		struct aesni_mb_session *sess,
>  		const struct rte_crypto_sym_xform *xform);
> 
> +extern int
> +aesni_mb_sec_crypto_process_bulk(struct rte_security_session *sess,
> +		struct rte_security_vec buf[], void *iv[], void *aad[],
> +		void *digest[], int status[], uint32_t num);
> +
>  /** device specific operations function pointer structure */
>  extern struct rte_cryptodev_ops *rte_aesni_mb_pmd_ops;
> 
> -
> +/** device specific operations function pointer structure for rte_security */
> +extern struct rte_security_ops *rte_aesni_mb_pmd_security_ops;
> 
>  #endif /* _RTE_AESNI_MB_PMD_PRIVATE_H_ */
> --
> 2.14.5


^ permalink raw reply	[flat|nested] 84+ messages in thread

* Re: [dpdk-dev] [PATCH v2 08/10] ipsec: add rte_security cpu_crypto action support
  2019-10-07 16:28     ` [dpdk-dev] [PATCH v2 08/10] ipsec: add rte_security cpu_crypto action support Fan Zhang
@ 2019-10-08 23:28       ` Ananyev, Konstantin
  0 siblings, 0 replies; 84+ messages in thread
From: Ananyev, Konstantin @ 2019-10-08 23:28 UTC (permalink / raw)
  To: Zhang, Roy Fan, dev; +Cc: Doherty, Declan, akhil.goyal

Hi Fan,
Comments for inbound part inline.
As I can see majority of my v1 comments still are not addressed.
Please check.
Konstantin

> 
> This patch updates the ipsec library to handle the newly introduced
> RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO action.
> 
> Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> ---
>  lib/librte_ipsec/crypto.h   |  24 +++
>  lib/librte_ipsec/esp_inb.c  | 200 ++++++++++++++++++++++--
>  lib/librte_ipsec/esp_outb.c | 369 +++++++++++++++++++++++++++++++++++++++++---
>  lib/librte_ipsec/sa.c       |  53 ++++++-
>  lib/librte_ipsec/sa.h       |  29 ++++
>  lib/librte_ipsec/ses.c      |   4 +-
>  6 files changed, 643 insertions(+), 36 deletions(-)
> 
> diff --git a/lib/librte_ipsec/crypto.h b/lib/librte_ipsec/crypto.h
> index f8fbf8d4f..901c8c7de 100644
> --- a/lib/librte_ipsec/crypto.h
> +++ b/lib/librte_ipsec/crypto.h
> @@ -179,4 +179,28 @@ lksd_none_cop_prepare(struct rte_crypto_op *cop,
>  	__rte_crypto_sym_op_attach_sym_session(sop, cs);
>  }
> 
> +typedef void* (*_set_icv_f)(void *val, struct rte_mbuf *ml, uint32_t icv_off);
> +
> +static inline void *
> +set_icv_va_pa(void *val, struct rte_mbuf *ml, uint32_t icv_off)
> +{
> +	union sym_op_data *icv = val;
> +
> +	icv->va = rte_pktmbuf_mtod_offset(ml, void *, icv_off);
> +	icv->pa = rte_pktmbuf_iova_offset(ml, icv_off);
> +
> +	return icv->va;
> +}
> +
> +static inline void *
> +set_icv_va(__rte_unused void *val, __rte_unused struct rte_mbuf *ml,
> +		__rte_unused uint32_t icv_off)
> +{
> +	void **icv_va = val;
> +
> +	*icv_va = rte_pktmbuf_mtod_offset(ml, void *, icv_off);
> +
> +	return *icv_va;
> +}
> +
>  #endif /* _CRYPTO_H_ */
> diff --git a/lib/librte_ipsec/esp_inb.c b/lib/librte_ipsec/esp_inb.c
> index 8e3ecbc64..c4476e819 100644
> --- a/lib/librte_ipsec/esp_inb.c
> +++ b/lib/librte_ipsec/esp_inb.c
> @@ -105,6 +105,78 @@ inb_cop_prepare(struct rte_crypto_op *cop,
>  	}
>  }
> 
> +static inline int
> +inb_cpu_crypto_proc_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
> +	uint32_t pofs, uint32_t plen,
> +	struct rte_security_vec *buf, struct iovec *cur_vec,
> +	void *iv)
> +{
> +	struct rte_mbuf *ms;
> +	struct iovec *vec = cur_vec;
> +	struct aead_gcm_iv *gcm;
> +	struct aesctr_cnt_blk *ctr;
> +	uint64_t *ivp;
> +	uint32_t algo;
> +	uint32_t left;
> +	uint32_t off = 0, n_seg = 0;

Same comment as for v1:
Please separate variable definition and value assignment.
It makes it hard to read, plus we don't do that in the rest of the library,
so better to follow rest of the code style.

> +
> +	ivp = rte_pktmbuf_mtod_offset(mb, uint64_t *,
> +		pofs + sizeof(struct rte_esp_hdr));
> +	algo = sa->algo_type;
> +
> +	switch (algo) {
> +	case ALGO_TYPE_AES_GCM:
> +		gcm = (struct aead_gcm_iv *)iv;
> +		aead_gcm_iv_fill(gcm, ivp[0], sa->salt);
> +		off = sa->ctp.cipher.offset + pofs;
> +		left = plen - sa->ctp.cipher.length;
> +		break;
> +	case ALGO_TYPE_AES_CBC:
> +	case ALGO_TYPE_3DES_CBC:
> +		copy_iv(iv, ivp, sa->iv_len);
> +		off = sa->ctp.auth.offset + pofs;
> +		left = plen - sa->ctp.auth.length;
> +		break;
> +	case ALGO_TYPE_AES_CTR:
> +		copy_iv(iv, ivp, sa->iv_len);
> +		off = sa->ctp.auth.offset + pofs;
> +		left = plen - sa->ctp.auth.length;
> +		ctr = (struct aesctr_cnt_blk *)iv;
> +		aes_ctr_cnt_blk_fill(ctr, ivp[0], sa->salt);
> +		break;
> +	case ALGO_TYPE_NULL:
> +		left = plen - sa->ctp.cipher.length;
> +		break;
> +	default:
> +		return -EINVAL;

How we can endup here?
If we have an unknown algorithm, shouldn't we fail at init stage?

> +	}
> +
> +	ms = mbuf_get_seg_ofs(mb, &off);
> +	if (!ms)
> +		return -1;

Same comment as for v1:
inb_pkt_prepare() should already check that we have a valid packet.
I don't think there is a need to check for any failure here.
Another thing, our esp header will be in the first segment for sure,
so do we need get_seg_ofs() here at all?


> +
> +	while (n_seg < RTE_LIBRTE_IP_FRAG_MAX_FRAG && left && ms) {
> +		uint32_t len = RTE_MIN(left, ms->data_len - off);


Again, same comments as for v1:

- I don't think this is right, we shouldn't impose additional limitations to
the number of segments in the packet.

- Whole construction seems a bit over-complicated here...
Why just not have a separate function that would dill iovec[] from mbuf
And return an error if there is not enough iovec[] entries?
Something like:

static inline int
mbuf_to_iovec(const struct rte_mbuf *mb, uint32_t ofs, uint32_t len, struct iovec vec[], uint32_t num)
{
     uint32_t i;
     if (mb->nb_seg > num)
        return - mb->nb_seg;

    vec[0].iov_base =  rte_pktmbuf_mtod_offset(mb, void *, off);
    vec[0].iov_len = mb->data_len - off;

    for (i = 1, ms = mb->next; mb != NULL; ms = ms->next, i++) {
        vec[i].iov_base = rte_pktmbuf_mtod(ms);
        vec[i].iov_len = ms->data_len;
    }

   vec[i].iov_len -= mb->pkt_len - len;
   return i;
}

Then we can use that function to fill our iovec[] in a loop.

L- ooking at this function, it seems to consist of 2 separate parts:
1. calculates offset and generates iv
2. setup iovec[].
Probably worth to split it into 2 separate functions like that.
Would be much easier to read/understand.

> +
> +		vec->iov_base = rte_pktmbuf_mtod_offset(ms, void *, off);
> +		vec->iov_len = len;
> +
> +		left -= len;
> +		vec++;
> +		n_seg++;
> +		ms = ms->next;
> +		off = 0;
> +	}
> +
> +	if (left)
> +		return -1;
> +
> +	buf->vec = cur_vec;
> +	buf->num = n_seg;
> +
> +	return n_seg;
> +}
> +
>  /*
>   * Helper function for prepare() to deal with situation when
>   * ICV is spread by two segments. Tries to move ICV completely into the
> @@ -139,20 +211,21 @@ move_icv(struct rte_mbuf *ml, uint32_t ofs)
>   */
>  static inline void
>  inb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
> -	const union sym_op_data *icv)
> +	uint8_t *icv_va, void *aad_buf, uint32_t aad_off)
>  {
>  	struct aead_gcm_aad *aad;
> 
>  	/* insert SQN.hi between ESP trailer and ICV */
>  	if (sa->sqh_len != 0)
> -		insert_sqh(sqn_hi32(sqc), icv->va, sa->icv_len);
> +		insert_sqh(sqn_hi32(sqc), icv_va, sa->icv_len);
> 
>  	/*
>  	 * fill AAD fields, if any (aad fields are placed after icv),
>  	 * right now we support only one AEAD algorithm: AES-GCM.
>  	 */
>  	if (sa->aad_len != 0) {
> -		aad = (struct aead_gcm_aad *)(icv->va + sa->icv_len);
> +		aad = aad_buf ? aad_buf :
> +				(struct aead_gcm_aad *)(icv_va + aad_off);
>  		aead_gcm_aad_fill(aad, sa->spi, sqc, IS_ESN(sa));
>  	}
>  }
> @@ -162,13 +235,15 @@ inb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
>   */
>  static inline int32_t
>  inb_pkt_prepare(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
> -	struct rte_mbuf *mb, uint32_t hlen, union sym_op_data *icv)
> +	struct rte_mbuf *mb, uint32_t hlen, _set_icv_f set_icv, void *icv_val,
> +	void *aad_buf)

This whole construct with another function pointer , overloaded arguments, etc.,
looks a bit clumsy and overcomplicated.
I think it would be much cleaner and easier to re-arrange the code like that:

1. update inb_pkt_xprepare to take aad buffer pointer and aad len as a parameters:

static inline void
inb_pkt_xprepare(const struct rte_ipsec_sa *sa, rte_be64_t sqc,
        const union sym_op_data *icv, void *aad, uint32_t aad_len)
{
        /* insert SQN.hi between ESP trailer and ICV */
        if (sa->sqh_len != 0)
                insert_sqh(sqn_hi32(sqc), icv->va, sa->icv_len);

        /*
         * fill AAD fields, if any (aad fields are placed after icv),
         * right now we support only one AEAD algorithm: AES-GCM.
         */
        if (aad_len != 0)
                aead_gcm_aad_fill(aad, sa->spi, sqc, IS_ESN(sa));
}

2. split inb_pkt_prepare() into 2 reusable helper functions:

*
 * retrieve and reconstruct SQN, then check it, then
 * convert it back into network byte order.
 */
static inline int
inb_get_sqn(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
        struct rte_mbuf *mb, uint32_t hlen, rte_be64_t *sqc)
{
        int32_t rc;
        uint64_t sqn;
        struct rte_esp_hdr *esph;

        esph = rte_pktmbuf_mtod_offset(mb, struct rte_esp_hdr *, hlen);

        /*
         * retrieve and reconstruct SQN, then check it, then
         * convert it back into network byte order.
         */
        sqn = rte_be_to_cpu_32(esph->seq);
        if (IS_ESN(sa))
                sqn = reconstruct_esn(rsn->sqn, sqn, sa->replay.win_sz);

        rc = esn_inb_check_sqn(rsn, sa, sqn);
        if (rc == 0)
                *sqc = rte_cpu_to_be_64(sqn);

        return rc;
}

static inline int32_t
inb_prepare(const struct rte_ipsec_sa *sa, struct rte_mbuf *mb,
        uint32_t hlen, uint32_t aad_len, union sym_op_data *icv)
{
        uint32_t clen, icv_len, icv_ofs, plen;
        struct rte_mbuf *ml;

        /* start packet manipulation */
        plen = mb->pkt_len;
        plen = plen - hlen;

        /* check that packet has a valid length */
        clen = plen - sa->ctp.cipher.length;
        if ((int32_t)clen < 0 || (clen & (sa->pad_align - 1)) != 0)
                return -EBADMSG;

        /* find ICV location */
        icv_len = sa->icv_len;
        icv_ofs = mb->pkt_len - icv_len;

        ml = mbuf_get_seg_ofs(mb, &icv_ofs);

        /*
         * if ICV is spread by two segments, then try to
         * move ICV completely into the last segment.
         */
         if (ml->data_len < icv_ofs + icv_len) {

                ml = move_icv(ml, icv_ofs);
                if (ml == NULL)
                        return -ENOSPC;

                /* new ICV location */
                icv_ofs = 0;
        }

        icv_ofs += sa->sqh_len;

        /* we have to allocate space for AAD somewhere,
         * right now - just use free trailing space at the last segment.
         * Would probably be more convenient to reserve space for AAD
         * inside rte_crypto_op itself
         * (again for IV space is already reserved inside cop).
         */
        if (aad_len + sa->sqh_len > rte_pktmbuf_tailroom(ml))
                return -ENOSPC;

        icv->va = rte_pktmbuf_mtod_offset(ml, void *, icv_ofs);
        icv->pa = rte_pktmbuf_iova_offset(ml, icv_ofs);

        /*
         * if esn is used then high-order 32 bits are also used in ICV
         * calculation but are not transmitted, update packet length
         * to be consistent with auth data length and offset, this will
         * be subtracted from packet length in post crypto processing
         */
        mb->pkt_len += sa->sqh_len;
        ml->data_len += sa->sqh_len;

        return plen;
}      
   
3. Now inb_pkt_prepare() becomes a simple sequential invocation of these 3 sub-functions
with right parameters:

static inline int32_t
inb_pkt_prepare(const struct rte_ipsec_sa *sa, const struct replay_sqn *rsn,
        struct rte