From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id C8473457AD for ; Tue, 13 Aug 2024 08:12:41 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BE6C74065B; Tue, 13 Aug 2024 08:12:41 +0200 (CEST) Received: from NAM12-DM6-obe.outbound.protection.outlook.com (mail-dm6nam12on2048.outbound.protection.outlook.com [40.107.243.48]) by mails.dpdk.org (Postfix) with ESMTP id BB264402D0 for ; Tue, 13 Aug 2024 08:12:39 +0200 (CEST) ARC-Seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=SZUfB3FqHBW1PMulYF+9e+++OLLGtTTBp+sP4uMePQgBrs2eu3nI6zNChKaZ+W3TKochkTpGXGZjmbdz8SuTLfxYXjKt0THDYkOLSUmdsDxwglzovcoWiaHQbVO3UifFaRAsuE9vrtaZjUjPU25gkH6nf3LeAqNS1E72SVkl90JULwnPwEnRugoxSgy5zcjWetdjZnKYalk5gllj6VJwZFhBu188xAaDZbqXbF2TSaudegmhXQj+zd3hTskDT4no1bXakdmfwwXuvbZEW78YJGIe18q6RZ5bEUhFdLtaEdi5fhplsVDnUpaYAi8G01fnNTsrFmBvUPRflhN4UyatdA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=Mx/3PkUdFJsbA7vv3GpI9hhhLE1MN7IyvU5nDeyUTis=; b=FAQIZU3NbaCjCKMuKHGrOfnb2YAI1N9vhZeTMJ94+tP8BZIHnY/tRCECjxALcuVYNTsV1mPhEL1O7VsrLOlTLAlg33dt0zoDWmSaNiJEzbr/M+yMOozqOr8BK0q3p8XlkzeG6/YjBuxq0pQ7Znimy83tFdi7aC6qVxHhLaJrhXvsu41VnQRv4EQPwbUvQTwU5S9Hm9RfUl8MhaD2J1Nbzj0pbb6iL898wv4ZNX4tqj3/blkeqPEDSfsZz9JHSkpeJrGADfhkYE0jPaQ+V/TJQG2drGRACmEzO/fcm4RRIxpvXIQcf7vvy33/ZU1dh+SZDb1xyiRJi5ST155awROW+Q== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Mx/3PkUdFJsbA7vv3GpI9hhhLE1MN7IyvU5nDeyUTis=; b=tjxdlWv1iJ3DW7O41emYBlFQboY3WFpxmsry5tCVvaNYhS8S+/n+wTtIJ/hMi8YidEbdqV2jjuFMnFyHMu3A9XhMQ2mKpTAhXr1+OZ1uvTzcxPbtOfKS4+La5l2AwJJdP/8E28miYzUd5jq7Kwq4WGicHe8XUmS5fl78We57YvcKChjSgB4RCALWsfFXEns4thMg4lBmT7Q1VlSGpWYZmCF9I/UdcXYOg2g0pZyHRE7QcT7QhR1ZxfaJSm8z4d6c++Bgut2PFgwJlGJ8VSNt9D6jiYnNQ44vbxR3v8ZJ2q0EKinhtVXvPlykXgqdpzAYJ8yi8WbyWQt6tRmGRcLTcg== Received: from CH3PR12MB8658.namprd12.prod.outlook.com (2603:10b6:610:175::8) by DS0PR12MB8455.namprd12.prod.outlook.com (2603:10b6:8:158::16) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.7849.22; Tue, 13 Aug 2024 06:12:36 +0000 Received: from CH3PR12MB8658.namprd12.prod.outlook.com ([fe80::d5cc:cc84:5e00:2f42]) by CH3PR12MB8658.namprd12.prod.outlook.com ([fe80::d5cc:cc84:5e00:2f42%5]) with mapi id 15.20.7849.021; Tue, 13 Aug 2024 06:12:36 +0000 From: Xueming Li To: Jack Bond-Preston CC: "stable@dpdk.org" , Kai Ji , Wathsala Vithanage Subject: Re: [PATCH 23.11] crypto/openssl: make per-QP auth context clones Thread-Topic: [PATCH 23.11] crypto/openssl: make per-QP auth context clones Thread-Index: AQHa7L4KT+LwZdpUs0+77z+z5WDkXLIktcY5 Date: Tue, 13 Aug 2024 06:12:35 +0000 Message-ID: References: <20240812134619.4018767-1-jack.bond-preston@foss.arm.com> In-Reply-To: <20240812134619.4018767-1-jack.bond-preston@foss.arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: msip_labels: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-traffictypediagnostic: CH3PR12MB8658:EE_|DS0PR12MB8455:EE_ x-ms-office365-filtering-correlation-id: 93d9ddef-1146-4a53-8cd3-08dcbb5eed29 x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; ARA:13230040|366016|1800799024|376014|38070700018; x-microsoft-antispam-message-info: =?us-ascii?Q?YYtymvoZKBnnBP7qWE5kRZ4eMCcjK0PfF+UOP1KCdAYmL6igpVjOotABqlBe?= =?us-ascii?Q?q2cevrMxdwm7O0FWxZWELGbLEbCO2Bt+FWw76Fqqp1bslIKOS5uUG9Pnalv+?= =?us-ascii?Q?tr1QH0y6gC/syDXsri86pUKZjqmG3zufkmQ89kdJ/vYNXpsyk0r6KAjXhU7V?= =?us-ascii?Q?dMi3TU6d8QVIflLTgHu7Fsla1CyavTUG0GA5/k1YF3NU7ZGGjiKc1xouyoBx?= =?us-ascii?Q?4S95TdIQ9Jjibnj6NVJ+Jrr5HXMBGPhfU4mzCcPV+pv8r0OU6lln/WtSodti?= =?us-ascii?Q?2EHtg0NQ3BsQQfE/XTTdOC5gI1ycaqFVfNffot87tfuXxya7yy4LCXuWO6Te?= =?us-ascii?Q?jzTxad6bHZzCmSnys328Zby8NY2b4A06Ez3+Y2W8xEnjT2ima0vJ5WlpXVSz?= =?us-ascii?Q?L5JY19Qj7Rdapj/AdNOVTGZsbqVrgQYuUYjeU5Ng7ZtaA7bAD3aW/J8ikv10?= =?us-ascii?Q?TqJPIsGs3Ls47/cwBLF/GxNNDK3icsQ7KjuofFre1ypNwsir7BG+j5cNvOMu?= =?us-ascii?Q?TiThtLFd+VNdOwURYbZMvZGSyeZI0cE0aeq+dYTDVwA9wt/Gur8Ul52S9k0D?= =?us-ascii?Q?0Nf03/GHYhbKpfcAQQ/w+l43uhdg87rySD+FMkpQuOlEJDhaATL4dKtmtx9V?= =?us-ascii?Q?hnJwivpHECkZxHc/znwMpFr0h8qVEfgrj0lBgg0HhV00TsWDLm45bWGkmO03?= =?us-ascii?Q?5/J3RWD/bkEr0zrNFXt4aC+nLm++QytDWRKzMbKXhisjS4GqwIqzE+ZwHBue?= =?us-ascii?Q?SIi/k84YC4+NVUkfc+DOsuOL81n3hl+Zmu1ET41txWOXA7WWsHl51KIKbJEw?= =?us-ascii?Q?tyR73c8M7wNe4ywqd2Khv7+XUA4nRz5HiG2+GaXZTvNa6SIFjVzM/ZLlUKlc?= =?us-ascii?Q?8Uh8S8+P8dWQDkbpqoLA3zoTbu9UdGu3VyAT+tq6CrUD9BhbsWQoCLR9aTFi?= =?us-ascii?Q?ocQUR7Iw8imWnOML9kReFfcMq4CKsetK9hwJplrm/ShvHwMj1QmsuZEeJx1Y?= =?us-ascii?Q?YZ0yypSSO/tSBm/4u/8f3Nj2L+TANq2eTIby3QnLcC6rqc+Iqld7pLi3N2l4?= =?us-ascii?Q?1TpArPNxIZhKYpuQTJnHHDmvMfA6JDNatrmscdVvCnXZCFd4rUh5wZV8D85K?= =?us-ascii?Q?418GcvMN6wEfp9N0r7Hk+fpL+9mtRiCPe7d2yfTxz3mR2cXQ9pYCbSUiljCs?= =?us-ascii?Q?MWZC8RjrBhbRxxVc4oYm0MIPAZLn/FV1jHTwPfifFUH1wqmNMLtWD0CUU6v+?= =?us-ascii?Q?I20IAAOmK57O8HNs2fPXkV/UuXLmPyD6k5mgwPErbZSF4P1R+B3wNV9SD2JF?= =?us-ascii?Q?Qz7IKOByGPzm+3hOUL0x7ji4jpAL/APyDtX592PtJjBVy6s9R7CGRmeCeP0o?= =?us-ascii?Q?wehHI8BsBnUoEhD+rm4W8j2Y1qJKeYzxLtTiMWj0A8orjypRlw=3D=3D?= x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:CH3PR12MB8658.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230040)(366016)(1800799024)(376014)(38070700018); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?c4WHWXVPtAjLU/wYY6MM2JQX/AqF3kECuGNM879or/m9JPzs/PLtvn3n8qN1?= =?us-ascii?Q?v60ymQDHVYqwYSGKavzVEZ0d/2LjD/MHd8aMWt0AVpXNUJBc+x7sqirmfK52?= =?us-ascii?Q?W5+eVBC5BsJOsNfEGnCTEVgPK5bPiJAoq0Y/QbZNe+Ku+d2Qg0sCv7LjTKq3?= =?us-ascii?Q?RDHJbujbUlhec3pCRK8qx/auLwK2eCD3lYn0fiVHIaVPRunRQHO3Bx65+uHu?= =?us-ascii?Q?g9nCqQ2OctHTL3TtfGra4iK4+oqQI2VO+6G2vJ/yKzv6EueHSn+BIHFeye1B?= =?us-ascii?Q?F8iXHX7eZBdkAbLFnKO/NDWoCeGVtLHabt5KALV7OHxgKLHZ3qGPdPYZ5PzG?= =?us-ascii?Q?GORO+Ij+gXMGwKB3S918SsHurLb4rmeNHFlpw2Un8f/HveQ7dG4Oy5X+v6cW?= =?us-ascii?Q?TzIbvl6GJ4poJcFVtadnpfg6FTGjOgQ9jNS6zH3Q7EAGM1FDsopsugXJye2C?= =?us-ascii?Q?oafL/WzDo4dmVFdYAnEHhITkUFmJaWhThSJ9c4lp2Md6yWLE7kxxG7GFiv7w?= =?us-ascii?Q?xgZT42WmgpwIgCPLfxEfmI05scubnIchTyeTVlWInqZvIj2Zj2igDCslUBEI?= =?us-ascii?Q?JxmHdt7bG5/MDDANxmOnQ0tMbmU36X5hCwAmZ9Wf962rlsT/2nwpkB00pRBw?= =?us-ascii?Q?5ZZ10VXIc6q/F3MIsB4YuR0k0YYI0LBu9bCddllFFlkR7qiURo1kzm4BZfCh?= =?us-ascii?Q?VBBG3Xn34NLe7R6ylxNuft3ZfVExU2p97vVKZsSdfHs+7Hwf9uqR6u5l/cDm?= =?us-ascii?Q?WBxAOQRpV9I53WVW4b2dZxA2wBf8PI3MCcWMLfUFIY4bAmy+YaAQnDb6EFRx?= =?us-ascii?Q?xRQF9qrOkdkkVSej4c70VRwWmHB5Q0pXrOFupnyGADiv15GnRa1sBStkZseJ?= =?us-ascii?Q?IePM95oZc5TXMa3a3L7Df6KGZpTxn4B+o25RteWrNtI2GCTz7XscdAUKVHpE?= =?us-ascii?Q?xO9fsyBvWv9O5h+st4e7wLL8FPFLw0QkLBQxcvU7qVqpKJ7CyDjUDhgIOCxs?= =?us-ascii?Q?cof0ObMt64jDF+u53Fkv1a4E8jjzzIXiBrlgpTDJQDQMeEwnsUvBgUf6gbEJ?= =?us-ascii?Q?Sqefo93Rc1zIN8HPMfEe4ChDr3jM4sGQjPArB5Lvb+O95brOwC1tBQtW0Ce8?= =?us-ascii?Q?X45E49pd0B1a2GoOtPy8jhpHpGHkfY8G+jMttKlspBHV6NwaeTyqQYheZ7E0?= =?us-ascii?Q?FvVOSwHyHxnMJ4lSD0j+/JMMWIPdPX/qYbVvyCuWj0pq4XDYCWEyAhOlL2R2?= =?us-ascii?Q?1b88ygiil05fDvhHBpN20/1zpSvXOK3t6d5qOP8NW00esejMHGqAIEWiUPnb?= =?us-ascii?Q?FX2x98snAbjVV9OQzEpmVGjoh1u1qgNvHqVwjjBy1HQ6ci7hJhWwMmUW43BX?= =?us-ascii?Q?mgrDThw7b+2CrXmLWMhyBrrqMccGD1uhdirip/XgFtOuSxY+/34HxyVOWV9o?= =?us-ascii?Q?cHTlyxs9eP+84gNV4Ss3A+8IblID+XaEc/8+4R8h3wsYKau03N5WA3TXCVwl?= =?us-ascii?Q?NioiW4kiZxlYRdXUnTfavEpSbZ5T+VxIW9ONrijZfl+PgIdVH6n03/WP8BHk?= =?us-ascii?Q?jNTxGLq75zrGuUNGQRGAAgkRD3dw5YblU8mK5vCk?= Content-Type: multipart/alternative; boundary="_000_CH3PR12MB865825ED984CC0D399C6E6DBA1862CH3PR12MB8658namp_" MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: CH3PR12MB8658.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 93d9ddef-1146-4a53-8cd3-08dcbb5eed29 X-MS-Exchange-CrossTenant-originalarrivaltime: 13 Aug 2024 06:12:35.9391 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: RRcabm7yyXnJr1nZ81abW6xKs2Z3LWgXeE4LzcSEIpc+oskmy1pUK++WE3tOJfbj7MdOSnfn38M4mtJm2BAWFA== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DS0PR12MB8455 X-BeenThere: stable@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: patches for DPDK stable branches List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: stable-bounces@dpdk.org --_000_CH3PR12MB865825ED984CC0D399C6E6DBA1862CH3PR12MB8658namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Hi Jack, Thanks for prompt response, patch queued to 23.11.2 release candidate queue= . Regards, Xueming ________________________________ From: Jack Bond-Preston Sent: Monday, August 12, 2024 9:46 PM To: Xueming Li Cc: stable@dpdk.org ; Kai Ji ; Wathsala = Vithanage Subject: [PATCH 23.11] crypto/openssl: make per-QP auth context clones [ upstream commit 17d5bc6135afdb38ddf02595bfa15aa5142d80b1 ] Currently EVP auth ctxs (e.g. EVP_MD_CTX, EVP_MAC_CTX) are allocated, copied to (from openssl_session), and then freed for every auth operation (ie. per packet). This is very inefficient, and avoidable. Make each openssl_session hold an array of structures, containing pointers to per-queue-pair cipher and auth context copies. These are populated on first use by allocating a new context and copying from the main context. These copies can then be used in a thread-safe manner by different worker lcores simultaneously. Consequently the auth context allocation and copy only has to happen once - the first time a given qp uses an openssl_session. This brings about a large performance boost. Throughput performance uplift measurements for HMAC-SHA1 generate on Ampere Altra Max platform: 1 worker lcore | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 0.63 | 1.42 | 123.5% | | 256 | 2.24 | 4.40 | 96.4% | | 1024 | 6.15 | 9.26 | 50.6% | | 2048 | 8.68 | 11.38 | 31.1% | | 4096 | 10.92 | 12.84 | 17.6% | 8 worker lcores | buffer sz (B) | prev (Gbps) | optimised (Gbps) | uplift | |-----------------+---------------+--------------------+----------| | 64 | 0.93 | 11.35 | 1122.5% | | 256 | 3.70 | 35.30 | 853.7% | | 1024 | 15.22 | 74.27 | 387.8% | | 2048 | 30.20 | 91.08 | 201.6% | | 4096 | 56.92 | 102.76 | 80.5% | Cc: stable@dpdk.org Signed-off-by: Jack Bond-Preston Acked-by: Kai Ji Reviewed-by: Wathsala Vithanage --- drivers/crypto/openssl/compat.h | 26 +++ drivers/crypto/openssl/openssl_pmd_private.h | 25 ++- drivers/crypto/openssl/rte_openssl_pmd.c | 176 +++++++++++++++---- drivers/crypto/openssl/rte_openssl_pmd_ops.c | 7 +- 4 files changed, 193 insertions(+), 41 deletions(-) diff --git a/drivers/crypto/openssl/compat.h b/drivers/crypto/openssl/compa= t.h index 9f9167c4f1..e1814fea8c 100644 --- a/drivers/crypto/openssl/compat.h +++ b/drivers/crypto/openssl/compat.h @@ -5,6 +5,32 @@ #ifndef __RTA_COMPAT_H__ #define __RTA_COMPAT_H__ +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L +static __rte_always_inline void +free_hmac_ctx(EVP_MAC_CTX *ctx) +{ + EVP_MAC_CTX_free(ctx); +} + +static __rte_always_inline void +free_cmac_ctx(EVP_MAC_CTX *ctx) +{ + EVP_MAC_CTX_free(ctx); +} +#else +static __rte_always_inline void +free_hmac_ctx(HMAC_CTX *ctx) +{ + HMAC_CTX_free(ctx); +} + +static __rte_always_inline void +free_cmac_ctx(CMAC_CTX *ctx) +{ + CMAC_CTX_free(ctx); +} +#endif + #if (OPENSSL_VERSION_NUMBER < 0x10100000L) static __rte_always_inline int diff --git a/drivers/crypto/openssl/openssl_pmd_private.h b/drivers/crypto/= openssl/openssl_pmd_private.h index 370de1d53b..aa3f466e74 100644 --- a/drivers/crypto/openssl/openssl_pmd_private.h +++ b/drivers/crypto/openssl/openssl_pmd_private.h @@ -80,6 +80,20 @@ struct openssl_qp { */ } __rte_cache_aligned; +struct evp_ctx_pair { + EVP_CIPHER_CTX *cipher; + union { + EVP_MD_CTX *auth; +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L + EVP_MAC_CTX *hmac; + EVP_MAC_CTX *cmac; +#else + HMAC_CTX *hmac; + CMAC_CTX *cmac; +#endif + }; +}; + /** OPENSSL crypto private session structure */ struct openssl_session { enum openssl_chain_order chain_order; @@ -168,11 +182,12 @@ struct openssl_session { uint16_t ctx_copies_len; /* < number of entries in ctx_copies */ - EVP_CIPHER_CTX *qp_ctx[]; - /**< Flexible array member of per-queue-pair pointers to copies of = EVP - * context structure. Cipher contexts are not safe to use from mult= iple - * cores simultaneously, so maintaining these copies allows avoidin= g - * per-buffer copying into a temporary context. + struct evp_ctx_pair qp_ctx[]; + /**< Flexible array member of per-queue-pair structures, each conta= ining + * pointers to copies of the cipher and auth EVP contexts. Cipher + * contexts are not safe to use from multiple cores simultaneously,= so + * maintaining these copies allows avoiding per-buffer copying into= a + * temporary context. */ } __rte_cache_aligned; diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/open= ssl/rte_openssl_pmd.c index 7518ffa3be..101111e85b 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd.c +++ b/drivers/crypto/openssl/rte_openssl_pmd.c @@ -894,40 +894,45 @@ openssl_set_session_parameters(struct openssl_session= *sess, void openssl_reset_session(struct openssl_session *sess) { + /* Free all the qp_ctx entries. */ for (uint16_t i =3D 0; i < sess->ctx_copies_len; i++) { - if (sess->qp_ctx[i] !=3D NULL) { - EVP_CIPHER_CTX_free(sess->qp_ctx[i]); - sess->qp_ctx[i] =3D NULL; + if (sess->qp_ctx[i].cipher !=3D NULL) { + EVP_CIPHER_CTX_free(sess->qp_ctx[i].cipher); + sess->qp_ctx[i].cipher =3D NULL; + } + + switch (sess->auth.mode) { + case OPENSSL_AUTH_AS_AUTH: + EVP_MD_CTX_destroy(sess->qp_ctx[i].auth); + sess->qp_ctx[i].auth =3D NULL; + break; + case OPENSSL_AUTH_AS_HMAC: + free_hmac_ctx(sess->qp_ctx[i].hmac); + sess->qp_ctx[i].hmac =3D NULL; + break; + case OPENSSL_AUTH_AS_CMAC: + free_cmac_ctx(sess->qp_ctx[i].cmac); + sess->qp_ctx[i].cmac =3D NULL; + break; } } EVP_CIPHER_CTX_free(sess->cipher.ctx); - if (sess->chain_order =3D=3D OPENSSL_CHAIN_CIPHER_BPI) - EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx); - switch (sess->auth.mode) { case OPENSSL_AUTH_AS_AUTH: EVP_MD_CTX_destroy(sess->auth.auth.ctx); break; case OPENSSL_AUTH_AS_HMAC: - EVP_PKEY_free(sess->auth.hmac.pkey); -# if OPENSSL_VERSION_NUMBER >=3D 0x30000000L - EVP_MAC_CTX_free(sess->auth.hmac.ctx); -# else - HMAC_CTX_free(sess->auth.hmac.ctx); -# endif + free_hmac_ctx(sess->auth.hmac.ctx); break; case OPENSSL_AUTH_AS_CMAC: -# if OPENSSL_VERSION_NUMBER >=3D 0x30000000L - EVP_MAC_CTX_free(sess->auth.cmac.ctx); -# else - CMAC_CTX_free(sess->auth.cmac.ctx); -# endif - break; - default: + free_cmac_ctx(sess->auth.cmac.ctx); break; } + + if (sess->chain_order =3D=3D OPENSSL_CHAIN_CIPHER_BPI) + EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx); } /** Provide session for operation */ @@ -1469,6 +1474,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, u= int8_t *dst, int offset, if (m =3D=3D 0) goto process_auth_err; + if (EVP_MAC_init(ctx, NULL, 0, NULL) <=3D 0) + goto process_auth_err; + src =3D rte_pktmbuf_mtod_offset(m, uint8_t *, offset); l =3D rte_pktmbuf_data_len(m) - offset; @@ -1495,11 +1503,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, = uint8_t *dst, int offset, if (EVP_MAC_final(ctx, dst, &dstlen, DIGEST_LENGTH_MAX) !=3D 1) goto process_auth_err; - EVP_MAC_CTX_free(ctx); return 0; process_auth_err: - EVP_MAC_CTX_free(ctx); OPENSSL_LOG(ERR, "Process openssl auth failed"); return -EINVAL; } @@ -1618,7 +1624,7 @@ get_local_cipher_ctx(struct openssl_session *sess, st= ruct openssl_qp *qp) if (sess->ctx_copies_len =3D=3D 0) return sess->cipher.ctx; - EVP_CIPHER_CTX **lctx =3D &sess->qp_ctx[qp->id]; + EVP_CIPHER_CTX **lctx =3D &sess->qp_ctx[qp->id].cipher; if (unlikely(*lctx =3D=3D NULL)) { #if OPENSSL_VERSION_NUMBER >=3D 0x30200000L @@ -1645,6 +1651,112 @@ get_local_cipher_ctx(struct openssl_session *sess, = struct openssl_qp *qp) return *lctx; } +static inline EVP_MD_CTX * +get_local_auth_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ + /* If the array is not being used, just return the main context. */ + if (sess->ctx_copies_len =3D=3D 0) + return sess->auth.auth.ctx; + + EVP_MD_CTX **lctx =3D &sess->qp_ctx[qp->id].auth; + + if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30100000L + /* EVP_MD_CTX_dup() added in OSSL 3.1 */ + *lctx =3D EVP_MD_CTX_dup(sess->auth.auth.ctx); +#else + *lctx =3D EVP_MD_CTX_new(); + EVP_MD_CTX_copy(*lctx, sess->auth.auth.ctx); +#endif + } + + return *lctx; +} + +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L +static inline EVP_MAC_CTX * +#else +static inline HMAC_CTX * +#endif +get_local_hmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ +#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION_NUMBER < 0= x30003000L) + /* For OpenSSL versions 3.0.0 <=3D v < 3.0.3, re-initing of + * EVP_MAC_CTXs is broken, and doesn't actually reset their + * state. This was fixed in OSSL commit c9ddc5af5199 ("Avoid + * undefined behavior of provided macs on EVP_MAC + * reinitialization"). In cases where the fix is not present, + * fall back to duplicating the context every buffer as a + * workaround, at the cost of performance. + */ + RTE_SET_USED(qp); + return EVP_MAC_CTX_dup(sess->auth.hmac.ctx); +#else + if (sess->ctx_copies_len =3D=3D 0) + return sess->auth.hmac.ctx; + +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L + EVP_MAC_CTX **lctx =3D +#else + HMAC_CTX **lctx =3D +#endif + &sess->qp_ctx[qp->id].hmac; + + if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L + *lctx =3D EVP_MAC_CTX_dup(sess->auth.hmac.ctx); +#else + *lctx =3D HMAC_CTX_new(); + HMAC_CTX_copy(*lctx, sess->auth.hmac.ctx); +#endif + } + + return *lctx; +#endif +} + +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L +static inline EVP_MAC_CTX * +#else +static inline CMAC_CTX * +#endif +get_local_cmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{ +#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION_NUMBER < 0= x30003000L) + /* For OpenSSL versions 3.0.0 <=3D v < 3.0.3, re-initing of + * EVP_MAC_CTXs is broken, and doesn't actually reset their + * state. This was fixed in OSSL commit c9ddc5af5199 ("Avoid + * undefined behavior of provided macs on EVP_MAC + * reinitialization"). In cases where the fix is not present, + * fall back to duplicating the context every buffer as a + * workaround, at the cost of performance. + */ + RTE_SET_USED(qp); + return EVP_MAC_CTX_dup(sess->auth.cmac.ctx); +#else + if (sess->ctx_copies_len =3D=3D 0) + return sess->auth.cmac.ctx; + +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L + EVP_MAC_CTX **lctx =3D +#else + CMAC_CTX **lctx =3D +#endif + &sess->qp_ctx[qp->id].cmac; + + if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L + *lctx =3D EVP_MAC_CTX_dup(sess->auth.cmac.ctx); +#else + *lctx =3D CMAC_CTX_new(); + CMAC_CTX_copy(*lctx, sess->auth.cmac.ctx); +#endif + } + + return *lctx; +#endif +} + /** Process auth/cipher combined operation */ static void process_openssl_combined_op(struct openssl_qp *qp, struct rte_crypto_op *o= p, @@ -1893,42 +2005,40 @@ process_openssl_auth_op(struct openssl_qp *qp, stru= ct rte_crypto_op *op, switch (sess->auth.mode) { case OPENSSL_AUTH_AS_AUTH: - ctx_a =3D EVP_MD_CTX_create(); - EVP_MD_CTX_copy_ex(ctx_a, sess->auth.auth.ctx); + ctx_a =3D get_local_auth_ctx(sess, qp); status =3D process_openssl_auth(mbuf_src, dst, op->sym->auth.data.offset, NULL, NULL, src= len, ctx_a, sess->auth.auth.evp_algo); - EVP_MD_CTX_destroy(ctx_a); break; case OPENSSL_AUTH_AS_HMAC: + ctx_h =3D get_local_hmac_ctx(sess, qp); # if OPENSSL_VERSION_NUMBER >=3D 0x30000000L - ctx_h =3D EVP_MAC_CTX_dup(sess->auth.hmac.ctx); status =3D process_openssl_auth_mac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_h); # else - ctx_h =3D HMAC_CTX_new(); - HMAC_CTX_copy(ctx_h, sess->auth.hmac.ctx); status =3D process_openssl_auth_hmac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_h); - HMAC_CTX_free(ctx_h); # endif +#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION_NUMBER < 0= x30003000L) + EVP_MAC_CTX_free(ctx_h); +#endif break; case OPENSSL_AUTH_AS_CMAC: + ctx_c =3D get_local_cmac_ctx(sess, qp); # if OPENSSL_VERSION_NUMBER >=3D 0x30000000L - ctx_c =3D EVP_MAC_CTX_dup(sess->auth.cmac.ctx); status =3D process_openssl_auth_mac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_c); # else - ctx_c =3D CMAC_CTX_new(); - CMAC_CTX_copy(ctx_c, sess->auth.cmac.ctx); status =3D process_openssl_auth_cmac(mbuf_src, dst, op->sym->auth.data.offset, srclen, ctx_c); - CMAC_CTX_free(ctx_c); # endif +#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION_NUMBER < 0= x30003000L) + EVP_MAC_CTX_free(ctx_c); +#endif break; default: status =3D -1; diff --git a/drivers/crypto/openssl/rte_openssl_pmd_ops.c b/drivers/crypto/= openssl/rte_openssl_pmd_ops.c index 4209c6ab6f..1bbb855a59 100644 --- a/drivers/crypto/openssl/rte_openssl_pmd_ops.c +++ b/drivers/crypto/openssl/rte_openssl_pmd_ops.c @@ -805,7 +805,7 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev *= dev) unsigned int max_nb_qps =3D ((struct openssl_private *) dev->data->dev_private)->max_nb_qpairs; return sizeof(struct openssl_session) + - (sizeof(void *) * max_nb_qps); + (sizeof(struct evp_ctx_pair) * max_nb_qps); } /* @@ -818,10 +818,11 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev= *dev) /* * Otherwise, the size of the flexible array member should be enou= gh to - * fit pointers to per-qp contexts. + * fit pointers to per-qp contexts. This is twice the number of que= ue + * pairs, to allow for auth and cipher contexts. */ return sizeof(struct openssl_session) + - (sizeof(void *) * dev->data->nb_queue_pairs); + (sizeof(struct evp_ctx_pair) * dev->data->nb_queue_pairs); } /** Returns the size of the asymmetric session structure */ -- 2.34.1 --_000_CH3PR12MB865825ED984CC0D399C6E6DBA1862CH3PR12MB8658namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable
Hi Jack,

Thanks for prompt response, patch queued to 23.11.2 release candidate queue= .

Regards,
Xueming

From: Jack Bond-Preston <= ;jack.bond-preston@foss.arm.com>
Sent: Monday, August 12, 2024 9:46 PM
To: Xueming Li <xuemingl@nvidia.com>
Cc: stable@dpdk.org <stable@dpdk.org>; Kai Ji <kai.ji@intel= .com>; Wathsala Vithanage <wathsala.vithanage@arm.com>
Subject: [PATCH 23.11] crypto/openssl: make per-QP auth context clon= es
 
[ upstream commit 17d5bc6135afdb38ddf02595bfa15aa5= 142d80b1 ]

Currently EVP auth ctxs (e.g. EVP_MD_CTX, EVP_MAC_CTX) are allocated,
copied to (from openssl_session), and then freed for every auth
operation (ie. per packet). This is very inefficient, and avoidable.

Make each openssl_session hold an array of structures, containing
pointers to per-queue-pair cipher and auth context copies. These are
populated on first use by allocating a new context and copying from the
main context. These copies can then be used in a thread-safe manner by
different worker lcores simultaneously. Consequently the auth context
allocation and copy only has to happen once - the first time a given qp
uses an openssl_session. This brings about a large performance boost.

Throughput performance uplift measurements for HMAC-SHA1 generate on
Ampere Altra Max platform:
1 worker lcore
|   buffer sz (B) |   prev (Gbps) |   optimis= ed (Gbps) |   uplift |
|-----------------+---------------+--------------------+----------|
|            &n= bsp; 64 |          0.63 | = ;            &n= bsp; 1.42 |   123.5% |
|             2= 56 |          2.24 | &nbs= p;             = 4.40 |    96.4% |
|            1024 |&= nbsp;         6.15 |  &nb= sp;            9.26 = |    50.6% |
|            2048 |&= nbsp;         8.68 |  &nb= sp;           11.38 |&nbs= p;   31.1% |
|            4096 |&= nbsp;        10.92 |   &n= bsp;          12.84 | &nb= sp;  17.6% |

8 worker lcores
|   buffer sz (B) |   prev (Gbps) |   optimis= ed (Gbps) |   uplift |
|-----------------+---------------+--------------------+----------|
|            &n= bsp; 64 |          0.93 | = ;             1= 1.35 |  1122.5% |
|             2= 56 |          3.70 | &nbs= p;            35.30 = |   853.7% |
|            1024 |&= nbsp;        15.22 |   &n= bsp;          74.27 | &nb= sp; 387.8% |
|            2048 |&= nbsp;        30.20 |   &n= bsp;          91.08 | &nb= sp; 201.6% |
|            4096 |&= nbsp;        56.92 |   &n= bsp;         102.76 |  &n= bsp; 80.5% |

Cc: stable@dpdk.org

Signed-off-by: Jack Bond-Preston <jack.bond-preston@foss.arm.com>
Acked-by: Kai Ji <kai.ji@intel.com>
Reviewed-by: Wathsala Vithanage <wathsala.vithanage@arm.com>
---
 drivers/crypto/openssl/compat.h      &n= bsp;       |  26 +++
 drivers/crypto/openssl/openssl_pmd_private.h |  25 ++-
 drivers/crypto/openssl/rte_openssl_pmd.c     | 17= 6 +++++++++++++++----
 drivers/crypto/openssl/rte_openssl_pmd_ops.c |   7 +-
 4 files changed, 193 insertions(+), 41 deletions(-)

diff --git a/drivers/crypto/openssl/compat.h b/drivers/crypto/openssl/compa= t.h
index 9f9167c4f1..e1814fea8c 100644
--- a/drivers/crypto/openssl/compat.h
+++ b/drivers/crypto/openssl/compat.h
@@ -5,6 +5,32 @@
 #ifndef __RTA_COMPAT_H__
 #define __RTA_COMPAT_H__
 
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+static __rte_always_inline void
+free_hmac_ctx(EVP_MAC_CTX *ctx)
+{
+       EVP_MAC_CTX_free(ctx);
+}
+
+static __rte_always_inline void
+free_cmac_ctx(EVP_MAC_CTX *ctx)
+{
+       EVP_MAC_CTX_free(ctx);
+}
+#else
+static __rte_always_inline void
+free_hmac_ctx(HMAC_CTX *ctx)
+{
+       HMAC_CTX_free(ctx);
+}
+
+static __rte_always_inline void
+free_cmac_ctx(CMAC_CTX *ctx)
+{
+       CMAC_CTX_free(ctx);
+}
+#endif
+
 #if (OPENSSL_VERSION_NUMBER < 0x10100000L)
 
 static __rte_always_inline int
diff --git a/drivers/crypto/openssl/openssl_pmd_private.h b/drivers/crypto/= openssl/openssl_pmd_private.h
index 370de1d53b..aa3f466e74 100644
--- a/drivers/crypto/openssl/openssl_pmd_private.h
+++ b/drivers/crypto/openssl/openssl_pmd_private.h
@@ -80,6 +80,20 @@ struct openssl_qp {
          */
 } __rte_cache_aligned;
 
+struct evp_ctx_pair {
+       EVP_CIPHER_CTX *cipher;
+       union {
+            &n= bsp;  EVP_MD_CTX *auth;
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+            &n= bsp;  EVP_MAC_CTX *hmac;
+            &n= bsp;  EVP_MAC_CTX *cmac;
+#else
+            &n= bsp;  HMAC_CTX *hmac;
+            &n= bsp;  CMAC_CTX *cmac;
+#endif
+       };
+};
+
 /** OPENSSL crypto private session structure */
 struct openssl_session {
         enum openssl_chain_order c= hain_order;
@@ -168,11 +182,12 @@ struct openssl_session {
 
         uint16_t ctx_copies_len;          /* < number of entries = in ctx_copies */
-       EVP_CIPHER_CTX *qp_ctx[];
-       /**< Flexible array member of per-= queue-pair pointers to copies of EVP
-        * context structure. Cipher con= texts are not safe to use from multiple
-        * cores simultaneously, so main= taining these copies allows avoiding
-        * per-buffer copying into a tem= porary context.
+       struct evp_ctx_pair qp_ctx[];
+       /**< Flexible array member of per-= queue-pair structures, each containing
+        * pointers to copies of the cip= her and auth EVP contexts. Cipher
+        * contexts are not safe to use = from multiple cores simultaneously, so
+        * maintaining these copies allo= ws avoiding per-buffer copying into a
+        * temporary context.
          */
 } __rte_cache_aligned;
 
diff --git a/drivers/crypto/openssl/rte_openssl_pmd.c b/drivers/crypto/open= ssl/rte_openssl_pmd.c
index 7518ffa3be..101111e85b 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd.c
@@ -894,40 +894,45 @@ openssl_set_session_parameters(struct openssl_session= *sess,
 void
 openssl_reset_session(struct openssl_session *sess)
 {
+       /* Free all the qp_ctx entries. */          for (uint16_t i =3D 0; i &= lt; sess->ctx_copies_len; i++) {
-            &n= bsp;  if (sess->qp_ctx[i] !=3D NULL) {
-            &n= bsp;          EVP_CIPHER_CTX_f= ree(sess->qp_ctx[i]);
-            &n= bsp;          sess->qp_ctx[= i] =3D NULL;
+            &n= bsp;  if (sess->qp_ctx[i].cipher !=3D NULL) {
+            &n= bsp;          EVP_CIPHER_CTX_f= ree(sess->qp_ctx[i].cipher);
+            &n= bsp;          sess->qp_ctx[= i].cipher =3D NULL;
+            &n= bsp;  }
+
+            &n= bsp;  switch (sess->auth.mode) {
+            &n= bsp;  case OPENSSL_AUTH_AS_AUTH:
+            &n= bsp;          EVP_MD_CTX_destr= oy(sess->qp_ctx[i].auth);
+            &n= bsp;          sess->qp_ctx[= i].auth =3D NULL;
+            &n= bsp;          break;
+            &n= bsp;  case OPENSSL_AUTH_AS_HMAC:
+            &n= bsp;          free_hmac_ctx(se= ss->qp_ctx[i].hmac);
+            &n= bsp;          sess->qp_ctx[= i].hmac =3D NULL;
+            &n= bsp;          break;
+            &n= bsp;  case OPENSSL_AUTH_AS_CMAC:
+            &n= bsp;          free_cmac_ctx(se= ss->qp_ctx[i].cmac);
+            &n= bsp;          sess->qp_ctx[= i].cmac =3D NULL;
+            &n= bsp;          break;
            &nb= sp;    }
         }
 
         EVP_CIPHER_CTX_free(sess-&= gt;cipher.ctx);
 
-       if (sess->chain_order =3D=3D OPENS= SL_CHAIN_CIPHER_BPI)
-            &n= bsp;  EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx);
-
         switch (sess->auth.mode= ) {
         case OPENSSL_AUTH_AS_AUTH:=
            &nb= sp;    EVP_MD_CTX_destroy(sess->auth.auth.ctx);
            &nb= sp;    break;
         case OPENSSL_AUTH_AS_HMAC:=
-            &n= bsp;  EVP_PKEY_free(sess->auth.hmac.pkey);
-# if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
-            &n= bsp;  EVP_MAC_CTX_free(sess->auth.hmac.ctx);
-# else
-            &n= bsp;  HMAC_CTX_free(sess->auth.hmac.ctx);
-# endif
+            &n= bsp;  free_hmac_ctx(sess->auth.hmac.ctx);
            &nb= sp;    break;
         case OPENSSL_AUTH_AS_CMAC:=
-# if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
-            &n= bsp;  EVP_MAC_CTX_free(sess->auth.cmac.ctx);
-# else
-            &n= bsp;  CMAC_CTX_free(sess->auth.cmac.ctx);
-# endif
-            &n= bsp;  break;
-       default:
+            &n= bsp;  free_cmac_ctx(sess->auth.cmac.ctx);
            &nb= sp;    break;
         }
+
+       if (sess->chain_order =3D=3D OPENS= SL_CHAIN_CIPHER_BPI)
+            &n= bsp;  EVP_CIPHER_CTX_free(sess->cipher.bpi_ctx);
 }
 
 /** Provide session for operation */
@@ -1469,6 +1474,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, u= int8_t *dst, int offset,
         if (m =3D=3D 0)
            &nb= sp;    goto process_auth_err;
 
+       if (EVP_MAC_init(ctx, NULL, 0, NULL) = <=3D 0)
+            &n= bsp;  goto process_auth_err;
+
         src =3D rte_pktmbuf_mtod_o= ffset(m, uint8_t *, offset);
 
         l =3D rte_pktmbuf_data_len= (m) - offset;
@@ -1495,11 +1503,9 @@ process_openssl_auth_mac(struct rte_mbuf *mbuf_src, = uint8_t *dst, int offset,
         if (EVP_MAC_final(ctx, dst= , &dstlen, DIGEST_LENGTH_MAX) !=3D 1)
            &nb= sp;    goto process_auth_err;
 
-       EVP_MAC_CTX_free(ctx);
         return 0;
 
 process_auth_err:
-       EVP_MAC_CTX_free(ctx);
         OPENSSL_LOG(ERR, "Pro= cess openssl auth failed");
         return -EINVAL;
 }
@@ -1618,7 +1624,7 @@ get_local_cipher_ctx(struct openssl_session *sess, st= ruct openssl_qp *qp)
         if (sess->ctx_copies_le= n =3D=3D 0)
            &nb= sp;    return sess->cipher.ctx;
 
-       EVP_CIPHER_CTX **lctx =3D &sess-&= gt;qp_ctx[qp->id];
+       EVP_CIPHER_CTX **lctx =3D &sess-&= gt;qp_ctx[qp->id].cipher;
 
         if (unlikely(*lctx =3D=3D = NULL)) {
 #if OPENSSL_VERSION_NUMBER >=3D 0x30200000L
@@ -1645,6 +1651,112 @@ get_local_cipher_ctx(struct openssl_session *sess, = struct openssl_qp *qp)
         return *lctx;
 }
 
+static inline EVP_MD_CTX *
+get_local_auth_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{
+       /* If the array is not being used, ju= st return the main context. */
+       if (sess->ctx_copies_len =3D=3D 0)=
+            &n= bsp;  return sess->auth.auth.ctx;
+
+       EVP_MD_CTX **lctx =3D &sess->q= p_ctx[qp->id].auth;
+
+       if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30100000L
+            &n= bsp;  /* EVP_MD_CTX_dup() added in OSSL 3.1 */
+            &n= bsp;  *lctx =3D EVP_MD_CTX_dup(sess->auth.auth.ctx);
+#else
+            &n= bsp;  *lctx =3D EVP_MD_CTX_new();
+            &n= bsp;  EVP_MD_CTX_copy(*lctx, sess->auth.auth.ctx);
+#endif
+       }
+
+       return *lctx;
+}
+
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+static inline EVP_MAC_CTX *
+#else
+static inline HMAC_CTX *
+#endif
+get_local_hmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{
+#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION= _NUMBER < 0x30003000L)
+       /* For OpenSSL versions 3.0.0 <=3D= v < 3.0.3, re-initing of
+        * EVP_MAC_CTXs is broken, and d= oesn't actually reset their
+        * state. This was fixed in OSSL= commit c9ddc5af5199 ("Avoid
+        * undefined behavior of provide= d macs on EVP_MAC
+        * reinitialization"). In c= ases where the fix is not present,
+        * fall back to duplicating the = context every buffer as a
+        * workaround, at the cost of pe= rformance.
+        */
+       RTE_SET_USED(qp);
+       return EVP_MAC_CTX_dup(sess->auth.= hmac.ctx);
+#else
+       if (sess->ctx_copies_len =3D=3D 0)=
+            &n= bsp;  return sess->auth.hmac.ctx;
+
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+       EVP_MAC_CTX **lctx =3D
+#else
+       HMAC_CTX **lctx =3D
+#endif
+            &n= bsp;  &sess->qp_ctx[qp->id].hmac;
+
+       if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+            &n= bsp;  *lctx =3D EVP_MAC_CTX_dup(sess->auth.hmac.ctx);
+#else
+            &n= bsp;  *lctx =3D HMAC_CTX_new();
+            &n= bsp;  HMAC_CTX_copy(*lctx, sess->auth.hmac.ctx);
+#endif
+       }
+
+       return *lctx;
+#endif
+}
+
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+static inline EVP_MAC_CTX *
+#else
+static inline CMAC_CTX *
+#endif
+get_local_cmac_ctx(struct openssl_session *sess, struct openssl_qp *qp) +{
+#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION= _NUMBER < 0x30003000L)
+       /* For OpenSSL versions 3.0.0 <=3D= v < 3.0.3, re-initing of
+        * EVP_MAC_CTXs is broken, and d= oesn't actually reset their
+        * state. This was fixed in OSSL= commit c9ddc5af5199 ("Avoid
+        * undefined behavior of provide= d macs on EVP_MAC
+        * reinitialization"). In c= ases where the fix is not present,
+        * fall back to duplicating the = context every buffer as a
+        * workaround, at the cost of pe= rformance.
+        */
+       RTE_SET_USED(qp);
+       return EVP_MAC_CTX_dup(sess->auth.= cmac.ctx);
+#else
+       if (sess->ctx_copies_len =3D=3D 0)=
+            &n= bsp;  return sess->auth.cmac.ctx;
+
+#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+       EVP_MAC_CTX **lctx =3D
+#else
+       CMAC_CTX **lctx =3D
+#endif
+            &n= bsp;  &sess->qp_ctx[qp->id].cmac;
+
+       if (unlikely(*lctx =3D=3D NULL)) { +#if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
+            &n= bsp;  *lctx =3D EVP_MAC_CTX_dup(sess->auth.cmac.ctx);
+#else
+            &n= bsp;  *lctx =3D CMAC_CTX_new();
+            &n= bsp;  CMAC_CTX_copy(*lctx, sess->auth.cmac.ctx);
+#endif
+       }
+
+       return *lctx;
+#endif
+}
+
 /** Process auth/cipher combined operation */
 static void
 process_openssl_combined_op(struct openssl_qp *qp, struct rte_crypto_= op *op,
@@ -1893,42 +2005,40 @@ process_openssl_auth_op(struct openssl_qp *qp, stru= ct rte_crypto_op *op,
 
         switch (sess->auth.mode= ) {
         case OPENSSL_AUTH_AS_AUTH:=
-            &n= bsp;  ctx_a =3D EVP_MD_CTX_create();
-            &n= bsp;  EVP_MD_CTX_copy_ex(ctx_a, sess->auth.auth.ctx);
+            &n= bsp;  ctx_a =3D get_local_auth_ctx(sess, qp);
            &nb= sp;    status =3D process_openssl_auth(mbuf_src, dst,
            &nb= sp;            =         op->sym->auth.data.offset,= NULL, NULL, srclen,
            &nb= sp;            =         ctx_a, sess->auth.auth.evp_al= go);
-            &n= bsp;  EVP_MD_CTX_destroy(ctx_a);
            &nb= sp;    break;
         case OPENSSL_AUTH_AS_HMAC:=
+            &n= bsp;  ctx_h =3D get_local_hmac_ctx(sess, qp);
 # if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
-            &n= bsp;  ctx_h =3D EVP_MAC_CTX_dup(sess->auth.hmac.ctx);
            &nb= sp;    status =3D process_openssl_auth_mac(mbuf_src, dst,             &nb= sp;            =         op->sym->auth.data.offset,= srclen,
            &nb= sp;            =         ctx_h);
 # else
-            &n= bsp;  ctx_h =3D HMAC_CTX_new();
-            &n= bsp;  HMAC_CTX_copy(ctx_h, sess->auth.hmac.ctx);
            &nb= sp;    status =3D process_openssl_auth_hmac(mbuf_src, dst,             &nb= sp;            =         op->sym->auth.data.offset,= srclen,
            &nb= sp;            =         ctx_h);
-            &n= bsp;  HMAC_CTX_free(ctx_h);
 # endif
+#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION= _NUMBER < 0x30003000L)
+            &n= bsp;  EVP_MAC_CTX_free(ctx_h);
+#endif
            &nb= sp;    break;
         case OPENSSL_AUTH_AS_CMAC:=
+            &n= bsp;  ctx_c =3D get_local_cmac_ctx(sess, qp);
 # if OPENSSL_VERSION_NUMBER >=3D 0x30000000L
-            &n= bsp;  ctx_c =3D EVP_MAC_CTX_dup(sess->auth.cmac.ctx);
            &nb= sp;    status =3D process_openssl_auth_mac(mbuf_src, dst,             &nb= sp;            =         op->sym->auth.data.offset,= srclen,
            &nb= sp;            =         ctx_c);
 # else
-            &n= bsp;  ctx_c =3D CMAC_CTX_new();
-            &n= bsp;  CMAC_CTX_copy(ctx_c, sess->auth.cmac.ctx);
            &nb= sp;    status =3D process_openssl_auth_cmac(mbuf_src, dst,             &nb= sp;            =         op->sym->auth.data.offset,= srclen,
            &nb= sp;            =         ctx_c);
-            &n= bsp;  CMAC_CTX_free(ctx_c);
 # endif
+#if (OPENSSL_VERSION_NUMBER >=3D 0x30000000L && OPENSSL_VERSION= _NUMBER < 0x30003000L)
+            &n= bsp;  EVP_MAC_CTX_free(ctx_c);
+#endif
            &nb= sp;    break;
         default:
            &nb= sp;    status =3D -1;
diff --git a/drivers/crypto/openssl/rte_openssl_pmd_ops.c b/drivers/crypto/= openssl/rte_openssl_pmd_ops.c
index 4209c6ab6f..1bbb855a59 100644
--- a/drivers/crypto/openssl/rte_openssl_pmd_ops.c
+++ b/drivers/crypto/openssl/rte_openssl_pmd_ops.c
@@ -805,7 +805,7 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev *= dev)
            &nb= sp;    unsigned int max_nb_qps =3D ((struct openssl_private = *)
            &nb= sp;            =         dev->data->dev_private)-&g= t;max_nb_qpairs;
            &nb= sp;    return sizeof(struct openssl_session) +
-            &n= bsp;            = ;      (sizeof(void *) * max_nb_qps);
+            &n= bsp;            = ;      (sizeof(struct evp_ctx_pair) * max_nb_qps);=
         }
 
         /*
@@ -818,10 +818,11 @@ openssl_pmd_sym_session_get_size(struct rte_cryptodev= *dev)
 
         /*
          * Otherwise, the siz= e of the flexible array member should be enough to
-        * fit pointers to per-qp contex= ts.
+        * fit pointers to per-qp contex= ts. This is twice the number of queue
+        * pairs, to allow for auth and = cipher contexts.
          */
         return sizeof(struct opens= sl_session) +
-            &n= bsp;  (sizeof(void *) * dev->data->nb_queue_pairs);
+            &n= bsp;  (sizeof(struct evp_ctx_pair) * dev->data->nb_queue_pairs);=
 }
 
 /** Returns the size of the asymmetric session structure */
--
2.34.1

--_000_CH3PR12MB865825ED984CC0D399C6E6DBA1862CH3PR12MB8658namp_--