From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5DED7A04DC; Tue, 20 Oct 2020 00:04:46 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 2AC72ACDC; Tue, 20 Oct 2020 00:04:44 +0200 (CEST) Received: from EUR02-AM5-obe.outbound.protection.outlook.com (mail-eopbgr00085.outbound.protection.outlook.com [40.107.0.85]) by dpdk.org (Postfix) with ESMTP id 6D5E8ACBA for ; Tue, 20 Oct 2020 00:04:40 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LJ2hX17rRGW+v3rlwv24hI5Py4tmF2Gr59t2PeQwaGY=; b=MPtkeBrxcRGEubfguoxFbdEEt/YT9EC2Z25SxvpDS05ETjHxKDLNCwB0OOZbFamrNp3n7O7+yXey77R5w2eT2t9fo38OXpUtHb2J/8bm3TUEXBgBVR/IENgpUIUrvF/+KZI84fSjqgEVjMyL8F2WNPOj0kSw9EeDhE+QF0a++Gw= Received: from AM0P190CA0007.EURP190.PROD.OUTLOOK.COM (2603:10a6:208:190::17) by DB7PR08MB3402.eurprd08.prod.outlook.com (2603:10a6:10:4f::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.25; Mon, 19 Oct 2020 22:04:36 +0000 Received: from AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com (2603:10a6:208:190:cafe::90) by AM0P190CA0007.outlook.office365.com (2603:10a6:208:190::17) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.22 via Frontend Transport; Mon, 19 Oct 2020 22:04:36 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 63.35.35.123) smtp.mailfrom=arm.com; dpdk.org; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com;dpdk.org; dmarc=pass action=none header.from=arm.com; Received-SPF: Pass (protection.outlook.com: domain of arm.com designates 63.35.35.123 as permitted sender) receiver=protection.outlook.com; client-ip=63.35.35.123; helo=64aa7808-outbound-1.mta.getcheckrecipient.com; Received: from 64aa7808-outbound-1.mta.getcheckrecipient.com (63.35.35.123) by AM5EUR03FT041.mail.protection.outlook.com (10.152.17.186) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.21 via Frontend Transport; Mon, 19 Oct 2020 22:04:36 +0000 Received: ("Tessian outbound ba2270a55485:v64"); Mon, 19 Oct 2020 22:04:36 +0000 X-CR-MTA-TID: 64aa7808 Received: from 3a24dc54664f.1 by 64aa7808-outbound-1.mta.getcheckrecipient.com id 336E59B7-93A1-4DCB-B61B-A628BA59A578.1; Mon, 19 Oct 2020 22:04:31 +0000 Received: from EUR04-DB3-obe.outbound.protection.outlook.com by 64aa7808-outbound-1.mta.getcheckrecipient.com with ESMTPS id 3a24dc54664f.1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384); Mon, 19 Oct 2020 22:04:31 +0000 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e5S3QZiuBwwwm0oAAY4xOgb8FOifgShK4svcj+GMbcpS2vtw4pb95mg9rXNYSxA1SWBSkiZ4rP879ZodXvbmwplvmkZ5gGpD4U/t7qn9I0DUBonBO2uas+xxvtjMHSUUwoc2WLouQw9vzTlc3tdjue5RJzfzyandapqmapBN5fb60ESwH/BC0Ll9qdidzLlFExNge5BazLiqggYmI2R7gsLYhfzTpoaNotnTwNH+mVSEAb6baSZoPuef8Hs5n5d2WXu+fC9+tIyGYHElmJkkgyIuzrkJIpQvNLeqPDI424uY1BZXdIIFd9k2+JmgW6vIT8xS/K44JZg4Xh6/LtDHuw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LJ2hX17rRGW+v3rlwv24hI5Py4tmF2Gr59t2PeQwaGY=; b=aX4AkCDg0Joz3gRdrj5C1xgX9CxAZLaUxTMMGUNnZG6DHDPsss6vBa1FpyBLhf6mzw7jSOd1jt9hWDRDQOgAgWi+rXB8vGBXzeOceF7EWzo+OTdIPnHRrrSSxUdHcx/0QLKx4iPKKyQ08T9Rk5NfdPa+IcIyXD44aHEW21+erkrvRGrdG2jw4M7AZDnp7y0+ZRVCt22oHRPeFYFf/AnVCyU/tVX3UihbrvWiIZV23mYDOaRvpWFppmetAEJL0z/Q1KhrnKWZGRpQJol2KVJcqkXhaOyhEzWPjInIZyii2fR/EYk2g5Pa4lK75WqsXOrzJtZG5P+RWl1GaaNuR+V3mA== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=armh.onmicrosoft.com; s=selector2-armh-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=LJ2hX17rRGW+v3rlwv24hI5Py4tmF2Gr59t2PeQwaGY=; b=MPtkeBrxcRGEubfguoxFbdEEt/YT9EC2Z25SxvpDS05ETjHxKDLNCwB0OOZbFamrNp3n7O7+yXey77R5w2eT2t9fo38OXpUtHb2J/8bm3TUEXBgBVR/IENgpUIUrvF/+KZI84fSjqgEVjMyL8F2WNPOj0kSw9EeDhE+QF0a++Gw= Received: from DBAPR08MB5814.eurprd08.prod.outlook.com (2603:10a6:10:1b1::6) by DB6PR08MB2776.eurprd08.prod.outlook.com (2603:10a6:6:1d::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3477.21; Mon, 19 Oct 2020 22:04:27 +0000 Received: from DBAPR08MB5814.eurprd08.prod.outlook.com ([fe80::7814:9c1:781f:475d]) by DBAPR08MB5814.eurprd08.prod.outlook.com ([fe80::7814:9c1:781f:475d%4]) with mapi id 15.20.3477.028; Mon, 19 Oct 2020 22:04:26 +0000 From: Honnappa Nagarahalli To: "Ananyev, Konstantin" , "Gujjar, Abhinandan S" , "dev@dpdk.org" , "Doherty, Declan" CC: "jerinj@marvell.com" , "Akhil.goyal@nxp.com" , "Vangati, Narender" , nd , Honnappa Nagarahalli , nd Thread-Topic: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback functions Thread-Index: AQHWhsTNpueHJn2Z1E+jIJevuNB9zqlrRWzAgAAl9gCAAW0pMIAGJnUAgAAH/lCAAp3uAIAAgn/ggADASgCAASzNQIAHHMUAgAA/FzCADrUggIABX7xwgAAmGoCAA4qBcIAAg8GAgABouoCAC5bYoA== Date: Mon, 19 Oct 2020 22:04:26 +0000 Message-ID: References: <1599549024-195051-1-git-send-email-abhinandan.gujjar@intel.com> In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-ts-tracking-id: E8F4E3A51D20294F90FDD66412948799.0 x-checkrecipientchecked: true Authentication-Results-Original: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=arm.com; x-originating-ip: [107.77.220.130] x-ms-publictraffictype: Email X-MS-Office365-Filtering-HT: Tenant X-MS-Office365-Filtering-Correlation-Id: dda1fa1c-7bff-4148-4358-08d8747af7f6 x-ms-traffictypediagnostic: DB6PR08MB2776:|DB7PR08MB3402: x-ld-processed: f34e5979-57d9-4aaa-ad4d-b122a662184d,ExtAddr x-ms-exchange-transport-forked: True X-Microsoft-Antispam-PRVS: x-checkrecipientrouted: true nodisclaimer: true x-ms-oob-tlc-oobclassifiers: OLM:10000;OLM:10000; X-MS-Exchange-SenderADCheck: 1 X-Microsoft-Antispam-Untrusted: BCL:0; X-Microsoft-Antispam-Message-Info-Original: iui4mKdAX64lFE85lUKoy2Y4WeqQN4wq5CJl1wA9Y1MXqD57Up49BTaQal8mPtyZw6wpl1MQqIjHGNaihfT2DmFzpO+eh9mGPmY+IOF+eNBptm0ON2Mf3aIlrHU7vB7i/hpA09fL9/WYj+Ji/l2NuCl0s8Nf9uHGC6PLc8S2CZIpu+bFOKvwn7XC6Zkb835bSCUSFlJJuJvZv/qf3shJ/jy/O6bs8J6dgvZEl1VQt8rGU5gdXkZAsrhcwXlaa8unfvWR3hAi5RRPHhDuIHHEAoBaHXaWhZ6/7XKS9mDhUN1O7B7E+eFWP3L/tCop1vYIzUcVlZJLrfWtxAqPtVRxL4sMRR7zStwFwVwBpqksMa57IBbW8xWa+CStukAfzxjs9r8oEntg/As3M3LdI8iuRg== X-Forefront-Antispam-Report-Untrusted: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DBAPR08MB5814.eurprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(136003)(39860400002)(376002)(346002)(396003)(9686003)(66946007)(4326008)(52536014)(6506007)(53546011)(110136005)(30864003)(54906003)(83380400001)(76116006)(7696005)(33656002)(71200400001)(86362001)(186003)(55016002)(5660300002)(316002)(478600001)(8676002)(66556008)(64756008)(66476007)(66446008)(8936002)(26005)(2906002)(966005)(559001)(579004); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata: HQK4KudEKMfjTcvmDTeaoxJWRWvCew0Hjpy6MLsWBm1r1hGctZMPqtW2KAbv3p4GLNr2x8ZfAK7smltcPJOrlg+btmFPV79wewgqnY0dWwR06q6bZX6/NsGzh1YO84Gg+/1A0QkL3AwFosvSSZp/b+3YPJLeESryOY5RPKIJLb/eKZqydNx4IV1lsc67J5FB6m26Tj+dI7ZpMjzBFZMxHUZwu+73zwOEF0sl1vOAK2sZ1BQ9pXSltBLvfeg1VF8+VikHLf0X7LScQs6glrFYxXF/a68pkRrg28XSHB8lVu7/5YxfGcUwE0I7ScSbRm2duvmb/wIqIDW21x9PsBs19efQ1cNlHHF9RRFJk6aPIzfU+svN/8I5+7ov9SuefDB6mFqNwUCpaLML/G2L+YHV/UayywtT3JkxqMQZwSSDnkjCNLWUUFoTUBUaFjEyOkCA862RVPIheIHgB/YsKCGKQq7oIUma9pGPaCyuMMuKkJ1qD+Hh33V/QMaW7iVSMvXi038/7omnDUV4TAkcQIsJQdzBdZo5zRMIv8Y2mvnLbRd5cmd8OpqtyPz8AFzjGjMvrBhYAiyZINm4E0ZUhgBOpxPuRN/pskkKVX8qGixB28f9cyKck3UcXKJO26z62E0p4UbRUJoMzHtorCPVmCOaGg== Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR08MB2776 Original-Authentication-Results: intel.com; dkim=none (message not signed) header.d=none;intel.com; dmarc=none action=none header.from=arm.com; X-EOPAttributedMessage: 0 X-MS-Exchange-Transport-CrossTenantHeadersStripped: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Office365-Filtering-Correlation-Id-Prvs: 9ded8831-44de-451c-883c-08d8747af20e X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: bE7nhLnYjgQB2YxLSra0dXRNVQEY5zn/4D4weUmnn4V9fU4RT558OVFt1CrSmP0e6cZntk1k1ZyhF3V1A4msOgAtHu5SdaDQYlpYNjt98YZaUQnDaA3crg3z/Qg5SH7zpoplSa/1qjmDiZl7cZxAno8N0oUGf4oxaDeJpYgp8i9VjBPJcr32+PRP818XR6tJ/dopavT/6IvjECa55e5Aq7np8Q/EPRhq9DajOSOZZZut7liK6S9ar3AlwclrF/onjSLAMrFZxG3DoQ7kFHA/ZDjHT7367SyjfCMvHUuiEkW7B0Z7PS943dQOLzaotxex0os5oYCXEn0RPxgRGBSLo8olUWlsTkB6c5J7je6A4HMR/mEn1J1t8EBtUZGBSVxvMZ6/cZ2U64ri8r6hRCV7BpZ17MAqIJ8NyjfwSvdaAyln1NqqkpfCZm1MvZ7OobI2d04Y2J+OV+G7+T76r8ve9QeHSC4wqdJLN8z/sl5cYXo= X-Forefront-Antispam-Report: CIP:63.35.35.123; CTRY:IE; LANG:en; SCL:1; SRV:; IPV:CAL; SFV:NSPM; H:64aa7808-outbound-1.mta.getcheckrecipient.com; PTR:ec2-63-35-35-123.eu-west-1.compute.amazonaws.com; CAT:NONE; SFS:(4636009)(396003)(39860400002)(136003)(376002)(346002)(46966005)(966005)(30864003)(186003)(316002)(36906005)(478600001)(83380400001)(47076004)(82740400003)(9686003)(7696005)(8936002)(356005)(8676002)(6506007)(53546011)(81166007)(70206006)(26005)(2906002)(82310400003)(55016002)(52536014)(70586007)(86362001)(33656002)(336012)(5660300002)(54906003)(110136005)(4326008); DIR:OUT; SFP:1101; X-OriginatorOrg: arm.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 19 Oct 2020 22:04:36.7085 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: dda1fa1c-7bff-4148-4358-08d8747af7f6 X-MS-Exchange-CrossTenant-Id: f34e5979-57d9-4aaa-ad4d-b122a662184d X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=f34e5979-57d9-4aaa-ad4d-b122a662184d; Ip=[63.35.35.123]; Helo=[64aa7808-outbound-1.mta.getcheckrecipient.com] X-MS-Exchange-CrossTenant-AuthSource: AM5EUR03FT041.eop-EUR03.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR08MB3402 Subject: Re: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" > > > > > Hi Abhinandan, > > > > > > > > > > > > > > > > > Hi Konstantin & Honnappa, > > > > > > > > > > > > Thanks for all the inputs and feedback. > > > > > > > > > > > > @Ananyev, Konstantin, > > > > > > I have measured the perf with and without callback on xeon. > > > > > > Here are the > > > > > numbers: > > > > > > > > > > > > ./app/dpdk-test-crypto-perf -l 6-7 > > > > > > --vdev=3D"crypto_openssl0,socket_id=3D0,max_nb_sessions=3D128" = -- > > > > > > --ptest throughput --devtype crypto_openssl --optype > > > > > > cipher-then-auth --cipher-algo aes-cbc --cipher-op encrypt > > > > > > --cipher-key-sz 16 --auth-algo sha1-hmac --auth-op generate > > > > > > --auth-key-sz 64 --digest-sz > > > > > > 12 --total-ops 10000000 --burst-sz 32 --buffer-sz 64 > > > > > > > > > > > > With callback(+ RCU - totally opaque to data-plane threads) > > > > > > lcore id Buf Size Burst Size Enqueued Dequeued F= ailed Enq > > > Failed > > > > > Deq MOps Gbps Cycles/Buf > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8129 0.4162 2694.09 > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8152 0.4174 2686.31 > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8198 0.4197 2671.48 > > > > > > > > > > > > Without callback: > > > > > > lcore id Buf Size Burst Size Enqueued Dequeued F= ailed Enq > > > Failed > > > > > Deq MOps Gbps Cycles/Buf > > > > > > > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8234 0.4216 2659.81 > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8247 0.4222 2655.63 > > > > > > 7 64 32 10000000 = 10000000 0 0 > > > > > 0.8123 0.4159 2695.90 > > > > > > > > > > > > > > > Just to cofirm: > > > > > You implemented crypto enqueue callbacks using RCU QSBR online > > > > > /offline (as suggested below) and numbers above are for: > > > > > 1) callback code in place and some dummy callback installed > > > > That's right. (+ RCU calling online + offline APIs inside > > > > rte_cryptodev_enqueue_burst()) > > > > > 2) callback code in place but no callbacks installed > > > > No callback code. i.e. Original code. > > > > > > Ok, and if I get things right - difference between mean values is ~15= cycles? > > Yes. May be, number are more stable on isolated core. Let's consider wo= rst > case too. >=20 > Ok. >=20 > > > That's seems like very good result to me. > > > Can I suggest to run one more test for your new callback code in > > > place, but no actual callbacks installed? > > lcore id Buf Size Burst Size Enqueued Dequeued Failed En= q Failed Deq > MOps Gbps Cycles/Buf > > > > 7 64 32 10000000 10000000 = 0 0 0.8220 > 0.4209 2664.12 > > 7 64 32 10000000 10000000 = 0 0 0.8245 > 0.4221 2656.14 > > 7 64 32 10000000 10000000 = 0 0 0.8261 > 0.4229 2651.15 >=20 > So, if I can read numbers properly for not-armed callback impact is negle= ctable. > It is hard to say much without seeing the actual code, but from the numbe= rs > above, I think it is a good result and we can go ahead with that approach= . > Honnappa, Akhil, Jerin do you have any objections to such approach in pri= nciple? The numbers look good. I guess this needs to be tested on Arm platforms as = well. It would be good to get the next version of the patch (along with the= test case), others can test from there. > Konstantin >=20 > > > Thanks > > > Konstantin > > > > > > > > > > > > > Is my understanding correct here? > > > > > Thanks > > > > > Konstantin > > > > > > > > > > > > > > > > > > > > > > Regards > > > > > > Abhinandan > > > > > > > > > > > > > -----Original Message----- > > > > > > > From: Ananyev, Konstantin > > > > > > > Sent: Tuesday, September 29, 2020 2:33 PM > > > > > > > To: Honnappa Nagarahalli ; > > > > > > > Gujjar, Abhinandan S ; > > > > > > > dev@dpdk.org; Doherty, Declan > > > > > > > Cc: jerinj@marvell.com; Akhil.goyal@nxp.com; Vangati, > > > > > > > Narender ; nd ; nd > > > > > > > > > > > > > > Subject: RE: [dpdk-dev] [v2 1/2] cryptodev: support enqueue > > > > > > > callback functions > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > +#ifdef RTE_CRYPTODEV_CALLBACKS int > > > > > > > > > > > > > > > > > > +rte_cryptodev_rcu_qsbr_add(uint8_t > > > > > > > > > > > > > > > > > > +dev_id, struct rte_rcu_qsbr > > > > > > > > > > > > > > > > > > +*qsbr) { > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > + struct rte_cryptodev *dev; > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > + if (!rte_cryptodev_pmd_is_valid_dev(d= ev_id)) { > > > > > > > > > > > > > > > > > > + CDEV_LOG_ERR("Invalid dev_id=3D%" > > > > > PRIu8, > > > > > > > > > dev_id); > > > > > > > > > > > > > > > > > > + return -EINVAL; > > > > > > > > > > > > > > > > > > + } > > > > > > > > > > > > > > > > > > + > > > > > > > > > > > > > > > > > > + dev =3D &rte_crypto_devices[dev_id]; > > > > > > > > > > > > > > > > > > + dev->qsbr =3D qsbr; > > > > > > > > > > > > > > > > > > + return 0; } > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > So if I understand your patch correctly > > > > > > > > > > > > > > > > > you propose a new working model for > > > > > > > > > > > > > > > > > crypto-devs: > > > > > > > > > > > > > > > > > 1. Control-plane has to allocate/setup > > > > > > > > > > > > > > > > > rcu_qsbr and do rte_cryptodev_rcu_qsbr_ad= d(). > > > > > > > > > > > > > > > > > 2. Data-plane has somehow to obtain > > > > > > > > > > > > > > > > > pointer to that rcu_qsbr and wrap > > > > > > > > > > > > > > > > > cryptodev_enqueue() > > > > > > > > > > > > > > > > > with rcu_qsbr_quiescent() or > > > > > > > > > > > rcu_qsbr_online()/rcu_qsbr_offline(). > > > > > > > > > > > > > > > > Yes. I think, it is not a new model. It is > > > > > > > > > > > > > > > > same as RCU integration with > > > > > > > > > > > > > LPM. > > > > > > > > > > > > > > > > Please refer: > > > > > > > > > > > > > > > > https://patches.dpdk.org/cover/73673/ > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I am talking about new working model for > > > > > > > > > > > > > > > crypto-dev > > > > > > > > > > > enqueue/dequeue. > > > > > > > > > > > > > > > As I said above now it becomes data-plane > > > > > > > > > > > > > > > thread responsibility > > > > > > > to: > > > > > > > > > > > > > > > -somehow to obtain pointer to that rcu_qsbr > > > > > > > > > > > > > > > for each cryptodev it is > > > > > > > > > > > > > using. > > > > > > > > > > > > > > > -call rcu sync functions > > > > > > > > > > > > > > > (quiescent/online/offline) on a regular > > > > > > > > > basis. > > > > > > > > > > > > > > It is not on regular basis. When data plane > > > > > > > > > > > > > > comes up, they report > > > > > > > > > online. > > > > > > > > > > > > > > They report quiescent when they are done with > > > > > > > > > > > > > > critical section or shared > > > > > > > > > > > > > structure. > > > > > > > > > > > > > > > > > > > > > > > > > > I understand that, but it means all existing > > > > > > > > > > > > > apps have to be changed that > > > > > > > > > > > way. > > > > > > > > > > > > > > > > > > > > > > > > > > > All though, there is some dataplane changes > > > > > > > > > > > > > > involved here, I don't think, it > > > > > > > > > > > > > is major. > > > > > > > > > > > > > > > > > > > > > > > > > > I still think our goal here should be to make no > > > > > > > > > > > > > visible changes to the dataplane. > > > > > > > > > > > > > I.E. all necessary data-plane changes need to be > > > > > > > > > > > > > hidden inside CB invocation part. > > > > > > > > > > > > Please note that this is being implemented using > > > > > > > > > > > > the memory reclamation framework documented at > > > > > > > > > > > > https://doc.dpdk.org/guides/prog_guide/rcu_lib.htm > > > > > > > > > > > > l#re > > > > > > > > > > > > sour > > > > > > > > > > > > ce-r > > > > > > > > > > > > ecla > > > > > > > > > > > > mati > > > > > > > > > > > > on-framework-for-dpdk > > > > > > > > > > > > > > > > > > > > > > > > While using RCU there are couple of trade-offs > > > > > > > > > > > > that applications have to > > > > > > > > > > > consider: > > > > > > > > > > > > 1) Performance - reporting the quiescent state too > > > > > > > > > > > > often results in performance impact on data plane > > > > > > > > > > > > 2) Amount of outstanding memory to reclaim - > > > > > > > > > > > > reporting less often results in more outstanding > > > > > > > > > > > > memory to reclaim > > > > > > > > > > > > > > > > > > > > > > > > Hence, the quiescent state reporting is left to the= application. > > > > > > > > > > > > The application decides how often it reports the > > > > > > > > > > > > quiescent state and has control > > > > > > > > > > > over the data plane performance and the outstanding > > > > > > > > > > > memory to > > > > > > > reclaim. > > > > > > > > > > > > > > > > > > > > > > > > When you say "new working model for crypto-dev > > > > > > > > > > > > enqueue/dequeue", > > > > > > > > > > > > > > > > > > > > > > > > 1) are you comparing these with existing > > > > > > > > > > > > crypto-dev enqueue/dequeue > > > > > > > > > > > APIs? If yes, these are new APIs, it is not breaking = anything. > > > > > > > > > > > > 2) are you comparing these with existing call back > > > > > > > > > > > > functions in ethdev enqueue/dequeue APIs? If yes, > > > > > > > > > > > > agree that this is a new model. But, it is > > > > > > > > > > > possible to support what ethdev supports along with > > > > > > > > > > > the RCU method used in this patch. > > > > > > > > > > > > > > > > > > > > > > What I am talking about: > > > > > > > > > > > > > > > > > > > > > > Existing cryptodev enqueue/dequeue model doesn't > > > > > > > > > > > require for the user to manage any RCU QSBR state man= ually. > > > > > > > > > > > I believe that addition of ability to add/remove > > > > > > > > > > > enqueue/dequeue callbacks shouldn't change existing > > > > > > > > > > > working > > > > > model. > > > > > > > > > > > I think that adding/removing such callbacks has to > > > > > > > > > > > be opaque to the user DP code and shouldn't require > > > > > > > > > > > user to change > > > it. > > > > > > > > > > > Same as we have now for ethdev callback implementatio= n. > > > > > > > > > > The ethdev callback implementation conveniently leaves > > > > > > > > > > the problem of > > > > > > > > > freeing memory to the user to resolve, it does not handle= the > issue. > > > > > > > > > > Hence, it "looks" to be opaque to the DP code. > > > > > > > > > > However, if the application has to implement a safe > > > > > > > > > > way to free the call back memory, its > > > > > > > > > DP is affected based on call backs are being used or not. > > > > > > > > > > > > > > > > > > Yes, I think that's big drawback in initial ethdev > > > > > > > > > callback implementation - it simply ignores DP/CP sync pr= oblem > completely. > > > > > > > > > Though I think it is possible to have both here: > > > > > > > > > keep callback "opaque" to DP code and provide some sync > > > > > > > > > mechanism between DP/CP. > > > > > > > > > Hopefully one day we can fix ethdev callbacks too. > > > > > > > > The solution we develop can be used in ethdev too. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I think that forcing DP code to be aware that > > > > > > > > > > > callbacks are present or not and to modify its > > > > > > > > > > > behaviour depending on that nearly voids the purpose = of > having callbacks at all. > > > > > > > > > > > In that case DP can just invoke callback function > > > > > > > > > > > directly from it's > > > > > > > > > codepath . > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Note that now data-plane thread would have > > > > > > > > > > > > > > > to do that always > > > > > > > > > > > > > > > - even if there are now callbacks installed > > > > > > > > > > > > > > > for that cryptodev queue > > > > > > > > > > > right now. > > > > > > > > > > > > > > > All that changes behaviour of existing apps > > > > > > > > > > > > > > > and I presume would reduce adoption of that = fature. > > > > > > > > > > > > If I understand this correct, you are talking > > > > > > > > > > > > about a case where in the application might be > > > > > > > > > > > > registering/unregistering multiple times during > > > > > > > > > > > > its lifetime. In this case, yes, the application > > > > > > > > > > > > might be reporting the > > > > > > > > > > > quiescent state even when it has not registered the c= all backs. > > > > > > > > > > > But, it has the flexibility to not report it if it > > > > > > > > > > > implements additional > > > > > logic. > > > > > > > > > > > > Note that we are assuming that the application has > > > > > > > > > > > > to report quiescent state only for using callback f= unctions. > > > > > > > > > > > > Most probably the application has > > > > > > > > > > > other requirements to use RCU. > > > > > > > > > > > > Why not support what is done for ethdev call back > > > > > > > > > > > > functions along with > > > > > > > > > > > providing RCU method? > > > > > > > > > > > > > > > > > > > > > > > > > > There is always trade off involved! > > > > > > > > > > > > > > In the previous patch, you suggested that some > > > > > > > > > > > > > > lazy app may not free up the memory allocated b= y add cb. > > > > > > > > > > > > > > For such apps, this patch has sync mechanism > > > > > > > > > > > > > > with some additional cost of CP & DP > > > > > > > > > changes. > > > > > > > > > > > > > > > > > > > > > > > > > > Sigh, it is not about laziness of the app. > > > > > > > > > > > > > The problem with current ethedev cb mechanism > > > > > > > > > > > > > and yours > > > > > > > > > > > > > V1 (which was just a clone of it) - CP doesn't > > > > > > > > > > > > > know when it is safe after CB removal to free rel= ated > memory. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I still think all this callback mechanism > > > > > > > > > > > > > > > should be totally opaque to data-plane > > > > > > > > > > > > > > > threads - user shouldn't change his app code > > > > > > > > > > > > > > > depending on would some enqueue/dequeue > > > > > > > > > > > > > > > callbacks be > > > > > > > > > > > installed or not. > > > > > > > > > > > > > > I am not sure, how that can be implemented > > > > > > > > > > > > > > with existing RCU > > > > > > > > > design. > > > > > > > > > > > > > > > > > > > > > > > > > > As I said below the simplest way - with calling > > > > > > > > > > > > > rcu onine/offline inside CB invocation block. > > > > > > > > > > > > > That's why I asked you - did you try that > > > > > > > > > > > > > approach and what is the perf numbers? > > > > > > > > > > > > > I presume with no callbacks installed the perf > > > > > > > > > > > > > change should be nearly > > > > > > > > > > > zero. > > > > > > > > > > > > > > > > > > > > > > > > > > > @Honnappa Nagarahalli, Do you have any suggesti= ons? > > > > > > > > > > > > Reporting quiescent state in the call back > > > > > > > > > > > > functions has several > > > > > > > > > > > disadvantages: > > > > > > > > > > > > 1) it will have performance impacts and the > > > > > > > > > > > > impacts will increase as the > > > > > > > > > > > number of data plane threads increase. > > > > > > > > > > > > 2) It will require additional configuration > > > > > > > > > > > > parameters to control how often the quiescent > > > > > > > > > > > > state is reported to control the performance > > > > > > > > > impact. > > > > > > > > > > > > 3) Does not take advantage of the fact that most > > > > > > > > > > > > probably the application is using RCU already > > > > > > > > > > > > 4) There are few difficulties as well, please see b= elow. > > > > > > > > > > > > > > > > > > > > > > I suggested Abhinandan to use RCU library because it > > > > > > > > > > > is already there, and I thought it would be good not > > > > > > > > > > > to re-implement > > > > > the wheel. > > > > > > > > > > > Though if you feel librte_rcu doesn't match that > > > > > > > > > > > task - fine, let's do it without librte_rcu. > > > > > > > > > > > After all, what we need here - just an atomic ref > > > > > > > > > > > count per queue that we are going to increment at > > > > > > > > > > > entering and leaving list of callbacks inside enqueue= /dequeue. > > > > > > > > > > Ok, looks like I missed the point that a queue is used > > > > > > > > > > by a single data plane > > > > > > > > > thread. > > > > > > > > > > Along with ref count increment you need the memory > > > > > > > > > > orderings to avoid > > > > > > > > > race conditions. These will be the same ones used in RCU. > > > > > > > > > > On the control plane, you need to read this counter > > > > > > > > > > and poll for the > > > > > > > > > counter updates. All this is same cost as in RCU. > > > > > > > > > > > > > > > > > > Agree. > > > > > > > > > > > > > > > > > > > To control the cost, you will have to control the rate > > > > > > > > > > of quiescent state reporting and might have to > > > > > > > > > expose this as a configuration parameter. > > > > > > > > > > > > > > > > > > > > The other important information you have to consider > > > > > > > > > > is if the thread is making any blocking calls, which > > > > > > > > > > may be in some other library. The thread is supposed > > > > > > > > > > to call rcu_qsbr_thread_offline API before calling a > > > > > > > > > blocking call. This allows the RCU to know that this > > > > > > > > > particular thread will not report quiescent state. The > > > > > > > > > cryptodev library might not have > > > > > > > that information. > > > > > > > > > > > > > > > > > > > > If you want to go ahead with this design, you can > > > > > > > > > > still use RCU with single thread configuration (like > > > > > > > > > > you have mentioned > > > > > > > > > > below) and hide the > > > > > > > > > details from the application. > > > > > > > > > > > > > > > > > > Yes, same thought here - use rcu_qsbr online/offline > > > > > > > > > for DP part and hide actual sync details inside callback = code. > > > > > > > > We can give it a try. If we can have the performance > > > > > > > > numbers, we can decide about how to control how often > > > > > > > > these APIs are called on the data > > > > > > > plane. > > > > > > > > > > > > > > To avoid misunderstanding: I am talking about calling > > > > > > > online/offline with every > > > > > > > cryptodev_enqueue() traversal over CB list. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > That seems quite a big change and I > > > > > > > > > > > > > > > > > don't think it is acceptable for most use= rs. > > > > > > > > > > > > > > > > > From my perspective adding/installing > > > > > > > > > > > > > > > > > call-backs to the dev has to be opaque > > > > > > > > > > > > > > > > > to the data- > > > plane code. > > > > > > > > > > > > > > > > > Also note that different callbacks can > > > > > > > > > > > > > > > > > be installed by different entities > > > > > > > > > > > > > > > > > (libs) and might have no idea about each > > > > > > > > > other. > > > > > > > > > > > > > > > > > That's why I thought it would be better > > > > > > > > > > > > > > > > > to make all this RCU stuff internal insid= e cryptodev: > > > > > > > > > > > > > > > > > hide all this rcu_qsbr > > > > > > > > > > > > > > > > > allocation/setup inside cryptod somehow > > > > > > > > > > > > > > > > > to > > > > > > > > > > > > > > > obtain pointer to that rcu_qsbr ev > > > > > > > > > > > > > > > init/queue setup > > > > > > > > > > > > > > > > > invoke > > > > > > > > > > > > > > > > > rcu_qsbr_online()/rcu_qsbr_offline() > > > > > > > > > > > > > > > > > inside > > > > > > > > > > > > > > > cryptodev_enqueue(). > > > > > > > > > > > > This will bring in the application related > > > > > > > > > > > > information such as the thread ID > > > > > > > > > > > into the library. > > > > > > > > > > > > > > > > > > > > > > I don't think it would. > > > > > > > > > > > Cryptodev enqueue/dequeue functions are not supposed > > > > > > > > > > > to be thread safe (same as rx/tx burst). > > > > > > > > > > > So we can always use RCU with just one thread(thread_= id =3D 0). > > > > > > > > > > Agree, the memory that needs to be freed is accessed > > > > > > > > > > by a single thread > > > > > > > > > on the data plane. RCU with one thread would suffice. > > > > > > > > > > > > > > > > > > > > > But as I said above - if you feel RCU lib is an > > > > > > > > > > > overhead here, that's fine - I think it would be > > > > > > > > > > > easy enough to do without > > > > > librte_rcu. > > > > > > > > > > > > > > > > > > > > > > > If the same API calls are being made from multiple > > > > > > > > > > > > data plane threads, you need a way to configure > > > > > > > > > > > > that information to the library. So, it is better > > > > > > > > > > > > to leave those details for the application to > > > > > > > handle. > > > > > > > > > > > > > > > > > > > > > > > > > > > > I have already tried exploring above stuffs= . > > > > > > > > > > > > > > > > There are too many > > > > > > > > > > > > > constraints. > > > > > > > > > > > > > > > > The changes don't fit in, as per RCU design= . > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Hmm could you be more specific here - what > > > > > > > > > > > > > > > constraints are you referring to? > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > Moreover, having rcu api under > > > > > > > > > > > > > > > > enqueue_burst() will affect the > > > > > > > > > > > > > > > performance. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > It most likely will. Though my expectation > > > > > > > > > > > > > > > it will affect performance only when some > > > > > > > > > > > > > > > callbacks are installed. My thought > > > > > > > > > > > here: > > > > > > > > > > > > > > > callback function by itself will affect > > > > > > > > > > > > > > > cryptdev_enqueue performance anyway, > > > > > > > > > > > > > > With existing callback design, I have measured > > > > > > > > > > > > > > the performance(with > > > > > > > > > > > > > crypto perf test) on xeon. > > > > > > > > > > > > > > It was almost negligible and same was shared wi= th > Declan. > > > > > > > > > > > > > > > > > > > > > > > > > > I am asking about different thing: did you try > > > > > > > > > > > > > alternate approach I described, that wouldn't > > > > > > > > > > > > > require changes in the user data- > > > > > > > > > plane code. > > > > > > > > > > > > > > > > > > > > > > > > > > > That is one of the reasons, I didn't want to > > > > > > > > > > > > > > add to many stuffs in to the > > > > > > > > > > > > > callback. > > > > > > > > > > > > > > The best part of existing design is crypto lib > > > > > > > > > > > > > > is not much > > > > > modified. > > > > > > > > > > > > > > The changes are either pushed to CP or DP. > > > > > > > > > > > > > > > > > > > > > > > > > > > > so adding extra overhead for sync is probably o= k here. > > > > > > > > > > > > > > > > > > > > > > > > > > I think that extra overhead when callbacks are > > > > > > > > > > > > > present is expected and probably acceptable. > > > > > > > > > > > > > Changes in the upper-layer data-plane code - prob= ably not. > > > > > > > > > > > > > > > > > > > > > > > > > > > > Though for situation when no callbacks are > > > > > > > > > > > > > > > installed > > > > > > > > > > > > > > > - perfomance should be left unaffected (or > > > > > > > > > > > > > > > impact should be as small > > > > > > > > > > > as possible). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > The changes are more on control plane > > > > > > > > > > > > > > > > side, which is one > > > > > > > time. > > > > > > > > > > > > > > > > The data plane changes are minimal. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I still think upper layer data-plane code > > > > > > > > > > > > > > > should stay unaffected (zero changes). > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >