From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 4D487A04B6;
	Mon, 12 Oct 2020 15:04:27 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 21FF81D6FD;
	Mon, 12 Oct 2020 15:04:25 +0200 (CEST)
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by dpdk.org (Postfix) with ESMTP id BCED71D6FA
 for <dev@dpdk.org>; Mon, 12 Oct 2020 15:04:22 +0200 (CEST)
IronPort-SDR: 2SGtKQkRBEvFMCfeLO/nkrH+QDG/qgzq2Yl2eu4Xzl9ML19uohni557FTWKiV93puFzMFpTIpG
 XlYAG9zPqNcA==
X-IronPort-AV: E=McAfee;i="6000,8403,9771"; a="163095210"
X-IronPort-AV: E=Sophos;i="5.77,366,1596524400"; d="scan'208";a="163095210"
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga006.jf.intel.com ([10.7.209.51])
 by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 12 Oct 2020 06:02:27 -0700
IronPort-SDR: cMwNUU/X+tND5ughV7wxRtL00YzvH4cLjJ4RY7Y5zD8DvJb6Iqt5A/UyaPL7SOn57utIT6dPCF
 Uz0Is/xWkZjA==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.77,366,1596524400"; d="scan'208";a="317916950"
Received: from orsmsx606.amr.corp.intel.com ([10.22.229.19])
 by orsmga006.jf.intel.com with ESMTP; 12 Oct 2020 06:02:27 -0700
Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by
 ORSMSX606.amr.corp.intel.com (10.22.229.19) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.1713.5; Mon, 12 Oct 2020 06:02:27 -0700
Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by
 orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.1713.5
 via Frontend Transport; Mon, 12 Oct 2020 06:02:27 -0700
Received: from NAM04-CO1-obe.outbound.protection.outlook.com (104.47.45.59) by
 edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.1.1713.5; Mon, 12 Oct 2020 06:02:26 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=j6lz/CBWQN62wlDlHzR/DRE4cWP/WI9SnTp64pD98jjVMdTKBIa3taHsBqqAq0EFa118vdF1B07dzs79ccGrNYQ+8lfZZ1h4VSphV6FF+icVwgcqfQiyYinx2LXCvgvqHRRTR/t+moSaOQqRAt4menB1b8qZ2DZIUy65YSWjigO50VQlDcBykWRJ6dye3/Xo3oZWaNZUPyL+v2Ao/VG3D0vZA3VCdxKbJrk5IGl0kiqvbvGt5ZSsOf/fZ89JPplZ/VnGqKjfesNb37ODdkW1otGq2hSNP3xQXjo0s7xoKe3hRaDPE9SU9UC6AqIWP4HzHd/7/rDLTWdWah6PWfY9Kw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=g9BpLyphKdeiI6HrprWc/9VldghZ0aWCBORKLME/kYA=;
 b=lPnv0GoKsXAFNnp9yiVQBxJytRO4BFcZJUNBAoZ/OIbx+EvfSiRbtvM2umUrzsuW77NaKQnys2hrwj3THG+FqoTHmlHDWHHBcUG9oguDqgDzpSPiGn88FiU+mdcfDvrqt1ZqlXWseNhSB4WKL7o3WHMzvCgaW6BXLTA8gqdc9nII6sU9u0H+SFGVglonVIqUfUr/j4H/SB57ljYXHfhPnY3YFRpsnv6bXWwySksw/N0k1fZLXvdfJJTRnVsz7kSTjeTAutSwxbn9/+NxBmwF5TgAyFImsUq5DyXb8TXWNORtqMM70ISgLqPWOuqitt5hqhefltqW9wRhLZ60PG90cw==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; 
 s=selector2-intel-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=g9BpLyphKdeiI6HrprWc/9VldghZ0aWCBORKLME/kYA=;
 b=lwWPdsh1uRnAiD9/SmJkC7CtwxNVv9yYu9YxKHXZhFooi+mQSbstdeZ6r/jvtUg89rWeRPh9pSkEuhUrRYUkbRKwW5PKbxYkW9Jb9Z6/4BCOjNTAFoy0ruwE3M0yZIVBnNDSXkrDEFZoOZlD4aLMLFdjaWAVEEqR20EFMqzFjs4=
Received: from BYAPR11MB3301.namprd11.prod.outlook.com (2603:10b6:a03:7f::26)
 by BYAPR11MB3302.namprd11.prod.outlook.com (2603:10b6:a03:7b::21)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.3455.28; Mon, 12 Oct
 2020 13:02:23 +0000
Received: from BYAPR11MB3301.namprd11.prod.outlook.com
 ([fe80::f5a4:3f6b:ade3:296b]) by BYAPR11MB3301.namprd11.prod.outlook.com
 ([fe80::f5a4:3f6b:ade3:296b%3]) with mapi id 15.20.3455.029; Mon, 12 Oct 2020
 13:02:23 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Gujjar, Abhinandan S" <abhinandan.gujjar@intel.com>, Honnappa Nagarahalli
 <Honnappa.Nagarahalli@arm.com>, "dev@dpdk.org" <dev@dpdk.org>, "Doherty,
 Declan" <declan.doherty@intel.com>
CC: "jerinj@marvell.com" <jerinj@marvell.com>, "Akhil.goyal@nxp.com"
 <akhil.goyal@nxp.com>, "Vangati, Narender" <narender.vangati@intel.com>, nd
 <nd@arm.com>, nd <nd@arm.com>
Thread-Topic: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback functions
Thread-Index: AQHWhsTbY+pAzzKww0euN2orABmUValrRWzAgAAl9gCAAW0pMIAGJnUAgAAH/lCAAp3uAIAAgn/ggADASgCAASzNQIAHHMUAgAA/FzCADrUggIABX7xwgAAmGoCAA4qBcIAAg8GAgABl8oA=
Date: Mon, 12 Oct 2020 13:02:23 +0000
Message-ID: <BYAPR11MB33018C7BE9176F515691ABB49A070@BYAPR11MB3301.namprd11.prod.outlook.com>
References: <1599549024-195051-1-git-send-email-abhinandan.gujjar@intel.com>
 <BYAPR11MB3301233DDE876BF3D19463909A210@BYAPR11MB3301.namprd11.prod.outlook.com>
 <MWHPR11MB183809AE6FF88ED2F2A6C252E8210@MWHPR11MB1838.namprd11.prod.outlook.com>
 <BYAPR11MB3301D3074A678BB10B1FE7AF9A3E0@BYAPR11MB3301.namprd11.prod.outlook.com>
 <MWHPR11MB1838DAEE401C006C913238C9E83A0@MWHPR11MB1838.namprd11.prod.outlook.com>
 <BYAPR11MB3301D8480B010A152FB134159A3A0@BYAPR11MB3301.namprd11.prod.outlook.com>
 <DBAPR08MB581470913C8FEE2C30FFB28698380@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <DM6PR11MB330896B2863AEA0F2D24B4B39A380@DM6PR11MB3308.namprd11.prod.outlook.com>
 <DBAPR08MB581492EEC7314057462689C298380@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <BYAPR11MB3301FFC21A26E5A37A9921219A390@BYAPR11MB3301.namprd11.prod.outlook.com>
 <DBAPR08MB581477EA5ADBB7D16BCCAC9498320@DBAPR08MB5814.eurprd08.prod.outlook.com>
 <BYAPR11MB3301C7055A8032649A55A6DA9A320@BYAPR11MB3301.namprd11.prod.outlook.com>
 <MWHPR11MB18389204DD0D50F8D17D888AE80B0@MWHPR11MB1838.namprd11.prod.outlook.com>
 <BYAPR11MB3301CB4401FF52F3E8737C229A080@BYAPR11MB3301.namprd11.prod.outlook.com>
 <MWHPR11MB183850366615C3D379A4ED34E8080@MWHPR11MB1838.namprd11.prod.outlook.com>
 <BYAPR11MB33018F892F6D56FD523E22D49A060@BYAPR11MB3301.namprd11.prod.outlook.com>
 <MWHPR11MB1838238F63C3C4E15B1553CCE8070@MWHPR11MB1838.namprd11.prod.outlook.com>
In-Reply-To: <MWHPR11MB1838238F63C3C4E15B1553CCE8070@MWHPR11MB1838.namprd11.prod.outlook.com>
Accept-Language: en-GB, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.5.1.3
authentication-results: intel.com; dkim=none (message not signed)
 header.d=none;intel.com; dmarc=none action=none header.from=intel.com;
x-originating-ip: [46.7.39.127]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 59c2c180-09c1-4713-1d12-08d86eaf0fc9
x-ms-traffictypediagnostic: BYAPR11MB3302:
x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <BYAPR11MB33027578AC1E5E1CF5028C9B9A070@BYAPR11MB3302.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: gCqyX2sS3HqDS3zvpjgwVFjkMJZfZURstDVw+48MkRo7CLAxofV6/1MVB5FnmoQV9AXlAEuUMZtuaUviUIvpFxqnzPlIo9UPV7YeENH2hIRMzsMIRIKsgbRbvynrWQSENZMYk3HCptNXdMpr7IoVTbWi51epicLxkeQX1/vvybvvo3/CMi4Xuubj39ZOxx0J+am0DG7H6XTUJSSMLmicPbHQa/Uw68nzPk+vKTZkz103T+DnG2i/qRY4JW3AbJx8HPp3uID5Hn384wXptHXM5rmAHZcWIpjg/c8MjUf3lwhsrPvP3TQ83oBnF7p08nIYBVYiUjs9Lo4MNOz/j6M/vXuaXvHtyN9SVn0hxmI4OGD+qWhXaV2fnopOUFmb97MVgNelJbJ9w9MaIJ6v7G8obg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BYAPR11MB3301.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(4636009)(396003)(346002)(376002)(136003)(366004)(39860400002)(9686003)(2906002)(52536014)(478600001)(83080400001)(71200400001)(5660300002)(316002)(8676002)(55016002)(83380400001)(186003)(110136005)(6636002)(86362001)(66556008)(66476007)(66946007)(76116006)(966005)(53546011)(54906003)(8936002)(30864003)(33656002)(4326008)(26005)(7696005)(66446008)(6506007)(64756008)(559001)(579004);
 DIR:OUT; SFP:1102; 
x-ms-exchange-antispam-messagedata: pa2sSBa3NbOVwW65VN68pT7JAYUiG4qvnCN4+CBr2q81nQ5Ak1eW5wSrJs6MmjjSz5BrUbq3bYIh79j3OTMz0Y5ZG03JSuso89JYNoHyLHD694ugklGSy24XrqrJDumgfnATMDP8ATp0ks7TP9LMjx+TGMpWzQsc92XhYs1PCLiJhKrmEhn8GFt0u41AW+YWin24Q6PHKRfyKQ+siYo2hCNsIuglZHTp8sPFCnV3ZLXPoL/1sEFmsQlD4FMg42wF0vF67WVKlR3ERVuVxtk/DmFpGOBOh1jcVQmC3K4VKygidN/iFDNxZCq/2EhoB00dLKpIZqZ8g7MVGYc2Yq3diUg7hrSAqSrWdNoKjk07wLy7qi/SI80pH5lkKYLPWwznEWTn04uK16KC+vHeYTZKaxfxAK63O1xp9v1g1ASn4J3DZCFkd/crDnPp7jMg1dHoROlBFiJfyvZYcIoQ1+f+UUSsY81DmID56NdFrgESf6RO00C7yN8kgBjzgIyLQ2VGUy7IghpH9X2nnAUVX+LjfKlPTz5iTT8iqnn7Ao30qsb263y7HX1abhuldwUzWd6nyGfzRzWk9wsx+fJdHfaQig220c4B6/AUI0X4Gctf6CcNXREdIl0uK/QvrpzIdut+KAOuNx5/1AkeDDvWaAtEXA==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BYAPR11MB3301.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 59c2c180-09c1-4713-1d12-08d86eaf0fc9
X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Oct 2020 13:02:23.5056 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: /sq+ZOzgb9aXG5aNUzgYITD43LIzFzEsuw5sfyY4I0UNk1wNQNzFAdqKE9IzrW/oC4qATAnI+VgQMLR05gJCI0fzlecUkJQsdcKVLBaOnN0=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3302
X-OriginatorOrg: intel.com
Subject: Re: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback
	functions
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>


Hi Abhinandan,
=20
> Hi Konstantin,
>=20
> > -----Original Message-----
> > From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > Sent: Monday, October 12, 2020 4:33 AM
> > To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Honnappa
> > Nagarahalli <Honnappa.Nagarahalli@arm.com>; dev@dpdk.org; Doherty,
> > Declan <declan.doherty@intel.com>
> > Cc: jerinj@marvell.com; Akhil.goyal@nxp.com; Vangati, Narender
> > <narender.vangati@intel.com>; nd <nd@arm.com>; nd <nd@arm.com>
> > Subject: RE: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callback fu=
nctions
> >
> >
> >
> > Hi Abhinandan,
> >
> > >
> > > Hi Konstantin,
> > >
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > Sent: Friday, October 9, 2020 8:10 PM
> > > > To: Gujjar, Abhinandan S <abhinandan.gujjar@intel.com>; Honnappa
> > > > Nagarahalli <Honnappa.Nagarahalli@arm.com>; dev@dpdk.org; Doherty,
> > > > Declan <declan.doherty@intel.com>
> > > > Cc: jerinj@marvell.com; Akhil.goyal@nxp.com; Vangati, Narender
> > > > <narender.vangati@intel.com>; nd <nd@arm.com>; nd <nd@arm.com>
> > > > Subject: RE: [dpdk-dev] [v2 1/2] cryptodev: support enqueue callbac=
k
> > > > functions
> > > >
> > > > Hi Abhinandan,
> > > >
> > > > >
> > > > > Hi Konstantin & Honnappa,
> > > > >
> > > > > Thanks for all the inputs and feedback.
> > > > >
> > > > > @Ananyev, Konstantin,
> > > > > I have measured the perf with and without callback on xeon. Here
> > > > > are the
> > > > numbers:
> > > > >
> > > > > ./app/dpdk-test-crypto-perf -l 6-7
> > > > > --vdev=3D"crypto_openssl0,socket_id=3D0,max_nb_sessions=3D128" --
> > > > > --ptest throughput --devtype crypto_openssl --optype
> > > > > cipher-then-auth --cipher-algo aes-cbc --cipher-op encrypt
> > > > > --cipher-key-sz 16 --auth-algo sha1-hmac --auth-op generate
> > > > > --auth-key-sz 64 --digest-sz
> > > > > 12 --total-ops 10000000 --burst-sz 32 --buffer-sz 64
> > > > >
> > > > > With callback(+ RCU - totally opaque to data-plane threads)
> > > > >     lcore id    Buf Size  Burst Size    Enqueued    Dequeued  Fai=
led Enq
> > Failed
> > > > Deq        MOps        Gbps  Cycles/Buf
> > > > >            7          64          32                10000000    1=
0000000           0           0
> > > > 0.8129      0.4162     2694.09
> > > > >            7          64          32                10000000    1=
0000000           0           0
> > > > 0.8152      0.4174     2686.31
> > > > >            7          64          32                10000000    1=
0000000           0           0
> > > > 0.8198      0.4197     2671.48
> > > > >
> > > > > Without callback:
> > > > >     lcore id    Buf Size  Burst Size    Enqueued    Dequeued  Fai=
led Enq
> > Failed
> > > > Deq        MOps        Gbps  Cycles/Buf
> > > > >
> > > > >            7          64          32               10000000    10=
000000           0           0
> > > > 0.8234      0.4216     2659.81
> > > > >            7          64          32               10000000    10=
000000           0           0
> > > > 0.8247      0.4222     2655.63
> > > > >            7          64          32               10000000    10=
000000           0           0
> > > > 0.8123      0.4159     2695.90
> > > >
> > > >
> > > > Just to cofirm:
> > > > You implemented crypto enqueue callbacks using RCU QSBR online
> > > > /offline (as suggested below) and numbers above are for:
> > > > 1) callback code in place and some dummy callback installed
> > > That's right. (+ RCU calling online + offline APIs inside
> > > rte_cryptodev_enqueue_burst())
> > > > 2) callback code in place but no callbacks installed
> > > No callback code. i.e. Original code.
> >
> > Ok, and if I get things right - difference between mean values is ~15 c=
ycles?
> Yes. May be, number are more stable on isolated core. Let's consider wors=
t case too.

Ok.

> > That's seems like very good result to me.
> > Can I suggest to run one more test for your new callback code in place,=
 but no
> > actual callbacks installed?
>     lcore id    Buf Size  Burst Size    Enqueued    Dequeued  Failed Enq =
 Failed Deq        MOps        Gbps  Cycles/Buf
>=20
>            7          64          32    10000000    10000000           0 =
          0      0.8220      0.4209     2664.12
>            7          64          32    10000000    10000000           0 =
          0      0.8245      0.4221     2656.14
>            7          64          32    10000000    10000000           0 =
          0      0.8261      0.4229     2651.15

So, if I can read numbers properly for not-armed callback impact is neglect=
able.
It is hard to say much without seeing the actual code, but from the numbers=
 above,
I think it is a good result and we can go ahead with that approach.
Honnappa, Akhil, Jerin do you have any objections to such approach in princ=
iple?
Konstantin

> > Thanks
> > Konstantin
> >
> > > >
> > > > Is my understanding correct here?
> > > > Thanks
> > > > Konstantin
> > > >
> > > >
> > > > >
> > > > > Regards
> > > > > Abhinandan
> > > > >
> > > > > > -----Original Message-----
> > > > > > From: Ananyev, Konstantin <konstantin.ananyev@intel.com>
> > > > > > Sent: Tuesday, September 29, 2020 2:33 PM
> > > > > > To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>; Gujjar=
,
> > > > > > Abhinandan S <abhinandan.gujjar@intel.com>; dev@dpdk.org;
> > > > > > Doherty, Declan <declan.doherty@intel.com>
> > > > > > Cc: jerinj@marvell.com; Akhil.goyal@nxp.com; Vangati, Narender
> > > > > > <narender.vangati@intel.com>; nd <nd@arm.com>; nd <nd@arm.com>
> > > > > > Subject: RE: [dpdk-dev] [v2 1/2] cryptodev: support enqueue
> > > > > > callback functions
> > > > > >
> > > > > >
> > > > > > >
> > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > +#ifdef RTE_CRYPTODEV_CALLBACKS int
> > > > > > > > > > > > > > > > > +rte_cryptodev_rcu_qsbr_add(uint8_t
> > > > > > > > > > > > > > > > > +dev_id, struct rte_rcu_qsbr
> > > > > > > > > > > > > > > > > +*qsbr) {
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +	struct rte_cryptodev *dev;
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +	if (!rte_cryptodev_pmd_is_valid_dev(dev=
_id)) {
> > > > > > > > > > > > > > > > > +		CDEV_LOG_ERR("Invalid dev_id=3D%"
> > > > PRIu8,
> > > > > > > > dev_id);
> > > > > > > > > > > > > > > > > +		return -EINVAL;
> > > > > > > > > > > > > > > > > +	}
> > > > > > > > > > > > > > > > > +
> > > > > > > > > > > > > > > > > +	dev =3D &rte_crypto_devices[dev_id];
> > > > > > > > > > > > > > > > > +	dev->qsbr =3D qsbr;
> > > > > > > > > > > > > > > > > +	return 0; }
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > So if I understand your patch correctly you
> > > > > > > > > > > > > > > > propose a new working model for
> > > > > > > > > > > > > > > > crypto-devs:
> > > > > > > > > > > > > > > > 1. Control-plane has to allocate/setup
> > > > > > > > > > > > > > > > rcu_qsbr and do rte_cryptodev_rcu_qsbr_add(=
).
> > > > > > > > > > > > > > > > 2. Data-plane has somehow to obtain pointer
> > > > > > > > > > > > > > > > to that rcu_qsbr and wrap
> > > > > > > > > > > > > > > > cryptodev_enqueue()
> > > > > > > > > > > > > > > >    with rcu_qsbr_quiescent()  or
> > > > > > > > > > rcu_qsbr_online()/rcu_qsbr_offline().
> > > > > > > > > > > > > > > Yes. I think, it is not a new model. It is
> > > > > > > > > > > > > > > same as RCU integration with
> > > > > > > > > > > > LPM.
> > > > > > > > > > > > > > > Please refer:
> > > > > > > > > > > > > > > https://patches.dpdk.org/cover/73673/
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am talking about new working model for
> > > > > > > > > > > > > > crypto-dev
> > > > > > > > > > enqueue/dequeue.
> > > > > > > > > > > > > > As I said above now it becomes data-plane threa=
d
> > > > > > > > > > > > > > responsibility
> > > > > > to:
> > > > > > > > > > > > > >  -somehow to obtain pointer to that rcu_qsbr fo=
r
> > > > > > > > > > > > > > each cryptodev it is
> > > > > > > > > > > > using.
> > > > > > > > > > > > > >  -call rcu sync functions
> > > > > > > > > > > > > > (quiescent/online/offline) on a regular
> > > > > > > > basis.
> > > > > > > > > > > > > It is not on regular basis. When data plane comes
> > > > > > > > > > > > > up, they report
> > > > > > > > online.
> > > > > > > > > > > > > They report quiescent when they are done with
> > > > > > > > > > > > > critical section or shared
> > > > > > > > > > > > structure.
> > > > > > > > > > > >
> > > > > > > > > > > > I understand that, but it means all existing apps
> > > > > > > > > > > > have to be changed that
> > > > > > > > > > way.
> > > > > > > > > > > >
> > > > > > > > > > > > > All though, there is some dataplane changes
> > > > > > > > > > > > > involved here, I don't think, it
> > > > > > > > > > > > is major.
> > > > > > > > > > > >
> > > > > > > > > > > > I still think our goal here should be to make no
> > > > > > > > > > > > visible changes to the dataplane.
> > > > > > > > > > > > I.E. all necessary data-plane changes need to be
> > > > > > > > > > > > hidden inside CB invocation part.
> > > > > > > > > > > Please note that this is being implemented using the
> > > > > > > > > > > memory reclamation framework documented at
> > > > > > > > > > > https://doc.dpdk.org/guides/prog_guide/rcu_lib.html#r=
e
> > > > > > > > > > > sour
> > > > > > > > > > > ce-r
> > > > > > > > > > > ecla
> > > > > > > > > > > mati
> > > > > > > > > > > on-framework-for-dpdk
> > > > > > > > > > >
> > > > > > > > > > > While using RCU there are couple of trade-offs that
> > > > > > > > > > > applications have to
> > > > > > > > > > consider:
> > > > > > > > > > > 1) Performance - reporting the quiescent state too
> > > > > > > > > > > often results in performance impact on data plane
> > > > > > > > > > > 2) Amount of outstanding memory to reclaim - reportin=
g
> > > > > > > > > > > less often results in more outstanding memory to
> > > > > > > > > > > reclaim
> > > > > > > > > > >
> > > > > > > > > > > Hence, the quiescent state reporting is left to the a=
pplication.
> > > > > > > > > > > The application decides how often it reports the
> > > > > > > > > > > quiescent state and has control
> > > > > > > > > > over the data plane performance and the outstanding
> > > > > > > > > > memory to
> > > > > > reclaim.
> > > > > > > > > > >
> > > > > > > > > > > When you say "new working model for crypto-dev
> > > > > > > > > > > enqueue/dequeue",
> > > > > > > > > > >
> > > > > > > > > > > 1) are you comparing these with existing crypto-dev
> > > > > > > > > > > enqueue/dequeue
> > > > > > > > > > APIs? If yes, these are new APIs, it is not breaking an=
ything.
> > > > > > > > > > > 2) are you comparing these with existing call back
> > > > > > > > > > > functions in ethdev enqueue/dequeue APIs? If yes,
> > > > > > > > > > > agree that this is a new model. But, it is
> > > > > > > > > > possible to support what ethdev supports along with the
> > > > > > > > > > RCU method used in this patch.
> > > > > > > > > >
> > > > > > > > > > What I am talking about:
> > > > > > > > > >
> > > > > > > > > > Existing cryptodev enqueue/dequeue model doesn't requir=
e
> > > > > > > > > > for the user to manage any RCU QSBR state manually.
> > > > > > > > > > I believe that addition of ability to add/remove
> > > > > > > > > > enqueue/dequeue callbacks shouldn't change existing
> > > > > > > > > > working
> > > > model.
> > > > > > > > > > I think that adding/removing such callbacks has to be
> > > > > > > > > > opaque to the user DP code and shouldn't require user t=
o change
> > it.
> > > > > > > > > > Same as we have now for ethdev callback implementation.
> > > > > > > > > The ethdev callback implementation conveniently leaves th=
e
> > > > > > > > > problem of
> > > > > > > > freeing memory to the user to resolve, it does not handle t=
he issue.
> > > > > > > > > Hence, it "looks" to be opaque to the DP code. However, i=
f
> > > > > > > > > the application has to implement a safe way to free the
> > > > > > > > > call back memory, its
> > > > > > > > DP is affected based on call backs are being used or not.
> > > > > > > >
> > > > > > > > Yes, I think that's big drawback in initial ethdev callback
> > > > > > > > implementation - it simply ignores DP/CP sync problem compl=
etely.
> > > > > > > > Though I think it is possible to have both here:
> > > > > > > >  keep callback "opaque" to DP code and provide some sync
> > > > > > > > mechanism between DP/CP.
> > > > > > > > Hopefully one day we can fix ethdev callbacks too.
> > > > > > > The solution we develop can be used in ethdev too.
> > > > > > >
> > > > > > > >
> > > > > > > > > > I think that forcing DP code to be aware that callbacks
> > > > > > > > > > are present or not and to modify its behaviour dependin=
g
> > > > > > > > > > on that nearly voids the purpose of having callbacks at=
 all.
> > > > > > > > > > In that case DP can just invoke callback function
> > > > > > > > > > directly from it's
> > > > > > > > codepath .
> > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Note that now data-plane thread would have to d=
o
> > > > > > > > > > > > > > that always
> > > > > > > > > > > > > > - even if there are now callbacks installed for
> > > > > > > > > > > > > > that cryptodev queue
> > > > > > > > > > right now.
> > > > > > > > > > > > > > All that changes behaviour of existing apps and
> > > > > > > > > > > > > > I presume would reduce adoption of  that fature=
.
> > > > > > > > > > > If I understand this correct, you are talking about a
> > > > > > > > > > > case where in the application might be
> > > > > > > > > > > registering/unregistering multiple times during its
> > > > > > > > > > > lifetime. In this case, yes, the application might be
> > > > > > > > > > > reporting the
> > > > > > > > > > quiescent state even when it has not registered the cal=
l backs.
> > > > > > > > > > But, it has the flexibility to not report it if it
> > > > > > > > > > implements additional
> > > > logic.
> > > > > > > > > > > Note that we are assuming that the application has to
> > > > > > > > > > > report quiescent state only for using callback functi=
ons.
> > > > > > > > > > > Most probably the application has
> > > > > > > > > > other requirements to use RCU.
> > > > > > > > > > > Why not support what is done for ethdev call back
> > > > > > > > > > > functions along with
> > > > > > > > > > providing RCU method?
> > > > > > > > > > >
> > > > > > > > > > > > > There is always trade off involved!
> > > > > > > > > > > > > In the previous patch, you suggested that some
> > > > > > > > > > > > > lazy app may not free up the memory allocated by =
add cb.
> > > > > > > > > > > > > For such apps, this patch has sync mechanism with
> > > > > > > > > > > > > some additional cost of CP & DP
> > > > > > > > changes.
> > > > > > > > > > > >
> > > > > > > > > > > > Sigh, it is not about laziness of the app.
> > > > > > > > > > > > The problem with current ethedev cb mechanism and
> > > > > > > > > > > > yours
> > > > > > > > > > > > V1 (which was just a clone of it) - CP doesn't know
> > > > > > > > > > > > when it is safe after CB removal to free related me=
mory.
> > > > > > > > > > > >
> > > > > > > > > > > > > > I still think all this callback mechanism shoul=
d
> > > > > > > > > > > > > > be totally opaque to data-plane threads - user
> > > > > > > > > > > > > > shouldn't change his app code depending on woul=
d
> > > > > > > > > > > > > > some enqueue/dequeue callbacks be
> > > > > > > > > > installed or not.
> > > > > > > > > > > > > I am not sure, how that can be implemented with
> > > > > > > > > > > > > existing RCU
> > > > > > > > design.
> > > > > > > > > > > >
> > > > > > > > > > > > As I said below the simplest way - with calling rcu
> > > > > > > > > > > > onine/offline inside CB invocation block.
> > > > > > > > > > > > That's why I asked you - did you try that approach
> > > > > > > > > > > > and what is the perf numbers?
> > > > > > > > > > > > I presume with no callbacks installed the perf
> > > > > > > > > > > > change should be nearly
> > > > > > > > > > zero.
> > > > > > > > > > > >
> > > > > > > > > > > > > @Honnappa Nagarahalli, Do you have any suggestion=
s?
> > > > > > > > > > > Reporting quiescent state in the call back functions
> > > > > > > > > > > has several
> > > > > > > > > > disadvantages:
> > > > > > > > > > > 1) it will have performance impacts and the impacts
> > > > > > > > > > > will increase as the
> > > > > > > > > > number of data plane threads increase.
> > > > > > > > > > > 2) It will require additional configuration parameter=
s
> > > > > > > > > > > to control how often the quiescent state is reported
> > > > > > > > > > > to control the performance
> > > > > > > > impact.
> > > > > > > > > > > 3) Does not take advantage of the fact that most
> > > > > > > > > > > probably the application is using RCU already
> > > > > > > > > > > 4) There are few difficulties as well, please see bel=
ow.
> > > > > > > > > >
> > > > > > > > > > I suggested Abhinandan to use RCU library because it is
> > > > > > > > > > already there, and I thought it would be good not to
> > > > > > > > > > re-implement
> > > > the wheel.
> > > > > > > > > > Though if you feel librte_rcu doesn't match that task -
> > > > > > > > > > fine, let's do it without librte_rcu.
> > > > > > > > > > After all, what we need here - just an atomic ref count
> > > > > > > > > > per queue that we are going to increment at entering an=
d
> > > > > > > > > > leaving list of callbacks inside enqueue/dequeue.
> > > > > > > > > Ok, looks like I missed the point that a queue is used by
> > > > > > > > > a single data plane
> > > > > > > > thread.
> > > > > > > > > Along with ref count increment you need the memory
> > > > > > > > > orderings to avoid
> > > > > > > > race conditions. These will be the same ones used in RCU.
> > > > > > > > > On the control plane, you need to read this counter and
> > > > > > > > > poll for the
> > > > > > > > counter updates. All this is same cost as in RCU.
> > > > > > > >
> > > > > > > > Agree.
> > > > > > > >
> > > > > > > > > To control the cost, you
> > > > > > > > > will have to control the rate of quiescent state reportin=
g
> > > > > > > > > and might have to
> > > > > > > > expose this as a configuration parameter.
> > > > > > > > >
> > > > > > > > > The other important information you have to consider is i=
f
> > > > > > > > > the thread is making any blocking calls, which may be in
> > > > > > > > > some other library. The thread is supposed to call
> > > > > > > > > rcu_qsbr_thread_offline API before calling a
> > > > > > > > blocking call. This allows the RCU to know that this
> > > > > > > > particular thread will not report quiescent state. The
> > > > > > > > cryptodev library might not have
> > > > > > that information.
> > > > > > > > >
> > > > > > > > > If you want to go ahead with this design, you can still
> > > > > > > > > use RCU with single thread configuration (like you have
> > > > > > > > > mentioned
> > > > > > > > > below) and hide the
> > > > > > > > details from the application.
> > > > > > > >
> > > > > > > > Yes,  same thought here -  use rcu_qsbr online/offline for
> > > > > > > > DP part and hide actual sync details inside callback code.
> > > > > > > We can give it a try. If we can have the performance numbers,
> > > > > > > we can decide about how to control how often these APIs are
> > > > > > > called on the data
> > > > > > plane.
> > > > > >
> > > > > > To avoid misunderstanding: I am talking about calling
> > > > > > online/offline with every
> > > > > > cryptodev_enqueue() traversal over CB list.
> > > > > >
> > > > > > >
> > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > That seems quite a big change and I don't
> > > > > > > > > > > > > > > > think it is acceptable for most users.
> > > > > > > > > > > > > > > > From my perspective adding/installing
> > > > > > > > > > > > > > > > call-backs to the dev has to be opaque to t=
he data-
> > plane code.
> > > > > > > > > > > > > > > > Also note that different callbacks can be
> > > > > > > > > > > > > > > > installed by different entities (libs) and
> > > > > > > > > > > > > > > > might have no idea about each
> > > > > > > > other.
> > > > > > > > > > > > > > > > That's why I thought it would be better to
> > > > > > > > > > > > > > > > make all this RCU stuff internal inside cry=
ptodev:
> > > > > > > > > > > > > > > >     hide all this rcu_qsbr allocation/setup
> > > > > > > > > > > > > > > > inside cryptod somehow to
> > > > > > > > > > > > > > obtain pointer to that rcu_qsbr ev init/queue
> > > > > > > > > > > > > > setup
> > > > > > > > > > > > > > > >     invoke
> > > > > > > > > > > > > > > > rcu_qsbr_online()/rcu_qsbr_offline()
> > > > > > > > > > > > > > > > inside
> > > > > > > > > > > > > > cryptodev_enqueue().
> > > > > > > > > > > This will bring in the application related informatio=
n
> > > > > > > > > > > such as the thread ID
> > > > > > > > > > into the library.
> > > > > > > > > >
> > > > > > > > > > I don't think it would.
> > > > > > > > > > Cryptodev enqueue/dequeue functions are not supposed to
> > > > > > > > > > be thread safe (same as rx/tx burst).
> > > > > > > > > > So we can always use RCU with just one thread(thread_id=
 =3D 0).
> > > > > > > > > Agree, the memory that needs to be freed is accessed by a
> > > > > > > > > single thread
> > > > > > > > on the data plane. RCU with one thread would suffice.
> > > > > > > > >
> > > > > > > > > > But as I said above - if you feel RCU lib is an overhea=
d
> > > > > > > > > > here, that's fine - I think it would be easy enough to
> > > > > > > > > > do without
> > > > librte_rcu.
> > > > > > > > > >
> > > > > > > > > > > If the same API calls are being made from multiple
> > > > > > > > > > > data plane threads, you need a way to configure that
> > > > > > > > > > > information to the library. So, it is better to leave
> > > > > > > > > > > those details for the application to
> > > > > > handle.
> > > > > > > > > > >
> > > > > > > > > > > > > > > I have already tried exploring above stuffs.
> > > > > > > > > > > > > > > There are too many
> > > > > > > > > > > > constraints.
> > > > > > > > > > > > > > > The changes don't fit in, as per RCU design.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Hmm could you be more specific here - what
> > > > > > > > > > > > > > constraints are you referring to?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Moreover, having rcu api under enqueue_burst(=
)
> > > > > > > > > > > > > > > will affect the
> > > > > > > > > > > > > > performance.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > It most likely will. Though my expectation it
> > > > > > > > > > > > > > will affect performance only when some callback=
s
> > > > > > > > > > > > > > are installed. My thought
> > > > > > > > > > here:
> > > > > > > > > > > > > > callback function by itself will affect
> > > > > > > > > > > > > > cryptdev_enqueue performance anyway,
> > > > > > > > > > > > > With existing callback design, I have measured th=
e
> > > > > > > > > > > > > performance(with
> > > > > > > > > > > > crypto perf test) on xeon.
> > > > > > > > > > > > > It was almost negligible and same was shared with=
 Declan.
> > > > > > > > > > > >
> > > > > > > > > > > > I am asking about different thing: did you try
> > > > > > > > > > > > alternate approach I described, that wouldn't
> > > > > > > > > > > > require changes in the user data-
> > > > > > > > plane code.
> > > > > > > > > > > >
> > > > > > > > > > > > > That is one of the reasons, I didn't want to add
> > > > > > > > > > > > > to many stuffs in to the
> > > > > > > > > > > > callback.
> > > > > > > > > > > > > The best part of existing design is crypto lib is
> > > > > > > > > > > > > not much
> > > > modified.
> > > > > > > > > > > > > The changes are either pushed to CP or DP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > so adding extra overhead for sync is probably ok =
here.
> > > > > > > > > > > >
> > > > > > > > > > > > I think that extra overhead when callbacks are
> > > > > > > > > > > > present is expected and probably acceptable.
> > > > > > > > > > > > Changes in the upper-layer data-plane code - probab=
ly not.
> > > > > > > > > > > >
> > > > > > > > > > > > > > Though for situation when no callbacks are
> > > > > > > > > > > > > > installed
> > > > > > > > > > > > > > - perfomance should be left unaffected (or
> > > > > > > > > > > > > > impact should be as small
> > > > > > > > > > as possible).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The changes are more on control plane side,
> > > > > > > > > > > > > > > which is one
> > > > > > time.
> > > > > > > > > > > > > > > The data plane changes are minimal.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I still think upper layer data-plane code shoul=
d
> > > > > > > > > > > > > > stay unaffected (zero changes).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > >
> > > > > > > > > <snip>