From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 63871A3218 for ; Mon, 21 Oct 2019 15:48:00 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id A536C1BF4B; Mon, 21 Oct 2019 15:47:59 +0200 (CEST) Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id B2EF91BF49 for ; Mon, 21 Oct 2019 15:47:57 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 21 Oct 2019 06:47:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.67,323,1566889200"; d="scan'208";a="222468366" Received: from irsmsx153.ger.corp.intel.com ([163.33.192.75]) by fmsmga004.fm.intel.com with ESMTP; 21 Oct 2019 06:47:54 -0700 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.252]) by IRSMSX153.ger.corp.intel.com ([169.254.9.119]) with mapi id 14.03.0439.000; Mon, 21 Oct 2019 14:47:54 +0100 From: "Ananyev, Konstantin" To: Akhil Goyal , "'dev@dpdk.org'" , "De Lara Guarch, Pablo" , 'Thomas Monjalon' , "Zhang, Roy Fan" , "Doherty, Declan" CC: 'Anoob Joseph' , Hemant Agrawal Thread-Topic: [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Thread-Index: AQHVYm4Y+AedzaNgY0qWMAmu5GwNVqcbQpEAgAArCICAAuADAIAARA4AgAYiKoCAAbQS0IABqquAgAZiqNCAAPA4gIABtviQgAu3KZCAAoImAIAExwxQgAA3zQCAAZ1E0IADFFEAgAZGHRCAAsIfgIAAUayAgAM4hICAB/TasIADCaEAgASiaLA= Date: Mon, 21 Oct 2019 13:47:53 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725801A8C6D6AC@IRSMSX104.ger.corp.intel.com> References: <20190903154046.55992-1-roy.fan.zhang@intel.com> <20190903154046.55992-2-roy.fan.zhang@intel.com> <9F7182E3F746AB4EA17801C148F3C6043369D686@IRSMSX101.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191926A17@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191962CD5@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191966116@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191966C23@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019196A767@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019196D53D@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019196F386@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019197206C@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019197446B@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB97725801A8C69C5E@IRSMSX104.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZTBjZGU4MjQtNWY1YS00NmNmLTgxN2MtMTExODhhNDBkMWJkIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoic0RzZTlvbjZnTm1KbngzXC9uYzl5Vis1U3pBeXE0NStqTTI5UGM5V29NQ1NJalExZjB0dzY0OXFXaVZVaEdsd3oifQ== x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Akhil, =20 > Added my comments inline with your draft. > > > > > > Hi Akhil, > > > > > > BTW, to be honest, I don't consider current rte_cryptodev_sym_sessi= on > > > > construct for multiple device_ids: > > > > __extension__ struct { > > > > void *data; > > > > uint16_t refcnt; > > > > } sess_data[0]; > > > > /**< Driver specific session material, variable size */ > > > > > > > Yes I also feel the same. I was also not in favor of this when it was= introduced. > > > Please go ahead and remove this. I have no issues with that. > > > > If you are not happy with that structure, and admit there are issues wi= th it, > > why do you push for reusing it for cpu-crypto API? > > Why not to take step back, take into account current drawbacks > > and define something that (hopefully) would suite us better? > > Again new API will be experimental for some time, so we'll > > have some opportunity to see does it works and if not fix it. >=20 > [Akhil] This structure is serving some use case which is agreed upon in t= he > Community, we cannot just remove a feature altogether. I understand that, but we don't suggest to remove anything that already her= e. We are talking about extending existing/adding new API. =20 All our debates around how much we can reuse from existing one and what new needs to be added. > Rather it is Intel's Use case only. >=20 > > > > About removing data[] from existing rte_cryptodev_sym_session - > > Personally would like to do that, but the change seems to be too massiv= e. > > Definitely not ready for such effort right now. > > >=20 > [snip].. >=20 > > > > Ok, then my suggestion: > > Let's at least write down all points about crypto-dev approach where we > > disagree and then probably try to resolve them one by one.... > > If we fail to make an agreement/progress in next week or so, > > (and no more reviews from the community) > > will have bring that subject to TB meeting to decide. > > Sounds fair to you? > Agreed > > > > List is below. > > Please add/correct me, if I missed something. > > > > Konstantin >=20 > Before going into comparison, we should define the requirement as well. Good point. > What I understood from the patchset, > "You need a synchronous API to perform crypto operations on raw data usin= g SW PMDs" > So, > - no crypto-ops, > - no separate enq-deq, only single process API for data path > - Do not need any value addition to the session parameters. > (You would need some parameters from the crypto-op which > Are constant per session and since you wont use crypto-op, > You need some place to store that) Yes, this is correct, I think. >=20 > Now as per your mail, the comparison > 1. extra input parameters to create/init rte_(cpu)_sym_session. >=20 > Will leverage existing 6B gap inside rte_crypto_*_xform between 'algo' an= d 'key' fields. > New fields will be optional and would be used by PMD only when cpu-crypto= session is requested. > For lksd-crypto session PMD is free to ignore these fields. > No ABI breakage is required. >=20 > [Akhil] Agreed, no issues. >=20 > 2. cpu-crypto create/init. > a) Our suggestion - introduce new API for that: > - rte_crypto_cpu_sym_init() that would init completely opaque rt= e_crypto_cpu_sym_session. > - struct rte_crypto_cpu_sym_session_ops {(*process)(...); (*clear= ); /*whatever else we'll need *'}; > - rte_crypto_cpu_sym_get_ops(const struct rte_crypto_sym_xform *x= forms) > that would return const struct rte_crypto_cpu_sym_session_ops *= based on input xforms. > Advantages: > 1) totally opaque data structure (no ABI breakages in future), PMD writ= er is totally free > with it format and contents. >=20 > [Akhil] It will have breakage at some point till we don't hit the union s= ize. Not sure, what union you are talking about? > Rather I don't suspect there will be more parameters added. > Or do we really care about the ABI breakage when the argument is about > the correct place to add a piece of code or do we really agree to add cod= e > anywhere just to avoid that breakage. I am talking about maintaining it in future. if your struct is not seen externally, no chances to introduce ABI breakage= .=20 >=20 > 2) each session entity is self-contained, user doesn't need to bring alo= ng dev_id etc. > dev_id is needed only at init stage, after that user will use sessi= on ops to perform > all operations on that session (process(), clear(), etc.). >=20 > [Akhil] There is nothing called as session ops in current DPDK. True, but it doesn't mean we can't/shouldn't have it. > What you are proposing > is a new concept which doesn't have any extra benefit, rather it is addin= g complexity > to have two different code paths for session create. >=20 >=20 > 3) User can decide does he wants to store ops[] pointer on a per session= basis, > or on a per group of same sessions, or... >=20 > [Akhil] Will the user really care which process API should be called from= the PMD. > Rather it should be driver's responsibility to store that in the session = private data > which would be opaque to the user. As per my suggestion same process func= tion can > be added to multiple sessions or a single session can be managed inside t= he PMD. In that case we either need to have a function per session (stored internal= ly), or make decision (branches) at run-time. But as I said in other mail - I am ok to add small shim structure here: either rte_crypto_cpu_sym_session { void *ses; struct rte_crypto_cpu_sym_se= ssion_ops ops; } or rte_crypto_cpu_sym_session { void *ses; struct rte_crypto_cpu_sym_sessio= n_ops *ops; }=20 And merge rte_crypto_cpu_sym_init() and rte_crypto_cpu_sym_get_ops() into o= ne (init). >=20 >=20 > 4) No mandatory mempools for private sessions. User can allocate memory = for cpu-crypto > session whenever he likes. >=20 > [Akhil] you mean session private data?=20 Yes. > You would need that memory anyways, user will be > allocating that already. You do not need to manage that. What I am saying - right now user has no choice but to allocate it via memp= ool. Which is probably not the best options for all cases. >=20 > Disadvantages: > 5) Extra changes in control path > 6) User has to store session_ops pointer explicitly. >=20 > [Akhil] More disadvantages: > - All supporting PMDs will need to maintain TWO types of session for the > same crypto processing. Suppose a fix or a new feature(or algo) is added,= PMD owner > will need to add code in both the session create APIs. Hence more mainten= ance and > error prone. I think majority of code for both paths will be common, plus even we'll reu= se current sym_session_init() - changes in PMD session_init() code will be unavoidable.=20 But yes, it will be new entry in devops, that PMD will have to support. Ok to add it as 7) to the list. > - Stacks which will be using these new APIs also need to maintain two > code path for the same processing while doing session initialization > for sync and async That's the same as #5 above, I think. >=20 >=20 > b) Your suggestion - reuse existing rte_cryptodev_sym_session_init()= and existing rte_cryptodev_sym_session > structure. > Advantages: > 1) allows to reuse same struct and init/create/clear() functions. > Probably less changes in control path. > Disadvantages: > 2) rte_cryptodev_sym_session. sess_data[] is indexed by driver_id, which= means that > we can't use the same rte_cryptodev_sym_session to hold private sess= ions pointers > for both sync and async mode for the same device. > So the only option we have - make PMD devops->sym_sess= ion_configure() > always create a session that can work in both cpu and lksd modes. > For some implementations that would probably mean that under the hoo= d PMD would create > 2 different session structs (sync/async) and then use one or another= depending on from what API been called. > Seems doable, but ...: > - will contradict with statement from 1: > " New fields will be optional and would be used by PMD only when c= pu-crypto session is requested." > Now it becomes mandatory for all apps to specify cp= u-crypto related parameters too, > even if they don't plan to use that mode - i.e. behavior change, = existing app change. > - might cause extra space overhead. >=20 > [Akhil] It will not contradict with #1, you will only have few checks in = the session init PMD > Which support this mode, find appropriate values and set the appropriate = process() in it. > User should be able to call, legacy enq-deq as well as the new process() = without any issue. > User would be at runtime will be able to change the datapath. > So this is not a disadvantage, it would be additional flexibility for the= user. Ok, but that's what I am saying - if PMD would *always* have to create a se= ssion that can handle both modes (sync/async), then user would *always* have to provide parameter= s for both modes too. Otherwise if let say user didn't setup sync specific parameters at all, wha= t PMD should do? - return with error? - init session that can be used with async path only? My current assumption is #1. If #2, then how user will be able to distinguish is that session valid for = both modes, or only for one?=20 >=20 >=20 > 3) not possible to store device (not driver) specific data within the se= ssion, but I think it is not really needed right now. > So probably minor compared to 2.b.2. >=20 > [Akhil] So lets omit this for current discussion. And I hope we can find = some way to deal with it. I don't think there is an easy way to fix that with existing API. >=20 >=20 > Actually #3 follows from #2, but decided to have them separated. >=20 > 3. process() parameters/behavior > a) Our suggestion: user stores ptr to session ops (or to (*process) i= tself) and just does: > session_ops->process(sess, ...); > Advantages: > 1) fastest possible execution path > 2) no need to carry on dev_id for data-path >=20 > [Akhil] I don't see any overhead of carrying dev id, at least it would be= inline with the > current DPDK methodology. If we'll add process() into rte_cryptodev itself (same as we have enqueue_b= urst/dequeue_burst), then it will be an ABI breakage. Also there are discussions to get rid of that approach completely: http://mails.dpdk.org/archives/dev/2019-September/144674.html So I am not sure this is a recommended way these days. > What you are suggesting is a new way to get the things done without much = benefit. Would help with ABI stability plus better performance, isn't it enough? > Also I don't see any performance difference as crypto workload is heavier= than > Code cycles, so that wont matter. It depends. Suppose function call costs you ~30 cycles. If you have burst of big packets (let say crypto for each will take ~2K cyc= les) that belong to the same session, then yes you wouldn't notice these extra 30 cycles at = all. If you have burst of small packets (let say crypto for each will take ~300 = cycles) each belongs to different session, then it will cost you ~10% extra. > So IMO, there is no advantage in your suggestion as well. >=20 >=20 > Disadvantages: > 3) user has to carry on session_ops pointer explicitly > b) Your suggestion: add (*cpu_process) inside rte_cryptodev_ops and = then: > rte_crypto_cpu_sym_process(uint8_t dev_id, rte_cryptodev_sym_sess= ion *sess, /*data parameters*/) {... > rte_cryptodevs[dev_id].dev_ops->cpu_process(ses, ...= ); > /*and then inside PMD specifc process: */ > pmd_private_session =3D sess->sess_data[this_pmd_dri= ver_id].data; > /* and then most likely either */ > pmd_private_session->process(pmd_private_session, ..= .); > /* or jump based on session/input data */ > Advantages: > 1) don't see any... > Disadvantages: > 2) User has to carry on dev_id inside data-path > 3) Extra level of indirection (plus data dependency) - both for data and= instructions. > Possible slowdown compared to a) (not measured). >=20 > Having said all this, if the disagreements cannot be resolved, you can go= for a pmd API specific > to your PMDs, I don't think it is good idea. PMD specific API is sort of deprecated path, also there is no clean way to = use it within the libraries. > because as per my understanding the solution doesn't look scalable to oth= er PMDs. > Your approach is aligned only to Intel , will not benefit others like ope= nssl which is used by all > vendors. I feel quite opposite, from my perspective majority of SW backed PMDs will = benefit from it. And I don't see anything Intel specific in my proposals above.=20 About openssl PMD: I am not an expert here, but looking at the code, I thin= k it will fit really well. Look yourself at its internal functions: process_openssl_auth_op/process_op= enssl_crypto_op, I think they doing exactly the same - they use sync API underneath, and the= y are session based (AFAIK you don't need any device/queue data, everything that needed for cry= pto/auth is stored inside session). Konstantin=20 =20