From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id ED041A00BE; Wed, 30 Oct 2019 15:23:23 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id F16D01BFC7; Wed, 30 Oct 2019 15:23:22 +0100 (CET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id CF0051BFC0 for ; Wed, 30 Oct 2019 15:23:20 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Oct 2019 07:23:19 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.68,247,1569308400"; d="scan'208";a="374903731" Received: from irsmsx109.ger.corp.intel.com ([163.33.3.23]) by orsmga005.jf.intel.com with ESMTP; 30 Oct 2019 07:23:17 -0700 Received: from irsmsx104.ger.corp.intel.com ([169.254.5.252]) by IRSMSX109.ger.corp.intel.com ([169.254.13.90]) with mapi id 14.03.0439.000; Wed, 30 Oct 2019 14:23:16 +0000 From: "Ananyev, Konstantin" To: Akhil Goyal , "'dev@dpdk.org'" , "De Lara Guarch, Pablo" , 'Thomas Monjalon' , "Zhang, Roy Fan" , "Doherty, Declan" CC: 'Anoob Joseph' , Hemant Agrawal Thread-Topic: [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Thread-Index: AQHVYm4Y+AedzaNgY0qWMAmu5GwNVqcbQpEAgAArCICAAuADAIAARA4AgAYiKoCAAbQS0IABqquAgAZiqNCAAPA4gIABtviQgAu3KZCAAoImAIAExwxQgAA3zQCAAZ1E0IADFFEAgAZGHRCAAsIfgIAAUayAgAM4hICAB/TasIADCaEAgASiaLCAAasPgIAAIn3wgAE2PoCAC0aWsA== Date: Wed, 30 Oct 2019 14:23:16 +0000 Message-ID: <2601191342CEEE43887BDE71AB97725801A8C7292F@IRSMSX104.ger.corp.intel.com> References: <2601191342CEEE43887BDE71AB97725801A8C6E152@IRSMSX104.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYjZlZTZkZTAtOTM0ZC00MDE0LTkxNWMtNmY5ZGRjOTc3MTYwIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiU2Z2ODBiWHBrRm5kSjJXVmtaK0lON1hEejZ3YVliSmRcL2dPMGFpSGRBNjlXUXJpVThBaWJqSjVEM0NmYStHUWUifQ== x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Akhil, > > > > > Added my comments inline with your draft. > > > > > [snip].. > > > > > > > > > > > > > > > > > Ok, then my suggestion: > > > > > > Let's at least write down all points about crypto-dev approach = where we > > > > > > disagree and then probably try to resolve them one by one.... > > > > > > If we fail to make an agreement/progress in next week or so, > > > > > > (and no more reviews from the community) > > > > > > will have bring that subject to TB meeting to decide. > > > > > > Sounds fair to you? > > > > > Agreed > > > > > > > > > > > > List is below. > > > > > > Please add/correct me, if I missed something. > > > > > > > > > > > > Konstantin > > > > > > > > > > Before going into comparison, we should define the requirement as= well. > > > > > > > > Good point. > > > > > > > > > What I understood from the patchset, > > > > > "You need a synchronous API to perform crypto operations on raw d= ata > > using > > > > SW PMDs" > > > > > So, > > > > > - no crypto-ops, > > > > > - no separate enq-deq, only single process API for data path > > > > > - Do not need any value addition to the session parameters. > > > > > (You would need some parameters from the crypto-op which > > > > > Are constant per session and since you wont use crypto-op, > > > > > You need some place to store that) > > > > > > > > Yes, this is correct, I think. > > > > > > > > > > > > > > Now as per your mail, the comparison > > > > > 1. extra input parameters to create/init rte_(cpu)_sym_session. > > > > > > > > > > Will leverage existing 6B gap inside rte_crypto_*_xform between '= algo' > > and > > > > 'key' fields. > > > > > New fields will be optional and would be used by PMD only when cp= u- > > crypto > > > > session is requested. > > > > > For lksd-crypto session PMD is free to ignore these fields. > > > > > No ABI breakage is required. > > > > > > > > > > [Akhil] Agreed, no issues. > > > > > > > > > > 2. cpu-crypto create/init. > > > > > a) Our suggestion - introduce new API for that: > > > > > - rte_crypto_cpu_sym_init() that would init completely op= aque > > > > rte_crypto_cpu_sym_session. > > > > > - struct rte_crypto_cpu_sym_session_ops {(*process)(...);= (*clear); > > > > /*whatever else we'll need *'}; > > > > > - rte_crypto_cpu_sym_get_ops(const struct rte_crypto_sym_= xform > > > > *xforms) > > > > > that would return const struct rte_crypto_cpu_sym_sessi= on_ops > > *based > > > > on input xforms. > > > > > Advantages: > > > > > 1) totally opaque data structure (no ABI breakages in future), = PMD > > > > writer is totally free > > > > > with it format and contents. > > > > > > > > > > [Akhil] It will have breakage at some point till we don't hit the= union size. > > > > > > > > Not sure, what union you are talking about? > > > > > > Union of xforms in rte_security_session_conf > > > > Hmm, how does it relates here? > > I thought we discussing pure rte_cryptodev_sym_session, no? > > > > > > > > > > > > > > Rather I don't suspect there will be more parameters added. > > > > > Or do we really care about the ABI breakage when the argument is = about > > > > > the correct place to add a piece of code or do we really agree to= add code > > > > > anywhere just to avoid that breakage. > > > > > > > > I am talking about maintaining it in future. > > > > if your struct is not seen externally, no chances to introduce ABI = breakage. > > > > > > > > > > > > > > 2) each session entity is self-contained, user doesn't need to b= ring along > > > > dev_id etc. > > > > > dev_id is needed only at init stage, after that user will u= se session ops > > > > to perform > > > > > all operations on that session (process(), clear(), etc.). > > > > > > > > > > [Akhil] There is nothing called as session ops in current DPDK. > > > > > > > > True, but it doesn't mean we can't/shouldn't have it. > > > > > > We can have it if it is not adding complexity for the user. Creating = 2 different > > code > > > Paths for user is not desirable for the stack developers. > > > > > > > > > > > > What you are proposing > > > > > is a new concept which doesn't have any extra benefit, rather it = is adding > > > > complexity > > > > > to have two different code paths for session create. > > > > > > > > > > > > > > > 3) User can decide does he wants to store ops[] pointer on a per= session > > > > basis, > > > > > or on a per group of same sessions, or... > > > > > > > > > > [Akhil] Will the user really care which process API should be cal= led from the > > > > PMD. > > > > > Rather it should be driver's responsibility to store that in the = session private > > > > data > > > > > which would be opaque to the user. As per my suggestion same proc= ess > > > > function can > > > > > be added to multiple sessions or a single session can be managed = inside the > > > > PMD. > > > > > > > > In that case we either need to have a function per session (stored = internally), > > > > or make decision (branches) at run-time. > > > > But as I said in other mail - I am ok to add small shim structure h= ere: > > > > either rte_crypto_cpu_sym_session { void *ses; struct > > > > rte_crypto_cpu_sym_session_ops ops; } > > > > or rte_crypto_cpu_sym_session { void *ses; struct > > > > rte_crypto_cpu_sym_session_ops *ops; } > > > > And merge rte_crypto_cpu_sym_init() and rte_crypto_cpu_sym_get_ops(= ) > > into > > > > one (init). > > > > > > Again that will be a separate API call from the user perspective whic= h is not > > good. > > > > > > > > > > > > > > > > > > > > > > 4) No mandatory mempools for private sessions. User can allocate > > > > memory for cpu-crypto > > > > > session whenever he likes. > > > > > > > > > > [Akhil] you mean session private data? > > > > > > > > Yes. > > > > > > > > > You would need that memory anyways, user will be > > > > > allocating that already. You do not need to manage that. > > > > > > > > What I am saying - right now user has no choice but to allocate it = via > > mempool. > > > > Which is probably not the best options for all cases. > > > > > > > > > > > > > > Disadvantages: > > > > > 5) Extra changes in control path > > > > > 6) User has to store session_ops pointer explicitly. > > > > > > > > > > [Akhil] More disadvantages: > > > > > - All supporting PMDs will need to maintain TWO types of session = for the > > > > > same crypto processing. Suppose a fix or a new feature(or algo) i= s added, > > PMD > > > > owner > > > > > will need to add code in both the session create APIs. Hence more > > > > maintenance and > > > > > error prone. > > > > > > > > I think majority of code for both paths will be common, plus even w= e'll reuse > > > > current sym_session_init() - > > > > changes in PMD session_init() code will be unavoidable. > > > > But yes, it will be new entry in devops, that PMD will have to supp= ort. > > > > Ok to add it as 7) to the list. > > > > > > > > > - Stacks which will be using these new APIs also need to maintain= two > > > > > code path for the same processing while doing session initializat= ion > > > > > for sync and async > > > > > > > > That's the same as #5 above, I think. > > > > > > > > > > > > > > > > > > > b) Your suggestion - reuse existing rte_cryptodev_sym_sessio= n_init() and > > > > existing rte_cryptodev_sym_session > > > > > structure. > > > > > Advantages: > > > > > 1) allows to reuse same struct and init/create/clear() functions= . > > > > > Probably less changes in control path. > > > > > Disadvantages: > > > > > 2) rte_cryptodev_sym_session. sess_data[] is indexed by driver_i= d, > > > > which means that > > > > > we can't use the same rte_cryptodev_sym_session to hold priv= ate > > > > sessions pointers > > > > > for both sync and async mode for the same device. > > > > > So the only option we have - make PMD devops- > > > > >sym_session_configure() > > > > > always create a session that can work in both cpu and lksd m= odes. > > > > > For some implementations that would probably mean that under= the > > > > hood PMD would create > > > > > 2 different session structs (sync/async) and then use one or= another > > > > depending on from what API been called. > > > > > Seems doable, but ...: > > > > > - will contradict with statement from 1: > > > > > " New fields will be optional and would be used by PMD onl= y when > > > > cpu-crypto session is requested." > > > > > Now it becomes mandatory for all apps to sp= ecify cpu-crypto > > > > related parameters too, > > > > > even if they don't plan to use that mode - i.e. behavior = change, > > > > existing app change. > > > > > - might cause extra space overhead. > > > > > > > > > > [Akhil] It will not contradict with #1, you will only have few ch= ecks in the > > > > session init PMD > > > > > Which support this mode, find appropriate values and set the appr= opriate > > > > process() in it. > > > > > User should be able to call, legacy enq-deq as well as the new pr= ocess() > > > > without any issue. > > > > > User would be at runtime will be able to change the datapath. > > > > > So this is not a disadvantage, it would be additional flexibility= for the user. > > > > > > > > Ok, but that's what I am saying - if PMD would *always* have to cre= ate a > > > > session that can handle > > > > both modes (sync/async), then user would *always* have to provide > > parameters > > > > for both modes too. > > > > Otherwise if let say user didn't setup sync specific parameters at = all, what > > PMD > > > > should do? > > > > - return with error? > > > > - init session that can be used with async path only? > > > > My current assumption is #1. > > > > If #2, then how user will be able to distinguish is that session va= lid for both > > > > modes, or only for one? > > > > > > I would say a 3rd option, do nothing if sync params are not set. > > > Probably have a debug print in the PMD(which support sync mode) to sp= ecify > > that > > > session is not configured properly for sync mode. > > > > So, just print warning and proceed with init session that can be used w= ith async > > path only? > > Then it sounds the same as #2 above. > > Which actually means that sync mode parameters for sym_session_init() > > becomes optional. > > Then we need an API to provide to the user information what modes > > (sync+async/async only) is supported by that session for given dev_id. > > And user would have to query/retain this information at control-path, > > and store it somewhere in user-space together with session pointer and = dev_ids > > to use later at data-path (same as we do now for session type). > > That definitely requires changes in control-path to start using it. > > Plus the fact that this value can differ for different dev_ids for the = same session - > > doesn't make things easier here. >=20 > API wont be required to specify that. Feature flag will be sufficient, no= t a big change > From the application perspective. >=20 > Here is some pseudo code just to elaborate my understanding. This will ne= ed some >=20 > From application, > If(dev_info->feature_flags & RTE_CRYPTODEV_FF_SYNC) { > /* set additional params in crypto xform */ > } >=20 > Now in the driver, > pmd_sym_session_configure(dev,xform,sess,mempool) { > ... > If(dev_info->feature_flags & RTE_CRYPTODEV_FF_SYNC > && xform->/*sync params are set*/) { > /*Assign process function pointer in sess->priv_data*/ > } /* It may return error if FF_SYNC is set and params are not correct. Then all apps will always *have to* setup sync parameters in xform. What you suggest is *mandatory* sync mode: user always has to setup sync mode params if PMD does support it (no matter does he plan to use sync mode= or not). =20 Which means behavior change in existing apps. > It would be upto the driver whether it support both SYNC and ASY= NC.*/ > } >=20 > Now the new sync API >=20 > pmd_process(...) { > If(dev_info->feature_flags & RTE_CRYPTODEV_FF_SYNC > && sess_priv->process !=3D NULL) > sess_priv->process(...); > else > ASSERT("sync mode not configured properly or not supported"); > } >=20 > In the data path, there is no extra processing happening. > Even in case of your suggestion, you should have these type of error chec= ks, > You cannot blindly trust on the application that the pointers are correct= . >=20 > > > > > Internally the PMD will not store the process() API in the session pr= iv data > > > And while calling the first packet, devops->process will give an asse= rt that > > session > > > Is not configured for sync mode. The session validation would be done= in any > > case > > > your suggestion or mine. So no extra overhead at runtime. > > > > I believe that after session_init() user should get either an error or > > valid session handler that he can use at runtime. > > Pushing session validation to runtime doesn't seem like a good idea. > > > It may get a warning from the PMD, that FF_SYNC is set but params are not > Correct/available. See above. I think warning is not enough. There should be a clear way (API) for developer to realize is the created s= ession can be used by sync API data-path or not.=20 >=20 > > > > > > > > > > > > > > > > > > > > > > > > > > 3) not possible to store device (not driver) specific data withi= n the > > > > session, but I think it is not really needed right now. > > > > > So probably minor compared to 2.b.2. > > > > > > > > > > [Akhil] So lets omit this for current discussion. And I hope we c= an find some > > > > way to deal with it. > > > > > > > > I don't think there is an easy way to fix that with existing API. > > > > > > > > > > > > > > > > > > > Actually #3 follows from #2, but decided to have them separated. > > > > > > > > > > 3. process() parameters/behavior > > > > > a) Our suggestion: user stores ptr to session ops (or to (*pr= ocess) itself) > > and > > > > just does: > > > > > session_ops->process(sess, ...); > > > > > Advantages: > > > > > 1) fastest possible execution path > > > > > 2) no need to carry on dev_id for data-path > > > > > > > > > > [Akhil] I don't see any overhead of carrying dev id, at least it = would be > > inline > > > > with the > > > > > current DPDK methodology. > > > > > > > > If we'll add process() into rte_cryptodev itself (same as we have > > > > enqueue_burst/dequeue_burst), > > > > then it will be an ABI breakage. > > > > Also there are discussions to get rid of that approach completely: > > > > > > https://eur01.safelinks.protection.outlook.com/?url=3Dhttp%3A%2F%2Fmail= s.dpd > > k.org%2Farchives%2Fdev%2F2019- > > September%2F144674.html&data=3D02%7C01%7Cakhil.goyal%40nxp.com%7 > > C1859dc1d29cd45a51e9908d7571784bb%7C686ea1d3bc2b4c6fa92cd99c5c301 > > 635%7C0%7C0%7C637073630835415165&sdata=3DBz9jgisyVzRJNt1BijtvSlurh > > JU1vXBbynNwlMDjaco%3D&reserved=3D0 > > > > So I am not sure this is a recommended way these days. > > > > > > We can either have it in rte_cryptodev or in rte_cryptodev_ops whiche= ver > > > is good for you. > > > > > > Whether it is ABI breakage or not, as per your requirements, this is = the correct > > > approach. Do you agree with this or not? > > > > I think it is possible approach, but not the best one: > > it looks quite flakey to me (see all these uncertainty with sym_session= _init > > above), > > plus introduces extra overhead at data-path. >=20 > Uncertainties can be handled appropriately using a feature flag >=20 > > > > > > > > Now handling the API/ABI breakage is a separate story. In 19.11 relea= se we > > > Are not much concerned about the ABI breakages, this was discussed in > > > community. So adding a new dev_ops wouldn't have been an issue. > > > Now since we are so close to RC1 deadline, we should come up with som= e > > > other solution for next release. May be having a pmd API in 20.02 and > > > converting it into formal one in 20.11 > > > > > > > > > > > > > > > What you are suggesting is a new way to get the things done witho= ut much > > > > benefit. > > > > > > > > Would help with ABI stability plus better performance, isn't it eno= ugh? > > > > > > > > > Also I don't see any performance difference as crypto workload is= heavier > > than > > > > > Code cycles, so that wont matter. > > > > > > > > It depends. > > > > Suppose function call costs you ~30 cycles. > > > > If you have burst of big packets (let say crypto for each will take= ~2K cycles) > > that > > > > belong > > > > to the same session, then yes you wouldn't notice these extra 30 cy= cles at all. > > > > If you have burst of small packets (let say crypto for each will ta= ke ~300 > > cycles) > > > > each > > > > belongs to different session, then it will cost you ~10% extra. > > > > > > Let us do some profiling on openssl with both the approaches and find= out the > > > difference. > > > > > > > > > > > > So IMO, there is no advantage in your suggestion as well. > > > > > > > > > > > > > > > Disadvantages: > > > > > 3) user has to carry on session_ops pointer explicitly > > > > > b) Your suggestion: add (*cpu_process) inside rte_cryptodev_= ops and > > then: > > > > > rte_crypto_cpu_sym_process(uint8_t dev_id, > > rte_cryptodev_sym_session > > > > *sess, /*data parameters*/) {... > > > > > rte_cryptodevs[dev_id].dev_ops->cpu_process(= ses, ...); > > > > > /*and then inside PMD specifc process: */ > > > > > pmd_private_session =3D sess- > > >sess_data[this_pmd_driver_id].data; > > > > > /* and then most likely either */ > > > > > pmd_private_session->process(pmd_private_ses= sion, ...); > > > > > /* or jump based on session/input data */ > > > > > Advantages: > > > > > 1) don't see any... > > > > > Disadvantages: > > > > > 2) User has to carry on dev_id inside data-path > > > > > 3) Extra level of indirection (plus data dependency) - both for = data and > > > > instructions. > > > > > Possible slowdown compared to a) (not measured). > > > > > > > > > > Having said all this, if the disagreements cannot be resolved, yo= u can go > > for a > > > > pmd API specific > > > > > to your PMDs, > > > > > > > > I don't think it is good idea. > > > > PMD specific API is sort of deprecated path, also there is no clean= way to use > > it > > > > within the libraries. > > > > > > I know that this is a deprecated path, we can use it until we are not= allowed > > > to break ABI/API > > > > > > > > > > > > because as per my understanding the solution doesn't look scalabl= e to > > other > > > > PMDs. > > > > > Your approach is aligned only to Intel , will not benefit others = like openssl > > > > which is used by all > > > > > vendors. > > > > > > > > I feel quite opposite, from my perspective majority of SW backed PM= Ds will > > > > benefit from it. > > > > And I don't see anything Intel specific in my proposals above. > > > > About openssl PMD: I am not an expert here, but looking at the code= , I think > > it > > > > will fit really well. > > > > Look yourself at its internal functions: > > > > process_openssl_auth_op/process_openssl_crypto_op, > > > > I think they doing exactly the same - they use sync API underneath,= and they > > are > > > > session based > > > > (AFAIK you don't need any device/queue data, everything that needed= for > > > > crypto/auth is stored inside session). > > > > > > > By vendor specific, I mean, > > > - no PMD would like to have 2 different variants of session Init APIs= for doing > > the same stuff. > > > - stacks will become vendor specific while using 2 separate session c= reate APIs. > > No stack would > > > Like to support 2 variants of session create- one for HW PMDs and one= for SW > > PMDs. > > > > I think what you refer on has nothing to do with 'vendor specific'. > > I would name it 'extra overhead for PMD and stack writers'. > > Yes, for sure there is extra overhead (as always with new API) - > > for both producer (PMD writer) and consumer (stack writer): > > New function(s) to support, probably more tests to create/run, etc. > > Though this API is optional - if PMD/stack maintainer doesn't see > > value in it, they are free not to support it. > > From other side, re-using rte_cryptodev_sym_session_init() > > wouldn't help anyway - both data-path and control-path would differ > > from async mode anyway. > > BTW, right now to support different HW flavors > > we do have 4 different control and data-paths for both > > ipsec-secgw and librte_ipsec: > > lkds-none/lksd-proto/inline-crypto/inline-proto. > > And that is considered to be ok. >=20 > No that is not ok. We cannot add new paths for every other case. What I am saying: if let-say lookaside-proto/inline-crypto/inline-proto deserves its own case in rte_security/rte_crypto API, I don't understand why cpu-crypto doesn't. > Those 4 are controlled using 2 set of APIs. Yes there are 2 API sets (rte_cryptodev/rte_security), but in fact if you look at ipsec-secgw and librte_ipsec we have 4 different= code paths. For both create_session() and ipsec_enqueue() we have a big switch() with 4= different cases. Nearly the same for librte_ipsec - we have different prepare/process function pointers for each security type. =20 > We should try our best to > Have minimum overhead to the application writer. This pain was also discu= ssed > In the one of DPDK conference as well. > DPDK is not a standalone entity, there are stacks running over it always. > We should not add API for every other use case when we have an alternativ= e > Approach with the existing API set. >=20 > Now introducing another one would add to that pain and a lot of work for > Both producer and consumer. If I would see a clean approach to implement desired functionality without introducing new API - I would definitely support it. The problem is that from my perspective, what you suggesting with existing API will bring more drawbacks then positi= ves. BTW, our first approach (via rte_security) does reuse existing API, so if adding new API is the main concern - let's reconsider that path. =20 > It would be interesting to see how much performance difference will be th= ere in the > Two approaches. As per my understanding it wont be much as compared to th= e > Extra work that you will be inducing. >=20 > -Akhil >=20 > > Honestly, I don't understand why SW backed implementations > > can't have their own path that would suite them most. > > Konstantin > > > > > > > > > >