From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 23846A2EDB for ; Mon, 30 Sep 2019 14:27:45 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 0650E378E; Mon, 30 Sep 2019 14:27:44 +0200 (CEST) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 76E4B343C for ; Mon, 30 Sep 2019 14:27:41 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Sep 2019 05:27:40 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.64,567,1559545200"; d="scan'208";a="194160618" Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25]) by orsmga003.jf.intel.com with ESMTP; 30 Sep 2019 05:27:36 -0700 Received: from irsmsx105.ger.corp.intel.com ([169.254.7.164]) by irsmsx110.ger.corp.intel.com ([169.254.15.189]) with mapi id 14.03.0439.000; Mon, 30 Sep 2019 13:22:47 +0100 From: "Ananyev, Konstantin" To: Akhil Goyal , "'dev@dpdk.org'" , "De Lara Guarch, Pablo" , 'Thomas Monjalon' CC: "Zhang, Roy Fan" , "Doherty, Declan" , 'Anoob Joseph' Thread-Topic: [RFC PATCH 1/9] security: introduce CPU Crypto action type and API Thread-Index: AQHVYm4Y+AedzaNgY0qWMAmu5GwNVqcbQpEAgAArCICAAuADAIAARA4AgAYiKoCAAbQS0IABqquAgAZiqNCAAPA4gIABtviQgAu3KZCAAoImAIAExwxQ Date: Mon, 30 Sep 2019 12:22:46 +0000 Message-ID: <2601191342CEEE43887BDE71AB977258019196D53D@irsmsx105.ger.corp.intel.com> References: <20190903154046.55992-1-roy.fan.zhang@intel.com> <20190903154046.55992-2-roy.fan.zhang@intel.com> <9F7182E3F746AB4EA17801C148F3C6043369D686@IRSMSX101.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191926A17@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191962CD5@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191966116@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB9772580191966C23@irsmsx105.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258019196A767@irsmsx105.ger.corp.intel.com> In-Reply-To: Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTdlZGY3MjgtMDgxZC00NTRmLWJkZDEtM2FlMTg5ODBiODI3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiNVJwSVcrU2JMMnJhRExEcU1oVUxVUW9NYjQ1TVBhdXFlb1VoYzhPMzFSQkhSVWw0OUw1ZTVMYjc0OVFRdFdIcyJ9 x-ctpclassification: CTP_NT dlp-product: dlpe-windows dlp-version: 11.2.0.6 dlp-reaction: no-action x-originating-ip: [163.33.239.182] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [RFC PATCH 1/9] security: introduce CPU Crypto action type and API X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Akhil, > > > > > > > > > > > This action type allows the burst of symmetric crypto= workload > > using > > > > > the > > > > > > > > > same > > > > > > > > > > > algorithm, key, and direction being processed by CPU = cycles > > > > > > > synchronously. > > > > > > > > > > > This flexible action type does not require external h= ardware > > > > > involvement, > > > > > > > > > > > having the crypto workload processed synchronously, a= nd is > > more > > > > > > > > > performant > > > > > > > > > > > than Cryptodev SW PMD due to the saved cycles on remo= ved > > "async > > > > > > > mode > > > > > > > > > > > simulation" as well as 3 cacheline access of the cryp= to ops. > > > > > > > > > > > > > > > > > > > > Does that mean application will not call the > > cryptodev_enqueue_burst > > > > > and > > > > > > > > > corresponding dequeue burst. > > > > > > > > > > > > > > > > > > Yes, instead it just call rte_security_process_cpu_crypto= _bulk(...) > > > > > > > > > > > > > > > > > > > It would be a new API something like process_packets an= d it will > > have > > > > > the > > > > > > > > > crypto processed packets while returning from the API? > > > > > > > > > > > > > > > > > > Yes, though the plan is that API will operate on raw data= buffers, > > not > > > > > mbufs. > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I still do not understand why we cannot do with the con= ventional > > > > > crypto lib > > > > > > > > > only. > > > > > > > > > > As far as I can understand, you are not doing any proto= col > > processing > > > > > or > > > > > > > any > > > > > > > > > value add > > > > > > > > > > To the crypto processing. IMO, you just need a synchron= ous > > crypto > > > > > > > processing > > > > > > > > > API which > > > > > > > > > > Can be defined in cryptodev, you don't need to re-creat= e a crypto > > > > > session > > > > > > > in > > > > > > > > > the name of > > > > > > > > > > Security session in the driver just to do a synchronous= processing. > > > > > > > > > > > > > > > > > > I suppose your question is why not to have > > > > > > > > > rte_crypot_process_cpu_crypto_bulk(...) instead? > > > > > > > > > The main reason is that would require disruptive changes = in existing > > > > > > > cryptodev > > > > > > > > > API > > > > > > > > > (would cause ABI/API breakage). > > > > > > > > > Session for RTE_SECURITY_ACTION_TYPE_CPU_CRYPTO need > > some > > > > > extra > > > > > > > > > information > > > > > > > > > that normal crypto_sym_xform doesn't contain > > > > > > > > > (cipher offset from the start of the buffer, might be som= ething extra > > in > > > > > > > future). > > > > > > > > > > > > > > > > Cipher offset will be part of rte_crypto_op. > > > > > > > > > > > > > > fill/read (+ alloc/free) is one of the main things that slowd= own current > > > > > crypto-op > > > > > > > approach. > > > > > > > That's why the general idea - have all data that wouldn't cha= nge from > > packet > > > > > to > > > > > > > packet > > > > > > > included into the session and setup it once at session_init()= . > > > > > > > > > > > > I agree that you cannot use crypto-op. > > > > > > You can have the new API in crypto. > > > > > > As per the current patch, you only need cipher_offset which you= can have > > it as > > > > > a parameter until > > > > > > You get it approved in the crypto xform. I believe it will be b= eneficial in > > case of > > > > > other crypto cases as well. > > > > > > We can have cipher offset at both places(crypto-op and cipher_x= form). It > > will > > > > > give flexibility to the user to > > > > > > override it. > > > > > > > > > > After having another thought on your proposal: > > > > > Probably we can introduce new rte_crypto_sym_xform_types for CPU > > related > > > > > stuff here? > > > > > > > > I also thought of adding new xforms, but that wont serve the purpos= e for > > may be all the cases. > > > > You would be needing all information currently available in the cur= rent > > xforms. > > > > So if you are adding new fields in the new xform, the size will be = more than > > that of the union of xforms. > > > > ABI breakage would still be there. > > > > > > > > If you think a valid compression of the AEAD xform can be done, the= n that > > can be done for each of the > > > > Xforms and we can have a solution to this issue. > > > > > > I think that we can re-use iv.offset for our purposes (for crypto off= set). > > > So for now we can make that path work without any ABI breakage. > > > Fan, please feel free to correct me here, if I missed something. > > > If in future we would need to add some extra information it might > > > require ABI breakage, though by now I don't envision anything particu= lar to > > add. > > > Anyway, if there is no objection to go that way, we can try to make > > > these changes for v2. > > > > > > > Actually, after looking at it more deeply it appears not that easy as I= thought it > > would be :) > > Below is a very draft version of proposed API additions. > > I think it avoids ABI breakages right now and provides enough flexibili= ty for > > future extensions (if any). > > For now, it doesn't address your comments about naming conventions (_CP= U_ > > vs _SYNC_) , etc. > > but I suppose is comprehensive enough to provide a main idea beyond it. > > Akhil and other interested parties, please try to review and provide fe= edback > > ASAP, > > as related changes would take some time and we still like to hit 19.11 = deadline. > > Konstantin > > > > diff --git a/lib/librte_cryptodev/rte_crypto_sym.h > > b/lib/librte_cryptodev/rte_crypto_sym.h > > index bc8da2466..c03069e23 100644 > > --- a/lib/librte_cryptodev/rte_crypto_sym.h > > +++ b/lib/librte_cryptodev/rte_crypto_sym.h > > @@ -103,6 +103,9 @@ rte_crypto_cipher_operation_strings[]; > > * > > * This structure contains data relating to Cipher (Encryption and Dec= ryption) > > * use to create a session. > > + * Actually I was wrong saying that we don't have free space inside xf= orms. > > + * Making key struct packed (see below) allow us to regain 6B that cou= ld be > > + * used for future extensions. > > */ > > struct rte_crypto_cipher_xform { > > enum rte_crypto_cipher_operation op; > > @@ -116,7 +119,25 @@ struct rte_crypto_cipher_xform { > > struct { > > const uint8_t *data; /**< pointer to key data */ > > uint16_t length; /**< key length in bytes */ > > - } key; > > + } __attribute__((__packed__)) key; > > + > > + /** > > + * offset for cipher to start within user provided data buffer= . > > + * Fan suggested another (and less space consuming way) - > > + * reuse iv.offset space below, by changing: > > + * struct {uint16_t offset, length;} iv; > > + * to uunamed union: > > + * union { > > + * struct {uint16_t offset, length;} iv; > > + * struct {uint16_t iv_len, crypto_offset} cpu_crypto_para= m; > > + * }; > > + * Both approaches seems ok to me in general. >=20 > No strong opinions here. OK with this one. >=20 > > + * Comments/suggestions are welcome. > > + */ > > + uint16_t offset; After another thought - it is probably a bit better to have offset as a sep= arate field. In that case we can use the same xforms to create both type of sessions. > > + > > + uint8_t reserved1[4]; > > + > > /**< Cipher key > > * > > * For the RTE_CRYPTO_CIPHER_AES_F8 mode of operation, key.data= will > > @@ -284,7 +305,7 @@ struct rte_crypto_auth_xform { > > struct { > > const uint8_t *data; /**< pointer to key data */ > > uint16_t length; /**< key length in bytes */ > > - } key; > > + } __attribute__((__packed__)) key; > > /**< Authentication key data. > > * The authentication key length MUST be less than or equal to = the > > * block size of the algorithm. It is the callers responsibilit= y to > > @@ -292,6 +313,8 @@ struct rte_crypto_auth_xform { > > * (for example RFC 2104, FIPS 198a). > > */ > > > > + uint8_t reserved1[6]; > > + > > struct { > > uint16_t offset; > > /**< Starting point for Initialisation Vector or Counte= r, > > @@ -376,7 +399,12 @@ struct rte_crypto_aead_xform { > > struct { > > const uint8_t *data; /**< pointer to key data */ > > uint16_t length; /**< key length in bytes */ > > - } key; > > + } __attribute__((__packed__)) key; > > + > > + /** offset for cipher to start within data buffer */ > > + uint16_t cipher_offset; > > + > > + uint8_t reserved1[4]; > > > > struct { > > uint16_t offset; > > diff --git a/lib/librte_cryptodev/rte_cryptodev.h > > b/lib/librte_cryptodev/rte_cryptodev.h > > index e175b838c..c0c7bfed7 100644 > > --- a/lib/librte_cryptodev/rte_cryptodev.h > > +++ b/lib/librte_cryptodev/rte_cryptodev.h > > @@ -1272,6 +1272,101 @@ void * > > rte_cryptodev_sym_session_get_user_data( > > struct rte_cryptodev_sym_sessio= n *sess); > > > > +/* > > + * After several thoughts decided not to try to squeeze CPU_CRYPTO > > + * into existing rte_crypto_sym_session structure/API, but instead > > + * introduce an extentsion to it via new fully opaque > > + * struct rte_crypto_cpu_sym_session and additional related API. >=20 >=20 > What all things do we need to squeeze? > In this proposal I do not see the new struct cpu_sym_session defined her= e. The plan is to have it totally opaque to the user, i.e. just: struct rte_crypto_cpu_sym_session; in public header files. > I believe you will have same lib API/struct for cpu_sym_session and sym_= session. I thought about such way, but there are few things that looks clumsy to me: 1. Right now there is no 'type' (or so) field inside rte_cryptodev_sym_sess= ion, so it is not possible to easy distinguish what session do you have: lksd_sy= m or cpu_sym. In theory, there is a hole of 4B inside rte_cryptodev_sym_session, so we ca= n add some extra field here, but in that case we wouldn't be able to use the same xform for both = lksd_sym or cpu_sym (which seems really plausible thing for me). 2. Majority of rte_cryptodev_sym_session fields I think are unnecessary fo= r rte_crypto_cpu_sym_session: sess_data[], opaque_data, user_data, nb_drivers. All that consumes space, that could be used somewhere else instead. 3. I am a bit reluctant to touch existing rte_cryptodev API - to avoid any = breakages I can't foresee right now. >From other side - if we'll add new functions/structs for cpu_sym_session we= can mark it and keep it for some time as experimental, so further changes (if needed) w= ould still be possible. > I am not sure if that would be needed. > It would be internal to the driver that if synchronous processing is supp= orted(from feature flag) and > Have relevant fields in xform(the newly added ones which are packed as pe= r your suggestions) set, > It will create that type of session. >=20 >=20 > > + * Main points: > > + * - Current crypto-dev API is reasonably mature and it is desirable > > + * to keep it unchanged (API/ABI stability). From other side, this > > + * new sync API is new one and probably would require extra changes. > > + * Having it as a new one allows to mark it as experimental, without > > + * affecting existing one. > > + * - Fully opaque cpu_sym_session structure gives more flexibility > > + * to the PMD writers and again allows to avoid ABI breakages in fut= ure. > > + * - process() function per set of xforms > > + * allows to expose different process() functions for different > > + * xform combinations. PMD writer can decide, does he wants to > > + * push all supported algorithms into one process() function, > > + * or spread it across several ones. > > + * I.E. More flexibility for PMD writer. >=20 > Which process function should be chosen is internal to PMD, how would tha= t info > be visible to the application or the library. These will get stored in th= e session private > data. It would be upto the PMD writer, to store the per session process f= unction in > the session private data. >=20 > Process function would be a dev ops just like enc/deq operations and it s= hould call > The respective process API stored in the session private data. That model (via devops) is possible, but has several drawbacks from my pers= pective: 1. It means we'll need to pass dev_id as a parameter to process() function. Though in fact dev_id is not a relevant information for us here (all we need is pointer to the session and pointer to the fuction to call) and I tried to avoid using it in data-path functions for that API. 2. As you pointed in that case it will be just one process() function per d= evice. So if PMD would like to have several process() functions for different type= of sessions =20 (let say one per alg) first thing it has to do inside it's process() - read= session data and based on that, do a jump/call to particular internal sub-routine. Something like: driver_id =3D get_pmd_driver_id(); priv_ses =3D ses->sess_data[driver_id]; Then either: switch(priv_sess->alg) {case XXX: process_XXX(priv_sess, ...);break;...} OR=20 priv_ses->process(priv_sess, ...); to select and call the proper function. Looks like totally unnecessary overhead to me. Though if we'll have ability to query/extract some sort session_ops based o= n the xform - we can avoid this extra de-refererence+jump/call thing. >=20 > I am not sure if you would need a new session init API for this as nothin= g would be visible to > the app or lib. >=20 > > + * - Not storing process() pointer inside the session - > > + * Allows user to choose does he want to store a process() pointer > > + * per session, or per group of sessions for that device that share > > + * the same input xforms. I.E. extra flexibility for the user, > > + * plus allows us to keep cpu_sym_session totally opaque, see above. >=20 > If multiple sessions need to be processed via the same process function, > PMD would save the same process in all the sessions, I don't think there = would > be any perf overhead with that. I think it would, see above. >=20 > > + * Sketched usage model: > > + * .... > > + * /* control path, alloc/init session */ > > + * int32_t sz =3D rte_crypto_cpu_sym_session_size(dev_id, &xform); > > + * struct rte_crypto_cpu_sym_session *ses =3D user_alloc(..., sz); > > + * rte_crypto_cpu_sym_process_t process =3D > > + * rte_crypto_cpu_sym_session_func(dev_id, &xform); > > + * rte_crypto_cpu_sym_session_init(dev_id, ses, &xform); > > + * ... > > + * /* data-path*/ > > + * process(ses, ....); > > + * .... > > + * /* control path, termiante/free session */ > > + * rte_crypto_cpu_sym_session_fini(dev_id, ses); > > + */ > > + > > +/** > > + * vector structure, contains pointer to vector array and the length > > + * of the array > > + */ > > +struct rte_crypto_vec { > > + struct iovec *vec; > > + uint32_t num; > > +}; > > + > > +/* > > + * Data-path bulk process crypto function. > > + */ > > +typedef void (*rte_crypto_cpu_sym_process_t)( > > + struct rte_crypto_cpu_sym_session *sess, > > + struct rte_crypto_vec buf[], void *iv[], void *aad[], > > + void *digest[], int status[], uint32_t num); > > +/* > > + * for given device return process function specific to input xforms > > + * on error - return NULL and set rte_errno value. > > + * Note that for same input xfroms for the same device should return > > + * the same process function. > > + */ > > +__rte_experimental > > +rte_crypto_cpu_sym_process_t > > +rte_crypto_cpu_sym_session_func(uint8_t dev_id, > > + const struct rte_crypto_sym_xform *xforms); > > + > > +/* > > + * Return required session size in bytes for given set of xforms. > > + * if xforms =3D=3D NULL, then return the max possible session size, > > + * that would fit session for any supported by the device algorithm. > > + * if CPU mode is not supported at all, or requeted in xform > > + * algorithm is not supported, then return -ENOTSUP. > > + */ > > +__rte_experimental > > +int > > +rte_crypto_cpu_sym_session_size(uint8_t dev_id, > > + const struct rte_crypto_sym_xform *xforms); > > + > > +/* > > + * Initialize session. > > + * It is caller responsibility to allocate enough space for it. > > + * See rte_crypto_cpu_sym_session_size above. > > + */ > > +__rte_experimental > > +int rte_crypto_cpu_sym_session_init(uint8_t dev_id, > > + struct rte_crypto_cpu_sym_session *sess, > > + const struct rte_crypto_sym_xform *xforms); > > + > > +__rte_experimental > > +void > > +rte_crypto_cpu_sym_session_fini(uint8_t dev_id, > > + struct rte_crypto_cpu_sym_session *sess); > > + > > + > > #ifdef __cplusplus > > } > > #endif > > diff --git a/lib/librte_cryptodev/rte_cryptodev_pmd.h > > b/lib/librte_cryptodev/rte_cryptodev_pmd.h > > index defe05ea0..ed7e63fab 100644 > > --- a/lib/librte_cryptodev/rte_cryptodev_pmd.h > > +++ b/lib/librte_cryptodev/rte_cryptodev_pmd.h > > @@ -310,6 +310,20 @@ typedef void (*cryptodev_sym_free_session_t)(struc= t > > rte_cryptodev *dev, > > typedef void (*cryptodev_asym_free_session_t)(struct rte_cryptodev *de= v, > > struct rte_cryptodev_asym_session *sess); > > > > +typedef int (*cryptodev_cpu_sym_session_size_t) (struct rte_cryptodev = *dev, > > + const struct rte_crypto_sym_xform *xforms); > > + > > +typedef int (*cryptodev_cpu_sym_session_init_t) (struct rte_cryptodev = *dev, > > + struct rte_crypto_cpu_sym_session *sess, > > + const struct rte_crypto_sym_xform *xforms); > > + > > +typedef void (*cryptodev_cpu_sym_session_fini_t) (struct rte_cryptodev= *dev, > > + struct rte_crypto_cpu_sym_session *sess); > > + > > +typedef rte_crypto_cpu_sym_process_t (*cryptodev_cpu_sym_session_func_= t) > > ( > > + struct rte_cryptodev *dev, > > + const struct rte_crypto_sym_xform *xforms); > > + > > /** Crypto device operations function pointer table */ > > struct rte_cryptodev_ops { > > cryptodev_configure_t dev_configure; /**< Configure device. = */ > > @@ -343,6 +357,11 @@ struct rte_cryptodev_ops { > > /**< Clear a Crypto sessions private data. */ > > cryptodev_asym_free_session_t asym_session_clear; > > /**< Clear a Crypto sessions private data. */ > > + > > + cryptodev_cpu_sym_session_size_t sym_cpu_session_get_size; > > + cryptodev_cpu_sym_session_func_t sym_cpu_session_get_func; > > + cryptodev_cpu_sym_session_init_t sym_cpu_session_init; > > + cryptodev_cpu_sym_session_fini_t sym_cpu_session_fini; > > }; > > > > > >