DPDK usage discussions
 help / color / mirror / Atom feed
* [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
@ 2017-05-05 12:50 Chinmaya Dwibedy
  2017-05-08 13:54 ` Chinmaya Dwibedy
  0 siblings, 1 reply; 6+ messages in thread
From: Chinmaya Dwibedy @ 2017-05-05 12:50 UTC (permalink / raw)
  To: users

Hi All,


We are using DPK-17.02.  We use crypto via hardware (QAT) and software
acceleration (AESNI).  There is one to one mapping between crypto Dev and
worker core. What are the memory requirements for the below stated

1)           Creation of one physical Crypto device.

2)           Creation of one AESNI MB virtual Crypto device.

Thereafter we configure a device with the default number of queue pairs to
set up for the device as shown below.


#define CDEV_MP_CACHE_SZ 64

rte_cryptodev_info_get(cdev_id, &info);

                dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;

                dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;

                  dev_conf.socket_id = SOCKET_ID_ANY;

                dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;

rte_cryptodev_configure (cdev_id, &dev_conf);


How to calculate the minimum memory required to configure per HW and per SW
crypto device. Then we allocate and set up a receive queue pair for a
device as follows. As of now we use one queue per device and number of
descriptors per queue pair is set to 2k. If we increase the number of
descriptors, will it improve the performance in terms of throughput?


#define CDEV_MP_NB_OBJS 2048

qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;

rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, dev_conf.socket_id)


We create a session for symmetric cryptographic operations per IPsec
Security association.  What is the memory required to hold session data
structure?


The intent behind this is to calculate the memory requirements in advance
(before EAL initialization) and based upon the available memory, figure out
how many crypto devices (note: our application initializes AESNI vdev
without using EAL command line option) can be initialized? Say there are 24
worker cores and we need 24 crypto AESNI vdevs. But there is no sufficient
hugepage memory for creating 24 crypto AESNI vdevs. In such case, we will
allocate more hugepages , then call rte_eal_init() and expect it to be
passed.


Thank you in advance for your suggestion and time.



Regards,

Chinmaya

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
  2017-05-05 12:50 [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02) Chinmaya Dwibedy
@ 2017-05-08 13:54 ` Chinmaya Dwibedy
  2017-05-09 15:52   ` Trahe, Fiona
  0 siblings, 1 reply; 6+ messages in thread
From: Chinmaya Dwibedy @ 2017-05-08 13:54 UTC (permalink / raw)
  To: users

Hi,

Can anyone please respond to this email ? Thank you in advance for your
suggestion and time.

Regards,
Chinmaya

On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy@gmail.com>
wrote:

> Hi All,
>
>
> We are using DPK-17.02.  We use crypto via hardware (QAT) and software
> acceleration (AESNI).  There is one to one mapping between crypto Dev and
> worker core. What are the memory requirements for the below stated
>
> 1)           Creation of one physical Crypto device.
>
> 2)           Creation of one AESNI MB virtual Crypto device.
>
> Thereafter we configure a device with the default number of queue pairs to
> set up for the device as shown below.
>
>
> #define CDEV_MP_CACHE_SZ 64
>
> rte_cryptodev_info_get(cdev_id, &info);
>
>                 dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
>
>                 dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
>
>                   dev_conf.socket_id = SOCKET_ID_ANY;
>
>                 dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
>
> rte_cryptodev_configure (cdev_id, &dev_conf);
>
>
> How to calculate the minimum memory required to configure per HW and per
> SW crypto device. Then we allocate and set up a receive queue pair for a
> device as follows. As of now we use one queue per device and number of
> descriptors per queue pair is set to 2k. If we increase the number of
> descriptors, will it improve the performance in terms of throughput?
>
>
> #define CDEV_MP_NB_OBJS 2048
>
> qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
>
> rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, dev_conf.socket_id)
>
>
> We create a session for symmetric cryptographic operations per IPsec
> Security association.  What is the memory required to hold session data
> structure?
>
>
> The intent behind this is to calculate the memory requirements in advance
> (before EAL initialization) and based upon the available memory, figure out
> how many crypto devices (note: our application initializes AESNI vdev
> without using EAL command line option) can be initialized? Say there are 24
> worker cores and we need 24 crypto AESNI vdevs. But there is no sufficient
> hugepage memory for creating 24 crypto AESNI vdevs. In such case, we will
> allocate more hugepages , then call rte_eal_init() and expect it to be
> passed.
>
>
> Thank you in advance for your suggestion and time.
>
>
>
> Regards,
>
> Chinmaya
>
>
>
>
>
>
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
  2017-05-08 13:54 ` Chinmaya Dwibedy
@ 2017-05-09 15:52   ` Trahe, Fiona
  2017-05-10  9:35     ` Chinmaya Dwibedy
  0 siblings, 1 reply; 6+ messages in thread
From: Trahe, Fiona @ 2017-05-09 15:52 UTC (permalink / raw)
  To: Chinmaya Dwibedy, users; +Cc: Trahe, Fiona, Kusztal, ArkadiuszX

Hi Chinmaya,

> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org] On Behalf Of Chinmaya Dwibedy
> Sent: Monday, May 8, 2017 2:54 PM
> To: users@dpdk.org
> Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and
> AESNI) (using DPDK-17.02)
> 
> Hi,
> 
> Can anyone please respond to this email ? Thank you in advance for your
> suggestion and time.
> 
> Regards,
> Chinmaya
> 
> On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy@gmail.com>
> wrote:
> 
> > Hi All,
> >
> >
> > We are using DPK-17.02.  We use crypto via hardware (QAT) and software
> > acceleration (AESNI).  There is one to one mapping between crypto Dev and
> > worker core. What are the memory requirements for the below stated
> >
> > 1)           Creation of one physical Crypto device.
> >
> > 2)           Creation of one AESNI MB virtual Crypto device.
> >
> > Thereafter we configure a device with the default number of queue pairs to
> > set up for the device as shown below.
> >
> >
> > #define CDEV_MP_CACHE_SZ 64
> >
> > rte_cryptodev_info_get(cdev_id, &info);
> >
> >                 dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
> >
> >                 dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
> >
> >                   dev_conf.socket_id = SOCKET_ID_ANY;
> >
> >                 dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
> >
> > rte_cryptodev_configure (cdev_id, &dev_conf);
> >
> >
> > How to calculate the minimum memory required to configure per HW and per
> > SW crypto device. Then we allocate and set up a receive queue pair for a
> > device as follows. As of now we use one queue per device and number of
> > descriptors per queue pair is set to 2k. If we increase the number of
> > descriptors, will it improve the performance in terms of throughput?
> >
> >
[Fiona] 
The QAT device can serve only a certain number of requests in parallel
which is far smaller than 2k. So increasing number of descriptors 
won't speed up throughput. In fact 2k is probably excessive and could
lead to longer latency if the queue is being filled up.
I would suggest trying values of 1k, 512 and 256 and if you see no reduction in 
reduction in throughput you can use a smaller queue and save some memory. 
The optimal size partly depends on how bursty your traffic is.

> > #define CDEV_MP_NB_OBJS 2048
> >
> > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
> >
> > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, dev_conf.socket_id)
> >
> >
[Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device).
TX queue = qp_conf.nb_descriptors * 128 bytes
+
RX queue = qp_conf.nb_descriptors * 32 bytes
+
op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes

op mempool size is totally up to the user and is not bound to any device or PMD.


Session mempool is per device (though this will change in 17.08)
QAT session struct is 576 bytes long + memory for
bpi_ctx and inst pointers.
Number of sessions in the pool are passed in to rte_cryptodev_configure().
This should be <= max_nb_sessions for that device which can be queried using
rte_cryptodev_info_get()


> > We create a session for symmetric cryptographic operations per IPsec
> > Security association.  What is the memory required to hold session data
> > structure?
> >
> >
> > The intent behind this is to calculate the memory requirements in advance
> > (before EAL initialization) and based upon the available memory, figure out
> > how many crypto devices (note: our application initializes AESNI vdev
> > without using EAL command line option) can be initialized? Say there are 24
> > worker cores and we need 24 crypto AESNI vdevs. But there is no sufficient
> > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we will
> > allocate more hugepages , then call rte_eal_init() and expect it to be
> > passed.
> >
> >
> > Thank you in advance for your suggestion and time.
> >
> >
> >
> > Regards,
> >
> > Chinmaya
> >
> >
> >
> >
> >
> >
> >
> >
> >

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
  2017-05-09 15:52   ` Trahe, Fiona
@ 2017-05-10  9:35     ` Chinmaya Dwibedy
  2017-05-15 11:52       ` Kusztal, ArkadiuszX
  0 siblings, 1 reply; 6+ messages in thread
From: Chinmaya Dwibedy @ 2017-05-10  9:35 UTC (permalink / raw)
  To: Trahe, Fiona; +Cc: users, Kusztal, ArkadiuszX

Hi Fiona,

Thanks a lot for your valuable feedback. Once again I reviewed the code and
figured out the memory requirement for crypto device. Kindly review the
below stated, feel free to suggest if something is wrong.

*AESNI SW Crypto Device (dpdk 17.02)*

During initialization of AESNI vdev (via rte_eal_vdev_init (), called by
DPDK application)

1)   Allocates memzone for cryptodev data structure: 128 bytes.

2)   Allocates memzone for cryptodev device private data: 12 bytes

During configuration of a device (via rte_cryptodev_configure ())

1)   Allocates memzone for queue_pairs meta data: 8 byes.

2)   Allocates memory required for session mempool: 2048*848= 1736704 bytes

(Note: Size of element: 848 bytes and Number of elements: 2048, Number of
queue pairs: 1 and Number of sessions: 2048)

During queue pair setup (via rte_cryptodev_queue_pair_setup ())

1)   Allocates memzone for queue pair data structure: 52928 bytes.

2)   Allocates memory for ring ( to place processed operations on) : 52928
bytes

*Total memory required per AESNI vdev: 1842708 bytes (1.757MB)*



*QAT HW Crypto Device (dpdk 17.02)*

During initialization of QAT device (via rte_cryptodev_pci_probe(),QAT
devices are discovered during the PCI probe of the EAL function which is
executed at DPDK initialization)

1)   Allocates memzone for cryptodev data structure: 128 bytes.

2)   Allocates memzone for cryptodev device private data: 80 bytes

During configuring a device (via rte_cryptodev_configure ())

1)   Allocate memzone for queue_pairs meta data: 8 byes.

2)   Allocates memory required for session mempool:: 2048*592= 1212416 bytes

(Note: Size of element: 592 bytes and Number of elements: 2048, Number of
queue pairs: 1 and Number of sessions: 2048)

During setting up a queue pair (via rte_cryptodev_queue_pair_setup ())

1)   Allocates memzone for queue pair data structure:: 320 bytes.

2)   Allocates memory for qat PMD op cookie pointer: 16384 bytes.

3)   Allocates memory for Tx queue: 262144 bytes.

4)   Allocates memory for Rx queue: 65536 bytes

*Total memory required per QAT device: 1557008 bytes (1.484MB)*

Regards,

Chinmaya

On Tue, May 9, 2017 at 9:22 PM, Trahe, Fiona <fiona.trahe@intel.com> wrote:

> Hi Chinmaya,
>
> > -----Original Message-----
> > From: users [mailto:users-bounces@dpdk.org] On Behalf Of Chinmaya
> Dwibedy
> > Sent: Monday, May 8, 2017 2:54 PM
> > To: users@dpdk.org
> > Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and
> > AESNI) (using DPDK-17.02)
> >
> > Hi,
> >
> > Can anyone please respond to this email ? Thank you in advance for your
> > suggestion and time.
> >
> > Regards,
> > Chinmaya
> >
> > On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > >
> > > We are using DPK-17.02.  We use crypto via hardware (QAT) and software
> > > acceleration (AESNI).  There is one to one mapping between crypto Dev
> and
> > > worker core. What are the memory requirements for the below stated
> > >
> > > 1)           Creation of one physical Crypto device.
> > >
> > > 2)           Creation of one AESNI MB virtual Crypto device.
> > >
> > > Thereafter we configure a device with the default number of queue
> pairs to
> > > set up for the device as shown below.
> > >
> > >
> > > #define CDEV_MP_CACHE_SZ 64
> > >
> > > rte_cryptodev_info_get(cdev_id, &info);
> > >
> > >                 dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
> > >
> > >                 dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
> > >
> > >                   dev_conf.socket_id = SOCKET_ID_ANY;
> > >
> > >                 dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
> > >
> > > rte_cryptodev_configure (cdev_id, &dev_conf);
> > >
> > >
> > > How to calculate the minimum memory required to configure per HW and
> per
> > > SW crypto device. Then we allocate and set up a receive queue pair for
> a
> > > device as follows. As of now we use one queue per device and number of
> > > descriptors per queue pair is set to 2k. If we increase the number of
> > > descriptors, will it improve the performance in terms of throughput?
> > >
> > >
> [Fiona]
> The QAT device can serve only a certain number of requests in parallel
> which is far smaller than 2k. So increasing number of descriptors
> won't speed up throughput. In fact 2k is probably excessive and could
> lead to longer latency if the queue is being filled up.
> I would suggest trying values of 1k, 512 and 256 and if you see no
> reduction in
> reduction in throughput you can use a smaller queue and save some memory.
> The optimal size partly depends on how bursty your traffic is.
>
> > > #define CDEV_MP_NB_OBJS 2048
> > >
> > > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
> > >
> > > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf,
> dev_conf.socket_id)
> > >
> > >
> [Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device).
> TX queue = qp_conf.nb_descriptors * 128 bytes
> +
> RX queue = qp_conf.nb_descriptors * 32 bytes
> +
> op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes
>
> op mempool size is totally up to the user and is not bound to any device
> or PMD.
>
>
> Session mempool is per device (though this will change in 17.08)
> QAT session struct is 576 bytes long + memory for
> bpi_ctx and inst pointers.
> Number of sessions in the pool are passed in to rte_cryptodev_configure().
> This should be <= max_nb_sessions for that device which can be queried
> using
> rte_cryptodev_info_get()
>
>
> > > We create a session for symmetric cryptographic operations per IPsec
> > > Security association.  What is the memory required to hold session data
> > > structure?
> > >
> > >
> > > The intent behind this is to calculate the memory requirements in
> advance
> > > (before EAL initialization) and based upon the available memory,
> figure out
> > > how many crypto devices (note: our application initializes AESNI vdev
> > > without using EAL command line option) can be initialized? Say there
> are 24
> > > worker cores and we need 24 crypto AESNI vdevs. But there is no
> sufficient
> > > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we
> will
> > > allocate more hugepages , then call rte_eal_init() and expect it to be
> > > passed.
> > >
> > >
> > > Thank you in advance for your suggestion and time.
> > >
> > >
> > >
> > > Regards,
> > >
> > > Chinmaya
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
  2017-05-10  9:35     ` Chinmaya Dwibedy
@ 2017-05-15 11:52       ` Kusztal, ArkadiuszX
  2017-05-16  5:51         ` Chinmaya Dwibedy
  0 siblings, 1 reply; 6+ messages in thread
From: Kusztal, ArkadiuszX @ 2017-05-15 11:52 UTC (permalink / raw)
  To: Chinmaya Dwibedy, Trahe, Fiona; +Cc: users

Hi Chinmaya,

Sorry for delayed answer.

As for QAT.
Few corrections.

2)   Allocates memory for qat PMD op cookie pointer: 16384 bytes
There will be as many cookie pointers as there is nb_descriptors (so in this case 2048).
Cookie pointer struct will take 704 bytes, 256 is needed for buffers in in-place operation and 256 for out-of-place + 16 bytes which will be padded due to 64B alignment constraint to 320 bytes and then  align (320 + 320 + 16) = 704 bytes.
So it will be 704 * 2048 = 1441792B.

Qat_session size should be of 568 bytes.

Regards,
Arek

From: Chinmaya Dwibedy [mailto:ckdwibedy@gmail.com]
Sent: Wednesday, May 10, 2017 10:36 AM
To: Trahe, Fiona <fiona.trahe@intel.com>
Cc: users@dpdk.org; Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>
Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)


Hi Fiona,

Thanks a lot for your valuable feedback. Once again I reviewed the code and figured out the memory requirement for crypto device. Kindly review the below stated, feel free to suggest if something is wrong.

AESNI SW Crypto Device (dpdk 17.02)

During initialization of AESNI vdev (via rte_eal_vdev_init (), called by DPDK application)

1)   Allocates memzone for cryptodev data structure: 128 bytes.

2)   Allocates memzone for cryptodev device private data: 12 bytes

During configuration of a device (via rte_cryptodev_configure ())

1)   Allocates memzone for queue_pairs meta data: 8 byes.

2)   Allocates memory required for session mempool: 2048*848= 1736704 bytes
(Note: Size of element: 848 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048)

During queue pair setup (via rte_cryptodev_queue_pair_setup ())

1)   Allocates memzone for queue pair data structure: 52928 bytes.

2)   Allocates memory for ring ( to place processed operations on) : 52928 bytes

Total memory required per AESNI vdev: 1842708 bytes (1.757MB)



QAT HW Crypto Device (dpdk 17.02)

During initialization of QAT device (via rte_cryptodev_pci_probe(),QAT devices are discovered during the PCI probe of the EAL function which is executed at DPDK initialization)

1)   Allocates memzone for cryptodev data structure: 128 bytes.

2)   Allocates memzone for cryptodev device private data: 80 bytes

During configuring a device (via rte_cryptodev_configure ())

1)   Allocate memzone for queue_pairs meta data: 8 byes.

2)   Allocates memory required for session mempool:: 2048*592= 1212416 bytes
(Note: Size of element: 592 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048)

During setting up a queue pair (via rte_cryptodev_queue_pair_setup ())

1)   Allocates memzone for queue pair data structure:: 320 bytes.

2)   Allocates memory for qat PMD op cookie pointer: 16384 bytes.

3)   Allocates memory for Tx queue: 262144 bytes.

4)   Allocates memory for Rx queue: 65536 bytes

Total memory required per QAT device: 1557008 bytes (1.484MB)
Regards,
Chinmaya

On Tue, May 9, 2017 at 9:22 PM, Trahe, Fiona <fiona.trahe@intel.com<mailto:fiona.trahe@intel.com>> wrote:
Hi Chinmaya,

> -----Original Message-----
> From: users [mailto:users-bounces@dpdk.org<mailto:users-bounces@dpdk.org>] On Behalf Of Chinmaya Dwibedy
> Sent: Monday, May 8, 2017 2:54 PM
> To: users@dpdk.org<mailto:users@dpdk.org>
> Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and
> AESNI) (using DPDK-17.02)
>
> Hi,
>
> Can anyone please respond to this email ? Thank you in advance for your
> suggestion and time.
>
> Regards,
> Chinmaya
>
> On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy@gmail.com<mailto:ckdwibedy@gmail.com>>
> wrote:
>
> > Hi All,
> >
> >
> > We are using DPK-17.02.  We use crypto via hardware (QAT) and software
> > acceleration (AESNI).  There is one to one mapping between crypto Dev and
> > worker core. What are the memory requirements for the below stated
> >
> > 1)           Creation of one physical Crypto device.
> >
> > 2)           Creation of one AESNI MB virtual Crypto device.
> >
> > Thereafter we configure a device with the default number of queue pairs to
> > set up for the device as shown below.
> >
> >
> > #define CDEV_MP_CACHE_SZ 64
> >
> > rte_cryptodev_info_get(cdev_id, &info);
> >
> >                 dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
> >
> >                 dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
> >
> >                   dev_conf.socket_id = SOCKET_ID_ANY;
> >
> >                 dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
> >
> > rte_cryptodev_configure (cdev_id, &dev_conf);
> >
> >
> > How to calculate the minimum memory required to configure per HW and per
> > SW crypto device. Then we allocate and set up a receive queue pair for a
> > device as follows. As of now we use one queue per device and number of
> > descriptors per queue pair is set to 2k. If we increase the number of
> > descriptors, will it improve the performance in terms of throughput?
> >
> >
[Fiona]
The QAT device can serve only a certain number of requests in parallel
which is far smaller than 2k. So increasing number of descriptors
won't speed up throughput. In fact 2k is probably excessive and could
lead to longer latency if the queue is being filled up.
I would suggest trying values of 1k, 512 and 256 and if you see no reduction in
reduction in throughput you can use a smaller queue and save some memory.
The optimal size partly depends on how bursty your traffic is.

> > #define CDEV_MP_NB_OBJS 2048
> >
> > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
> >
> > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, dev_conf.socket_id)
> >
> >
[Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device).
TX queue = qp_conf.nb_descriptors * 128 bytes
+
RX queue = qp_conf.nb_descriptors * 32 bytes
+
op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes

op mempool size is totally up to the user and is not bound to any device or PMD.


Session mempool is per device (though this will change in 17.08)
QAT session struct is 576 bytes long + memory for
bpi_ctx and inst pointers.
Number of sessions in the pool are passed in to rte_cryptodev_configure().
This should be <= max_nb_sessions for that device which can be queried using
rte_cryptodev_info_get()


> > We create a session for symmetric cryptographic operations per IPsec
> > Security association.  What is the memory required to hold session data
> > structure?
> >
> >
> > The intent behind this is to calculate the memory requirements in advance
> > (before EAL initialization) and based upon the available memory, figure out
> > how many crypto devices (note: our application initializes AESNI vdev
> > without using EAL command line option) can be initialized? Say there are 24
> > worker cores and we need 24 crypto AESNI vdevs. But there is no sufficient
> > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we will
> > allocate more hugepages , then call rte_eal_init() and expect it to be
> > passed.
> >
> >
> > Thank you in advance for your suggestion and time.
> >
> >
> >
> > Regards,
> >
> > Chinmaya
> >
> >
> >
> >
> >
> >
> >
> >
> >


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02)
  2017-05-15 11:52       ` Kusztal, ArkadiuszX
@ 2017-05-16  5:51         ` Chinmaya Dwibedy
  0 siblings, 0 replies; 6+ messages in thread
From: Chinmaya Dwibedy @ 2017-05-16  5:51 UTC (permalink / raw)
  To: Kusztal, ArkadiuszX; +Cc: Trahe, Fiona, users

Thank you Arek for corrections.

On Mon, May 15, 2017 at 5:22 PM, Kusztal, ArkadiuszX <
arkadiuszx.kusztal@intel.com> wrote:

> Hi Chinmaya,
>
>
>
> Sorry for delayed answer.
>
>
>
> As for QAT.
>
> Few corrections.
>
>
>
> 2)   Allocates memory for qat PMD op cookie pointer: 16384 bytes
>
> There will be as many cookie pointers as there is nb_descriptors (so in
> this case 2048).
>
> Cookie pointer struct will take 704 bytes, 256 is needed for buffers in
> in-place operation and 256 for out-of-place + 16 bytes which will be padded
> due to 64B alignment constraint to 320 bytes and then  align (320 + 320 +
> 16) = 704 bytes.
>
> So it will be 704 * 2048 = 1441792B.
>
>
>
> Qat_session size should be of 568 bytes.
>
>
>
> Regards,
>
> Arek
>
>
>
> *From:* Chinmaya Dwibedy [mailto:ckdwibedy@gmail.com]
> *Sent:* Wednesday, May 10, 2017 10:36 AM
> *To:* Trahe, Fiona <fiona.trahe@intel.com>
> *Cc:* users@dpdk.org; Kusztal, ArkadiuszX <arkadiuszx.kusztal@intel.com>
>
> *Subject:* Re: [dpdk-users] Memory requirements for crypto devices (QAT
> and AESNI) (using DPDK-17.02)
>
>
>
> Hi Fiona,
>
> Thanks a lot for your valuable feedback. Once again I reviewed the code
> and figured out the memory requirement for crypto device. Kindly review the
> below stated, feel free to suggest if something is wrong.
>
> *AESNI SW Crypto Device (dpdk 17.02)*
>
> During initialization of AESNI vdev (via rte_eal_vdev_init (), called by
> DPDK application)
>
> 1)   Allocates memzone for cryptodev data structure: 128 bytes.
>
> 2)   Allocates memzone for cryptodev device private data: 12 bytes
>
> During configuration of a device (via rte_cryptodev_configure ())
>
> 1)   Allocates memzone for queue_pairs meta data: 8 byes.
>
> 2)   Allocates memory required for session mempool: 2048*848= 1736704
> bytes
>
> (Note: Size of element: 848 bytes and Number of elements: 2048, Number of
> queue pairs: 1 and Number of sessions: 2048)
>
> During queue pair setup (via rte_cryptodev_queue_pair_setup ())
>
> 1)   Allocates memzone for queue pair data structure: 52928 bytes.
>
> 2)   Allocates memory for ring ( to place processed operations on) :
> 52928 bytes
>
> *Total memory required per AESNI vdev: 1842708 bytes (1.757MB)*
>
>
>
> *QAT HW Crypto Device (dpdk 17.02)*
>
> During initialization of QAT device (via rte_cryptodev_pci_probe(),QAT
> devices are discovered during the PCI probe of the EAL function which is
> executed at DPDK initialization)
>
> 1)   Allocates memzone for cryptodev data structure: 128 bytes.
>
> 2)   Allocates memzone for cryptodev device private data: 80 bytes
>
> During configuring a device (via rte_cryptodev_configure ())
>
> 1)   Allocate memzone for queue_pairs meta data: 8 byes.
>
> 2)   Allocates memory required for session mempool:: 2048*592= 1212416
> bytes
>
> (Note: Size of element: 592 bytes and Number of elements: 2048, Number of
> queue pairs: 1 and Number of sessions: 2048)
>
> During setting up a queue pair (via rte_cryptodev_queue_pair_setup ())
>
> 1)   Allocates memzone for queue pair data structure:: 320 bytes.
>
> 2)   Allocates memory for qat PMD op cookie pointer: 16384 bytes.
>
> 3)   Allocates memory for Tx queue: 262144 bytes.
>
> 4)   Allocates memory for Rx queue: 65536 bytes
>
> *Total memory required per QAT device: 1557008 bytes (1.484MB)*
>
> Regards,
>
> Chinmaya
>
>
>
> On Tue, May 9, 2017 at 9:22 PM, Trahe, Fiona <fiona.trahe@intel.com>
> wrote:
>
> Hi Chinmaya,
>
>
> > -----Original Message-----
> > From: users [mailto:users-bounces@dpdk.org] On Behalf Of Chinmaya
> Dwibedy
> > Sent: Monday, May 8, 2017 2:54 PM
> > To: users@dpdk.org
> > Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and
> > AESNI) (using DPDK-17.02)
> >
> > Hi,
> >
> > Can anyone please respond to this email ? Thank you in advance for your
> > suggestion and time.
> >
> > Regards,
> > Chinmaya
> >
> > On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy <ckdwibedy@gmail.com>
> > wrote:
> >
> > > Hi All,
> > >
> > >
> > > We are using DPK-17.02.  We use crypto via hardware (QAT) and software
> > > acceleration (AESNI).  There is one to one mapping between crypto Dev
> and
> > > worker core. What are the memory requirements for the below stated
> > >
> > > 1)           Creation of one physical Crypto device.
> > >
> > > 2)           Creation of one AESNI MB virtual Crypto device.
> > >
> > > Thereafter we configure a device with the default number of queue
> pairs to
> > > set up for the device as shown below.
> > >
> > >
> > > #define CDEV_MP_CACHE_SZ 64
> > >
> > > rte_cryptodev_info_get(cdev_id, &info);
> > >
> > >                 dev_conf.nb_queue_pairs = info.max_nb_queue_pairs;
> > >
> > >                 dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions;
> > >
> > >                   dev_conf.socket_id = SOCKET_ID_ANY;
> > >
> > >                 dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ;
> > >
> > > rte_cryptodev_configure (cdev_id, &dev_conf);
> > >
> > >
> > > How to calculate the minimum memory required to configure per HW and
> per
> > > SW crypto device. Then we allocate and set up a receive queue pair for
> a
> > > device as follows. As of now we use one queue per device and number of
> > > descriptors per queue pair is set to 2k. If we increase the number of
> > > descriptors, will it improve the performance in terms of throughput?
> > >
> > >
>
> [Fiona]
> The QAT device can serve only a certain number of requests in parallel
> which is far smaller than 2k. So increasing number of descriptors
> won't speed up throughput. In fact 2k is probably excessive and could
> lead to longer latency if the queue is being filled up.
> I would suggest trying values of 1k, 512 and 256 and if you see no
> reduction in
> reduction in throughput you can use a smaller queue and save some memory.
> The optimal size partly depends on how bursty your traffic is.
>
> > > #define CDEV_MP_NB_OBJS 2048
> > >
> > > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS;
> > >
> > > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf,
> dev_conf.socket_id)
> > >
> > >
> [Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device).
> TX queue = qp_conf.nb_descriptors * 128 bytes
> +
> RX queue = qp_conf.nb_descriptors * 32 bytes
> +
> op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes
>
> op mempool size is totally up to the user and is not bound to any device
> or PMD.
>
>
> Session mempool is per device (though this will change in 17.08)
> QAT session struct is 576 bytes long + memory for
> bpi_ctx and inst pointers.
> Number of sessions in the pool are passed in to rte_cryptodev_configure().
> This should be <= max_nb_sessions for that device which can be queried
> using
> rte_cryptodev_info_get()
>
>
>
> > > We create a session for symmetric cryptographic operations per IPsec
> > > Security association.  What is the memory required to hold session data
> > > structure?
> > >
> > >
> > > The intent behind this is to calculate the memory requirements in
> advance
> > > (before EAL initialization) and based upon the available memory,
> figure out
> > > how many crypto devices (note: our application initializes AESNI vdev
> > > without using EAL command line option) can be initialized? Say there
> are 24
> > > worker cores and we need 24 crypto AESNI vdevs. But there is no
> sufficient
> > > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we
> will
> > > allocate more hugepages , then call rte_eal_init() and expect it to be
> > > passed.
> > >
> > >
> > > Thank you in advance for your suggestion and time.
> > >
> > >
> > >
> > > Regards,
> > >
> > > Chinmaya
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
>
>
>

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2017-05-16  5:51 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-05 12:50 [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02) Chinmaya Dwibedy
2017-05-08 13:54 ` Chinmaya Dwibedy
2017-05-09 15:52   ` Trahe, Fiona
2017-05-10  9:35     ` Chinmaya Dwibedy
2017-05-15 11:52       ` Kusztal, ArkadiuszX
2017-05-16  5:51         ` Chinmaya Dwibedy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).