From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi0-f46.google.com (mail-oi0-f46.google.com [209.85.218.46]) by dpdk.org (Postfix) with ESMTP id 9BC352952 for ; Wed, 10 May 2017 11:35:37 +0200 (CEST) Received: by mail-oi0-f46.google.com with SMTP id w10so29321039oif.0 for ; Wed, 10 May 2017 02:35:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=Yd7J/5IZ5B+mp5jCEEvk3R0CWMCYnXW/yYAYyvdEBXA=; b=I6NKtYZIOuYKwu0om4SX9uOHejEtc0RtT1GrxRNGQr0vGtThCRhGDKJxygSIGDiioB tiEUPlb7rkFbymeaF+UVwTNTVv64dTdJoli+1RYeV7DE/p2nCx9AMP5k2GzFzI/31oT+ w3ttVPxuTkJj2pDukM6lSCvGTdiwiJ8Ifi9tVNa3748vqMlWCsQFu07WX3b/hH6yi1JQ akkZWpyn2LSN7wX1Etu4trhRYUBFdviYuR+504o4OIVjCBQsrpEJVLC2+WvBD7EqAHY2 C/jvXj7hrIgnkl0lN0t8GfmG3INFqJE6eCYCAYOnijd1p8VKUDjRqVB81NwOepLpm2OW qmtQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=Yd7J/5IZ5B+mp5jCEEvk3R0CWMCYnXW/yYAYyvdEBXA=; b=elHWoh7g/qHkdHunwFeoGEkPLK3Nwwwqwrb60N8+m+3nonQ++tA0C44lm2meMKcFfB SOhXl1dlGn5W8CNhMxEdRShhCpUQ4mi34tK1zJGGAiR1YcSAAbauRwURj56LplP7kMo0 tuvBLPzrrcyQOanCNtthxO9/5AcRoe8FLjldscc+iE9k16sDpp2zSs8Gp8NP5twBrB4g vLNUKxMtSueJZ68UU41h5tAmqjY9Nk47qP5esQanYS2hxSURixw4KSNw/CU30eX8cE2a NSYu9wO+fc9yV66yPynxeOMf/42y2bYC4xeojUNZG0+4OdoPZAKsV/ZYjZzY+LSeeDEr voMg== X-Gm-Message-State: AODbwcCm3DVti6yr4KDrmrUIopJciFvsK1qYGgnQ6uPdHKQ+5SmAOxuK IkcgxKtu7I/yc+23XcMOIA1m3pfKvw== X-Received: by 10.157.7.86 with SMTP id 80mr2350768ote.91.1494408936911; Wed, 10 May 2017 02:35:36 -0700 (PDT) MIME-Version: 1.0 Received: by 10.157.63.215 with HTTP; Wed, 10 May 2017 02:35:36 -0700 (PDT) In-Reply-To: <348A99DA5F5B7549AA880327E580B435891F4159@IRSMSX101.ger.corp.intel.com> References: <348A99DA5F5B7549AA880327E580B435891F4159@IRSMSX101.ger.corp.intel.com> From: Chinmaya Dwibedy Date: Wed, 10 May 2017 15:05:36 +0530 Message-ID: To: "Trahe, Fiona" Cc: "users@dpdk.org" , "Kusztal, ArkadiuszX" Content-Type: text/plain; charset=UTF-8 X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and AESNI) (using DPDK-17.02) X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 10 May 2017 09:35:38 -0000 Hi Fiona, Thanks a lot for your valuable feedback. Once again I reviewed the code and figured out the memory requirement for crypto device. Kindly review the below stated, feel free to suggest if something is wrong. *AESNI SW Crypto Device (dpdk 17.02)* During initialization of AESNI vdev (via rte_eal_vdev_init (), called by DPDK application) 1) Allocates memzone for cryptodev data structure: 128 bytes. 2) Allocates memzone for cryptodev device private data: 12 bytes During configuration of a device (via rte_cryptodev_configure ()) 1) Allocates memzone for queue_pairs meta data: 8 byes. 2) Allocates memory required for session mempool: 2048*848= 1736704 bytes (Note: Size of element: 848 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048) During queue pair setup (via rte_cryptodev_queue_pair_setup ()) 1) Allocates memzone for queue pair data structure: 52928 bytes. 2) Allocates memory for ring ( to place processed operations on) : 52928 bytes *Total memory required per AESNI vdev: 1842708 bytes (1.757MB)* *QAT HW Crypto Device (dpdk 17.02)* During initialization of QAT device (via rte_cryptodev_pci_probe(),QAT devices are discovered during the PCI probe of the EAL function which is executed at DPDK initialization) 1) Allocates memzone for cryptodev data structure: 128 bytes. 2) Allocates memzone for cryptodev device private data: 80 bytes During configuring a device (via rte_cryptodev_configure ()) 1) Allocate memzone for queue_pairs meta data: 8 byes. 2) Allocates memory required for session mempool:: 2048*592= 1212416 bytes (Note: Size of element: 592 bytes and Number of elements: 2048, Number of queue pairs: 1 and Number of sessions: 2048) During setting up a queue pair (via rte_cryptodev_queue_pair_setup ()) 1) Allocates memzone for queue pair data structure:: 320 bytes. 2) Allocates memory for qat PMD op cookie pointer: 16384 bytes. 3) Allocates memory for Tx queue: 262144 bytes. 4) Allocates memory for Rx queue: 65536 bytes *Total memory required per QAT device: 1557008 bytes (1.484MB)* Regards, Chinmaya On Tue, May 9, 2017 at 9:22 PM, Trahe, Fiona wrote: > Hi Chinmaya, > > > -----Original Message----- > > From: users [mailto:users-bounces@dpdk.org] On Behalf Of Chinmaya > Dwibedy > > Sent: Monday, May 8, 2017 2:54 PM > > To: users@dpdk.org > > Subject: Re: [dpdk-users] Memory requirements for crypto devices (QAT and > > AESNI) (using DPDK-17.02) > > > > Hi, > > > > Can anyone please respond to this email ? Thank you in advance for your > > suggestion and time. > > > > Regards, > > Chinmaya > > > > On Fri, May 5, 2017 at 6:20 PM, Chinmaya Dwibedy > > wrote: > > > > > Hi All, > > > > > > > > > We are using DPK-17.02. We use crypto via hardware (QAT) and software > > > acceleration (AESNI). There is one to one mapping between crypto Dev > and > > > worker core. What are the memory requirements for the below stated > > > > > > 1) Creation of one physical Crypto device. > > > > > > 2) Creation of one AESNI MB virtual Crypto device. > > > > > > Thereafter we configure a device with the default number of queue > pairs to > > > set up for the device as shown below. > > > > > > > > > #define CDEV_MP_CACHE_SZ 64 > > > > > > rte_cryptodev_info_get(cdev_id, &info); > > > > > > dev_conf.nb_queue_pairs = info.max_nb_queue_pairs; > > > > > > dev_conf.session_mp.nb_objs = info.sym.max_nb_sessions; > > > > > > dev_conf.socket_id = SOCKET_ID_ANY; > > > > > > dev_conf.session_mp.cache_size = CDEV_MP_CACHE_SZ; > > > > > > rte_cryptodev_configure (cdev_id, &dev_conf); > > > > > > > > > How to calculate the minimum memory required to configure per HW and > per > > > SW crypto device. Then we allocate and set up a receive queue pair for > a > > > device as follows. As of now we use one queue per device and number of > > > descriptors per queue pair is set to 2k. If we increase the number of > > > descriptors, will it improve the performance in terms of throughput? > > > > > > > [Fiona] > The QAT device can serve only a certain number of requests in parallel > which is far smaller than 2k. So increasing number of descriptors > won't speed up throughput. In fact 2k is probably excessive and could > lead to longer latency if the queue is being filled up. > I would suggest trying values of 1k, 512 and 256 and if you see no > reduction in > reduction in throughput you can use a smaller queue and save some memory. > The optimal size partly depends on how bursty your traffic is. > > > > #define CDEV_MP_NB_OBJS 2048 > > > > > > qp_conf.nb_descriptors = CDEV_MP_NB_OBJS; > > > > > > rte_cryptodev_queue_pair_setup (cdev_id, 0, &qp_conf, > dev_conf.socket_id) > > > > > > > [Fiona] Memory for each QAT queue pair (max 2 sym qps per QAT device). > TX queue = qp_conf.nb_descriptors * 128 bytes > + > RX queue = qp_conf.nb_descriptors * 32 bytes > + > op cookies (used for sgl meta-data) = qp_conf.nb_descriptors * 264 bytes > > op mempool size is totally up to the user and is not bound to any device > or PMD. > > > Session mempool is per device (though this will change in 17.08) > QAT session struct is 576 bytes long + memory for > bpi_ctx and inst pointers. > Number of sessions in the pool are passed in to rte_cryptodev_configure(). > This should be <= max_nb_sessions for that device which can be queried > using > rte_cryptodev_info_get() > > > > > We create a session for symmetric cryptographic operations per IPsec > > > Security association. What is the memory required to hold session data > > > structure? > > > > > > > > > The intent behind this is to calculate the memory requirements in > advance > > > (before EAL initialization) and based upon the available memory, > figure out > > > how many crypto devices (note: our application initializes AESNI vdev > > > without using EAL command line option) can be initialized? Say there > are 24 > > > worker cores and we need 24 crypto AESNI vdevs. But there is no > sufficient > > > hugepage memory for creating 24 crypto AESNI vdevs. In such case, we > will > > > allocate more hugepages , then call rte_eal_init() and expect it to be > > > passed. > > > > > > > > > Thank you in advance for your suggestion and time. > > > > > > > > > > > > Regards, > > > > > > Chinmaya > > > > > > > > > > > > > > > > > > > > > > > > > > > >