From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM04-BN3-obe.outbound.protection.outlook.com (mail-eopbgr680077.outbound.protection.outlook.com [40.107.68.77]) by dpdk.org (Postfix) with ESMTP id 53D061B4E1 for ; Fri, 12 Oct 2018 12:16:01 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Z3HtciZ2EHV6w6maN4BbqTtojvd0mCOFCRZcZpwIW0M=; b=V1EYVGsSg7pxU26Q554QDgOvmYhtrxyKYZAtTuYGXBjlbyVv3XWxs22PKlmnUmE4Mwdk/L8gDBacG319o5+YHwNyON2u1qn5KY5xwWfXtvYAdmh6m9MuFDEwFpZIwZCjPd7icLJ8QXK1zLJmJjyBUkecp9Gwvt22rQ5ud6m+dOo= Received: from SN6PR07MB5152.namprd07.prod.outlook.com (52.135.101.33) by SN6PR07MB4511.namprd07.prod.outlook.com (52.135.94.10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1207.23; Fri, 12 Oct 2018 10:15:59 +0000 Received: from SN6PR07MB5152.namprd07.prod.outlook.com ([fe80::308f:470:554e:a3a4]) by SN6PR07MB5152.namprd07.prod.outlook.com ([fe80::308f:470:554e:a3a4%4]) with mapi id 15.20.1207.029; Fri, 12 Oct 2018 10:15:58 +0000 From: "Verma, Shally" To: Tomasz Jozwiak , "dev@dpdk.org" , "fiona.trahe@intel.com" , "akhil.goyal@nxp.com" , "pablo.de.lara.guarch@intel.com" CC: "De@dpdk.org" , "Lara@dpdk.org" , "Guarch@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 2/3] app/compress-perf: add performance measurement Thread-Index: AQHUWYqjhLJt3XAtpkeXnW8WfRqeNaUVmp7A Date: Fri, 12 Oct 2018 10:15:58 +0000 Message-ID: References: <1538400427-20164-1-git-send-email-tomaszx.jozwiak@intel.com> <1538400427-20164-3-git-send-email-tomaszx.jozwiak@intel.com> In-Reply-To: <1538400427-20164-3-git-send-email-tomaszx.jozwiak@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=Shally.Verma@cavium.com; x-originating-ip: [115.113.156.2] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; SN6PR07MB4511; 6:MFiGPsstmkWP98aJX/uyE5aEEio9TClqSn1NcbZOyUHA8A569lD6I56sy4M2u5QjTz685ItLq5J1UYvNkwM8TWumNqUC1AIT45ZIdEbxx8qyPsVGemDWLhkqauqELe9L4jMN45xJLZaWyA9Iw+IK4M7rkubEWOSeb9Pb1wknzdyv6hm9YhEy1XvdBoNItYVH8kv9GHOdXDTVQFD+ZtIqwQ3+sIpx/ZLCTvhqRoL6vSlXp2CSaiMUqwHWcsTUEuLt1LB+tLt/IkBUUN1qNK1kPVdzseMKLCzi3NRu4rNZATklorDTJ2NSwUz6Pqx8KUD0CfGtQYrfiUiKl6ORC0rkSP8gxee8dAujPaq+eU4WwRhXRjCn7yA/gH/FoPTEP7ZAikJL6YUQYJ0u+T4r4WcZfIuAgm/OVX8NPMGtvg4Wl02LChE87fy/XCp9vdWTOprdjTEeE/GDYo2XGl6qk9CWZg==; 5:SC6jC+J8c/Ns77KEK6gbC9eM2RL4O2jVJOFINZbuT3X8Z72C+Fte6RN3J4gz4Dc6EBAfZVZMNBEYE7f7fVATpCivLV0KuzlSzchKjWo4fwmGlQn2s5dao09U7eef4+wHaUKEY2ZSsmHqApz8FvWNK4qGE/P0JA10p4cU2ZPNO40=; 7:tdqY5a+I7lo/KQZqsl2DL44F0RndNDa3Z+ymyrD5YL7gOeuENd9HiDKBgDiwq6GwxnYR4dRxJRA3Wjdz3/E4hxW59OhFbL5KbcUX9EyRsci7buRRh+c7hWRHR1DWHA1dJVtwmErLgl1AujoEDqb/xnR8p5L6jZgU1zbu3nhQNB8dHWM/V+bN5LwrzWekHz0FBInPI7JNqm6B9xr4s3pw0irBglmkFP+CbyP6GRK+QzYCBW36ppRZECpE4XeZxttU x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-correlation-id: 439dabfe-8eb3-4b9f-a0ce-08d6302bb47d x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600074)(711020)(2017052603328)(7153060)(7193020); SRVR:SN6PR07MB4511; x-ms-traffictypediagnostic: SN6PR07MB4511: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699)(185117386973197); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3231355)(944501410)(52105095)(3002001)(10201501046)(93006095)(93001095)(149066)(150057)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123564045)(20161123562045)(20161123558120)(201708071742011)(7699051)(76991067); SRVR:SN6PR07MB4511; BCL:0; PCL:0; RULEID:; SRVR:SN6PR07MB4511; x-forefront-prvs: 0823A5777B x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(396003)(136003)(376002)(346002)(366004)(39860400002)(199004)(189003)(57704003)(13464003)(256004)(14444005)(186003)(99286004)(66066001)(110136005)(478600001)(106356001)(54906003)(86362001)(2201001)(8936002)(33656002)(2900100001)(6246003)(6116002)(14454004)(486006)(105586002)(25786009)(68736007)(26005)(102836004)(3846002)(446003)(11346002)(81166006)(316002)(81156014)(476003)(72206003)(305945005)(7736002)(97736004)(5660300001)(55016002)(6436002)(229853002)(16200700003)(4744004)(53946003)(53936002)(71190400001)(9686003)(71200400001)(6506007)(5024004)(55236004)(2501003)(2906002)(7696005)(5250100002)(74316002)(76176011)(4326008)(559001)(569006); DIR:OUT; SFP:1101; SCL:1; SRVR:SN6PR07MB4511; H:SN6PR07MB5152.namprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: /ti7VP7p9oUIK9VIj/LU0YpF075PvrWXMdoOCRO/HL67hQZC5oNteW2hmGmCONxLgVtPJp8RUCyrHJAAbTedSoovlVMVexOVWypjLL+MIEzMoVTvvJv8TUsDexo9QlkbhZ4BYloVrboGB3jorxDVJOSJQwD7HXFvNxg5xxUTqm/gKLi6tOUbmSCKo4qdldJox5UhNcYbFf1/GsxVXgX6S/MR2Df6gjpnTovfrCypFNepxuG6jPi31hvniQ1wSLXprrn60OTq+u26y6TIZYqIXJYi9R+qGkajo1IvrVkkoUupJ1EkukCHGqn//BnushCqj7lQouBzEAQfRw9mC2+2aZrLkkhnWXKSLXERZe1UWYA= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: cavium.com X-MS-Exchange-CrossTenant-Network-Message-Id: 439dabfe-8eb3-4b9f-a0ce-08d6302bb47d X-MS-Exchange-CrossTenant-originalarrivaltime: 12 Oct 2018 10:15:58.8112 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR07MB4511 Subject: Re: [dpdk-dev] [PATCH 2/3] app/compress-perf: add performance measurement X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 Oct 2018 10:16:01 -0000 HI TomaszX Sorry for delay in response. Comments inline. >-----Original Message----- >From: dev On Behalf Of Tomasz Jozwiak >Sent: 01 October 2018 18:57 >To: dev@dpdk.org; fiona.trahe@intel.com; tomaszx.jozwiak@intel.com; akhil.= goyal@nxp.com; pablo.de.lara.guarch@intel.com >Cc: De@dpdk.org; Lara@dpdk.org; Guarch@dpdk.org >Subject: [dpdk-dev] [PATCH 2/3] app/compress-perf: add performance measure= ment > >External Email > >Added performance measurement part into compression perf. test. > >Signed-off-by: De Lara Guarch, Pablo >Signed-off-by: Tomasz Jozwiak >--- > app/test-compress-perf/main.c | 844 +++++++++++++++++++++++++++++++++++++= +++++ > 1 file changed, 844 insertions(+) > >diff --git a/app/test-compress-perf/main.c b/app/test-compress-perf/main.c >index f52b98d..093dfaf 100644 >--- a/app/test-compress-perf/main.c >+++ b/app/test-compress-perf/main.c >@@ -5,13 +5,721 @@ > #include > #include > #include >+#include > #include > > #include "comp_perf_options.h" > >+#define NUM_MAX_XFORMS 16 >+#define NUM_MAX_INFLIGHT_OPS 512 >+#define EXPANSE_RATIO 1.05 >+#define MIN_ISAL_SIZE 8 >+ >+#define DIV_CEIL(a, b) ((a) / (b) + ((a) % (b) !=3D 0)) >+ >+static int >+param_range_check(uint16_t size, const struct rte_param_log2_range *range= ) >+{ >+ unsigned int next_size; >+ >+ /* Check lower/upper bounds */ >+ if (size < range->min) >+ return -1; >+ >+ if (size > range->max) >+ return -1; >+ >+ /* If range is actually only one value, size is correct */ >+ if (range->increment =3D=3D 0) >+ return 0; >+ >+ /* Check if value is one of the supported sizes */ >+ for (next_size =3D range->min; next_size <=3D range->max; >+ next_size +=3D range->increment) >+ if (size =3D=3D next_size) >+ return 0; >+ >+ return -1; >+} >+ >+static int >+comp_perf_check_capabilities(struct comp_test_data *test_data) >+{ >+ const struct rte_compressdev_capabilities *cap; >+ >+ cap =3D rte_compressdev_capability_get(test_data->cdev_id, >+ RTE_COMP_ALGO_DEFLATE); >+ >+ if (cap =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Compress device does not support DEFLATE\n"); >+ return -1; >+ } >+ >+ uint64_t comp_flags =3D cap->comp_feature_flags; >+ >+ /* Huffman enconding */ >+ if (test_data->huffman_enc =3D=3D RTE_COMP_HUFFMAN_FIXED && >+ (comp_flags & RTE_COMP_FF_HUFFMAN_FIXED) =3D=3D 0)= { >+ RTE_LOG(ERR, USER1, >+ "Compress device does not supported Fixed Huffman\= n"); >+ return -1; >+ } >+ >+ if (test_data->huffman_enc =3D=3D RTE_COMP_HUFFMAN_DYNAMIC && >+ (comp_flags & RTE_COMP_FF_HUFFMAN_DYNAMIC) =3D=3D = 0) { >+ RTE_LOG(ERR, USER1, >+ "Compress device does not supported Dynamic Huffma= n\n"); >+ return -1; >+ } >+ >+ /* Window size */ >+ if (test_data->window_sz !=3D -1) { >+ if (param_range_check(test_data->window_sz, &cap->window_s= ize) What if cap->window_size is 0 i.e. implementation default? >+ < 0) { >+ RTE_LOG(ERR, USER1, >+ "Compress device does not support " >+ "this window size\n"); >+ return -1; >+ } >+ } else >+ /* Set window size to PMD maximum if none was specified */ >+ test_data->window_sz =3D cap->window_size.max; >+ >+ /* Check if chained mbufs is supported */ >+ if (test_data->max_sgl_segs > 1 && >+ (comp_flags & RTE_COMP_FF_OOP_SGL_IN_SGL_OUT) =3D= =3D 0) { >+ RTE_LOG(INFO, USER1, "Compress device does not support " >+ "chained mbufs. Max SGL segments set to 1\= n"); >+ test_data->max_sgl_segs =3D 1; >+ } >+ >+ /* Level 0 support */ >+ if (test_data->level.min =3D=3D 0 && >+ (comp_flags & RTE_COMP_FF_NONCOMPRESSED_BLOCKS) = =3D=3D 0) { >+ RTE_LOG(ERR, USER1, "Compress device does not support " >+ "level 0 (no compression)\n"); >+ return -1; >+ } >+ >+ return 0; >+} >+ >+static int >+comp_perf_allocate_memory(struct comp_test_data *test_data) >+{ >+ /* Number of segments for input and output >+ * (compression and decompression) >+ */ >+ uint32_t total_segs =3D DIV_CEIL(test_data->input_data_sz, >+ test_data->seg_sz); >+ test_data->comp_buf_pool =3D rte_pktmbuf_pool_create("comp_buf_poo= l", >+ total_segs, >+ 0, 0, test_data->seg_sz + RTE_PKTMBUF_HEAD= ROOM, >+ rte_socket_id()); >+ if (test_data->comp_buf_pool =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Mbuf mempool could not be created\n")= ; >+ return -1; >+ } >+ >+ test_data->decomp_buf_pool =3D rte_pktmbuf_pool_create("decomp_buf= _pool", >+ total_segs, >+ 0, 0, test_data->seg_sz + RTE_PKTMBUF_HEAD= ROOM, >+ rte_socket_id()); >+ if (test_data->decomp_buf_pool =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Mbuf mempool could not be created\n")= ; >+ return -1; >+ } >+ >+ test_data->total_bufs =3D DIV_CEIL(total_segs, test_data->max_sgl_= segs); >+ >+ test_data->op_pool =3D rte_comp_op_pool_create("op_pool", >+ test_data->total_bufs, >+ 0, 0, rte_socket_id()); >+ if (test_data->op_pool =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Comp op mempool could not be created\= n"); >+ return -1; >+ } >+ >+ /* >+ * Compressed data might be a bit larger than input data, >+ * if data cannot be compressed Possible only if it's zlib format right? Or deflate as well? >+ */ >+ test_data->compressed_data =3D rte_zmalloc_socket(NULL, >+ test_data->input_data_sz * EXPANSE_RATIO >+ + MIN_ISAL_SIZE, 0= , >+ rte_socket_id()); >+ if (test_data->compressed_data =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Memory to hold the data from the inpu= t " >+ "file could not be allocated\n"); >+ return -1; >+ } >+ >+ test_data->decompressed_data =3D rte_zmalloc_socket(NULL, >+ test_data->input_data_sz, 0, >+ rte_socket_id()); >+ if (test_data->decompressed_data =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Memory to hold the data from the inpu= t " >+ "file could not be allocated\n"); >+ return -1; >+ } >+ >+ test_data->comp_bufs =3D rte_zmalloc_socket(NULL, >+ test_data->total_bufs * sizeof(struct rte_mbuf *), >+ 0, rte_socket_id()); >+ if (test_data->comp_bufs =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Memory to hold the compression mbufs" >+ " could not be allocated\n"); >+ return -1; >+ } >+ >+ test_data->decomp_bufs =3D rte_zmalloc_socket(NULL, >+ test_data->total_bufs * sizeof(struct rte_mbuf *), >+ 0, rte_socket_id()); >+ if (test_data->decomp_bufs =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Memory to hold the decompression mbuf= s" >+ " could not be allocated\n"); >+ return -1; >+ } >+ return 0; >+} >+ >+static int >+comp_perf_dump_input_data(struct comp_test_data *test_data) >+{ >+ FILE *f =3D fopen(test_data->input_file, "r"); >+ >+ if (f =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Input file could not be opened\n"); >+ return -1; >+ } >+ >+ if (fseek(f, 0, SEEK_END) !=3D 0) { >+ RTE_LOG(ERR, USER1, "Size of input could not be calculated= \n"); >+ goto err; >+ } >+ size_t actual_file_sz =3D ftell(f); >+ /* If extended input data size has not been set, >+ * input data size =3D file size >+ */ >+ >+ if (test_data->input_data_sz =3D=3D 0) >+ test_data->input_data_sz =3D actual_file_sz; >+ >+ if (fseek(f, 0, SEEK_SET) !=3D 0) { >+ RTE_LOG(ERR, USER1, "Size of input could not be calculated= \n"); >+ goto err; >+ } >+ >+ test_data->input_data =3D rte_zmalloc_socket(NULL, >+ test_data->input_data_sz, 0, rte_socket_id= ()); >+ >+ if (test_data->input_data =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Memory to hold the data from the inpu= t " >+ "file could not be allocated\n"); >+ goto err; >+ } >+ >+ size_t remaining_data =3D test_data->input_data_sz; >+ uint8_t *data =3D test_data->input_data; >+ >+ while (remaining_data > 0) { >+ size_t data_to_read =3D RTE_MIN(remaining_data, actual_fil= e_sz); >+ >+ if (fread(data, data_to_read, 1, f) !=3D 1) { >+ RTE_LOG(ERR, USER1, "Input file could not be read\= n"); >+ goto err; >+ } >+ if (fseek(f, 0, SEEK_SET) !=3D 0) { >+ RTE_LOG(ERR, USER1, >+ "Size of input could not be calculated\n")= ; >+ goto err; >+ } >+ remaining_data -=3D data_to_read; >+ data +=3D data_to_read; It looks like it will run 2nd time only if input file size < input data siz= e in which case it will just keep filling input buffer with repeated data.= =20 Is that the intention here? >+ } >+ >+ if (test_data->input_data_sz > actual_file_sz) >+ RTE_LOG(INFO, USER1, >+ "%zu bytes read from file %s, extending the file %.2f ti= mes\n", >+ test_data->input_data_sz, test_data->input_file, >+ (double)test_data->input_data_sz/actual_file_sz); >+ else >+ RTE_LOG(INFO, USER1, >+ "%zu bytes read from file %s\n", >+ test_data->input_data_sz, test_data->input_file); >+ >+ fclose(f); >+ >+ return 0; >+ >+err: >+ fclose(f); >+ rte_free(test_data->input_data); >+ test_data->input_data =3D NULL; >+ >+ return -1; >+} >+ >+static int >+comp_perf_initialize_compressdev(struct comp_test_data *test_data) >+{ >+ uint8_t enabled_cdev_count; >+ uint8_t enabled_cdevs[RTE_COMPRESS_MAX_DEVS]; >+ >+ enabled_cdev_count =3D rte_compressdev_devices_get(test_data->driv= er_name, >+ enabled_cdevs, RTE_COMPRESS_MAX_DEVS); >+ if (enabled_cdev_count =3D=3D 0) { >+ RTE_LOG(ERR, USER1, "No compress devices type %s available= \n", >+ test_data->driver_name); >+ return -EINVAL; >+ } >+ >+ if (enabled_cdev_count > 1) >+ RTE_LOG(INFO, USER1, >+ "Only the first compress device will be used\n"); >+ >+ test_data->cdev_id =3D enabled_cdevs[0]; >+ >+ if (comp_perf_check_capabilities(test_data) < 0) >+ return -1; >+ >+ /* Configure compressdev (one device, one queue pair) */ >+ struct rte_compressdev_config config =3D { >+ .socket_id =3D rte_socket_id(), >+ .nb_queue_pairs =3D 1, >+ .max_nb_priv_xforms =3D NUM_MAX_XFORMS, >+ .max_nb_streams =3D 0 >+ }; >+ >+ if (rte_compressdev_configure(test_data->cdev_id, &config) < 0) { >+ RTE_LOG(ERR, USER1, "Device configuration failed\n"); >+ return -1; >+ } >+ >+ if (rte_compressdev_queue_pair_setup(test_data->cdev_id, 0, >+ NUM_MAX_INFLIGHT_OPS, rte_socket_id()) < 0) { >+ RTE_LOG(ERR, USER1, "Queue pair setup failed\n"); >+ return -1; >+ } >+ >+ if (rte_compressdev_start(test_data->cdev_id) < 0) { >+ RTE_LOG(ERR, USER1, "Device could not be started\n"); >+ return -1; >+ } >+ >+ return 0; >+} >+ >+static int >+prepare_bufs(struct comp_test_data *test_data) >+{ >+ uint32_t remaining_data =3D test_data->input_data_sz; >+ uint8_t *input_data_ptr =3D test_data->input_data; >+ size_t data_sz; >+ uint8_t *data_addr; >+ uint32_t i, j; >+ >+ for (i =3D 0; i < test_data->total_bufs; i++) { >+ /* Allocate data in input mbuf and copy data from input fi= le */ >+ test_data->decomp_bufs[i] =3D >+ rte_pktmbuf_alloc(test_data->decomp_buf_pool); >+ if (test_data->decomp_bufs[i] =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not allocate mbuf\n"); >+ return -1; >+ } >+ >+ data_sz =3D RTE_MIN(remaining_data, test_data->seg_sz); >+ data_addr =3D (uint8_t *) rte_pktmbuf_append( >+ test_data->decomp_bufs[i], data_sz= ); >+ if (data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not append data\n"); >+ return -1; >+ } >+ rte_memcpy(data_addr, input_data_ptr, data_sz); >+ >+ input_data_ptr +=3D data_sz; >+ remaining_data -=3D data_sz; >+ >+ /* Already one segment in the mbuf */ >+ uint16_t segs_per_mbuf =3D 1; >+ >+ /* Chain mbufs if needed for input mbufs */ >+ while (segs_per_mbuf < test_data->max_sgl_segs >+ && remaining_data > 0) { >+ struct rte_mbuf *next_seg =3D >+ rte_pktmbuf_alloc(test_data->decomp_buf_po= ol); >+ >+ if (next_seg =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Could not allocate mbuf\n"); >+ return -1; >+ } >+ >+ data_sz =3D RTE_MIN(remaining_data, test_data->seg= _sz); >+ data_addr =3D (uint8_t *)rte_pktmbuf_append(next_s= eg, >+ data_sz); >+ >+ if (data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not append data= \n"); Since a new buffer per segment is allocated, so is it possible for append t= o fail? think, this check is redundant here. >+ return -1; >+ } >+ >+ rte_memcpy(data_addr, input_data_ptr, data_sz); >+ input_data_ptr +=3D data_sz; >+ remaining_data -=3D data_sz; >+ >+ if (rte_pktmbuf_chain(test_data->decomp_bufs[i], >+ next_seg) < 0) { >+ RTE_LOG(ERR, USER1, "Could not chain mbufs= \n"); >+ return -1; >+ } >+ segs_per_mbuf++; >+ } >+ >+ /* Allocate data in output mbuf */ >+ test_data->comp_bufs[i] =3D >+ rte_pktmbuf_alloc(test_data->comp_buf_pool); >+ if (test_data->comp_bufs[i] =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not allocate mbuf\n"); >+ return -1; >+ } >+ data_addr =3D (uint8_t *) rte_pktmbuf_append( >+ test_data->comp_bufs[i], >+ test_data->seg_sz); >+ if (data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not append data\n"); >+ return -1; >+ } >+ >+ /* Chain mbufs if needed for output mbufs */ >+ for (j =3D 1; j < segs_per_mbuf; j++) { >+ struct rte_mbuf *next_seg =3D >+ rte_pktmbuf_alloc(test_data->comp_buf_pool= ); >+ >+ if (next_seg =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Could not allocate mbuf\n"); >+ return -1; >+ } >+ >+ data_addr =3D (uint8_t *)rte_pktmbuf_append(next_s= eg, >+ test_data->seg_sz); >+ >+ if (data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, "Could not append data= \n"); >+ return -1; >+ } >+ >+ if (rte_pktmbuf_chain(test_data->comp_bufs[i], >+ next_seg) < 0) { >+ RTE_LOG(ERR, USER1, "Could not chain mbufs= \n"); >+ return -1; >+ } >+ } >+ } >+ >+ return 0; >+} >+ >+static void >+free_bufs(struct comp_test_data *test_data) >+{ >+ uint32_t i; >+ >+ for (i =3D 0; i < test_data->total_bufs; i++) { >+ rte_pktmbuf_free(test_data->comp_bufs[i]); >+ rte_pktmbuf_free(test_data->decomp_bufs[i]); >+ } >+ rte_free(test_data->comp_bufs); >+ rte_free(test_data->decomp_bufs); >+} >+ >+static int >+main_loop(struct comp_test_data *test_data, uint8_t level, >+ enum rte_comp_xform_type type, >+ uint8_t *output_data_ptr, >+ size_t *output_data_sz, >+ unsigned int benchmarking) >+{ >+ uint8_t dev_id =3D test_data->cdev_id; >+ uint32_t i, iter, num_iter; >+ struct rte_comp_op **ops, **deq_ops; >+ void *priv_xform =3D NULL; >+ struct rte_comp_xform xform; >+ size_t output_size =3D 0; >+ struct rte_mbuf **input_bufs, **output_bufs; >+ int res =3D 0; >+ int allocated =3D 0; >+ >+ if (test_data =3D=3D NULL || !test_data->burst_sz) { >+ RTE_LOG(ERR, USER1, >+ "Unknow burst size\n"); >+ return -1; >+ } >+ >+ ops =3D rte_zmalloc_socket(NULL, >+ 2 * test_data->total_bufs * sizeof(struct rte_comp_op *), >+ 0, rte_socket_id()); >+ >+ if (ops =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Can't allocate memory for ops strucures\n"); >+ return -1; >+ } >+ >+ deq_ops =3D &ops[test_data->total_bufs]; >+ >+ if (type =3D=3D RTE_COMP_COMPRESS) { >+ xform =3D (struct rte_comp_xform) { >+ .type =3D RTE_COMP_COMPRESS, >+ .compress =3D { >+ .algo =3D RTE_COMP_ALGO_DEFLATE, >+ .deflate.huffman =3D test_data->huffman_en= c, >+ .level =3D level, >+ .window_size =3D test_data->window_sz, >+ .chksum =3D RTE_COMP_CHECKSUM_NONE, >+ .hash_algo =3D RTE_COMP_HASH_ALGO_NONE >+ } >+ }; >+ input_bufs =3D test_data->decomp_bufs; >+ output_bufs =3D test_data->comp_bufs; >+ } else { >+ xform =3D (struct rte_comp_xform) { >+ .type =3D RTE_COMP_DECOMPRESS, >+ .decompress =3D { >+ .algo =3D RTE_COMP_ALGO_DEFLATE, >+ .chksum =3D RTE_COMP_CHECKSUM_NONE, >+ .window_size =3D test_data->window_sz, >+ .hash_algo =3D RTE_COMP_HASH_ALGO_NONE >+ } >+ }; >+ input_bufs =3D test_data->comp_bufs; >+ output_bufs =3D test_data->decomp_bufs; >+ } >+ >+ /* Create private xform */ >+ if (rte_compressdev_private_xform_create(dev_id, &xform, >+ &priv_xform) < 0) { >+ RTE_LOG(ERR, USER1, "Private xform could not be created\n"= ); >+ res =3D -1; >+ goto end; >+ } >+ >+ uint64_t tsc_start, tsc_end, tsc_duration; >+ >+ tsc_start =3D tsc_end =3D tsc_duration =3D 0; >+ if (benchmarking) { >+ tsc_start =3D rte_rdtsc(); >+ num_iter =3D test_data->num_iter; >+ } else >+ num_iter =3D 1; Looks like in same code we're doing benchmarking and functional validation.= It can be reorganised to keep validation test separately like done in cryp= to_perf. >+ >+ for (iter =3D 0; iter < num_iter; iter++) { >+ uint32_t total_ops =3D test_data->total_bufs; >+ uint32_t remaining_ops =3D test_data->total_bufs; >+ uint32_t total_deq_ops =3D 0; >+ uint32_t total_enq_ops =3D 0; >+ uint16_t ops_unused =3D 0; >+ uint16_t num_enq =3D 0; >+ uint16_t num_deq =3D 0; >+ >+ output_size =3D 0; >+ >+ while (remaining_ops > 0) { >+ uint16_t num_ops =3D RTE_MIN(remaining_ops, >+ test_data->burst_sz); >+ uint16_t ops_needed =3D num_ops - ops_unused; >+ >+ /* >+ * Move the unused operations from the previous >+ * enqueue_burst call to the front, to maintain or= der >+ */ >+ if ((ops_unused > 0) && (num_enq > 0)) { >+ size_t nb_b_to_mov =3D >+ ops_unused * sizeof(struct rte_comp_= op *); >+ >+ memmove(ops, &ops[num_enq], nb_b_to_mov); >+ } >+ >+ /* Allocate compression operations */ >+ if (ops_needed && !rte_comp_op_bulk_alloc( >+ test_data->op_pool, >+ &ops[ops_unused], >+ ops_needed)) { >+ RTE_LOG(ERR, USER1, >+ "Could not allocate enough operation= s\n"); >+ res =3D -1; >+ goto end; >+ } >+ allocated +=3D ops_needed; >+ >+ for (i =3D 0; i < ops_needed; i++) { >+ /* >+ * Calculate next buffer to attach to oper= ation >+ */ >+ uint32_t buf_id =3D total_enq_ops + i + >+ ops_unused; >+ uint16_t op_id =3D ops_unused + i; >+ /* Reset all data in output buffers */ >+ struct rte_mbuf *m =3D output_bufs[buf_id]= ; >+ >+ m->pkt_len =3D test_data->seg_sz * m->nb_s= egs; Isn't pkt_len set already when we call rte_pktmbuf_append() and chain()? >+ while (m) { >+ m->data_len =3D m->buf_len - m->da= ta_off; Same question, shouldn't rte_pktmbuf_append() adjust data_len as well per e= ach mbuf? >+ m =3D m->next; >+ } >+ ops[op_id]->m_src =3D input_bufs[buf_id]; >+ ops[op_id]->m_dst =3D output_bufs[buf_id]; >+ ops[op_id]->src.offset =3D 0; >+ ops[op_id]->src.length =3D >+ rte_pktmbuf_pkt_len(input_bufs[buf= _id]); >+ ops[op_id]->dst.offset =3D 0; >+ ops[op_id]->flush_flag =3D RTE_COMP_FLUSH_= FINAL; >+ ops[op_id]->input_chksum =3D buf_id; >+ ops[op_id]->private_xform =3D priv_xform; >+ } >+ >+ num_enq =3D rte_compressdev_enqueue_burst(dev_id, = 0, ops, >+ num_ops); >+ ops_unused =3D num_ops - num_enq; >+ remaining_ops -=3D num_enq; >+ total_enq_ops +=3D num_enq; >+ >+ num_deq =3D rte_compressdev_dequeue_burst(dev_id, = 0, >+ deq_ops, >+ test_data->burs= t_sz); >+ total_deq_ops +=3D num_deq; >+ if (benchmarking =3D=3D 0) { >+ for (i =3D 0; i < num_deq; i++) { >+ struct rte_comp_op *op =3D deq_ops= [i]; >+ const void *read_data_addr =3D >+ rte_pktmbuf_read(op->m_dst= , 0, >+ op->produced, output_data_= ptr); >+ if (read_data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Could not copy buffer in destinatio= n\n"); >+ res =3D -1; >+ goto end; >+ } >+ >+ if (read_data_addr !=3D output_dat= a_ptr) >+ rte_memcpy(output_data_ptr= , >+ rte_pktmbuf_mtod( >+ op->m_dst, uint8= _t *), >+ op->produced); >+ output_data_ptr +=3D op->produced; >+ output_size +=3D op->produced; >+ >+ } >+ } >+ >+ if (iter =3D=3D num_iter - 1) { >+ for (i =3D 0; i < num_deq; i++) { Why is it only for last iteration, we are adjusting dst mbuf data_len.? Shouldn't it be done for each dequeued op? And, for benchmarking, do we even need to set data and pkt len on dst mbuf? >+ struct rte_comp_op *op =3D deq_ops= [i]; >+ struct rte_mbuf *m =3D op->m_dst; >+ >+ m->pkt_len =3D op->produced; >+ uint32_t remaining_data =3D op->pr= oduced; >+ uint16_t data_to_append; >+ >+ while (remaining_data > 0) { >+ data_to_append =3D >+ RTE_MIN(remaining_= data, >+ test_data->se= g_sz); >+ m->data_len =3D data_to_ap= pend; >+ remaining_data -=3D >+ data_to_ap= pend; >+ m =3D m->next; Should break if m->next =3D=3D NULL >+ } >+ } >+ } >+ rte_mempool_put_bulk(test_data->op_pool, >+ (void **)deq_ops, num_deq); >+ allocated -=3D num_deq; >+ } >+ >+ /* Dequeue the last operations */ >+ while (total_deq_ops < total_ops) { >+ num_deq =3D rte_compressdev_dequeue_burst(dev_id, = 0, >+ deq_ops, test_data->burst_= sz); >+ total_deq_ops +=3D num_deq; >+ if (benchmarking =3D=3D 0) { >+ for (i =3D 0; i < num_deq; i++) { >+ struct rte_comp_op *op =3D deq_ops= [i]; >+ const void *read_data_addr =3D >+ rte_pktmbuf_read(op->m_dst= , 0, >+ op->produced, output_data_= ptr); >+ if (read_data_addr =3D=3D NULL) { >+ RTE_LOG(ERR, USER1, >+ "Could not copy buffer in destinatio= n\n"); >+ res =3D -1; >+ goto end; >+ } >+ >+ if (read_data_addr !=3D output_dat= a_ptr) >+ rte_memcpy(output_data_ptr= , >+ rte_pktmbuf_mtod( >+ op->m_dst, uint8_t= *), >+ op->produced); >+ output_data_ptr +=3D op->produced; >+ output_size +=3D op->produced; >+ >+ } >+ } >+ >+ if (iter =3D=3D num_iter - 1) { >+ for (i =3D 0; i < num_deq; i++) { >+ struct rte_comp_op *op =3D deq_ops= [i]; >+ struct rte_mbuf *m =3D op->m_dst; >+ >+ m->pkt_len =3D op->produced; >+ uint32_t remaining_data =3D op->pr= oduced; >+ uint16_t data_to_append; >+ >+ while (remaining_data > 0) { >+ data_to_append =3D >+ RTE_MIN(remaining_data, >+ test_data->seg_sz)= ; >+ m->data_len =3D data_to_ap= pend; >+ remaining_data -=3D >+ data_to_ap= pend; >+ m =3D m->next; >+ } >+ } >+ } >+ rte_mempool_put_bulk(test_data->op_pool, >+ (void **)deq_ops, num_deq); >+ allocated -=3D num_deq; >+ } >+ } >+ >+ if (benchmarking) { >+ tsc_end =3D rte_rdtsc(); >+ tsc_duration =3D tsc_end - tsc_start; >+ >+ if (type =3D=3D RTE_COMP_COMPRESS) test looks for stateless operations only, so can we add perf test type like= : test type perf, op type:STATELESS/STATEFUL Also, why do we need --max-num-sgl-segs as an input option from user? Shoul= dn't input_sz and seg_sz internally decide on num-segs? Or is it added to serve some other different purpose? Thanks Shally >+ test_data->comp_tsc_duration[level] =3D >+ tsc_duration / num_iter; >+ else >+ test_data->decomp_tsc_duration[level] =3D >+ tsc_duration / num_iter; >+ } >+ >+ if (benchmarking =3D=3D 0 && output_data_sz) >+ *output_data_sz =3D output_size; >+end: >+ rte_mempool_put_bulk(test_data->op_pool, (void **)ops, allocated); >+ rte_compressdev_private_xform_free(dev_id, priv_xform); >+ rte_free(ops); >+ return res; >+} >+ > int > main(int argc, char **argv) > { >+ uint8_t level, level_idx =3D 0; >+ uint8_t i; > int ret; > struct comp_test_data *test_data; > >@@ -43,9 +751,145 @@ main(int argc, char **argv) > goto err; > } > >+ if (comp_perf_initialize_compressdev(test_data) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (comp_perf_dump_input_data(test_data) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (comp_perf_allocate_memory(test_data) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (prepare_bufs(test_data) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (test_data->level.inc !=3D 0) >+ level =3D test_data->level.min; >+ else >+ level =3D test_data->level.list[0]; >+ >+ size_t comp_data_sz; >+ size_t decomp_data_sz; >+ >+ printf("Burst size =3D %u\n", test_data->burst_sz); >+ printf("File size =3D %zu\n", test_data->input_data_sz); >+ >+ printf("%6s%12s%17s%19s%21s%15s%21s%23s%16s\n", >+ "Level", "Comp size", "Comp ratio [%]", >+ "Comp [Cycles/it]", "Comp [Cycles/Byte]", "Comp [Gbps]", >+ "Decomp [Cycles/it]", "Decomp [Cycles/Byte]", "Decomp [Gbp= s]"); >+ >+ while (level <=3D test_data->level.max) { >+ /* >+ * Run a first iteration, to verify compression and >+ * get the compression ratio for the level >+ */ >+ if (main_loop(test_data, level, RTE_COMP_COMPRESS, >+ test_data->compressed_data, >+ &comp_data_sz, 0) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (main_loop(test_data, level, RTE_COMP_DECOMPRESS, >+ test_data->decompressed_data, >+ &decomp_data_sz, 0) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ >+ if (decomp_data_sz !=3D test_data->input_data_sz) { >+ RTE_LOG(ERR, USER1, >+ "Decompressed data length not equal to input data lengt= h\n"); >+ RTE_LOG(ERR, USER1, >+ "Decompressed size =3D %zu, expected =3D %= zu\n", >+ decomp_data_sz, test_data->input_data_sz); >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } else { >+ if (memcmp(test_data->decompressed_data, >+ test_data->input_data, >+ test_data->input_data_sz) !=3D 0) = { >+ RTE_LOG(ERR, USER1, >+ "Decompressed data is not the same as file dat= a\n"); >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ } >+ >+ double ratio =3D (double) comp_data_sz / >+ test_data->input_data_sz *= 100; >+ >+ /* >+ * Run the tests twice, discarding the first performance >+ * results, before the cache is warmed up >+ */ >+ for (i =3D 0; i < 2; i++) { >+ if (main_loop(test_data, level, RTE_COMP_COMPRESS, >+ NULL, NULL, 1) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ } >+ >+ for (i =3D 0; i < 2; i++) { >+ if (main_loop(test_data, level, RTE_COMP_DECOMPRES= S, >+ NULL, NULL, 1) < 0) { >+ ret =3D EXIT_FAILURE; >+ goto err; >+ } >+ } >+ >+ uint64_t comp_tsc_duration =3D >+ test_data->comp_tsc_duration[level]; >+ double comp_tsc_byte =3D (double)comp_tsc_duration / >+ test_data->input_data_sz; >+ double comp_gbps =3D rte_get_tsc_hz() / comp_tsc_byte * 8 = / >+ 1000000000; >+ uint64_t decomp_tsc_duration =3D >+ test_data->decomp_tsc_duration[level]; >+ double decomp_tsc_byte =3D (double)decomp_tsc_duration / >+ test_data->input_data_sz; >+ double decomp_gbps =3D rte_get_tsc_hz() / decomp_tsc_byte = * 8 / >+ 1000000000; >+ >+ printf("%6u%12zu%17.2f%19"PRIu64"%21.2f" >+ "%15.2f%21"PRIu64"%23.2f%16.2f\n", >+ level, comp_data_sz, ratio, comp_tsc_duration, >+ comp_tsc_byte, comp_gbps, decomp_tsc_duration, >+ decomp_tsc_byte, decomp_gbps); >+ >+ if (test_data->level.inc !=3D 0) >+ level +=3D test_data->level.inc; >+ else { >+ if (++level_idx =3D=3D test_data->level.count) >+ break; >+ level =3D test_data->level.list[level_idx]; >+ } >+ } >+ > ret =3D EXIT_SUCCESS; > > err: >+ if (test_data->cdev_id !=3D -1) >+ rte_compressdev_stop(test_data->cdev_id); >+ >+ free_bufs(test_data); >+ rte_free(test_data->compressed_data); >+ rte_free(test_data->decompressed_data); >+ rte_free(test_data->input_data); >+ rte_mempool_free(test_data->comp_buf_pool); >+ rte_mempool_free(test_data->decomp_buf_pool); >+ rte_mempool_free(test_data->op_pool); >+ > rte_free(test_data); > > return ret; >-- >2.7.4