From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by dpdk.space (Postfix) with ESMTP id 6A1CFA05DC for ; Mon, 10 Jun 2019 06:39:52 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AD77F1BDD8; Mon, 10 Jun 2019 06:39:51 +0200 (CEST) Received: from EUR02-HE1-obe.outbound.protection.outlook.com (mail-eopbgr10047.outbound.protection.outlook.com [40.107.1.47]) by dpdk.org (Postfix) with ESMTP id DEEE61BDD7 for ; Mon, 10 Jun 2019 06:39:49 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=Vmkf2N6bF+zOTE1b/Jj8wpbjjnvCwqnquD9PJbzJxhA=; b=RH57umKGwCdcYGyB8P4cMA9d5qAKkwQ+KFLLNi55lx8pQN0Akzn434OEMy9aH6Th4YVTJPrmnJfOjp4NFm8Js/L3bnqKuRIToMj50FlIIpDoZk5fISW52T3pEb2wSF07KyMFbcR4cK64qF+BKOy1XiqBmRT2CaVmsaxwdzka5HA= Received: from AM4PR05MB3265.eurprd05.prod.outlook.com (10.171.188.154) by AM4PR05MB3169.eurprd05.prod.outlook.com (10.171.188.26) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1965.12; Mon, 10 Jun 2019 04:39:47 +0000 Received: from AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::3868:5d60:6294:ba71]) by AM4PR05MB3265.eurprd05.prod.outlook.com ([fe80::3868:5d60:6294:ba71%5]) with mapi id 15.20.1965.017; Mon, 10 Jun 2019 04:39:47 +0000 From: Slava Ovsiienko To: "Iremonger, Bernard" , "dev@dpdk.org" CC: "Yigit, Ferruh" Thread-Topic: [dpdk-dev] [RFC] app/testpmd: add profiling for Rx/Tx burst routines Thread-Index: AQHVFE+04foixQLbPUeFfs48oYjSEKaQbaYAgAP1iZA= Date: Mon, 10 Jun 2019 04:39:47 +0000 Message-ID: References: <1558936043-6259-1-git-send-email-viacheslavo@mellanox.com> <8CEF83825BEC744B83065625E567D7C260DAD4FD@IRSMSX108.ger.corp.intel.com> In-Reply-To: <8CEF83825BEC744B83065625E567D7C260DAD4FD@IRSMSX108.ger.corp.intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=viacheslavo@mellanox.com; x-originating-ip: [95.67.35.250] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: a2fbafe8-693f-4cff-ba99-08d6ed5dab2e x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(4618075)(2017052603328)(7193020); SRVR:AM4PR05MB3169; x-ms-traffictypediagnostic: AM4PR05MB3169: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:1079; x-forefront-prvs: 0064B3273C x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(376002)(366004)(346002)(136003)(39850400004)(396003)(13464003)(51914003)(199004)(189003)(55016002)(3846002)(9686003)(66946007)(6116002)(102836004)(26005)(53546011)(6506007)(256004)(53936002)(66556008)(64756008)(2501003)(229853002)(76176011)(53946003)(73956011)(6436002)(66476007)(76116006)(99286004)(7696005)(71190400001)(66066001)(71200400001)(66446008)(86362001)(74316002)(68736007)(476003)(446003)(11346002)(478600001)(30864003)(486006)(110136005)(25786009)(81166006)(8676002)(81156014)(4326008)(14454004)(6246003)(8936002)(2906002)(5660300002)(33656002)(7736002)(316002)(52536014)(186003)(305945005)(579004)(559001); DIR:OUT; SFP:1101; SCL:1; SRVR:AM4PR05MB3169; H:AM4PR05MB3265.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: Zem9To5b5sFwP5A4hgjQHYN1MBTmmzmABpBUEALWZ0Hq/wM40DIONV0T3s2VN9iGPcadTlw01i6fslnEiCIQD2om4f2ifxPH+MCzziBUhYszvei4ex58vpiQuGOgvdVHGTwWbgLH8pFWtDK7P4bDHOPXfLj9FBr4finLEu76qZ9IZ15HOweb+ECYOfzNVKix25detsfsGDVgLZO6YavRaBBzMA1m+LKF81/L8sFBhTbJt9he022m93bvBhWLPpP3S4Sum4deTUYHc3oXUtdp31fpKuwwx5YboH2Jj8k0hUHxYPY+f/xcKvWCSM0wjCKkkMxZ+GrOkRi3mmIfD2rwsTcmo1ndPP/D3mhtq/b+mjHhNrlJm6yWJv+BO04mA0YvLZYCntY5cyF5dwV05GNBOu2SNJXNzHcMxjw49OVELLY= Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: a2fbafe8-693f-4cff-ba99-08d6ed5dab2e X-MS-Exchange-CrossTenant-originalarrivaltime: 10 Jun 2019 04:39:47.8232 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: viacheslavo@mellanox.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM4PR05MB3169 Subject: Re: [dpdk-dev] [RFC] app/testpmd: add profiling for Rx/Tx burst routines X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi, Bernard Thanks for the comment. > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) > > + uint64_t start_tx_tsc; >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? > I think - it should not. All of options: RTE_TEST_PMD_RECORD_CORE_CYCLES RTE_TEST_PMD_RECORD_CORE_TX_CYCLES RTE_TEST_PMD_RECORD_CORE_RX_CYCLES are supposed to be defined independently. I've compiled for all 8 possible CORE_xx_CYCLES combinations. RTE_TEST_PMD_RECORD_CORE_TX_CYCLES uses the dedicated TSC start point "star= t_tx_tsc". RTE_TEST_PMD_RECORD_CORE_CYCLES and RTE_TEST_PMD_RECORD_CORE_RX_CYCLES share the "start_rx_tsc". With best regards, Slava (Viacheslav) > -----Original Message----- > From: Iremonger, Bernard > Sent: Friday, June 7, 2019 19:08 > To: Slava Ovsiienko ; dev@dpdk.org > Cc: Yigit, Ferruh > Subject: RE: [dpdk-dev] [RFC] app/testpmd: add profiling for Rx/Tx burst > routines >=20 > Hi Viacheslav, >=20 >=20 > > -----Original Message----- > > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Viacheslav > > Ovsiienko > > Sent: Monday, May 27, 2019 6:47 AM > > To: dev@dpdk.org > > Cc: Yigit, Ferruh > > Subject: [dpdk-dev] [RFC] app/testpmd: add profiling for Rx/Tx burst > > routines > > > > There is the testpmd configuration option called > > RTE_TEST_PMD_RECORD_CORE_CYCLES, if this one is turned on the > testpmd > > application measures the CPU clocks spent within forwarding loop. This > > time is the sum of execution times of rte_eth_rx_burst(), > > rte_eth_tx_burst(), rte_delay_us(), > > rte_pktmbuf_free() and so on, depending on fwd mode set. > > > > While debugging and performance optimization of datapath burst > > routines tt would be useful to see the pure execution times of these > > ones. It is proposed to add separated profiling > > options: > > > > CONFIG_RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > enables gathering profiling data for transmit datapath, > > ticks spent within rte_eth_tx_burst() > > > > CONFIG_RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > enables gathering profiling data for transmit datapath, > > ticks spent within rte_eth_rx_burst() > > > > Signed-off-by: Viacheslav Ovsiienko > > --- > > app/test-pmd/csumonly.c | 25 ++++++++++++------------- > > app/test-pmd/flowgen.c | 25 +++++++++++++------------ > > app/test-pmd/icmpecho.c | 26 +++++++++++++------------- > > app/test-pmd/iofwd.c | 24 ++++++++++++------------ > > app/test-pmd/macfwd.c | 24 +++++++++++++----------- > > app/test-pmd/macswap.c | 26 ++++++++++++++------------ > > app/test-pmd/rxonly.c | 17 ++++++----------- > > app/test-pmd/softnicfwd.c | 24 ++++++++++++------------ > > app/test-pmd/testpmd.c | 32 ++++++++++++++++++++++++++++++++ > > app/test-pmd/testpmd.h | 40 > > ++++++++++++++++++++++++++++++++++++++++ > > app/test-pmd/txonly.c | 23 +++++++++++------------ > > config/common_base | 2 ++ > > 12 files changed, 180 insertions(+), 108 deletions(-) > > > > diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c index > > f4f2a7b..251e179 100644 > > --- a/app/test-pmd/csumonly.c > > +++ b/app/test-pmd/csumonly.c > > @@ -710,19 +710,19 @@ struct simple_gre_hdr { > > uint16_t nb_segments =3D 0; > > int ret; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) > > + uint64_t start_tx_tsc; >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > #endif > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* receive a burst of packet */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > #ifdef RTE_TEST_PMD_RECORD_BURST_STATS @@ -982,8 +982,10 @@ > struct > > simple_gre_hdr { > > printf("Preparing packet burst to transmit failed: %s\n", > > rte_strerror(rte_errno)); > > > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, tx_pkts_burst, > > nb_prep); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > > > /* > > * Retry if necessary > > @@ -992,8 +994,10 @@ struct simple_gre_hdr { > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &tx_pkts_burst[nb_tx], nb_rx - > nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -1010,12 +1014,7 @@ struct simple_gre_hdr { > > rte_pktmbuf_free(tx_pkts_burst[nb_tx]); > > } while (++nb_tx < nb_rx); > > } > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine csum_fwd_engine =3D { diff --git > > a/app/test-pmd/flowgen.c b/app/test-pmd/flowgen.c index > > 3214e3c..b128e68 100644 > > --- a/app/test-pmd/flowgen.c > > +++ b/app/test-pmd/flowgen.c > > @@ -130,20 +130,21 @@ > > uint16_t i; > > uint32_t retry; > > uint64_t tx_offloads; > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > -#endif > > static int next_flow =3D 0; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > + uint64_t start_tx_tsc; > > +#endif > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* Receive a burst of packets and discard them. */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > fs->rx_packets +=3D nb_rx; > > > > for (i =3D 0; i < nb_rx; i++) > > @@ -212,7 +213,9 @@ > > next_flow =3D (next_flow + 1) % cfg_n_flows; > > } > > > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, > > nb_pkt); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > /* > > * Retry if necessary > > */ > > @@ -220,8 +223,10 @@ > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_rx - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -239,11 +244,7 @@ > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_pkt); > > } > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine flow_gen_engine =3D { diff --git > > a/app/test-pmd/icmpecho.c b/app/test-pmd/icmpecho.c index > > 55d266d..a539fe8 100644 > > --- a/app/test-pmd/icmpecho.c > > +++ b/app/test-pmd/icmpecho.c > > @@ -293,21 +293,22 @@ > > uint32_t cksum; > > uint8_t i; > > int l2_len; > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > -#endif > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) > > + uint64_t start_tx_tsc; > > +#endif > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* > > * First, receive a burst of packets. > > */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > > > @@ -487,8 +488,10 @@ > > > > /* Send back ICMP echo replies, if any. */ > > if (nb_replies > 0) { > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > pkts_burst, > > nb_replies); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > /* > > * Retry if necessary > > */ > > @@ -497,10 +500,12 @@ > > while (nb_tx < nb_replies && > > retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + > > TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, > > fs->tx_queue, > > &pkts_burst[nb_tx], > > nb_replies - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, > > start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -514,12 +519,7 @@ > > } while (++nb_tx < nb_replies); > > } > > } > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine icmp_echo_engine =3D { diff --git > > a/app/test-pmd/iofwd.c b/app/test-pmd/iofwd.c index > > 9dce76e..dc66a88 100644 > > --- a/app/test-pmd/iofwd.c > > +++ b/app/test-pmd/iofwd.c > > @@ -51,21 +51,21 @@ > > uint16_t nb_tx; > > uint32_t retry; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > + uint64_t start_tx_tsc; > > #endif > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* > > * Receive a burst of packets and forward them. > > */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, > > pkts_burst, nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > fs->rx_packets +=3D nb_rx; > > @@ -73,8 +73,10 @@ > > #ifdef RTE_TEST_PMD_RECORD_BURST_STATS > > fs->rx_burst_stats.pkt_burst_spread[nb_rx]++; > > #endif > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > pkts_burst, nb_rx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > /* > > * Retry if necessary > > */ > > @@ -82,8 +84,10 @@ > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_rx - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -96,11 +100,7 @@ > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_rx); > > } > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine io_fwd_engine =3D { > > diff --git a/app/test-pmd/macfwd.c b/app/test-pmd/macfwd.c index > > 7cac757..2fd38ea 100644 > > --- a/app/test-pmd/macfwd.c > > +++ b/app/test-pmd/macfwd.c > > @@ -56,21 +56,23 @@ > > uint16_t i; > > uint64_t ol_flags =3D 0; > > uint64_t tx_offloads; > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > + > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > + uint64_t start_tx_tsc; > > #endif > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > #endif > > > > /* > > * Receive a burst of packets and forward them. > > */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > > > @@ -103,7 +105,9 @@ > > mb->vlan_tci =3D txp->tx_vlan_id; > > mb->vlan_tci_outer =3D txp->tx_vlan_id_outer; > > } > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, > > nb_rx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > /* > > * Retry if necessary > > */ > > @@ -111,8 +115,10 @@ > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_rx - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > > > @@ -126,11 +132,7 @@ > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_rx); > > } > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine mac_fwd_engine =3D { > > diff --git a/app/test-pmd/macswap.c b/app/test-pmd/macswap.c index > > 71af916..b22acdb 100644 > > --- a/app/test-pmd/macswap.c > > +++ b/app/test-pmd/macswap.c > > @@ -86,21 +86,22 @@ > > uint16_t nb_rx; > > uint16_t nb_tx; > > uint32_t retry; > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > -#endif > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > + uint64_t start_tx_tsc; > > +#endif > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* > > * Receive a burst of packets and forward them. > > */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > > > @@ -112,7 +113,10 @@ > > > > do_macswap(pkts_burst, nb_rx, txp); > > > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, > > nb_rx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > + > > /* > > * Retry if necessary > > */ > > @@ -120,8 +124,10 @@ > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_rx - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -134,11 +140,7 @@ > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_rx); > > } > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine mac_swap_engine =3D { diff --git > > a/app/test-pmd/rxonly.c b/app/test-pmd/rxonly.c index > > 5c65fc4..d1da357 100644 > > --- a/app/test-pmd/rxonly.c > > +++ b/app/test-pmd/rxonly.c > > @@ -50,19 +50,18 @@ > > uint16_t nb_rx; > > uint16_t i; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > - > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* > > * Receive a burst of packets. > > */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, pkts_burst, > > nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > if (unlikely(nb_rx =3D=3D 0)) > > return; > > > > @@ -73,11 +72,7 @@ > > for (i =3D 0; i < nb_rx; i++) > > rte_pktmbuf_free(pkts_burst[i]); > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > struct fwd_engine rx_only_engine =3D { > > diff --git a/app/test-pmd/softnicfwd.c b/app/test-pmd/softnicfwd.c > > index > > 94e6669..9b2b0e6 100644 > > --- a/app/test-pmd/softnicfwd.c > > +++ b/app/test-pmd/softnicfwd.c > > @@ -87,35 +87,39 @@ struct tm_hierarchy { > > uint16_t nb_tx; > > uint32_t retry; > > > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) >=20 > Should the RTE_TEST_PMD_RECORD_CORE_CYCLES macro be checked here > too? >=20 > > + uint64_t start_tx_tsc; > > #endif > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > /* Packets Receive */ > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > nb_rx =3D rte_eth_rx_burst(fs->rx_port, fs->rx_queue, > > pkts_burst, nb_pkt_per_burst); > > + TEST_PMD_CORE_CYC_RX_ADD(fs, start_rx_tsc); > > fs->rx_packets +=3D nb_rx; > > > > #ifdef RTE_TEST_PMD_RECORD_BURST_STATS > > fs->rx_burst_stats.pkt_burst_spread[nb_rx]++; > > #endif > > > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > pkts_burst, nb_rx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > > > /* Retry if necessary */ > > if (unlikely(nb_tx < nb_rx) && fs->retry_enabled) { > > retry =3D 0; > > while (nb_tx < nb_rx && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_rx - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -130,11 +134,7 @@ struct tm_hierarchy { > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_rx); > > } > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > static void > > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > > f0061d9..de8478f 100644 > > --- a/app/test-pmd/testpmd.c > > +++ b/app/test-pmd/testpmd.c > > @@ -1483,6 +1483,12 @@ struct extmem_param { #ifdef > > RTE_TEST_PMD_RECORD_CORE_CYCLES > > uint64_t fwd_cycles =3D 0; > > #endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > + uint64_t rx_cycles =3D 0; > > +#endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > + uint64_t tx_cycles =3D 0; > > +#endif > > uint64_t total_recv =3D 0; > > uint64_t total_xmit =3D 0; > > struct rte_port *port; > > @@ -1513,6 +1519,12 @@ struct extmem_param { #ifdef > > RTE_TEST_PMD_RECORD_CORE_CYCLES > > fwd_cycles +=3D fs->core_cycles; > > #endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > + rx_cycles +=3D fs->core_rx_cycles; > > +#endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > + tx_cycles +=3D fs->core_tx_cycles; > > +#endif > > } > > for (i =3D 0; i < cur_fwd_config.nb_fwd_ports; i++) { > > uint8_t j; > > @@ -1648,6 +1660,20 @@ struct extmem_param { > > (unsigned int)(fwd_cycles / total_recv), > > fwd_cycles, total_recv); > > #endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > + if (total_recv > 0) > > + printf("\n rx CPU cycles/packet=3D%u (total cycles=3D" > > + "%"PRIu64" / total RX packets=3D%"PRIu64")\n", > > + (unsigned int)(rx_cycles / total_recv), > > + rx_cycles, total_recv); > > +#endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > + if (total_xmit > 0) > > + printf("\n tx CPU cycles/packet=3D%u (total cycles=3D" > > + "%"PRIu64" / total TX packets=3D%"PRIu64")\n", > > + (unsigned int)(tx_cycles / total_xmit), > > + tx_cycles, total_xmit); > > +#endif > > } > > > > void > > @@ -1678,6 +1704,12 @@ struct extmem_param { #ifdef > > RTE_TEST_PMD_RECORD_CORE_CYCLES > > fs->core_cycles =3D 0; > > #endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > + fs->core_rx_cycles =3D 0; > > +#endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > + fs->core_tx_cycles =3D 0; > > +#endif > > } > > } > > > > diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index > > 1d9b7a2..4e8af8a 100644 > > --- a/app/test-pmd/testpmd.h > > +++ b/app/test-pmd/testpmd.h > > @@ -130,12 +130,52 @@ struct fwd_stream { #ifdef > > RTE_TEST_PMD_RECORD_CORE_CYCLES > > uint64_t core_cycles; /**< used for RX and TX processing */ > > #endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > + uint64_t core_tx_cycles; /**< used for tx_burst processing */ > > +#endif > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > + uint64_t core_rx_cycles; /**< used for rx_burst processing */ > > +#endif > > #ifdef RTE_TEST_PMD_RECORD_BURST_STATS > > struct pkt_burst_stats rx_burst_stats; > > struct pkt_burst_stats tx_burst_stats; #endif }; > > > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) > > +#define TEST_PMD_CORE_CYC_TX_START(a) {a =3D rte_rdtsc(); } #else > > +#define > > +TEST_PMD_CORE_CYC_TX_START(a) #endif > > + > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) || \ > > + defined(RTE_TEST_PMD_RECORD_CORE_RX_CYCLES) > > +#define TEST_PMD_CORE_CYC_RX_START(a) {a =3D rte_rdtsc(); } #else > > +#define > > +TEST_PMD_CORE_CYC_RX_START(a) #endif > > + > > +#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES #define > > +TEST_PMD_CORE_CYC_FWD_ADD(fs, s) \ {uint64_t end_tsc =3D rte_rdtsc(); > > +fs->core_cycles +=3D end_tsc - (s); } #else #define > > +TEST_PMD_CORE_CYC_FWD_ADD(fs, s) #endif > > + > > +#ifdef RTE_TEST_PMD_RECORD_CORE_TX_CYCLES > > +#define TEST_PMD_CORE_CYC_TX_ADD(fs, s) \ {uint64_t end_tsc =3D > > +rte_rdtsc(); fs->core_tx_cycles +=3D end_tsc - (s); } #else #define > > +TEST_PMD_CORE_CYC_TX_ADD(fs, s) #endif > > + > > +#ifdef RTE_TEST_PMD_RECORD_CORE_RX_CYCLES > > +#define TEST_PMD_CORE_CYC_RX_ADD(fs, s) \ {uint64_t end_tsc =3D > > +rte_rdtsc(); fs->core_rx_cycles +=3D end_tsc - (s); } #else #define > > +TEST_PMD_CORE_CYC_RX_ADD(fs, s) #endif > > + > > /** Descriptor for a single flow. */ > > struct port_flow { > > struct port_flow *next; /**< Next flow in list. */ diff --git > > a/app/test- pmd/txonly.c b/app/test-pmd/txonly.c index > > fdfca14..fe3045a 100644 > > --- a/app/test-pmd/txonly.c > > +++ b/app/test-pmd/txonly.c > > @@ -241,16 +241,16 @@ > > uint32_t retry; > > uint64_t ol_flags =3D 0; > > uint64_t tx_offloads; > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - uint64_t start_tsc; > > - uint64_t end_tsc; > > - uint64_t core_cycles; > > +#if defined(RTE_TEST_PMD_RECORD_CORE_TX_CYCLES) > > + uint64_t start_tx_tsc; > > +#endif > > +#if defined(RTE_TEST_PMD_RECORD_CORE_CYCLES) > > + uint64_t start_rx_tsc; > > #endif > > > > #ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - start_tsc =3D rte_rdtsc(); > > + TEST_PMD_CORE_CYC_RX_START(start_rx_tsc); > > #endif > > - > > mbp =3D current_fwd_lcore()->mbp; > > txp =3D &ports[fs->tx_port]; > > tx_offloads =3D txp->dev_conf.txmode.offloads; @@ -302,7 +302,9 > @@ > > if (nb_pkt =3D=3D 0) > > return; > > > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx =3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, pkts_burst, > > nb_pkt); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > /* > > * Retry if necessary > > */ > > @@ -310,8 +312,10 @@ > > retry =3D 0; > > while (nb_tx < nb_pkt && retry++ < burst_tx_retry_num) { > > rte_delay_us(burst_tx_delay_time); > > + TEST_PMD_CORE_CYC_TX_START(start_tx_tsc); > > nb_tx +=3D rte_eth_tx_burst(fs->tx_port, fs->tx_queue, > > &pkts_burst[nb_tx], nb_pkt - nb_tx); > > + TEST_PMD_CORE_CYC_TX_ADD(fs, start_tx_tsc); > > } > > } > > fs->tx_packets +=3D nb_tx; > > @@ -334,12 +338,7 @@ > > rte_pktmbuf_free(pkts_burst[nb_tx]); > > } while (++nb_tx < nb_pkt); > > } > > - > > -#ifdef RTE_TEST_PMD_RECORD_CORE_CYCLES > > - end_tsc =3D rte_rdtsc(); > > - core_cycles =3D (end_tsc - start_tsc); > > - fs->core_cycles =3D (uint64_t) (fs->core_cycles + core_cycles); > > -#endif > > + TEST_PMD_CORE_CYC_FWD_ADD(fs, start_rx_tsc); > > } > > > > static void > > diff --git a/config/common_base b/config/common_base index > > 6b96e0e..6e84af4 100644 > > --- a/config/common_base > > +++ b/config/common_base > > @@ -998,6 +998,8 @@ CONFIG_RTE_PROC_INFO=3Dn # > CONFIG_RTE_TEST_PMD=3Dy > > CONFIG_RTE_TEST_PMD_RECORD_CORE_CYCLES=3Dn > > +CONFIG_RTE_TEST_PMD_RECORD_CORE_RX_CYCLES=3Dn > > +CONFIG_RTE_TEST_PMD_RECORD_CORE_TX_CYCLES=3Dn > > CONFIG_RTE_TEST_PMD_RECORD_BURST_STATS=3Dn >=20 > Should the RECORD macros be documented in the run_app.rst file ? >=20 > > # > > -- > > 1.8.3.1 >=20 > Regards, >=20 > Bernard