From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id AABF52B91 for ; Sat, 1 Apr 2017 04:02:03 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=intel.com; i=@intel.com; q=dns/txt; s=intel; t=1491012123; x=1522548123; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=obbv2eBCN02XufmnUckw7tOJeps5cHAIN4st29ZsdsU=; b=SvnuD0URFHTHabXpzdW07M2I4f+Vmzq6HnUlggixqK96m+gryk47Psa+ l2fo/8c0RJkTubdDWORNugipAdMGxA==; Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 31 Mar 2017 19:02:02 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.36,255,1486454400"; d="scan'208";a="67634155" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by orsmga002.jf.intel.com with ESMTP; 31 Mar 2017 19:02:02 -0700 Received: from fmsmsx111.amr.corp.intel.com (10.18.116.5) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 31 Mar 2017 19:02:01 -0700 Received: from shsmsx104.ccr.corp.intel.com (10.239.4.70) by fmsmsx111.amr.corp.intel.com (10.18.116.5) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 31 Mar 2017 19:02:01 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.212]) by SHSMSX104.ccr.corp.intel.com ([169.254.5.42]) with mapi id 14.03.0248.002; Sat, 1 Apr 2017 10:01:59 +0800 From: "Pei, Yulong" To: Vladyslav Buslov , "Zhang, Helin" , "Wu, Jingjing" , "Yigit, Ferruh" CC: "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH] net/i40e: add packet prefetch Thread-Index: AQHSknqq7YgO4M5hq0qDs07ShbTPHKGv8LTQ Date: Sat, 1 Apr 2017 02:01:58 +0000 Message-ID: <188971FCDA171749BED5DA74ABF3E6F03B6ACF0D@shsmsx102.ccr.corp.intel.com> References: <1488365813-12442-1-git-send-email-vladyslav.buslov@harmonicinc.com> In-Reply-To: <1488365813-12442-1-git-send-email-vladyslav.buslov@harmonicinc.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiYTA1MDBiYWItZGU1NC00MDM5LWI2YTgtZTgyOGUwMmI4YmJmIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE1LjkuNi42IiwiVHJ1c3RlZExhYmVsSGFzaCI6IjVSYlFoRjBrVCtlUDVpYUU4NnQ2TFVrODM2dnhHNlwvVkUzU3kwRVlcL0x6TT0ifQ== x-ctpclassification: CTP_IC x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH] net/i40e: add packet prefetch X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 01 Apr 2017 02:02:04 -0000 Hi All In Non-vector mode, without this patch, single core performance can reach 3= 7.576Mpps with 64Byte packet, But after applied this patch , single core performance downgrade to 34.343M= pps with 64Byte packet. Best Regards Yulong Pei -----Original Message----- From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Vladyslav Buslov Sent: Wednesday, March 1, 2017 6:57 PM To: Zhang, Helin ; Wu, Jingjing ; Yigit, Ferruh Cc: dev@dpdk.org Subject: [dpdk-dev] [PATCH] net/i40e: add packet prefetch Prefetch both cache lines of mbuf and first cache line of payload if CONFIG= _RTE_PMD_PACKET_PREFETCH is set. Signed-off-by: Vladyslav Buslov --- drivers/net/i40e/i40e_rxtx.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/drivers/net/i40e/i40e_rxtx.c b/drivers/net/i40e/i40e_rxtx.c in= dex 48429cc..2b4e5c9 100644 --- a/drivers/net/i40e/i40e_rxtx.c +++ b/drivers/net/i40e/i40e_rxtx.c @@ -100,6 +100,12 @@ #define I40E_TX_OFFLOAD_NOTSUP_MASK \ (PKT_TX_OFFLOAD_MASK ^ I40E_TX_OFFLOAD_MASK) =20 +#ifdef RTE_PMD_PACKET_PREFETCH +#define rte_packet_prefetch(p) rte_prefetch0(p) +#else +#define rte_packet_prefetch(p) do {} while (0) +#endif + static uint16_t i40e_xmit_pkts_simple(void *tx_queue, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); @@ -495,6 +501,9 @@ i40e_rx_scan_hw_ring(struct i40e_rx_queue *rxq) /* Translate descriptor info to mbuf parameters */ for (j =3D 0; j < nb_dd; j++) { mb =3D rxep[j].mbuf; + rte_packet_prefetch( + RTE_PTR_ADD(mb->buf_addr, + RTE_PKTMBUF_HEADROOM)); qword1 =3D rte_le_to_cpu_64(\ rxdp[j].wb.qword1.status_error_len); pkt_len =3D ((qword1 & I40E_RXD_QW1_LENGTH_PBUF_MASK) >> @@ -578,9 +587= ,11 @@ i40e_rx_alloc_bufs(struct i40e_rx_queue *rxq) =20 rxdp =3D &rxq->rx_ring[alloc_idx]; for (i =3D 0; i < rxq->rx_free_thresh; i++) { - if (likely(i < (rxq->rx_free_thresh - 1))) + if (likely(i < (rxq->rx_free_thresh - 1))) { /* Prefetch next mbuf */ - rte_prefetch0(rxep[i + 1].mbuf); + rte_packet_prefetch(rxep[i + 1].mbuf->cacheline0); + rte_packet_prefetch(rxep[i + 1].mbuf->cacheline1); + } =20 mb =3D rxep[i].mbuf; rte_mbuf_refcnt_set(mb, 1); @@ -752,7 +763,8 @@ i40e_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkt= s, uint16_t nb_pkts) I40E_RXD_QW1_LENGTH_PBUF_SHIFT) - rxq->crc_len; =20 rxm->data_off =3D RTE_PKTMBUF_HEADROOM; - rte_prefetch0(RTE_PTR_ADD(rxm->buf_addr, RTE_PKTMBUF_HEADROOM)); + rte_packet_prefetch(RTE_PTR_ADD(rxm->buf_addr, + RTE_PKTMBUF_HEADROOM)); rxm->nb_segs =3D 1; rxm->next =3D NULL; rxm->pkt_len =3D rx_packet_len; @@ -939,7 +951,7 @@ i40e_recv_scattered_pkts(void *rx_queue, first_seg->ol_flags |=3D pkt_flags; =20 /* Prefetch data of first segment, if configured to do so. */ - rte_prefetch0(RTE_PTR_ADD(first_seg->buf_addr, + rte_packet_prefetch(RTE_PTR_ADD(first_seg->buf_addr, first_seg->data_off)); rx_pkts[nb_rx++] =3D first_seg; first_seg =3D NULL; -- 2.1.4