From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9DED0A034F; Tue, 1 Mar 2022 12:07:18 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 410F740DF6; Tue, 1 Mar 2022 12:07:18 +0100 (CET) Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by mails.dpdk.org (Postfix) with ESMTP id 69B74407FF for ; Tue, 1 Mar 2022 12:07:16 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1646132836; x=1677668836; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-transfer-encoding:mime-version; bh=AoaLtrVMYmtljqQXYb7R4G8x2X7Q5erROp7BE2oFoOE=; b=D7kVI6WpP1TQMlFzXyur5Hyj5h1KSPNSwSib18tvNhrlKNQdzGMUpVgg zvi6wijh4PIFRNPaZm1+z7ZjVHRhFZin6vxBxJL0hBa9UNGQS3qSTRLAh CJ4p7wqLts7KnLdYM7NwQggymuMrlm+Y166+If7jE13ln9h/8YWtYK0G+ i8zrJA8HKZF8jRPsXUD2RWupKQ2D71txmNVF0yyL2TQ+P8jfb18ns4npa r8ClqNCXdTeV269aqaLAzEwPQwZOUsLR5/WpNut+4PdueBH6swkonv/rU lUTZ0Qm+xEFSjYH6X1QPODqW6gtCpDN9OV5QVnyH/PyBSmGIY0hCy1y35 w==; X-IronPort-AV: E=McAfee;i="6200,9189,10272"; a="236619769" X-IronPort-AV: E=Sophos;i="5.90,145,1643702400"; d="scan'208";a="236619769" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 01 Mar 2022 03:07:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,145,1643702400"; d="scan'208";a="534851482" Received: from fmsmsx601.amr.corp.intel.com ([10.18.126.81]) by orsmga007.jf.intel.com with ESMTP; 01 Mar 2022 03:07:15 -0800 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by fmsmsx601.amr.corp.intel.com (10.18.126.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Tue, 1 Mar 2022 03:07:14 -0800 Received: from shsmsx601.ccr.corp.intel.com (10.109.6.141) by SHSMSX601.ccr.corp.intel.com (10.109.6.141) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Tue, 1 Mar 2022 19:07:12 +0800 Received: from shsmsx601.ccr.corp.intel.com ([10.109.6.141]) by SHSMSX601.ccr.corp.intel.com ([10.109.6.141]) with mapi id 15.01.2308.021; Tue, 1 Mar 2022 19:07:12 +0800 From: "Zhang, Qi Z" To: "Wu, Wenjun1" , "dev@dpdk.org" , "Yang, Qiming" CC: "Van Haaren, Harry" , "Su, Simei" Subject: RE: [PATCH v4] net/ice: improve performance of RX timestamp offload Thread-Topic: [PATCH v4] net/ice: improve performance of RX timestamp offload Thread-Index: AQHYLHjBI/Bf0CjKsECAt2ezZnbUUayqXnAQ Date: Tue, 1 Mar 2022 11:07:12 +0000 Message-ID: <393580b4f35248e2ac2ae68bc2a7f9c5@intel.com> References: <20220222062612.335622-1-wenjun1.wu@intel.com> <20220228073607.2249410-1-wenjun1.wu@intel.com> In-Reply-To: <20220228073607.2249410-1-wenjun1.wu@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-reaction: no-action dlp-version: 11.6.401.20 dlp-product: dlpe-windows x-originating-ip: [10.239.127.36] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org > -----Original Message----- > From: Wu, Wenjun1 > Sent: Monday, February 28, 2022 3:36 PM > To: dev@dpdk.org; Zhang, Qi Z ; Yang, Qiming > > Cc: Van Haaren, Harry ; Su, Simei > ; Wu, Wenjun1 > Subject: [PATCH v4] net/ice: improve performance of RX timestamp offload >=20 > Previously, each time a burst of packets is received, SW reads HW registe= r > and assembles it and the timestamp from descriptor together to get the > complete 64 bits timestamp. >=20 > This patch optimizes the algorithm. The SW only needs to check the > monotonicity of the low 32bits timestamp to avoid crossing borders. > Each time before SW receives a burst of packets, it should check the time > difference between current time and last update time to avoid the low 32 > bits timestamp cycling twice. Overall, the patch looks good to me and we can cc-stable for LTS but I'd like to defer this to the next release as we are close to the relea= se date and don't want to take the risk to merge complex changes at this mo= ment. Regards Qi >=20 > Signed-off-by: Wenjun Wu >=20 > --- > v4: rework initialization behavior > v3: add missing conditional compilation > v2: add conditional compilation > --- > drivers/net/ice/ice_ethdev.h | 3 + > drivers/net/ice/ice_rxtx.c | 118 +++++++++++++++++++++++++---------- > 2 files changed, 88 insertions(+), 33 deletions(-) >=20 > diff --git a/drivers/net/ice/ice_ethdev.h b/drivers/net/ice/ice_ethdev.h = index > 3ed580d438..6778941d7d 100644 > --- a/drivers/net/ice/ice_ethdev.h > +++ b/drivers/net/ice/ice_ethdev.h > @@ -554,6 +554,9 @@ struct ice_adapter { > struct rte_timecounter tx_tstamp_tc; > bool ptp_ena; > uint64_t time_hw; > + uint32_t hw_time_high; /* high 32 bits of timestamp */ > + uint32_t hw_time_low; /* low 32 bits of timestamp */ > + uint64_t hw_time_update; /* SW time of HW record updating */ > struct ice_fdir_prof_info fdir_prof_info[ICE_MAX_PTGS]; > struct ice_rss_prof_info rss_prof_info[ICE_MAX_PTGS]; > /* True if DCF state of the associated PF is on */ diff --git > a/drivers/net/ice/ice_rxtx.c b/drivers/net/ice/ice_rxtx.c index > 4f218bcd0d..4b0bcd4863 100644 > --- a/drivers/net/ice/ice_rxtx.c > +++ b/drivers/net/ice/ice_rxtx.c > @@ -1574,9 +1574,10 @@ ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) > uint64_t pkt_flags =3D 0; > uint32_t *ptype_tbl =3D rxq->vsi->adapter->ptype_tbl; #ifndef > RTE_LIBRTE_ICE_16BYTE_RX_DESC > + bool is_tsinit =3D false; > + uint64_t ts_ns; > struct ice_vsi *vsi =3D rxq->vsi; > struct ice_hw *hw =3D ICE_VSI_TO_HW(vsi); > - uint64_t ts_ns; > struct ice_adapter *ad =3D rxq->vsi->adapter; #endif > rxdp =3D &rxq->rx_ring[rxq->rx_tail]; > @@ -1588,8 +1589,14 @@ ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) > if (!(stat_err0 & (1 << ICE_RX_FLEX_DESC_STATUS0_DD_S))) > return 0; >=20 > - if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) > - rxq->hw_register_set =3D 1; > +#ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC > + if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { > + uint64_t sw_cur_time =3D rte_get_timer_cycles() / > (rte_get_timer_hz() / > +1000); > + > + if (unlikely(sw_cur_time - ad->hw_time_update > 4)) > + is_tsinit =3D 1; > + } > +#endif >=20 > /** > * Scan LOOK_AHEAD descriptors at a time to determine which @@ - > 1625,14 +1632,26 @@ ice_rx_scan_hw_ring(struct ice_rx_queue *rxq) > rxd_to_pkt_fields_ops[rxq->rxdid](rxq, mb, &rxdp[j]); > #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC > if (ice_timestamp_dynflag > 0) { > - ts_ns =3D ice_tstamp_convert_32b_64b(hw, ad, > - rxq->hw_register_set, > - > rte_le_to_cpu_32(rxdp[j].wb.flex_ts.ts_high)); > - rxq->hw_register_set =3D 0; > + rxq->time_high =3D > + rte_le_to_cpu_32(rxdp[j].wb.flex_ts.ts_high); > + if (unlikely(is_tsinit)) { > + ts_ns =3D > ice_tstamp_convert_32b_64b(hw, ad, 1, > + rxq- > >time_high); > + ad->hw_time_low =3D (uint32_t)ts_ns; > + ad->hw_time_high =3D > (uint32_t)(ts_ns >> 32); > + is_tsinit =3D false; > + } else { > + if (rxq->time_high < ad- > >hw_time_low) > + ad->hw_time_high +=3D 1; > + ts_ns =3D (uint64_t)ad->hw_time_high > << 32 | rxq->time_high; > + ad->hw_time_low =3D rxq->time_high; > + } > + ad->hw_time_update =3D rte_get_timer_cycles() > / > + (rte_get_timer_hz() / > 1000); > *RTE_MBUF_DYNFIELD(mb, > - ice_timestamp_dynfield_offset, > - rte_mbuf_timestamp_t *) =3D ts_ns; > - mb->ol_flags |=3D ice_timestamp_dynflag; > + > ice_timestamp_dynfield_offset, > + rte_mbuf_timestamp_t *) =3D > ts_ns; > + pkt_flags |=3D ice_timestamp_dynflag; > } >=20 > if (ad->ptp_ena && ((mb->packet_type & @@ - > 1831,14 +1850,19 @@ ice_recv_scattered_pkts(void *rx_queue, > uint64_t pkt_flags; > uint32_t *ptype_tbl =3D rxq->vsi->adapter->ptype_tbl; #ifndef > RTE_LIBRTE_ICE_16BYTE_RX_DESC > + bool is_tsinit =3D false; > + uint64_t ts_ns; > struct ice_vsi *vsi =3D rxq->vsi; > struct ice_hw *hw =3D ICE_VSI_TO_HW(vsi); > - uint64_t ts_ns; > struct ice_adapter *ad =3D rxq->vsi->adapter; -#endif >=20 > - if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) > - rxq->hw_register_set =3D 1; > + if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { > + uint64_t sw_cur_time =3D rte_get_timer_cycles() / > (rte_get_timer_hz() / > +1000); > + > + if (unlikely(sw_cur_time - ad->hw_time_update > 4)) > + is_tsinit =3D true; > + } > +#endif >=20 > while (nb_rx < nb_pkts) { > rxdp =3D &rx_ring[rx_id]; > @@ -1951,14 +1975,25 @@ ice_recv_scattered_pkts(void *rx_queue, > pkt_flags =3D ice_rxd_error_to_pkt_flags(rx_stat_err0); > #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC > if (ice_timestamp_dynflag > 0) { > - ts_ns =3D ice_tstamp_convert_32b_64b(hw, ad, > - rxq->hw_register_set, > - rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high)); > - rxq->hw_register_set =3D 0; > - *RTE_MBUF_DYNFIELD(first_seg, > - ice_timestamp_dynfield_offset, > - rte_mbuf_timestamp_t *) =3D ts_ns; > - first_seg->ol_flags |=3D ice_timestamp_dynflag; > + rxq->time_high =3D > + rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high); > + if (unlikely(is_tsinit)) { > + ts_ns =3D ice_tstamp_convert_32b_64b(hw, ad, > 1, rxq->time_high); > + ad->hw_time_low =3D (uint32_t)ts_ns; > + ad->hw_time_high =3D (uint32_t)(ts_ns >> 32); > + is_tsinit =3D false; > + } else { > + if (rxq->time_high < ad->hw_time_low) > + ad->hw_time_high +=3D 1; > + ts_ns =3D (uint64_t)ad->hw_time_high << 32 | > rxq->time_high; > + ad->hw_time_low =3D rxq->time_high; > + } > + ad->hw_time_update =3D rte_get_timer_cycles() / > + (rte_get_timer_hz() / 1000); > + *RTE_MBUF_DYNFIELD(rxm, > + (ice_timestamp_dynfield_offset), > + rte_mbuf_timestamp_t *) =3D ts_ns; > + pkt_flags |=3D ice_timestamp_dynflag; > } >=20 > if (ad->ptp_ena && ((first_seg->packet_type & > RTE_PTYPE_L2_MASK) @@ -2325,14 +2360,19 @@ ice_recv_pkts(void > *rx_queue, > uint64_t pkt_flags; > uint32_t *ptype_tbl =3D rxq->vsi->adapter->ptype_tbl; #ifndef > RTE_LIBRTE_ICE_16BYTE_RX_DESC > + bool is_tsinit =3D false; > + uint64_t ts_ns; > struct ice_vsi *vsi =3D rxq->vsi; > struct ice_hw *hw =3D ICE_VSI_TO_HW(vsi); > - uint64_t ts_ns; > struct ice_adapter *ad =3D rxq->vsi->adapter; -#endif >=20 > - if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) > - rxq->hw_register_set =3D 1; > + if (rxq->offloads & RTE_ETH_RX_OFFLOAD_TIMESTAMP) { > + uint64_t sw_cur_time =3D rte_get_timer_cycles() / > (rte_get_timer_hz() / > +1000); > + > + if (unlikely(sw_cur_time - ad->hw_time_update > 4)) > + is_tsinit =3D 1; > + } > +#endif >=20 > while (nb_rx < nb_pkts) { > rxdp =3D &rx_ring[rx_id]; > @@ -2386,14 +2426,25 @@ ice_recv_pkts(void *rx_queue, > pkt_flags =3D ice_rxd_error_to_pkt_flags(rx_stat_err0); > #ifndef RTE_LIBRTE_ICE_16BYTE_RX_DESC > if (ice_timestamp_dynflag > 0) { > - ts_ns =3D ice_tstamp_convert_32b_64b(hw, ad, > - rxq->hw_register_set, > - rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high)); > - rxq->hw_register_set =3D 0; > + rxq->time_high =3D > + rte_le_to_cpu_32(rxd.wb.flex_ts.ts_high); > + if (unlikely(is_tsinit)) { > + ts_ns =3D ice_tstamp_convert_32b_64b(hw, ad, > 1, rxq->time_high); > + ad->hw_time_low =3D (uint32_t)ts_ns; > + ad->hw_time_high =3D (uint32_t)(ts_ns >> 32); > + is_tsinit =3D false; > + } else { > + if (rxq->time_high < ad->hw_time_low) > + ad->hw_time_high +=3D 1; > + ts_ns =3D (uint64_t)ad->hw_time_high << 32 | > rxq->time_high; > + ad->hw_time_low =3D rxq->time_high; > + } > + ad->hw_time_update =3D rte_get_timer_cycles() / > + (rte_get_timer_hz() / 1000); > *RTE_MBUF_DYNFIELD(rxm, > - ice_timestamp_dynfield_offset, > - rte_mbuf_timestamp_t *) =3D ts_ns; > - rxm->ol_flags |=3D ice_timestamp_dynflag; > + (ice_timestamp_dynfield_offset), > + rte_mbuf_timestamp_t *) =3D ts_ns; > + pkt_flags |=3D ice_timestamp_dynflag; > } >=20 > if (ad->ptp_ena && ((rxm->packet_type & > RTE_PTYPE_L2_MASK) =3D=3D @@ -2408,6 +2459,7 @@ ice_recv_pkts(void > *rx_queue, > /* copy old mbuf to rx_pkts */ > rx_pkts[nb_rx++] =3D rxm; > } > + > rxq->rx_tail =3D rx_id; > /** > * If the number of free RX descriptors is greater than the RX free > -- > 2.25.1