From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id AB366A04F3; Thu, 2 Jan 2020 18:59:35 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 106B51C1B0; Thu, 2 Jan 2020 18:59:29 +0100 (CET) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id 3C45F1C1A3 for ; Thu, 2 Jan 2020 18:59:27 +0100 (CET) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id 002Ht9b1019119 for ; Thu, 2 Jan 2020 09:59:26 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-type; s=pfpt0818; bh=FVS7E4kHXckWf2H2I0SqGq7Mxaap1N5MJfTDHg7xFDU=; b=lu0I3uxB1Cc7ZJAsmg+6EVCotHjXQ7qsZfB533+Z+UTVfbtCdm9Xut482fUwJKbz2sjh CdKifMMM2lVv6TWrghBaEQTDaQqPjZ8P8I3A6v8j2tZsuz6sIKYcbAI5GzFVwiqbYy8D zcIOj11DgnetucGnNiPbDCwbFWd33e/i/S7dhl+ayePdK8KwdhokRNnPd20bNZe+gKLT dP9mgyp6dIV/5vZ4yVkGbos5tSc0s/EVmp/eraKmzvAlrtQj3st+k148MFhys/XHxGKs B12/DrzlJkXxxnacn1zp1lzec+A+DaW6oErDNeUdmLa6uRJkzaNqkj0DKtV8lW7U6oDo Rw== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0b-0016f401.pphosted.com with ESMTP id 2x67etnw0n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Thu, 02 Jan 2020 09:59:26 -0800 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1497.2; Thu, 2 Jan 2020 09:59:24 -0800 Received: from maili.marvell.com (10.93.176.43) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server id 15.0.1497.2 via Frontend Transport; Thu, 2 Jan 2020 09:59:23 -0800 Received: from dut1171.mv.qlogic.com (unknown [10.112.88.18]) by maili.marvell.com (Postfix) with ESMTP id E1E003F703F; Thu, 2 Jan 2020 09:59:23 -0800 (PST) Received: from dut1171.mv.qlogic.com (localhost [127.0.0.1]) by dut1171.mv.qlogic.com (8.14.7/8.14.7) with ESMTP id 002HxNM9009605; Thu, 2 Jan 2020 09:59:23 -0800 Received: (from root@localhost) by dut1171.mv.qlogic.com (8.14.7/8.14.7/Submit) id 002HxNIA009604; Thu, 2 Jan 2020 09:59:23 -0800 From: Shahed Shaikh To: CC: , , Date: Thu, 2 Jan 2020 09:59:03 -0800 Message-ID: <20200102175903.9556-2-shshaikh@marvell.com> X-Mailer: git-send-email 2.12.0 In-Reply-To: <20200102175903.9556-1-shshaikh@marvell.com> References: <20200102175903.9556-1-shshaikh@marvell.com> MIME-Version: 1.0 Content-Type: text/plain X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.95,18.0.572 definitions=2020-01-02_05:2020-01-02,2020-01-02 signatures=0 Subject: [dpdk-dev] [PATCH 2/2] net/qede: enhance transmit data path CPU utilization X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Use lightweight transmit handler which handles non-offloaded Tx data path. We get CPU utilization improvement of ~8%. Signed-off-by: Shahed Shaikh --- drivers/net/qede/qede_ethdev.c | 15 +++- drivers/net/qede/qede_rxtx.c | 125 +++++++++++++++++++++++++++++++++ drivers/net/qede/qede_rxtx.h | 2 + 3 files changed, 141 insertions(+), 1 deletion(-) diff --git a/drivers/net/qede/qede_ethdev.c b/drivers/net/qede/qede_ethdev.c index 47e90096a..055f046e2 100644 --- a/drivers/net/qede/qede_ethdev.c +++ b/drivers/net/qede/qede_ethdev.c @@ -270,8 +270,10 @@ qede_interrupt_handler(void *param) static void qede_assign_rxtx_handlers(struct rte_eth_dev *dev) { + uint64_t tx_offloads = dev->data->dev_conf.txmode.offloads; struct qede_dev *qdev = dev->data->dev_private; struct ecore_dev *edev = &qdev->edev; + bool use_tx_offload = false; if (ECORE_IS_CMT(edev)) { dev->rx_pkt_burst = qede_recv_pkts_cmt; @@ -287,7 +289,18 @@ qede_assign_rxtx_handlers(struct rte_eth_dev *dev) dev->rx_pkt_burst = qede_recv_pkts_regular; } - dev->tx_pkt_burst = qede_xmit_pkts; + use_tx_offload = !!(tx_offloads & + (DEV_TX_OFFLOAD_OUTER_IPV4_CKSUM | /* tunnel */ + DEV_TX_OFFLOAD_TCP_TSO | /* tso */ + DEV_TX_OFFLOAD_VLAN_INSERT)); /* vlan insert */ + + if (use_tx_offload) { + DP_INFO(edev, "Assigning qede_xmit_pkts\n"); + dev->tx_pkt_burst = qede_xmit_pkts; + } else { + DP_INFO(edev, "Assigning qede_xmit_pkts_regular\n"); + dev->tx_pkt_burst = qede_xmit_pkts_regular; + } } static void diff --git a/drivers/net/qede/qede_rxtx.c b/drivers/net/qede/qede_rxtx.c index 3b486a0a4..985e49f1c 100644 --- a/drivers/net/qede/qede_rxtx.c +++ b/drivers/net/qede/qede_rxtx.c @@ -2234,6 +2234,131 @@ qede_mpls_tunn_tx_sanity_check(struct rte_mbuf *mbuf, } #endif +uint16_t +qede_xmit_pkts_regular(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) +{ + struct qede_tx_queue *txq = p_txq; + struct qede_dev *qdev = txq->qdev; + struct ecore_dev *edev = &qdev->edev; + struct eth_tx_1st_bd *bd1; + struct eth_tx_2nd_bd *bd2; + struct eth_tx_3rd_bd *bd3; + struct rte_mbuf *m_seg = NULL; + struct rte_mbuf *mbuf; + struct qede_tx_entry *sw_tx_ring; + uint16_t nb_tx_pkts; + uint16_t bd_prod; + uint16_t idx; + uint16_t nb_frags = 0; + uint16_t nb_pkt_sent = 0; + uint8_t nbds; + uint64_t tx_ol_flags; + /* BD1 */ + uint16_t bd1_bf; + uint8_t bd1_bd_flags_bf; + + if (unlikely(txq->nb_tx_avail < txq->tx_free_thresh)) { + PMD_TX_LOG(DEBUG, txq, "send=%u avail=%u free_thresh=%u", + nb_pkts, txq->nb_tx_avail, txq->tx_free_thresh); + qede_process_tx_compl(edev, txq); + } + + nb_tx_pkts = nb_pkts; + bd_prod = rte_cpu_to_le_16(ecore_chain_get_prod_idx(&txq->tx_pbl)); + sw_tx_ring = txq->sw_tx_ring; + + while (nb_tx_pkts--) { + /* Init flags/values */ + nbds = 0; + bd1 = NULL; + bd2 = NULL; + bd3 = NULL; + bd1_bf = 0; + bd1_bd_flags_bf = 0; + nb_frags = 0; + + mbuf = *tx_pkts++; + assert(mbuf); + + + /* Check minimum TX BDS availability against available BDs */ + if (unlikely(txq->nb_tx_avail < mbuf->nb_segs)) + break; + + tx_ol_flags = mbuf->ol_flags; + bd1_bd_flags_bf |= 1 << ETH_TX_1ST_BD_FLAGS_START_BD_SHIFT; + + if (unlikely(txq->nb_tx_avail < + ETH_TX_MIN_BDS_PER_NON_LSO_PKT)) + break; + bd1_bf |= + (mbuf->pkt_len & ETH_TX_DATA_1ST_BD_PKT_LEN_MASK) + << ETH_TX_DATA_1ST_BD_PKT_LEN_SHIFT; + + /* Offload the IP checksum in the hardware */ + if (tx_ol_flags & PKT_TX_IP_CKSUM) + bd1_bd_flags_bf |= + 1 << ETH_TX_1ST_BD_FLAGS_IP_CSUM_SHIFT; + + /* L4 checksum offload (tcp or udp) */ + if ((tx_ol_flags & (PKT_TX_IPV4 | PKT_TX_IPV6)) && + (tx_ol_flags & (PKT_TX_UDP_CKSUM | PKT_TX_TCP_CKSUM))) + bd1_bd_flags_bf |= + 1 << ETH_TX_1ST_BD_FLAGS_L4_CSUM_SHIFT; + + /* Fill the entry in the SW ring and the BDs in the FW ring */ + idx = TX_PROD(txq); + sw_tx_ring[idx].mbuf = mbuf; + + /* BD1 */ + bd1 = (struct eth_tx_1st_bd *)ecore_chain_produce(&txq->tx_pbl); + memset(bd1, 0, sizeof(struct eth_tx_1st_bd)); + nbds++; + + /* Map MBUF linear data for DMA and set in the BD1 */ + QEDE_BD_SET_ADDR_LEN(bd1, rte_mbuf_data_iova(mbuf), + mbuf->data_len); + bd1->data.bitfields = rte_cpu_to_le_16(bd1_bf); + bd1->data.bd_flags.bitfields = bd1_bd_flags_bf; + + /* Handle fragmented MBUF */ + if (unlikely(mbuf->nb_segs > 1)) { + m_seg = mbuf->next; + + /* Encode scatter gather buffer descriptors */ + nb_frags = qede_encode_sg_bd(txq, m_seg, &bd2, &bd3, + nbds - 1); + } + + bd1->data.nbds = nbds + nb_frags; + + txq->nb_tx_avail -= bd1->data.nbds; + txq->sw_tx_prod++; + bd_prod = + rte_cpu_to_le_16(ecore_chain_get_prod_idx(&txq->tx_pbl)); +#ifdef RTE_LIBRTE_QEDE_DEBUG_TX + print_tx_bd_info(txq, bd1, bd2, bd3, tx_ol_flags); +#endif + nb_pkt_sent++; + txq->xmit_pkts++; + } + + /* Write value of prod idx into bd_prod */ + txq->tx_db.data.bd_prod = bd_prod; + rte_wmb(); + rte_compiler_barrier(); + DIRECT_REG_WR_RELAXED(edev, txq->doorbell_addr, txq->tx_db.raw); + rte_wmb(); + + /* Check again for Tx completions */ + qede_process_tx_compl(edev, txq); + + PMD_TX_LOG(DEBUG, txq, "to_send=%u sent=%u bd_prod=%u core=%d", + nb_pkts, nb_pkt_sent, TX_PROD(txq), rte_lcore_id()); + + return nb_pkt_sent; +} + uint16_t qede_xmit_pkts(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts) { diff --git a/drivers/net/qede/qede_rxtx.h b/drivers/net/qede/qede_rxtx.h index a4c634e88..d7ff870b2 100644 --- a/drivers/net/qede/qede_rxtx.h +++ b/drivers/net/qede/qede_rxtx.h @@ -275,6 +275,8 @@ uint16_t qede_xmit_pkts(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); uint16_t qede_xmit_pkts_cmt(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); +uint16_t qede_xmit_pkts_regular(void *p_txq, struct rte_mbuf **tx_pkts, + uint16_t nb_pkts); uint16_t qede_xmit_prep_pkts(void *p_txq, struct rte_mbuf **tx_pkts, uint16_t nb_pkts); -- 2.17.1