From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from smtp-outbound-1.vmware.com (smtp-outbound-1.vmware.com [208.91.2.12]) by dpdk.org (Postfix) with ESMTP id 311CF58E8 for ; Wed, 11 Feb 2015 22:16:02 +0100 (CET) Received: from sc9-mailhost2.vmware.com (sc9-mailhost2.vmware.com [10.113.161.72]) by smtp-outbound-1.vmware.com (Postfix) with ESMTP id 797C92986E; Wed, 11 Feb 2015 13:15:58 -0800 (PST) Received: from EX13-CAS-004.vmware.com (EX13-CAS-004.vmware.com [10.113.191.54]) by sc9-mailhost2.vmware.com (Postfix) with ESMTP id 7185AB05B6; Wed, 11 Feb 2015 13:15:58 -0800 (PST) Received: from EX13-MBX-026.vmware.com (10.113.191.46) by EX13-MBX-026.vmware.com (10.113.191.46) with Microsoft SMTP Server (TLS) id 15.0.913.22; Wed, 11 Feb 2015 13:15:55 -0800 Received: from EX13-MBX-026.vmware.com ([fe80::858b:7f42:fd7c:703d]) by EX13-MBX-026.vmware.com ([fe80::858b:7f42:fd7c:703d%17]) with mapi id 15.00.0913.011; Wed, 11 Feb 2015 13:15:55 -0800 From: Yong Wang To: Stephen Hemminger , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 3/7] vmxnet3: add support for mulit-segment transmit Thread-Index: AQHQRj/sX0rOO57qa0mu6CuhAMogGQ== Date: Wed, 11 Feb 2015 21:15:55 +0000 Message-ID: References: <1418793196-17953-1-git-send-email-stephen@networkplumber.org> <1418793196-17953-4-git-send-email-stephen@networkplumber.org> In-Reply-To: <1418793196-17953-4-git-send-email-stephen@networkplumber.org> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.113.160.246] Content-Type: text/plain; charset="iso-8859-1" Content-ID: <079249B73CD5404F9F32F5CAF60BE1DF@pa-exch1.vmware.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Cc: Bill Hong , Stephen Hemminger Subject: Re: [dpdk-dev] [PATCH 3/7] vmxnet3: add support for mulit-segment transmit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 11 Feb 2015 21:16:02 -0000 On 12/16/14, 9:13 PM, "Stephen Hemminger" wrote: >From: Stephen Hemminger > >Change sending loop to support multi-segment mbufs. >The VMXNET3 api has start-of-packet and end-packet flags, so it >is not hard to send multi-segment mbuf's. > >Also, update descriptor in 32 bit value rather than toggling >bitfields which is slower and error prone. >Based on code in earlier driver, and the Linux kernel driver. > >Add a compiler barrier to make sure that update of earlier descriptor >are completed prior to update of generation bit on start of packet. > >Signed-off-by: Stephen Hemminger >Signed-off-by: Bill Hong >--- > lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c | 126 >+++++++++++++--------------------- > 1 file changed, 48 insertions(+), 78 deletions(-) > >diff --git a/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c >b/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c >index 8e15784..7cb0b93 100644 >--- a/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c >+++ b/lib/librte_pmd_vmxnet3/vmxnet3_rxtx.c >@@ -312,13 +312,9 @@ vmxnet3_tq_tx_complete(vmxnet3_tx_queue_t *txq) > VMXNET3_ASSERT(txq->cmd_ring.base[tcd->txdIdx].txd.eop =3D=3D 1); > #endif > mbuf =3D txq->cmd_ring.buf_info[tcd->txdIdx].m; >- if (unlikely(mbuf =3D=3D NULL)) >- rte_panic("EOP desc does not point to a valid mbuf"); >- else >- rte_pktmbuf_free(mbuf); >- >- >+ rte_pktmbuf_free_seg(mbuf); > txq->cmd_ring.buf_info[tcd->txdIdx].m =3D NULL; >+ > /* Mark the txd for which tcd was generated as completed */ > vmxnet3_cmd_ring_adv_next2comp(&txq->cmd_ring); >=20 >@@ -336,13 +332,8 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf >**tx_pkts, > uint16_t nb_pkts) > { > uint16_t nb_tx; >- Vmxnet3_TxDesc *txd =3D NULL; >- vmxnet3_buf_info_t *tbi =3D NULL; >- struct vmxnet3_hw *hw; >- struct rte_mbuf *txm; > vmxnet3_tx_queue_t *txq =3D tx_queue; >- >- hw =3D txq->hw; >+ struct vmxnet3_hw *hw =3D txq->hw; >=20 > if (unlikely(txq->stopped)) { > PMD_TX_LOG(DEBUG, "Tx queue is stopped."); >@@ -354,75 +345,60 @@ vmxnet3_xmit_pkts(void *tx_queue, struct rte_mbuf >**tx_pkts, >=20 > nb_tx =3D 0; > while (nb_tx < nb_pkts) { >+ Vmxnet3_GenericDesc *gdesc; >+ vmxnet3_buf_info_t *tbi; >+ uint32_t first2fill, avail, dw2; >+ struct rte_mbuf *txm =3D tx_pkts[nb_tx]; >+ struct rte_mbuf *m_seg =3D txm; >+ >+ /* Is command ring full? */ >+ avail =3D vmxnet3_cmd_ring_desc_avail(&txq->cmd_ring); >+ if (txm->nb_segs > avail) { >+ ++txq->stats.tx_ring_full; >+ break; >+ } You need to check for VMXNET3_MAX_TXD_PER_PKT as this is the max # of tx descriptors allowed for a non-tso pkt. Food for thoughts: if avail is more than 0, it might be better to continue if the next pkt could fit into the remaining descriptors, rather than break here so all remaining pkts will be freed. >=20 >- if (vmxnet3_cmd_ring_desc_avail(&txq->cmd_ring)) { >- int copy_size =3D 0; >- >- txm =3D tx_pkts[nb_tx]; >- /* Don't support scatter packets yet, free them if met */ >- if (txm->nb_segs !=3D 1) { >- PMD_TX_LOG(DEBUG, "Don't support scatter packets yet, drop!"); >- rte_pktmbuf_free(tx_pkts[nb_tx]); >- txq->stats.drop_total++; >- >- nb_tx++; >- continue; >- } >- >- txd =3D (Vmxnet3_TxDesc *)(txq->cmd_ring.base + >txq->cmd_ring.next2fill); >- if (rte_pktmbuf_pkt_len(txm) <=3D VMXNET3_HDR_COPY_SIZE) { >- struct Vmxnet3_TxDataDesc *tdd; >- >- tdd =3D txq->data_ring.base + txq->cmd_ring.next2fill; >- copy_size =3D rte_pktmbuf_pkt_len(txm); >- rte_memcpy(tdd->data, rte_pktmbuf_mtod(txm, char *), copy_size); >- } Why do you remove this code block? It uses a special data ring for small packets which results in substantial performance improvement. Please refer to commit 2e849373 for details. Related to this, I wonder what tests have you done as that=B9s not clear from the commit descriptions? Do we have any coverage of perf regressions? >- >- /* Fill the tx descriptor */ >+ /* use the previous gen bit for the SOP desc */ >+ dw2 =3D (txq->cmd_ring.gen ^ 0x1) << VMXNET3_TXD_GEN_SHIFT; >+ first2fill =3D txq->cmd_ring.next2fill; >+ do { >+ /* Remember the transmit buffer for cleanup */ > tbi =3D txq->cmd_ring.buf_info + txq->cmd_ring.next2fill; >- tbi->bufPA =3D RTE_MBUF_DATA_DMA_ADDR(txm); >- if (copy_size) >- txd->addr =3D rte_cpu_to_le_64(txq->data_ring.basePA + >- txq->cmd_ring.next2fill * >- sizeof(struct Vmxnet3_TxDataDesc)); >- else >- txd->addr =3D tbi->bufPA; >- txd->len =3D txm->data_len; >+ tbi->m =3D m_seg; >=20 >- /* Mark the last descriptor as End of Packet. */ >- txd->cq =3D 1; >- txd->eop =3D 1; >+ /* NB: the following assumes that VMXNET3 maximum >+ transmit buffer size (16K) is greater than >+ maximum sizeof mbuf segment size. */ Can you add VMXNET3_ASSERT(VMXNET3_MAX_TX_BUF_SIZE >=3D txd->len)? >+ gdesc =3D txq->cmd_ring.base + txq->cmd_ring.next2fill; >+ gdesc->txd.addr =3D RTE_MBUF_DATA_DMA_ADDR(m_seg); >+ gdesc->dword[2] =3D dw2 | m_seg->data_len; >+ gdesc->dword[3] =3D 0; >=20 >- /* Add VLAN tag if requested */ >- if (txm->ol_flags & PKT_TX_VLAN_PKT) { >- txd->ti =3D 1; >- txd->tci =3D rte_cpu_to_le_16(txm->vlan_tci); >- } >+ /* move to the next2fill descriptor */ >+ vmxnet3_cmd_ring_adv_next2fill(&txq->cmd_ring); >=20 >- /* Record current mbuf for freeing it later in tx complete */ >-#ifdef RTE_LIBRTE_VMXNET3_DEBUG_DRIVER >- VMXNET3_ASSERT(txm); >-#endif >- tbi->m =3D txm; >+ /* use the right gen for non-SOP desc */ >+ dw2 =3D txq->cmd_ring.gen << VMXNET3_TXD_GEN_SHIFT; >+ } while ( (m_seg =3D m_seg->next) !=3D NULL); >=20 >- /* Set the offloading mode to default */ >- txd->hlen =3D 0; >- txd->om =3D VMXNET3_OM_NONE; >- txd->msscof =3D 0; >+ /* Update the EOP descriptor */ >+ gdesc->dword[3] |=3D VMXNET3_TXD_EOP | VMXNET3_TXD_CQ; >=20 >- /* finally flip the GEN bit of the SOP desc */ >- txd->gen =3D txq->cmd_ring.gen; >- txq->shared->ctrl.txNumDeferred++; >+ /* Add VLAN tag if present */ >+ gdesc =3D txq->cmd_ring.base + first2fill; >+ if (txm->ol_flags & PKT_TX_VLAN_PKT) { >+ gdesc->txd.ti =3D 1; >+ gdesc->txd.tci =3D txm->vlan_tci; >+ } >=20 >- /* move to the next2fill descriptor */ >- vmxnet3_cmd_ring_adv_next2fill(&txq->cmd_ring); >- nb_tx++; >+ /* TODO: Add transmit checksum offload here */ >- } else { >- PMD_TX_LOG(DEBUG, "No free tx cmd desc(s)"); >- txq->stats.drop_total +=3D (nb_pkts - nb_tx); >- break; >- } >+ /* flip the GEN bit on the SOP */ >+ rte_compiler_barrier(); >+ gdesc->dword[2] ^=3D VMXNET3_TXD_GEN; >+ >+ txq->shared->ctrl.txNumDeferred++; >+ nb_tx++; > } >=20 > PMD_TX_LOG(DEBUG, "vmxnet3 txThreshold: %u", >txq->shared->ctrl.txThreshold); >@@ -722,12 +698,6 @@ vmxnet3_dev_tx_queue_setup(struct rte_eth_dev *dev, >=20 > PMD_INIT_FUNC_TRACE(); >=20 >- if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOMULTSEGS) !=3D >- ETH_TXQ_FLAGS_NOMULTSEGS) { >- PMD_INIT_LOG(ERR, "TX Multi segment not support yet"); >- return -EINVAL; >- } >- > if ((tx_conf->txq_flags & ETH_TXQ_FLAGS_NOOFFLOADS) !=3D > ETH_TXQ_FLAGS_NOOFFLOADS) { > PMD_INIT_LOG(ERR, "TX not support offload function yet"); >--=20 >2.1.3 >