From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 294B9A046B for ; Sat, 27 Jul 2019 22:41:18 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8FD211C08E; Sat, 27 Jul 2019 22:41:17 +0200 (CEST) Received: from mx0b-0016f401.pphosted.com (mx0b-0016f401.pphosted.com [67.231.156.173]) by dpdk.org (Postfix) with ESMTP id 93E631C07D for ; Sat, 27 Jul 2019 22:41:15 +0200 (CEST) Received: from pps.filterd (m0045851.ppops.net [127.0.0.1]) by mx0b-0016f401.pphosted.com (8.16.0.42/8.16.0.42) with SMTP id x6RKfEvu003862 for ; Sat, 27 Jul 2019 13:41:14 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.com; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : content-transfer-encoding : mime-version; s=pfpt0818; bh=vkgdez1xYm4xj679Ucu03iAN8Z6a2zRvtc4Aq66RoHM=; b=KPMAwFOg3pKcxlL8TUkhv1sjKA35fKo+CwDkbdSf+/zNYn8SCD9nIJ2wCChTfqVVyWEc +QALIm3y7Ng08SoO2BKOAtX68sMLRFbUPtABYoYsMFS0zcZ7bLEE098Q3SNSl1daw/sA kYGRoOCOZhj5yyBK0LlLQCiEUPPg1mh55Pjn8IMbbJ34xz2Svq/s0kPpF3x3SwvisoGt Hj1vZzoZFQ/hjN4WIJfsOr9IDkvfl5rzX7+mErhzMxSk48Ym874ZgVsmMFYEH1QGJEF9 cEYa7aUwuH03lGWtLpTvFXVNgD/fKtvVUBWTLOzaTG3ECZP1NzauU21+/MTMzerNQbRE Kg== Received: from sc-exch04.marvell.com ([199.233.58.184]) by mx0b-0016f401.pphosted.com with ESMTP id 2u0p4ks9na-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-SHA384 bits=256 verify=NOT) for ; Sat, 27 Jul 2019 13:41:14 -0700 Received: from SC-EXCH03.marvell.com (10.93.176.83) by SC-EXCH04.marvell.com (10.93.176.84) with Microsoft SMTP Server (TLS) id 15.0.1367.3; Sat, 27 Jul 2019 13:41:12 -0700 Received: from NAM02-BL2-obe.outbound.protection.outlook.com (104.47.38.52) by SC-EXCH03.marvell.com (10.93.176.83) with Microsoft SMTP Server (TLS) id 15.0.1367.3 via Frontend Transport; Sat, 27 Jul 2019 13:41:12 -0700 ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=bsYTX65yy8UMaL490nUyYf+MfYzMV0amPI8D/hpI/TbFTbcF2j/Pt7uiZihapmSprhzl/pQMp5RpY+2j9a1LCrmAhzoemTZ/x0BX2xd93tdGLj7JfMag6r/jGyDd8FGPdn0o1rOxHxL5h/miqSAG9XfBKeyIXhnV9UeJmqyJWb0+9QQwVTqa2DRfwQgo5kyLvlGuYxHEwfz+cM3vUZlutttPI8iQBaPw77bZalfrTVn5Ok+Ta8D4KcY1xYbEeSQhrhcHIc3MYXQ+su8uCJGCtZN86X+aEY/9X1TrppPWF3Yi1+RHbR7KcTS+6rB0mgsBANVQQ+Qbhr/89otVNMftJw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vkgdez1xYm4xj679Ucu03iAN8Z6a2zRvtc4Aq66RoHM=; b=HTNbICvqchVEIXE9ZJEi4DpmsjWsiOK2h790xjvF2E58fh/AWCPABd1qd0U4oNpHUw0/tS3TJ817j4YRL674WhmKEutnWm5xo4kjOQCoOY75pqqC6JRGdiClBAwbXKjYPtiFvIaHjGfesHKjGJtiJnmpaptBGFq6a/ywq44Dktuare2eTWEoSoJVUZvc0NHHqeaaKpuoH1yClseReOG8+lfYmh61GQUlr2rFWECGBBJUl/CDt/EjTttsV9ZnInRb7WOFmxnF+WOVU1G9zChqGuN8RdDclryldItlMtlx7tdMjY66YAFBUJX99latW4IqbrUIjcUqNrYeWh8bztOW1A== ARC-Authentication-Results: i=1; mx.microsoft.com 1;spf=pass smtp.mailfrom=marvell.com;dmarc=pass action=none header.from=marvell.com;dkim=pass header.d=marvell.com;arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=marvell.onmicrosoft.com; s=selector2-marvell-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=vkgdez1xYm4xj679Ucu03iAN8Z6a2zRvtc4Aq66RoHM=; b=qWoZxtuZt3E+D6eiMSdXo5krkFLKgBebSYWzrNwMCyVd3neGB1JxvcFEh9+jscZXGG+xMuslfJ3mwx9ieubDsOF8oMQofZVAZq/hsexXW9U/jj8iJF31WjxOhrs9j+S9kMONktIaPymwIxeuz1BaAQWeJKrNe8WDJBVFH7WjMmU= Received: from MN2PR18MB2848.namprd18.prod.outlook.com (20.179.21.149) by MN2PR18MB2623.namprd18.prod.outlook.com (20.179.82.145) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2115.15; Sat, 27 Jul 2019 20:41:10 +0000 Received: from MN2PR18MB2848.namprd18.prod.outlook.com ([fe80::4447:9459:5386:2e18]) by MN2PR18MB2848.namprd18.prod.outlook.com ([fe80::4447:9459:5386:2e18%7]) with mapi id 15.20.2115.005; Sat, 27 Jul 2019 20:41:10 +0000 From: Harman Kalra To: Pavan Nikhilesh Bhagavatula , "Jerin Jacob Kollanukkaran" , Nithin Kumar Dabilpuram , Kiran Kumar Kokkilagadda CC: "dev@dpdk.org" , Harman Kalra Thread-Topic: [PATCH v2 1/2] net/octeontx2: fix ptp performance issue Thread-Index: AQHVRLufGenVQ4W7hkCj5IoZ1gE7mw== Date: Sat, 27 Jul 2019 20:41:09 +0000 Message-ID: <1564260052-28926-1-git-send-email-hkalra@marvell.com> References: In-Reply-To: Accept-Language: en-GB, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-clientproxiedby: PN1PR0101CA0063.INDPRD01.PROD.OUTLOOK.COM (2603:1096:c00:d::25) To MN2PR18MB2848.namprd18.prod.outlook.com (2603:10b6:208:3e::21) x-ms-exchange-messagesentrepresentingtype: 1 x-mailer: git-send-email 2.7.4 x-originating-ip: [115.113.156.2] x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: a2a970ff-11f9-4726-46e2-08d712d2c1a9 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(2390118)(7020095)(4652040)(8989299)(4534185)(7168020)(4627221)(201703031133081)(201702281549075)(8990200)(5600148)(711020)(4605104)(1401327)(2017052603328)(7193020); SRVR:MN2PR18MB2623; x-ms-traffictypediagnostic: MN2PR18MB2623: x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:4303; x-forefront-prvs: 01110342A5 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(4636009)(366004)(396003)(376002)(39830400003)(346002)(136003)(199004)(189003)(3846002)(6116002)(2906002)(107886003)(25786009)(4326008)(8676002)(81156014)(81166006)(8936002)(66066001)(50226002)(14454004)(7736002)(478600001)(102836004)(110136005)(54906003)(26005)(305945005)(6512007)(68736007)(71190400001)(71200400001)(86362001)(66446008)(64756008)(66946007)(66556008)(66476007)(316002)(5660300002)(186003)(55236004)(2616005)(53936002)(6636002)(11346002)(446003)(36756003)(76176011)(52116002)(476003)(6486002)(486006)(14444005)(256004)(6436002)(386003)(6506007)(99286004); DIR:OUT; SFP:1101; SCL:1; SRVR:MN2PR18MB2623; H:MN2PR18MB2848.namprd18.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: marvell.com does not designate permitted sender hosts) x-ms-exchange-senderadcheck: 1 x-microsoft-antispam-message-info: oT7+vFpRdGX2ytDVKAvMBfLwKGOqr9KtDZbc6ohuLxRLfG8Wa/em/hu4aPOtcz3pwTUbAC+d0sJuhO59J7Pq7WKcogxzt2uz2JjwsLBTx7yqrqWzhEQEAG2nWo+b5Pr9WuDaDpb8BqKZrBjB8ELyqNDqIBzCcs8NR9uvN2SvO9WkQX+yoPu19Nk4tE2DKhZEbfeOx/2VAu8J4KPA8oAu5ywChqiOLzEfDD60QXdWYt+xD64Ao+LKxTwz40a7BHR+lTKdSMZ4LmQnKaYllL9xmG/b/OiQBI2MntUOGulBVY8ANZacreMOAo0QQpk/67nSX8DggLMSKrzlVaU5vEcbl63wP/EhVMq3UjD7xEOcMxKA9f74139h6ePR05fGsxriSE15+SizElIE/MUiGRSK8Dy8NRebsvVYKBK9xyCGog4= Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Exchange-CrossTenant-Network-Message-Id: a2a970ff-11f9-4726-46e2-08d712d2c1a9 X-MS-Exchange-CrossTenant-originalarrivaltime: 27 Jul 2019 20:41:10.2620 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 70e1fb47-1155-421d-87fc-2e58f638b6e0 X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: hkalra@marvell.com X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR18MB2623 X-OriginatorOrg: marvell.com X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:5.22.84,1.0.8 definitions=2019-07-27_15:2019-07-26,2019-07-27 signatures=0 Subject: [dpdk-dev] [PATCH v2 1/2] net/octeontx2: fix ptp performance issue X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" A huge drop in per core MPPS value was observed when PTP stack is enabled. The reason behind the bottleneck is HW serialises the transfer of all SQEs, which seeks timestamp capture, on the same send DMA path. Hence only those packets which requires timestamp capture should set SETTSTAMP in send mem alg. With this patch timestamping would be done only for those packets with PKT_TX_IEEE1588_TMST set. Fixes: fb3ae0951abd ("net/octeontx2: support Tx") Fixes: 8980a153006b ("event/octeontx2: support PTP for SSO") Signed-off-by: Harman Kalra --- drivers/event/octeontx2/otx2_evdev.h | 7 ++++++- drivers/event/octeontx2/otx2_worker.h | 10 +++++++-- drivers/event/octeontx2/otx2_worker_dual.h | 14 +++++++++++-- drivers/net/octeontx2/otx2_ethdev.c | 2 -- drivers/net/octeontx2/otx2_rx.c | 3 ++- drivers/net/octeontx2/otx2_rx.h | 24 +++++++++++++--------- drivers/net/octeontx2/otx2_tx.h | 19 +++++++++++++---- 7 files changed, 57 insertions(+), 22 deletions(-) diff --git a/drivers/event/octeontx2/otx2_evdev.h b/drivers/event/octeontx2= /otx2_evdev.h index 9c9718f6f..5cd80e3b2 100644 --- a/drivers/event/octeontx2/otx2_evdev.h +++ b/drivers/event/octeontx2/otx2_evdev.h @@ -25,6 +25,7 @@ #define OTX2_SSO_SQB_LIMIT (0x180) #define OTX2_SSO_XAQ_SLACK (8) #define OTX2_SSO_XAQ_CACHE_CNT (0x7) +#define OTX2_SSO_WQE_SG_PTR (9) =20 /* SSO LF register offsets (BAR2) */ #define SSO_LF_GGRP_OP_ADD_WORK0 (0x0ull) @@ -222,10 +223,14 @@ otx2_wqe_to_mbuf(uint64_t get_work1, const uint64_t m= buf, uint8_t port_id, const void * const lookup_mem) { struct nix_wqe_hdr_s *wqe =3D (struct nix_wqe_hdr_s *)get_work1; + uint64_t val =3D mbuf_init.value | (uint64_t)port_id << 48; + + if (flags & NIX_RX_OFFLOAD_TSTAMP_F) + val |=3D NIX_TIMESYNC_RX_OFFSET; =20 otx2_nix_cqe_to_mbuf((struct nix_cqe_hdr_s *)wqe, tag, (struct rte_mbuf *)mbuf, lookup_mem, - mbuf_init.value | (uint64_t)port_id << 48, flags); + val, flags); =20 } =20 diff --git a/drivers/event/octeontx2/otx2_worker.h b/drivers/event/octeontx= 2/otx2_worker.h index 3c847d223..76f91bb59 100644 --- a/drivers/event/octeontx2/otx2_worker.h +++ b/drivers/event/octeontx2/otx2_worker.h @@ -18,6 +18,7 @@ otx2_ssogws_get_work(struct otx2_ssogws *ws, struct rte_e= vent *ev, const uint32_t flags, const void * const lookup_mem) { union otx2_sso_event event; + uint64_t tstamp_ptr; uint64_t get_work1; uint64_t mbuf; =20 @@ -69,8 +70,10 @@ otx2_ssogws_get_work(struct otx2_ssogws *ws, struct rte_= event *ev, otx2_wqe_to_mbuf(get_work1, mbuf, event.sub_event_type, (uint32_t) event.get_work0, flags, lookup_mem); /* Extracting tstamp, if PTP enabled*/ + tstamp_ptr =3D *(uint64_t *)(((struct nix_wqe_hdr_s *)get_work1) + + OTX2_SSO_WQE_SG_PTR); otx2_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, ws->tstamp, - flags); + flags, (uint64_t *)tstamp_ptr); get_work1 =3D mbuf; } =20 @@ -86,6 +89,7 @@ otx2_ssogws_get_work_empty(struct otx2_ssogws *ws, struct= rte_event *ev, const uint32_t flags) { union otx2_sso_event event; + uint64_t tstamp_ptr; uint64_t get_work1; uint64_t mbuf; =20 @@ -131,8 +135,10 @@ otx2_ssogws_get_work_empty(struct otx2_ssogws *ws, str= uct rte_event *ev, otx2_wqe_to_mbuf(get_work1, mbuf, event.sub_event_type, (uint32_t) event.get_work0, flags, NULL); /* Extracting tstamp, if PTP enabled*/ + tstamp_ptr =3D *(uint64_t *)(((struct nix_wqe_hdr_s *)get_work1) + + OTX2_SSO_WQE_SG_PTR); otx2_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, ws->tstamp, - flags); + flags, (uint64_t *)tstamp_ptr); get_work1 =3D mbuf; } =20 diff --git a/drivers/event/octeontx2/otx2_worker_dual.h b/drivers/event/oct= eontx2/otx2_worker_dual.h index 4a72f424d..5134e3d52 100644 --- a/drivers/event/octeontx2/otx2_worker_dual.h +++ b/drivers/event/octeontx2/otx2_worker_dual.h @@ -21,6 +21,7 @@ otx2_ssogws_dual_get_work(struct otx2_ssogws_state *ws, { const uint64_t set_gw =3D BIT_ULL(16) | 1; union otx2_sso_event event; + uint64_t tstamp_ptr; uint64_t get_work1; uint64_t mbuf; =20 @@ -70,8 +71,17 @@ otx2_ssogws_dual_get_work(struct otx2_ssogws_state *ws, event.event_type =3D=3D RTE_EVENT_TYPE_ETHDEV) { otx2_wqe_to_mbuf(get_work1, mbuf, event.sub_event_type, (uint32_t) event.get_work0, flags, lookup_mem); - /* Extracting tstamp, if PTP enabled*/ - otx2_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, tstamp, flags); + /* Extracting tstamp, if PTP enabled. CGX will prepend the + * timestamp at starting of packet data and it can be derieved + * from WQE 9 dword which corresponds to SG iova. + * rte_pktmbuf_mtod_offset can be used for this purpose but it + * brings down the performance as it reads mbuf->buf_addr which + * is not part of cache in general fast path. + */ + tstamp_ptr =3D *(uint64_t *)(((struct nix_wqe_hdr_s *)get_work1) + + OTX2_SSO_WQE_SG_PTR); + otx2_nix_mbuf_to_tstamp((struct rte_mbuf *)mbuf, tstamp, flags, + (uint64_t *)tstamp_ptr); get_work1 =3D mbuf; } =20 diff --git a/drivers/net/octeontx2/otx2_ethdev.c b/drivers/net/octeontx2/ot= x2_ethdev.c index b018b25b7..595c8003a 100644 --- a/drivers/net/octeontx2/otx2_ethdev.c +++ b/drivers/net/octeontx2/otx2_ethdev.c @@ -874,8 +874,6 @@ otx2_nix_form_default_desc(struct otx2_eth_txq *txq) send_mem =3D (struct nix_send_mem_s *)(txq->cmd + (send_hdr->w0.sizem1 << 1)); send_mem->subdc =3D NIX_SUBDC_MEM; - send_mem->dsz =3D 0x0; - send_mem->wmem =3D 0x1; send_mem->alg =3D NIX_SENDMEMALG_SETTSTMP; send_mem->addr =3D txq->dev->tstamp.tx_tstamp_iova; } diff --git a/drivers/net/octeontx2/otx2_rx.c b/drivers/net/octeontx2/otx2_r= x.c index deefe9588..701efc858 100644 --- a/drivers/net/octeontx2/otx2_rx.c +++ b/drivers/net/octeontx2/otx2_rx.c @@ -68,7 +68,8 @@ nix_recv_pkts(void *rx_queue, struct rte_mbuf **rx_pkts, =20 otx2_nix_cqe_to_mbuf(cq, cq->tag, mbuf, lookup_mem, mbuf_init, flags); - otx2_nix_mbuf_to_tstamp(mbuf, rxq->tstamp, flags); + otx2_nix_mbuf_to_tstamp(mbuf, rxq->tstamp, flags, + (uint64_t *)((uint8_t *)mbuf + data_off)); rx_pkts[packets++] =3D mbuf; otx2_prefetch_store_keep(mbuf); head++; diff --git a/drivers/net/octeontx2/otx2_rx.h b/drivers/net/octeontx2/otx2_r= x.h index e150f38d7..d12e8b809 100644 --- a/drivers/net/octeontx2/otx2_rx.h +++ b/drivers/net/octeontx2/otx2_rx.h @@ -50,22 +50,26 @@ union mbuf_initializer { =20 static __rte_always_inline void otx2_nix_mbuf_to_tstamp(struct rte_mbuf *mbuf, - struct otx2_timesync_info *tstamp, const uint16_t flag) + struct otx2_timesync_info *tstamp, const uint16_t flag, + uint64_t *tstamp_ptr) { if ((flag & NIX_RX_OFFLOAD_TSTAMP_F) && - mbuf->packet_type =3D=3D RTE_PTYPE_L2_ETHER_TIMESYNC && (mbuf->data_off =3D=3D RTE_PKTMBUF_HEADROOM + NIX_TIMESYNC_RX_OFFSET)) { - uint64_t *tstamp_ptr; =20 - /* Deal with rx timestamp */ - tstamp_ptr =3D rte_pktmbuf_mtod_offset(mbuf, uint64_t *, - -NIX_TIMESYNC_RX_OFFSET); + /* Reading the rx timestamp inserted by CGX, viz at + * starting of the packet data. + */ mbuf->timestamp =3D rte_be_to_cpu_64(*tstamp_ptr); - tstamp->rx_tstamp =3D mbuf->timestamp; - tstamp->rx_ready =3D 1; - mbuf->ol_flags |=3D PKT_RX_IEEE1588_PTP | PKT_RX_IEEE1588_TMST - | PKT_RX_TIMESTAMP; + /* PKT_RX_IEEE1588_TMST flag needs to be set only in case + * PTP packets are received. + */ + if (mbuf->packet_type =3D=3D RTE_PTYPE_L2_ETHER_TIMESYNC) { + tstamp->rx_tstamp =3D mbuf->timestamp; + tstamp->rx_ready =3D 1; + mbuf->ol_flags |=3D PKT_RX_IEEE1588_PTP | + PKT_RX_IEEE1588_TMST | PKT_RX_TIMESTAMP; + } } } =20 diff --git a/drivers/net/octeontx2/otx2_tx.h b/drivers/net/octeontx2/otx2_t= x.h index b75a220ea..494ba3884 100644 --- a/drivers/net/octeontx2/otx2_tx.h +++ b/drivers/net/octeontx2/otx2_tx.h @@ -43,18 +43,29 @@ otx2_nix_xmit_prepare_tstamp(uint64_t *cmd, const uint= 64_t *send_mem_desc, if (flags & NIX_TX_OFFLOAD_TSTAMP_F) { struct nix_send_mem_s *send_mem; uint16_t off =3D (no_segdw - 1) << 1; + const uint8_t is_ol_tstamp =3D !(ol_flags & PKT_TX_IEEE1588_TMST); =20 send_mem =3D (struct nix_send_mem_s *)(cmd + off); - if (flags & NIX_TX_MULTI_SEG_F) + if (flags & NIX_TX_MULTI_SEG_F) { /* Retrieving the default desc values */ cmd[off] =3D send_mem_desc[6]; =20 + /* Using compiler barier to avoid voilation of C + * aliasing rules. + */ + rte_compiler_barrier(); + } + /* Packets for which PKT_TX_IEEE1588_TMST is not set, tx tstamp - * should not be updated at tx tstamp registered address, rather - * a dummy address which is eight bytes ahead would be updated + * should not be recorded, hence changing the alg type to + * NIX_SENDMEMALG_SET and also changing send mem addr field to + * next 8 bytes as it corrpt the actual tx tstamp registered + * address. */ + send_mem->alg =3D NIX_SENDMEMALG_SETTSTMP - (is_ol_tstamp); + send_mem->addr =3D (rte_iova_t)((uint64_t *)send_mem_desc[7] + - !(ol_flags & PKT_TX_IEEE1588_TMST)); + (is_ol_tstamp)); } } =20 --=20 2.18.0