From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from NAM02-SN1-obe.outbound.protection.outlook.com (mail-sn1nam02on0069.outbound.protection.outlook.com [104.47.36.69]) by dpdk.org (Postfix) with ESMTP id ACD66201 for ; Sat, 3 Nov 2018 16:40:58 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=CAVIUMNETWORKS.onmicrosoft.com; s=selector1-cavium-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=r+VrWcet9exCs4YtBzarzMWMQKmpV2XTjqaWcv026G8=; b=nb1JyRrhAqa0zRc5uZjO5NJjA/e+cQebfRtLZZ1JYL74PJxivvlzxPESh2suG/DeIsPKk8K1TIFW8/QIuCJpvvwR88CLfcbbkzEnnAwqgrfCP+W6jE3nQoKsd7ppUfS6Lbjb5id32mNpxP54OlMY4uFCefnG/yYRIVLGMWhR3jw= Received: from SN6PR07MB5008.namprd07.prod.outlook.com (52.135.121.74) by SN6PR07MB5503.namprd07.prod.outlook.com (20.177.248.81) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.1294.21; Sat, 3 Nov 2018 15:40:55 +0000 Received: from SN6PR07MB5008.namprd07.prod.outlook.com ([fe80::913d:1a21:cb40:16de]) by SN6PR07MB5008.namprd07.prod.outlook.com ([fe80::913d:1a21:cb40:16de%3]) with mapi id 15.20.1294.027; Sat, 3 Nov 2018 15:40:55 +0000 From: Jerin Jacob To: Honnappa Nagarahalli CC: "bruce.richardson@intel.com" , "pablo.de.lara.guarch@intel.com" , "dev@dpdk.org" , "yipeng1.wang@intel.com" , "dharmik.thakkar@arm.com" , "gavin.hu@arm.com" , "nd@arm.com" , "thomas@monjalon.net" , "ferruh.yigit@intel.com" , "hemant.agrawal@nxp.com" , "chaozhu@linux.vnet.ibm.com" Thread-Topic: [dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write concurrency Thread-Index: AQHUbO4hC1hG0V5Pe0K6GLHhoSzYDKU+WcgA///jgwA= Date: Sat, 3 Nov 2018 15:40:55 +0000 Message-ID: <20181103154039.GA25488@jerin> References: <1540532253-112591-1-git-send-email-honnappa.nagarahalli@arm.com> <1540532253-112591-5-git-send-email-honnappa.nagarahalli@arm.com> <20181103115240.GA3608@jerin> In-Reply-To: <20181103115240.GA3608@jerin> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [106.200.201.67] x-clientproxiedby: SG2PR04CA0193.apcprd04.prod.outlook.com (2603:1096:4:14::31) To SN6PR07MB5008.namprd07.prod.outlook.com (2603:10b6:805:ad::10) authentication-results: spf=none (sender IP is ) smtp.mailfrom=Jerin.JacobKollanukkaran@cavium.com; x-ms-exchange-messagesentrepresentingtype: 1 x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; SN6PR07MB5503; 6:/74HglOuQ2kyoAwH1QxnukcUU3HUeLJXLt35EqdeO+uZo3dDc4ecPcYj2Kzsl1x12vG6c0F804pPz/0QgSWKBKftovAWBFymheZ/uch1qlhXzltOt/PxSEPIiAe722WMyxqV8wfSrSLE8j9tsrAYW2l+GtEbxbbdYpD3SBJVqE03W9JZQL5qLubfBXYrwowXxs+rt2ILCnpoh9B1zrN8uUa2Zj3z+qyymnkkD5UbOm0qhxQY2YAAubD5XQJxFxY90zYE9AE63L9WRunwdZOy6LtpNEoT4amR6io/kwWbYs9TfpboumL4h3jPQW7PX0eWYYNHrc62ih6GczonBBOZKtI5XCO5GFdiqHrlk8HHpLU1+NVyteZn/QZfX4fThoF1B1/4kkjc+dq3rO/eUCg3Ztpo/KV1eF4g7TCugU6purYixzJMyZVKTL2JT3zDjptbZEZXNVsBAIoNLf1W20aPHw==; 5:932D2oFLqvxSxCxm5wZ74yKI2XM5PKqVT/mZRYBd46OW1K6mXjSyGgfj96WiRP/9ID4dPkEqa14S+N36yo5PHnFVmBG7h2PPi5bkiNF2yAVKO2gMTheF1xB9b/msgqHZJyUcCL1f2XtnPPmcsCFCcmqryJh4MrDT37G7vSLe3DE=; 7:e+IStaz38hPAKPUtxuTDR7Kw9Z+0nb8ILq97UUMDC+wW+whaPZgtCmeoK/33pYjvsQSnJedN3puYer+kDXkO/Y3/0ulmQ7+hKGnK1n3jrVHt5hSrHcpiNbzXrGlRB83+ZWqGehVu0p9vWIwhjyku/g== x-ms-office365-filtering-correlation-id: 1c6b7a69-1201-4187-8c92-08d641a2be33 x-microsoft-antispam: BCL:0; PCL:0; RULEID:(7020095)(4652040)(8989299)(5600074)(711020)(4534185)(4627221)(201703031133081)(201702281549075)(8990200)(2017052603328)(7153060)(7193020); SRVR:SN6PR07MB5503; x-ms-traffictypediagnostic: SN6PR07MB5503: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699)(180628864354917)(185117386973197)(163750095850); x-ms-exchange-senderadcheck: 1 x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(3231382)(944501410)(52105095)(93006095)(3002001)(10201501046)(148016)(149066)(150057)(6041310)(20161123558120)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123564045)(201708071742011)(7699051)(76991095); SRVR:SN6PR07MB5503; BCL:0; PCL:0; RULEID:; SRVR:SN6PR07MB5503; x-forefront-prvs: 08457955C4 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(7916004)(376002)(366004)(136003)(396003)(346002)(39860400002)(199004)(189003)(13464003)(76176011)(478600001)(3846002)(14444005)(386003)(6506007)(256004)(102836004)(33656002)(68736007)(33716001)(1076002)(99286004)(42882007)(71190400001)(6116002)(71200400001)(8936002)(81156014)(72206003)(26005)(66066001)(81166006)(52116002)(33896004)(25786009)(229853002)(2906002)(8676002)(4326008)(186003)(6246003)(14454004)(2900100001)(53936002)(97736004)(9686003)(6486002)(6512007)(486006)(6436002)(7736002)(446003)(11346002)(105586002)(106356001)(305945005)(476003)(54906003)(316002)(6916009)(7416002)(5660300001); DIR:OUT; SFP:1101; SCL:1; SRVR:SN6PR07MB5503; H:SN6PR07MB5008.namprd07.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: cavium.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: zBB7pSNZrT1QHodF56FFD8dVokWkWnDaerxEXK07ZZF11nKRQSBOOrrOp8OSRPRSFsUfdxgP6aVef5Wfy4IEoi4OUNphZcC2dQ8DvjHIWJSvghyFRU8nHm4dP7TDKF3LFWTvTu86dAd6mFfVuL2+WWDGQ1wYvZQuI7jEpNCe/3CBXbKsZW2YU1gNZXpLsimasaU48c2Azdv0wQx8ah9H/L7IlFfdaD2z21+iBzRgOiaGYTzVpsqTDPZi29I8+idY7DMUYb76KSXANfi48NNBJzX1mQZtcJiY1q9oIgYGRcTYjMP9HuyWRus4abgyWplgU4tfaSRyvyxj7ivYUHJVYdmVNHUgZh7faado0A33Pog= spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1c6b7a69-1201-4187-8c92-08d641a2be33 X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Nov 2018 15:40:55.4963 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 711e4ccf-2e9b-4bcf-a551-4094005b6194 X-MS-Exchange-Transport-CrossTenantHeadersStamped: SN6PR07MB5503 Subject: Re: [dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write concurrency X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Sat, 03 Nov 2018 15:40:59 -0000 -----Original Message----- > Date: Sat, 3 Nov 2018 11:52:54 +0000 > From: Jerin Jacob > To: Honnappa Nagarahalli > CC: "bruce.richardson@intel.com" , > "pablo.de.lara.guarch@intel.com" , > "dev@dpdk.org" , "yipeng1.wang@intel.com" > , "dharmik.thakkar@arm.com" > , "gavin.hu@arm.com" , > "nd@arm.com" , "thomas@monjalon.net" , > "ferruh.yigit@intel.com" , > "hemant.agrawal@nxp.com" > Subject: Re: [dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write > concurrency >=20 > -----Original Message----- > > Date: Fri, 26 Oct 2018 00:37:32 -0500 > > From: Honnappa Nagarahalli > > To: bruce.richardson@intel.com, pablo.de.lara.guarch@intel.com > > CC: dev@dpdk.org, yipeng1.wang@intel.com, honnappa.nagarahalli@arm.com, > > dharmik.thakkar@arm.com, gavin.hu@arm.com, nd@arm.com > > Subject: [dpdk-dev] [PATCH v7 4/5] hash: add lock-free read-write > > concurrency > > X-Mailer: git-send-email 2.7.4 > > > > > > Add lock-free read-write concurrency. This is achieved by the > > following changes. > > > > 1) Add memory ordering to avoid race conditions. The only race > > condition that can occur is - using the key store element > > before the key write is completed. Hence, while inserting the element > > the release memory order is used. Any other race condition is caught > > by the key comparison. Memory orderings are added only where needed. > > For ex: reads in the writer's context do not need memory ordering > > as there is a single writer. > > > > key_idx in the bucket entry and pdata in the key store element are > > used for synchronisation. key_idx is used to release an inserted > > entry in the bucket to the reader. Use of pdata for synchronisation > > is required due to updation of an existing entry where-in only > > the pdata is updated without updating key_idx. > > > > 2) Reader-writer concurrency issue, caused by moving the keys > > to their alternative locations during key insert, is solved > > by introducing a global counter(tbl_chng_cnt) indicating a > > change in table. > > > > 3) Add the flag to enable reader-writer concurrency during > > run time. > > > > Signed-off-by: Honnappa Nagarahalli >=20 > Hi Honnappa, >=20 > This patch is causing _~24%_ performance regression on mpps/core with 64B > packet with l3fwd in EM mode with octeontx. >=20 > Example command to reproduce with 2 core+2 port l3fwd in hash mode(-E) >=20 > # l3fwd -v -c 0xf00000 -n 4 -- -P -E -p 0x3 --config=3D"(0, 0, 23),(1, 0,= 22)" >=20 > Observations: > 1) When hash lookup is _success_ then regression is only 3%. Which is kin= d of > make sense because additional new atomic instructions >=20 > What I meant by lookup is _success_ is: > Configuring traffic gen like below to match lookup as defined > ipv4_l3fwd_em_route_array() in examples/l3fwd/l3fwd_em.c >=20 > dest.ip port0 201.0.0.0 > src.ip port0 200.20.0.1 > dest.port port0 102 > src.port port0 12 >=20 > dest.ip port1 101.0.0.0 > src.ip port1 100.10.0.1 > dest.port port1 101 > src.port port1 11 >=20 > tx.type IPv4+TCP >=20 >=20 >=20 > 2) When hash lookup _fails_ the per core mpps regression comes around 24%= with 64B packet size. >=20 > What I meant by lookup is _failure_ is: > Configuring traffic gen not to hit the 5 tuples defined in > ipv4_l3fwd_em_route_array() in examples/l3fwd/l3fwd_em.c >=20 >=20 > 3) perf top _without_ this patch > 37.30% l3fwd [.] em_main_loop > 22.40% l3fwd [.] rte_hash_lookup > 13.05% l3fwd [.] nicvf_recv_pkts_cksum > 9.70% l3fwd [.] nicvf_xmit_pkts > 6.18% l3fwd [.] ipv4_hash_crc > 4.77% l3fwd [.] nicvf_fill_rbdr > 4.50% l3fwd [.] nicvf_single_pool_free_xmited_buffers > 1.16% libc-2.28.so [.] memcpy > 0.47% l3fwd [.] common_ring_mp_enqueue > 0.44% l3fwd [.] common_ring_mc_dequeue > 0.03% l3fwd [.] strerror_r@plt >=20 > 4) perf top with this patch >=20 > 47.41% l3fwd [.] rte_hash_lookup > 23.55% l3fwd [.] em_main_loop > 9.53% l3fwd [.] nicvf_recv_pkts_cksum > 6.95% l3fwd [.] nicvf_xmit_pkts > 4.63% l3fwd [.] ipv4_hash_crc > 3.30% l3fwd [.] nicvf_fill_rbdr > 3.29% l3fwd [.] nicvf_single_pool_free_xmited_buffers > 0.76% libc-2.28.so [.] memcpy > 0.30% l3fwd [.] common_ring_mp_enqueue > 0.25% l3fwd [.] common_ring_mc_dequeue > 0.04% l3fwd [.] strerror_r@plt >=20 >=20 > 5) Based on assembly, most of the cycles spends in rte_hash_lookup > around key_idx =3D __atomic_load_n(&bkt->key_idx[i](whose LDAR) > and "if (bkt->sig_current[i] =3D=3D sig && key_idx !=3D EMPTY_SLOT) {" >=20 >=20 > 6) Since this patch is big and does 3 things are mentioned above, > it is difficult to pin point what is causing the exact issue. >=20 > But, my primary analysis shows the item (1)(adding the atomic barriers). > But I need to spend more cycles to find out the exact causes. + Adding POWERPC maintainer as mostly POWERPC also impacted on this patch. Looks like __atomic_load_n(__ATOMIC_ACQUIRE) will be just mov instruction on x86, so x86 may not be much impacted. I analyzed it further, it a plain LD vs __atomic_load_n(__ATOMIC_ACQUIRE) i= ssue. The outer __rte_hash_lookup_with_hash has only 2ish __atomic_load_n operation which causing only around 1% regression. But since this patch has "two" __atomic_load_n in each search_one_bucket() and in the worst case it is looping around 16 time.s i.e "32 LDAR per packet" explains why 24% drop in lookup miss cases and ~3% drop in lookup success case. So this patch's regression will be based on how many cycles an LDAR takes on given ARMv8 platform and on how many issue(s) it can issue LDAR instructions at given point of time. IMO, This scheme won't work. I think, we are introducing such performance critical feature, we need to put under function pointer scheme = so that if an application does not need such feature it can use plain loads. Already we have a lot of flags in the hash library to define the runtime behavior, I think, it makes sense to select function pointer based on such flags and have a performance effective solution based on application requirements. Just to prove the above root cause analysis, the following patch can fix the performance issue. I know, it is NOT correct in the context of this patch. Just pasting in case, someone want to see the cost of LD vs __atomic_load_n(__ATOMIC_ACQUIRE) on a given platform. On a different note, I think, it makes sense to use RCU based structure in these case to avoid performance issue. liburcu has a good hash library for such cases. (very less write and more read cases) /Jerin @@ -1135,27 +1134,21 @@ search_one_bucket(const struct rte_hash *h, const void *key, uint16_t sig, void **data, const struct rte_hash_bucket *bkt) { int i; - uint32_t key_idx; - void *pdata; struct rte_hash_key *k, *keys =3D h->key_store; =20 for (i =3D 0; i < RTE_HASH_BUCKET_ENTRIES; i++) { - key_idx =3D __atomic_load_n(&bkt->key_idx[i], - __ATOMIC_ACQUIRE); - if (bkt->sig_current[i] =3D=3D sig && key_idx !=3D EMPTY_SL= OT) { + if (bkt->sig_current[i] =3D=3D sig && + bkt->key_idx[i] !=3D EMPTY_SLOT) { k =3D (struct rte_hash_key *) ((char *)keys + - key_idx * h->key_entry_size); - pdata =3D __atomic_load_n(&k->pdata, - __ATOMIC_ACQUIRE); - + bkt->key_idx[i] * h->key_entry_size); if (rte_hash_cmp_eq(key, k->key, h) =3D=3D 0) { if (data !=3D NULL) - *data =3D pdata; + *data =3D k->pdata; /* * Return index where key is stored, * subtracting the first dummy index */ - return key_idx - 1; + return bkt->key_idx[i] - 1; } } } >=20 > The use case like lwfwd in hash mode, where writer does not update > stuff in fastpath(aka insert op) will be impact with this patch. >=20 > 7) Have you checked the l3fwd lookup failure use case in your environment= ? > if so, please share your observation and if not, could you please check i= t? >=20 > 8) IMO, Such performance regression is not acceptable for l3fwd use case > where hash insert op will be done in slowpath. >=20 > 9) Does anyone else facing this problem? >=20 >=20