From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id AFCFB1B17A for ; Fri, 28 Sep 2018 01:45:49 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 27 Sep 2018 16:45:48 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.54,312,1534834800"; d="scan'208";a="84015230" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by FMSMGA003.fm.intel.com with ESMTP; 27 Sep 2018 16:45:32 -0700 Received: from fmsmsx123.amr.corp.intel.com (10.18.125.38) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.319.2; Thu, 27 Sep 2018 16:45:31 -0700 Received: from fmsmsx151.amr.corp.intel.com ([169.254.7.87]) by fmsmsx123.amr.corp.intel.com ([169.254.7.84]) with mapi id 14.03.0319.002; Thu, 27 Sep 2018 16:45:31 -0700 From: "Wang, Yipeng1" To: Honnappa Nagarahalli , "Richardson, Bruce" , "De Lara Guarch, Pablo" CC: "dev@dpdk.org" , "honnappa.nagarahalli@dpdk.org" , "gavin.hu@arm.com" , "steve.capper@arm.com" , "ola.liljedahl@arm.com" , "nd@arm.com" , "Gobriel, Sameh" Thread-Topic: [dpdk-dev] [PATCH 0/4] Address reader-writer concurrency in rte_hash Thread-Index: AQHURgTuMbp3QXuwrESo2qvnFE2pcaUE6Y7g Date: Thu, 27 Sep 2018 23:45:31 +0000 Message-ID: References: <1536253938-192391-1-git-send-email-honnappa.nagarahalli@arm.com> In-Reply-To: <1536253938-192391-1-git-send-email-honnappa.nagarahalli@arm.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 11.0.400.15 dlp-reaction: no-action x-ctpclassification: CTP_NT x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiOTkxOWJjZWYtMmIyNy00OWNmLTgzYWItYTRkMzFkMzMyNGMwIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiUWc0MEVobjBWdmNzK2dqa0JSVmJkYlwvanQ5cFlFQzFTZkt0SjJ0OWp6ZFNvaTFwSmVtWnlMZjRkbUtJZ2ZGSkoifQ== x-originating-ip: [10.1.200.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 0/4] Address reader-writer concurrency in rte_hash X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 27 Sep 2018 23:45:50 -0000 Hi Honnappa, Reply inlined: >-----Original Message----- > > Currently, reader-writer concurrency problems in rte_hash are > addressed using reader-writer locks. Use of reader-writer locks > results in following issues: > > 1) In many of the use cases for the hash table, writer threads > are running on control plane. If the writer is preempted while > holding the lock, it will block the readers for an extended period > resulting in packet drops. This problem seems to apply for platform= s > with transactional memory support as well because of the algorithm > used for rte_rwlock_write_lock_tm: > > static inline void > rte_rwlock_write_lock_tm(rte_rwlock_t *rwl) > { > if (likely(rte_try_tm(&rwl->cnt))) > return; > rte_rwlock_write_lock(rwl); > } > > i.e. there is a posibility of using rte_rwlock_write_lock in > failure cases. [Wang, Yipeng] In our test, TSX failure happens very rarely on a TSX platf= orm. But we agree that without TSX, the current rte_rwlock implementation may make the writer= to hold a lock for a period of time. > 2) Reader-writer lock based solution does not address the following > issue. > rte_hash_lookup_xxx APIs return the index of the element in > the key store. Application(reader) can use that index to reference > other data structures in its scope. Because of this, the > index should not be freed till the application completes > using the index. [Wang, Yipeng] I agree on this use case. But I think we should provide new= API functions for deletion to users who want this behavior, without changing the meaning of current API if that is possible. > Current code: > Cores Lookup Lookup > with add > 2 474 246 > 4 935 579 > 6 1387 1048 > 8 1766 1480 > 10 2119 1951 > 12 2546 2441 > > With this patch: > Cores Lookup Lookup > with add > 2 291 211 > 4 297 196 > 6 304 198 > 8 309 202 > 10 315 205 > 12 319 209 > [Wang, Yipeng] It would be good if you could provide the platform informati= on on these results. Another comment is as you know we also have a new extension to rte_hash to = enable extendable buckets and partial-key hashing. Thanks for the comments btw. Could you che= ck if your lockless scheme also applies to those extensions?