From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 215907D3A for ; Wed, 23 Aug 2017 18:19:22 +0200 (CEST) Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 23 Aug 2017 09:19:22 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.41,417,1498546800"; d="scan'208";a="141209912" Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203]) by orsmga005.jf.intel.com with ESMTP; 23 Aug 2017 09:19:21 -0700 Received: from FMSMSX109.amr.corp.intel.com (10.18.116.9) by FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS) id 14.3.319.2; Wed, 23 Aug 2017 09:19:21 -0700 Received: from fmsmsx115.amr.corp.intel.com ([169.254.4.190]) by FMSMSX109.amr.corp.intel.com ([169.254.15.111]) with mapi id 14.03.0319.002; Wed, 23 Aug 2017 09:19:21 -0700 From: "Carrillo, Erik G" To: "Wiles, Keith" CC: "rsanford@akamai.com" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 0/3] *** timer library enhancements *** Thread-Index: AQHTHB7TiZ/GChPD+EmiNd0N4Sbrd6KSfqgA//+U1PA= Date: Wed, 23 Aug 2017 16:19:20 +0000 Message-ID: References: <1503499644-29432-1-git-send-email-erik.g.carrillo@intel.com> <3F9B5E47-8083-443E-96EE-CBC41695BE43@intel.com> In-Reply-To: <3F9B5E47-8083-443E-96EE-CBC41695BE43@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: dlp-product: dlpe-windows dlp-version: 10.0.102.7 dlp-reaction: no-action x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWMzZjcwZjgtZWMyZC00NjYxLWJiZDEtYTc3OThlYzgzZDc3IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6ImF4bHBcL1pzZUdtMk1Tand4Z3kxeXZUbkpsd0Z2STBLbVEzMndxekFMaDQ0PSJ9 x-ctpclassification: CTP_IC x-originating-ip: [10.1.200.106] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements *** X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 23 Aug 2017 16:19:23 -0000 > -----Original Message----- > From: Wiles, Keith > Sent: Wednesday, August 23, 2017 10:02 AM > To: Carrillo, Erik G > Cc: rsanford@akamai.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 0/3] *** timer library enhancements *** >=20 >=20 > > On Aug 23, 2017, at 9:47 AM, Gabriel Carrillo > wrote: > > > > In the current implementation of the DPDK timer library, timers can be > > created and set to be handled by a target lcore by adding it to a > > skiplist that corresponds to that lcore. However, if an application > > enables multiple lcores, and each of these lcores repeatedly attempts > > to install timers on the same target lcore, overall application > > throughput will be reduced as all lcores contend to acquire the lock > > guarding the single skiplist of pending timers. > > > > This patchset addresses this scenario by adding an array of skiplists > > to each lcore's priv_timer struct, such that when lcore i installs a > > timer on lcore k, the timer will be added to the ith skiplist for > > lcore k. If lcore j installs a timer on lcore k simultaneously, > > lcores i and j can both proceed since they will be acquiring different > > locks for different lists. > > > > When lcore k processes its pending timers, it will traverse each > > skiplist in its array and acquire a skiplist's lock while a run list > > is broken out; meanwhile, all other lists can continue to be modified. > > Then, all run lists for lcore k are collected and traversed together > > so timers are executed in their global order. >=20 > What is the performance and/or latency added to the timeout now? >=20 > I worry about the case when just about all of the cores are enabled, whic= h > could be as high was 128 or more now. There is a case in the timer_perf_autotest that runs rte_timer_manage with = zero timers that can give a sense of the added latency. When run with one= lcore, it completes in around 25 cycles. When run with 43 lcores (the hig= hest I have access to at the moment), rte_timer_mange completes in around 1= 55 cycles. So it looks like each added lcore adds around 3 cycles of overh= ead for checking empty lists in my testing. >=20 > One option is to have the lcore j that wants to install a timer on lcore = k to pass > a message via a ring to lcore k to add that timer. We could even add that= logic > into setting a timer on a different lcore then the caller in the current = API. The > ring would be a multi-producer and single consumer, we still have the loc= k. > What am I missing here? >=20 I did try this approach: initially I had a multi-producer single-consumer r= ing that would hold requests to add or delete a timer from lcore k's skipli= st, but it didn't really give an appreciable increase in my test applicatio= n throughput. In profiling this solution, the hotspot had moved from acqui= ring the skiplist's spinlock to the rte_atomic32_cmpset that the multiple-p= roducer ring code uses to manipulate the head pointer. Then, I tried multiple single-producer single-consumer rings per target lco= re. This removed the ring hotspot, but the performance didn't increase as = much as with the proposed solution. These solutions also add overhead to r= te_timer_manage, as it would have to process the rings and then process the= skiplists. One other thing to note is that a solution that uses such messages changes = the use models for the timer. One interesting example is: =20 - lcore I enqueues a message to install a timer on lcore k - lcore k runs rte_timer_manage, processes its messages and adds the timer = to its list - lcore I then enqueues a message to stop the same timer, now owned by lcor= e k - lcore k does not run rte_timer_manage again - lcore I wants to free the timer but it might not be safe Even though lcore I has successfully enqueued the request to stop the timer= (and delete it from lcore k's pending list), it hasn't actually been delet= ed from the list yet, so freeing it could corrupt the list. This case exi= sts in the existing timer stress tests. Another interesting scenario is: - lcore I resets a timer to install it on lcore k - lcore j resets the same timer to install it on lcore k - then, lcore k runs timer_manage Lcore j's message obviates lcore i's message, and it would be wasted work f= or lcore k to process it, so we should mark it to be skipped over. Handli= ng all the edge cases was more complex than the solution proposed. > > > > Gabriel Carrillo (3): > > timer: add per-installer pending lists for each lcore > > timer: handle timers installed from non-EAL threads > > doc: update timer lib docs > > > > doc/guides/prog_guide/timer_lib.rst | 19 ++- > > lib/librte_timer/rte_timer.c | 329 +++++++++++++++++++++++------= --- > ---- > > lib/librte_timer/rte_timer.h | 9 +- > > 3 files changed, 231 insertions(+), 126 deletions(-) > > > > -- > > 2.6.4 > > >=20 > Regards, > Keith