From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by dpdk.org (Postfix) with ESMTP id AB7655B40 for ; Fri, 7 Dec 2018 20:21:57 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga008.fm.intel.com ([10.253.24.58]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 07 Dec 2018 11:21:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,327,1539673200"; d="scan'208";a="107846451" Received: from fmsmsx104.amr.corp.intel.com ([10.18.124.202]) by fmsmga008.fm.intel.com with ESMTP; 07 Dec 2018 11:21:56 -0800 Received: from fmsmsx116.amr.corp.intel.com (10.18.116.20) by fmsmsx104.amr.corp.intel.com (10.18.124.202) with Microsoft SMTP Server (TLS) id 14.3.408.0; Fri, 7 Dec 2018 11:21:56 -0800 Received: from fmsmsx115.amr.corp.intel.com ([169.254.4.121]) by fmsmsx116.amr.corp.intel.com ([169.254.2.70]) with mapi id 14.03.0415.000; Fri, 7 Dec 2018 11:21:56 -0800 From: "Carrillo, Erik G" To: Stephen Hemminger CC: "rsanford@akamai.com" , "jerin.jacob@caviumnetworks.com" , "pbhagavatula@caviumnetworks.com" , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v2 1/2] timer: allow timer management in shared memory Thread-Index: AQHUjlgcEqrwTMsZq02FnMLak8q7b6Vzn5gA Date: Fri, 7 Dec 2018 19:21:55 +0000 Message-ID: References: <1543534514-183766-1-git-send-email-erik.g.carrillo@intel.com> <1544205180-31546-1-git-send-email-erik.g.carrillo@intel.com> <1544205180-31546-2-git-send-email-erik.g.carrillo@intel.com> <20181207101011.4e0275b2@xeon-e3> In-Reply-To: <20181207101011.4e0275b2@xeon-e3> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiNTExNDU1ODAtNjU2OS00YzBkLTlkYmEtNjczYTI3ZWVkZjY4IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE3LjEwLjE4MDQuNDkiLCJUcnVzdGVkTGFiZWxIYXNoIjoiTUJTNDdnVnM1cUNLYzRsK1VyMjhWeXZvRDgzXC8wY0pCcTZhSjhrTFVOcGl4cURaazFpOWdtTFlPdWMzbnlqdU0ifQ== x-ctpclassification: CTP_NT x-originating-ip: [10.1.200.108] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v2 1/2] timer: allow timer management in shared memory X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 07 Dec 2018 19:21:58 -0000 Hi Stephen, Thanks for the review. Some responses in-line: > -----Original Message----- > From: Stephen Hemminger [mailto:stephen@networkplumber.org] > Sent: Friday, December 7, 2018 12:10 PM > To: Carrillo, Erik G > Cc: rsanford@akamai.com; jerin.jacob@caviumnetworks.com; > pbhagavatula@caviumnetworks.com; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v2 1/2] timer: allow timer management in > shared memory >=20 > On Fri, 7 Dec 2018 11:52:59 -0600 > Erik Gabriel Carrillo wrote: >=20 > > Currently, the timer library uses a per-process table of structures to > > manage skiplists of timers presumably because timers contain arbitrary > > function pointers whose value may not resolve properly in other > > processes. > > > > However, if the same callback is used handle all timers, and that > > callback is only invoked in one process, then it woud be safe to allow > > the data structures to be allocated in shared memory, and to allow > > secondary processes to modify the timer lists. This would let timers > > be used in more multi-process scenarios. > > > > The library's global variables are wrapped with a struct, and an array > > of these structures is created in shared memory. The original APIs > > are updated to reference the zeroth entry in the array. This maintains > > the original behavior for both primary and secondary processes since > > the set intersection of their coremasks should be empty [1]. New APIs > > are introduced to enable the allocation/deallocation of other entries > > in the array. > > > > New variants of the APIs used to start and stop timers are introduced; > > they allow a caller to specify which array entry should be used to > > locate the timer list to insert into or delete from. > > > > Finally, a new variant of rte_timer_manage() is introduced, which > > allows a caller to specify which array entry should be used to locate > > the timer lists to process; it can also process multiple timer lists > > per invocation. > > > > [1] > > https://doc.dpdk.org/guides/prog_guide/multi_proc_support.html#multi- > p > > rocess-limitations > > > > Signed-off-by: Erik Gabriel Carrillo >=20 > Makes sense but it looks to me like an ABI breakage. Experimental isn't g= oing > to work for this. For APIs that existed prior to this patch, I've duplicated them in a "19.02= " node in=20 the map file; I only marked new APIs as experimental. I versioned each AP= I in order to maintain the prior interface as well. I tested ABI compatibility with devtools/validate-abi.sh; it reported no errors detected. So I believ= e this won't break the ABI, but if I need to change something I certainly will. >=20 > > +static uint32_t default_data_id; // id set to zero automatically >=20 > C++ style comments are not allowed per DPDK coding style. > Best to just drop the comment, it is stating the obvious. >=20 Sure - will do. > > -/* Init the timer library. */ > > +static inline int > > +timer_data_valid(uint32_t id) > > +{ > > + return !!(rte_timer_data_arr[id].internal_flags & FL_ALLOCATED); } >=20 > Don't need inline on static functions. > ... >=20 > > +MAP_STATIC_SYMBOL(int rte_timer_manage(void), > > +rte_timer_manage_v1902); BIND_DEFAULT_SYMBOL(rte_timer_manage, > > +_v1902, 19.02); > > + > > +int __rte_experimental > > +rte_timer_alt_manage(uint32_t timer_data_id, > > + unsigned int *poll_lcores, > > + int nb_poll_lcores, > > + rte_timer_alt_manage_cb_t f) > > +{ > > + union rte_timer_status status; > > + struct rte_timer *tim, *next_tim, **pprev; > > + struct rte_timer *run_first_tims[RTE_MAX_LCORE]; > > + unsigned int runlist_lcore_ids[RTE_MAX_LCORE]; > > + unsigned int this_lcore =3D rte_lcore_id(); > > + struct rte_timer *prev[MAX_SKIPLIST_DEPTH + 1]; > > + uint64_t cur_time; > > + int i, j, ret; > > + int nb_runlists =3D 0; > > + struct rte_timer_data *data; > > + struct priv_timer *privp; > > + uint32_t poll_lcore; > > + > > + TIMER_DATA_VALID_GET_OR_ERR_RET(timer_data_id, data, - > EINVAL); > > + > > + /* timer manager only runs on EAL thread with valid lcore_id */ > > + assert(this_lcore < RTE_MAX_LCORE); > > + > > + __TIMER_STAT_ADD(data->priv_timer, manage, 1); > > + > > + if (poll_lcores =3D=3D NULL) { > > + poll_lcores =3D (unsigned int []){rte_lcore_id()}; >=20 >=20 > This isn't going to be safe. It assigns poll_lcores to an array allocated= on the > stack. >=20 poll_lcores is allowed to be NULL when rte_timer_alt_manage() is called fo= r convenience; if it is NULL, then we create an array on the stack=20 containing one item and point poll_lcores at it. poll_lcores only needs to= be valid for the invocation of the function, so pointing to an array on the st= ack seems fine. Did I miss the point? > > + > > + for (i =3D 0, poll_lcore =3D poll_lcores[i]; i < nb_poll_lcores; > > + poll_lcore =3D poll_lcores[++i]) { > > + privp =3D &data->priv_timer[poll_lcore]; > > + > > + /* optimize for the case where per-cpu list is empty */ > > + if (privp->pending_head.sl_next[0] =3D=3D NULL) > > + continue; > > + cur_time =3D rte_get_timer_cycles(); > > + > > +#ifdef RTE_ARCH_64 > > + /* on 64-bit the value cached in the pending_head.expired > will > > + * be updated atomically, so we can consult that for a quick > > + * check here outside the lock > > + */ > > + if (likely(privp->pending_head.expire > cur_time)) > > + continue; > > +#endif >=20 >=20 > This code needs to be optimized so that application can call this at a ve= ry high > rate without performance impact.