From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id 8E43A1B514 for ; Thu, 13 Dec 2018 13:26:40 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga103.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Dec 2018 04:26:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,349,1539673200"; d="scan'208";a="259169825" Received: from aburakov-mobl1.ger.corp.intel.com (HELO [10.237.220.93]) ([10.237.220.93]) by orsmga004.jf.intel.com with ESMTP; 13 Dec 2018 04:26:36 -0800 To: Honnappa Nagarahalli , Stephen Hemminger Cc: "Ananyev, Konstantin" , "dev@dpdk.org" , nd , Dharmik Thakkar , Malvika Gupta , "Gavin Hu (Arm Technology China)" References: <20181122033055.3431-1-honnappa.nagarahalli@arm.com> <20181122033055.3431-3-honnappa.nagarahalli@arm.com> <2601191342CEEE43887BDE71AB977258010CEBBBA9@IRSMSX106.ger.corp.intel.com> <2601191342CEEE43887BDE71AB977258010CEBD5D4@IRSMSX106.ger.corp.intel.com> <20181207092936.17bf2887@xeon-e3> From: "Burakov, Anatoly" Message-ID: Date: Thu, 13 Dec 2018 12:26:35 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.3 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [RFC 2/3] tqs: add thread quiescent state library X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 13 Dec 2018 12:26:41 -0000 On 11-Dec-18 6:40 AM, Honnappa Nagarahalli wrote: >> >>>>> >>>>>>> + >>>>>>> +/* Add a reader thread, running on an lcore, to the list of >>>>>>> +threads >>>>>>> + * reporting their quiescent state on a TQS variable. >>>>>>> + */ >>>>>>> +int __rte_experimental >>>>>>> +rte_tqs_register_lcore(struct rte_tqs *v, unsigned int lcore_id) { >>>>>>> + TQS_RETURN_IF_TRUE((v == NULL || lcore_id >= >>>>>> RTE_TQS_MAX_LCORE), >>>>>>> + -EINVAL); >>>>>> >>>>>> It is not very good practice to make function return different >>>>>> values and behave in a different way in debug/non-debug mode. >>>>>> I'd say that for slow-path (functions in .c) it is always good >>>>>> to check input parameters. >>>>>> For fast-path (functions in .h) we sometimes skip such checking, >>>>>> but debug mode can probably use RTE_ASSERT() or so. >>>>> Makes sense, I will change this in the next version. >>>>> >>>>>> >>>>>> >>>>>> lcore_id >= RTE_TQS_MAX_LCORE >>>>>> >>>>>> Is this limitation really necessary? >>>>> I added this limitation because currently DPDK application cannot >>>>> take a mask more than 64bit wide. Otherwise, this should be as big >>>>> as >>>> RTE_MAX_LCORE. >>>>> I see that in the case of '-lcores' option, the number of lcores >>>>> can be more than the number of PEs. In this case, we still need a >>>>> MAX limit (but >>>> can be bigger than 64). >>>>> >>>>>> First it means that only lcores can use that API (at least >>>>>> data-path part), second even today many machines have more than 64 >> cores. >>>>>> I think you can easily avoid such limitation, if instead of >>>>>> requiring lcore_id as input parameter, you'll just make it >>>>>> return index of >>>> next available entry in w[]. >>>>>> Then tqs_update() can take that index as input parameter. >>>>> I had thought about a similar approach based on IDs. I was >>>>> concerned that ID will be one more thing to manage for the >>>>> application. But, I see the >>>> limitations of the current approach now. I will change it to allocation >> based. >>>> This will support even non-EAL pthreads as well. >>>> >>>> Yes, with such approach non-lcore threads will be able to use it also. >>>> >>> I realized that rte_tqs_register_lcore/ rte_tqs_unregister_lcore need to be >> efficient as they can be called from the worker's packet processing loop >> (rte_event_dequeue_burst allows blocking. So, the worker thread needs to >> call rte_tqs_unregister_lcore before calling rte_event_dequeue_burst and >> rte_tqs_register_lcore before starting packet processing). Allocating the >> thread ID in these functions will make them more complex. >>> >>> I suggest that we change the argument 'lcore_id' to 'thread_id'. The >> application could use 'lcore_id' as 'thread_id' if threads are mapped to >> physical cores 1:1. >>> >>> If the threads are not mapped 1:1 to physical cores, the threads need to use >> a thread_id in the range of 0 - RTE_TQS_MAX_THREADS. I do not see that >> DPDK has a thread_id concept. For TQS, the thread IDs are global (i.e. not per >> TQS variable). I could provide APIs to do the thread ID allocation, but I think >> the thread ID allocation should not be part of this library. Such thread ID >> might be useful for other libraries. >>> >>> > >> >> Thread id is problematic since Glibc doesn't want to give it out. >> You have to roll your own function to do gettid(). >> It is not as easy as just that. Plus what about preemption? > > Agree. I looked into this further. The rte_gettid function uses a system call (BSD and Linux). I am not clear on the space of the ID returned (as well). I do not think it is guaranteed that it will be with in a narrow range that is required here. > > My suggestion would be to add a set of APIs that would allow for allocation of thread IDs which are within a given range of 0 to > System-provided thread-ID's would probably also be potentially non-unique in multiprocess scenario? -- Thanks, Anatoly