From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 63F10A0A0C; Tue, 29 Jun 2021 13:05:18 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id B9CD44117C; Tue, 29 Jun 2021 13:05:17 +0200 (CEST) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by mails.dpdk.org (Postfix) with ESMTP id 677C940E01 for ; Tue, 29 Jun 2021 13:05:15 +0200 (CEST) X-IronPort-AV: E=McAfee;i="6200,9189,10029"; a="271981488" X-IronPort-AV: E=Sophos;i="5.83,308,1616482800"; d="scan'208";a="271981488" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jun 2021 04:05:13 -0700 X-IronPort-AV: E=Sophos;i="5.83,308,1616482800"; d="scan'208";a="625590325" Received: from aburakov-mobl.ger.corp.intel.com (HELO [10.213.252.170]) ([10.213.252.170]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jun 2021 04:05:11 -0700 To: "Ananyev, Konstantin" , "dev@dpdk.org" , "Hunt, David" Cc: "Loftus, Ciara" References: <676eab0e1eb6c63acb170893675daa5a39eac29d.1624884053.git.anatoly.burakov@intel.com> <77f30ce1-40b0-b427-c0f9-359a350358a1@intel.com> From: "Burakov, Anatoly" Message-ID: Date: Tue, 29 Jun 2021 12:05:08 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH v3 6/7] power: support monitoring multiple Rx queues X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 29-Jun-21 1:07 AM, Ananyev, Konstantin wrote: >>>> +static uint16_t >>>> +clb_multiwait(uint16_t port_id, uint16_t qidx, >>>> + struct rte_mbuf **pkts __rte_unused, uint16_t nb_rx, >>>> + uint16_t max_pkts __rte_unused, void *addr __rte_unused) >>>> +{ >>>> + const unsigned int lcore = rte_lcore_id(); >>>> + const union queue q = {.portid = port_id, .qid = qidx}; >>>> + const bool empty = nb_rx == 0; >>>> + struct pmd_core_cfg *q_conf; >>>> + >>>> + q_conf = &lcore_cfg[lcore]; >>>> + >>>> + /* early exit */ >>>> + if (likely(!empty)) { >>>> + q_conf->empty_poll_stats = 0; >>>> + } else { >>>> + /* do we care about this particular queue? */ >>>> + if (!queue_is_power_save(q_conf, &q)) >>>> + return nb_rx; >>> >>> I still don't understand the need of 'special' power_save queue here... >>> Why we can't just have a function: >>> >>> get_number_of_queues_whose_sequential_empty_polls_less_then_threshold(struct pmd_core_cfg *lcore_cfg), >>> and then just: >>> >>> /* all queues have at least EMPTYPOLL_MAX sequential empty polls */ >>> if (get_number_of_queues_whose_sequential_empty_polls_less_then_threshold(q_conf) == 0) { >>> /* go into power-save mode here */ >>> } >> >> Okay, let's go through this step by step :) >> >> Let's suppose we have three queues - q0, q1 and q2. We want to sleep >> whenever there's no traffic on *all of them*, however we cannot know >> that until we have checked all of them. >> >> So, let's suppose that q0, q1 and q2 were empty all this time, but now >> some traffic arrived at q2 while we're still checking q0. We see that q0 >> is empty, and all of the queues were empty for the last N polls, so we >> think we will be safe to sleep at q0 despite the fact that traffic has >> just arrived at q2. >> This is not an issue with MONITOR mode because we will be able to see if >> current Rx ring descriptor is busy or not via the NIC callback, *but >> this is not possible* with PAUSE and SCALE modes, because they don't >> have the sneaky lookahead function of MONITOR! So, with PAUSE and SCALE >> modes, it is possible to end up in a situation where you *think* you >> don't have any traffic, but you actually do, you just haven't checked >> the relevant queue yet. > > I think such situation is unavoidable. > Yes, traffic can arrive to *any* queue at *any* time. > With your example above - user choose q2 as 'special' queue, but > traffic actually arrives on q0 or q1. > And yes, if user choose PAUSE or SCALE methods he *can* miss the traffic, > because as you said for these methods there is no notification mechanisms. > I think there are just unavoidable limitations with these power-save methods. > >> In order to prevent this from happening, we do not sleep on every queue, >> instead we sleep *once* per loop. > > Yes, totally agree we shouldn't sleep on *every* queue. > We need to go to sleep when there is no traffic on *any* of queues we monitor. > >> That is, we check q0, check q1, check >> q2, and only then we decide whether we want to sleep or not. > >> Of course, with such scheme it is still possible to e.g. sleep in q2 >> while there's traffic waiting in q0, > > Yes, exactly. > >> but worst case is less bad with >> this scheme, because we'll be doing at worst 1 extra sleep. > > Hmm, I think it would be one extra sleep anyway. > >> Whereas with what you're suggesting, if we had e.g. 10 queues to poll, >> and we checked q1 but traffic has just arrived at q0, we'll be sleeping >> at q1, then we'll be sleeping at q2, then we'll be sleeping at q3, then >> we'll be sleeping at q4, then we'll be sleeping at q5.... and 9 sleeps >> later we finally reach q0 and find out after all this time that we >> shouldn't have slept in the first place. > > Ah ok, I think I understand now what you are saying. > Sure, to avoid such situation, we'll need to maintain extra counters and > update them properly when we go to sleep. > I should state it clearly at the beginning. > It might be easier to explain what I meant by code snippet: > > lcore_conf needs 2 counters: > uint64_t nb_queues_ready_to_sleep; > uint64_t nb_sleeps; > > Plus each queue needs 2 counters: > uint64_t nb_empty_polls; > uint64_t nb_sleeps; > > Now, at rx_callback(): > > /* check did sleep happen since previous call, > if yes, then reset queue counters */ > if (queue->nb_sleeps != lcore_conf->nb_sleeps) { > queue->nb_sleeps = lcore_conf->nb_sleeps; > queue->nb_empty_polls = 0; > } > > /* packet arrived, reset counters */ > if (nb_rx != 0) { > /* queue is not 'ready_to_sleep' any more */ > if (queue->nb_empty_polls > EMPTYPOLL_MAX) > lcore_conf-> nb_queues_ready_to_sleep--; > queue->nb_empty_polls = 0; > > /* empty poll */ > } else { > /* queue reaches EMPTYPOLL_MAX threshold, mark it as 'ready_to_sleep' */ > if (queue->nb_empty_polls == EMPTYPOLL_MAX) > lcore_conf-> nb_queues_ready_to_sleep++; > queue->nb_empty_polls++; > } > > /* no traffic on any queue for at least EMPTYPOLL_MAX iterations */ > if (lcore_conf-> nb_queues_ready_to_sleep == lcore_conf->n_queues) { > /* update counters and sleep */ > lcore_conf->nb_sleeps++; > lcore_conf-> nb_queues_ready_to_sleep = 0; > goto_sleep(); > } > } OK, this sounds like it is actually doable :) I'll prototype and see if it works. > >> Hopefully you get the point now :) >> >> So, the idea here is, for any N queues, sleep only once, not N times. >> >>> >>>> + >>>> + /* >>>> + * we can increment unconditionally here because if there were >>>> + * non-empty polls in other queues assigned to this core, we >>>> + * dropped the counter to zero anyway. >>>> + */ >>>> + q_conf->empty_poll_stats++; >>>> + if (unlikely(q_conf->empty_poll_stats > EMPTYPOLL_MAX)) { >>>> + struct rte_power_monitor_cond pmc[RTE_MAX_ETHPORTS]; >>> >>> I think you need here: >>> struct rte_power_monitor_cond pmc[q_conf->n_queues]; >> >> I think VLA's are generally agreed upon to be unsafe, so i'm avoiding >> them here. > > Wonder why? > These days DPDK uses VLA in dozens of places... Well, if that's the case, i can use it here also :) realistically the n_queues value will be very small, so it shouldn't be a big issue. -- Thanks, Anatoly