From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 9707D58EC for ; Mon, 30 Jul 2018 17:32:59 +0200 (CEST) X-Amp-Result: UNSCANNABLE X-Amp-File-Uploaded: False Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 30 Jul 2018 08:32:58 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,422,1526367600"; d="scan'208";a="68654795" Received: from irvmail001.ir.intel.com ([163.33.26.43]) by FMSMGA003.fm.intel.com with ESMTP; 30 Jul 2018 08:32:57 -0700 Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com [10.237.217.45]) by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id w6UFWuur030249; Mon, 30 Jul 2018 16:32:56 +0100 Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1]) by sivswdev01.ir.intel.com with ESMTP id w6UFWumC030724; Mon, 30 Jul 2018 16:32:56 +0100 Received: (from lma25@localhost) by sivswdev01.ir.intel.com with LOCAL id w6UFWuLQ030716; Mon, 30 Jul 2018 16:32:56 +0100 Date: Mon, 30 Jul 2018 16:32:56 +0100 From: "Liang, Ma" To: Jerin Jacob Cc: "Van Haaren, Harry" , "Elo, Matias (Nokia - FI/Espoo)" , "dev@dpdk.org" Message-ID: <20180730153256.GA5887@sivswdev01.ir.intel.com> References: <20180730075408.GA14117@jerin> <80CC5C07-0D73-4F86-9F93-0AB78DEF2BFD@nokia.com> <20180730092921.GA22242@jerin> <20180730103638.GA26701@jerin> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180730103638.GA26701@jerin> User-Agent: Mutt/1.9.1 (2017-09-22) Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Jul 2018 15:33:01 -0000 On 30 Jul 16:06, Jerin Jacob wrote: > -----Original Message----- > > Date: Mon, 30 Jul 2018 09:38:01 +0000 > > From: "Van Haaren, Harry" > > To: Jerin Jacob , "Elo, Matias (Nokia - > > FI/Espoo)" > > CC: "dev@dpdk.org" > > Subject: RE: [dpdk-dev] eventdev: method for finding out unlink status > > > > > > > From: Jerin Jacob [mailto:jerin.jacob@caviumnetworks.com] > > > Sent: Monday, July 30, 2018 10:29 AM > > > To: Elo, Matias (Nokia - FI/Espoo) > > > Cc: dev@dpdk.org; Van Haaren, Harry > > > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status > > > > > > -----Original Message----- > > > > Date: Mon, 30 Jul 2018 09:17:47 +0000 > > > > From: "Elo, Matias (Nokia - FI/Espoo)" > > > > To: Jerin Jacob > > > > CC: "dev@dpdk.org" , "Van Haaren, Harry" > > > > > > > > Subject: Re: [dpdk-dev] eventdev: method for finding out unlink status > > > > x-mailer: Apple Mail (2.3445.9.1) > > > > > > > > > > > > >> > > > > >> In bug report https://bugs.dpdk.org/show_bug.cgi?id=60 we have been > > > discussing > > > > >> issues related to events ending up in wrong ports after calling > > > > >> rte_event_port_unlink(). In addition of finding few bugs we have > > > identified a > > > > >> need for a new API call (or documentation extension) for an application > > > to be > > > > > > > > > > From HW perspective, documentation extension should be enough. adding > > > > > "there may be pre-scheduled events and the application is responsible to > > > process them" > > > > > on unlink(). Since dequeue() has which queue it is dequeue-ed from, the > > > > > application can allays make action based on that(i.e, Is the event > > > > > post/pre to unlink) > > > > > > > > At least in case of SW eventdev the problem is how the application can know > > > that > > > > it has processed all pre-scheduled events. E.g. dequeue may return nothing > > > but since > > > > the scheduler is running as a separate process events may still end up to > > > the unlinked > > > > port asynchronously. > > > > > > Can't we do, dequeue() in loop to get all the events from port. If > > > dequeue returns with zero event then ports are drained up. Right? > > > > Nope - because the scheduler might not have performed and "Acked" the > > unlink(), and internally it has *just* scheduled an event, but it wasn't > > available in the dequeue ring yet. > > > > Aka, its racy behavior - and we need a way to retrieve this "Unlink Ack" > > from the scheduler (which runs in another thread in event/sw). > > OK. Some bits specific to event/sw. We will address it. BTW: OPDL is not support unlink in runtime. so if we need suggest user do a query to the CAP bits first. > > > > > > > > > >> able to find out when an unlink() call has finished and no new events are > > > > >> scheduled anymore to the particular event port. This is required e.g. > > > when doing > > > > >> clean-up after an application thread stops processing events. > > > > > > > > > > If thread stopping then it better to call dev_stop(). At least in HW > > > > > implementation, > > > > > > > > For an application doing dynamic load balancing stopping the whole eventdev > > > is not an > > > > option. > > > > > > OK. Makes sense. Doing unlink() and link() in fastpath is not a > > > problem. > > > > Correct > > > > > > > Changing core assignment to event port is problem without stop(). I > > > guess, you > > > application or general would be OK with that constraint. > > > > > > I don't think that the eventdev API requires 1:1 Lcore / Port mapping, so really a > > PMD should be able to handle any thread calling any port. > > > > The event/sw PMD allows any thread to call dequeue/enqueue any port, > > so long as it is not being accessed by another thread. > > Yes. True. Eventdev API does not required 1:1 Lcore/Port mapping. > Just like event/sw requires some bits to clear "Unlink Ack". At least, > our HW implementation we need some bit clear when we change lcore to port > mapping. Currently we are doing it in stop() call, If there is a real valid use > case to change lcore to port mapping without stop, we would like to > propose and API to flush/clear state on Lcore/port mapping change. > It can be NOP for event/sw. > > > > > > > > > > A given event port assigned to a new lcore other than > > > > > it previous one then we need to do some clean up at port level. > > > > > > > > In my case I'm mapping an event port per thread statically (basically > > > thread_id == port_id), > > > > so this shouldn't be an issue. > > > > This is the common case - but I don't think we should demand it. > > There is a valid scale-down model which just polls *all* ports using > > a single lcore, instead of unlink() of multiple ports. > > > > > > For this "runtime scale down" use-case the missing information is being > > able to identify when an unlink is complete. After that (and ensuring the > > port buffer is empty) the application can be guaranteed that there are no > > more events going to be sent to that port, and the application can take > > the worker lcore out of its polling-loop and put it to sleep. > > > > As mentioned before, I think an "unlinks_in_progress()" function is perhaps > > the easiest way to achieve this functionality, as it allows relatively simple > > tracking of unlinks() using an atomic counter in sw. (Implementation details > > become complex when we have a separate core running event/sw, separate cores > > polling, and a control-plane thread calling unlink...) > > > > I think the end result we're hoping for is something like pseudo code below, > > (keep in mind that the event/sw has a service-core thread running it, so no > > application code there): > > > > int worker_poll = 1; > > > > worker() { > > while(worker_poll) { > > // eventdev_dequeue_burst() etc > > } > > go_to_sleep(1); > > } > > > > control_plane_scale_down() { > > unlink(evdev, worker, queue_id); > > while(unlinks_in_progress(evdev) > 0) > > usleep(100); > > > > /* here we know that the unlink is complete. > > * so we can now stop the worker from polling */ > > worker_poll = 0; > > } > > > Make sense. Instead of rte_event_is_unlink_in_progress(), How about > adding a callback in rte_event_port_unlink() which will be called on > unlink completion. It will reduce the need for ONE more API. > > Anyway it RC2 now, so we can not accept a new feature. So we will have > time for deprecation notice. > > > > > > Hope my pseudo-code makes pseudo-sense :) > > > > -Harry