From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-lf0-f68.google.com (mail-lf0-f68.google.com [209.85.215.68]) by dpdk.org (Postfix) with ESMTP id D61E11B891 for ; Wed, 25 Oct 2017 11:28:48 +0200 (CEST) Received: by mail-lf0-f68.google.com with SMTP id 75so27052798lfx.1 for ; Wed, 25 Oct 2017 02:28:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=semihalf-com.20150623.gappssmtp.com; s=20150623; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:content-transfer-encoding:in-reply-to :user-agent; bh=RPTtpA+Od1UM+/ILl5/dhiX8Zxnv4hDEXHgzhSCcI/I=; b=ZucEg6N2LQobA8YW+tbHRvyxzhhzG/Ag+pxh1AmNBOKb3VlMqWSglWS+e+ymJadX53 CJ1EvuPXiGVv1evkMgLbELJE/vZEZnJQj00yxLMU0KnqfjwLGdXh4rNQl6UMw5l6luou MqeQoOeLAQtlR3SZrPCAY5iVCuvpkv4NXtKdWuwC6hsogjPUmBQxzwVRE1nHAMBbmroQ usvk862/XxVJrCn3/rZgkTqVFlsjkTGRUkAGWLvbza6Ayjb1KRXF8m/pqTUpYLqF1rzP 9jJL5VcCSHFlRBXJfKRg8BHJIQOHdpXrijmwyXQgGBTLZUQLFeJngLQ+fYNUICvRLsAB kMjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to:user-agent; bh=RPTtpA+Od1UM+/ILl5/dhiX8Zxnv4hDEXHgzhSCcI/I=; b=d8HBQYZ/d2HYxwlPJo7y2LgklkOiQo262LmmKDoCEdk8zrAdShcejd4bkfkod2aRld m0VQ1mY9F8l5/ohwjJU807ORDQZMqGAwuOajYcmEog46NqZ+536t501OSLXAi7IEpg7I bgy2Uqwpg1eynxWmv2rXYvhBbf1kkfDkJvEXq7V2Fhr92K0q6rVwxdBk9uBHpBa1GTfM QDt/EujrvUB57hKS1PxecSR48s3yVsfMSisJyfQEoLOSUvVZteuGSQvQnS2cHKy9OH7/ KP34qLAOeGjIMZr7GSJzKnqceNlI5RclQW/uPALrFZh/0/CF0SxZHq/XqWOt69rulQTE NjfQ== X-Gm-Message-State: AMCzsaVnOQT6Ox7F5Occqm50YMPXAyQpsNFGSqRFvxpSGyLSTsXwdK70 z3ypvK9KImOhKkASNd/PKdMoMw== X-Google-Smtp-Source: ABhQp+SASoHd2KUBAVo0l8RCTVwJ997U5ELin3NM4UoSSnvyHr/e0n7erZWT9KeC2f3Me2/nl0r4Wg== X-Received: by 10.25.99.65 with SMTP id x62mr6926077lfb.129.1508923728170; Wed, 25 Oct 2017 02:28:48 -0700 (PDT) Received: from localhost (31-172-191-173.noc.fibertech.net.pl. [31.172.191.173]) by smtp.gmail.com with ESMTPSA id o41sm485666lfi.93.2017.10.25.02.28.46 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Wed, 25 Oct 2017 02:28:47 -0700 (PDT) Date: Wed, 25 Oct 2017 11:28:46 +0200 From: Tomasz Duszynski To: Ferruh Yigit Cc: Tomasz Duszynski , dev@dpdk.org Message-ID: <20171025092846.GA4293@tdu> References: <1508154348-10988-1-git-send-email-tdu@semihalf.com> <1508154348-10988-3-git-send-email-tdu@semihalf.com> <06faf3fd-ff00-bcab-e462-0ddd2a35728d@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: quoted-printable In-Reply-To: <06faf3fd-ff00-bcab-e462-0ddd2a35728d@intel.com> User-Agent: Mutt/1.5.23.1 (2014-03-12) Subject: Re: [dpdk-dev] [PATCH 2/2] examples/kni: stop lcores while doing kni ops X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 25 Oct 2017 09:28:49 -0000 Hi Ferruh, On Fri, Oct 20, 2017 at 05:42:13PM -0700, Ferruh Yigit wrote: > On 10/16/2017 4:45 AM, Tomasz Duszynski wrote: > > Since the transmit and receive functions should not be invoked when > > the device is stopped, stop lcores during kni ops and restart them > > after device is started once again. > > Hi Tomasz, > > Are you observing any error or unexpected behavior because of rx/tx funct= ions? I > am not sure about the patch, please check below logs, and trying to under= stand > scope of the patch, can you please give more details what happens if this= patch > is missing? Right, calling rx/tx functions after device was stopped will break things. For instace, putting interface up will call rte_eth_dev_stop() which frees port resources. If that happens (and usually happens) during rx/tx driver would break, as resources used in library functions calls are bogus at that point. So we end up with SIGSEGV. According to dpdk documentation rx/tx functions cannot be called before dev_start(). Thus I think cores that do rx/tx should be stopped just before rte_eth_dev_stop() is called. That could be fixed inside pmd but would require locks in rx/tx path which is rather a no-go. > > > > > Signed-off-by: Tomasz Duszynski > > --- > > examples/kni/main.c | 28 ++++++++++++++++++++++++++++ > > 1 file changed, 28 insertions(+) > > > > diff --git a/examples/kni/main.c b/examples/kni/main.c > > index cb48fb5..5c50448 100644 > > --- a/examples/kni/main.c > > +++ b/examples/kni/main.c > > @@ -166,6 +166,23 @@ static int kni_change_mtu(uint16_t port_id, unsign= ed int new_mtu); > > static int kni_config_network_interface(uint16_t port_id, uint8_t if_u= p); > > > > static rte_atomic32_t kni_stop =3D RTE_ATOMIC32_INIT(0); > > +static rte_atomic32_t kni_restart =3D RTE_ATOMIC32_INIT(0); > > + > > +static void > > +kni_stop_lcores(void) > > +{ > > + unsigned int i; > > + > > + rte_atomic32_inc(&kni_restart); > > + rte_atomic32_inc(&kni_stop); > > + > > + RTE_LCORE_FOREACH(i) { > > + if (i =3D=3D rte_lcore_id()) > > + continue; > > This function called by port Rx core [1], and since the thread can't wait= itself > to finish, specially if nb_kni > 1, the Rx core still can do some work ev= en > after exit from this function. Ack. > > > + > > + rte_eal_wait_lcore(i); > > The API documentation says: "To be executed on the MASTER lcore only." > Not sure what happens when called from slave core, as we did here. OK. > > > + } > > +} > > > > /* Print out statistics on packets handled */ > > static void > > @@ -712,6 +729,7 @@ kni_change_mtu(uint16_t port_id, unsigned int new_m= tu) > > > > RTE_LOG(INFO, APP, "Change MTU of port %d to %u\n", port_id, new_mtu); > > > > + kni_stop_lcores(); > > /* Stop specific port */ > > rte_eth_dev_stop(port_id); > > > > @@ -755,6 +773,8 @@ kni_config_network_interface(uint16_t port_id, uint= 8_t if_up) > > RTE_LOG(INFO, APP, "Configure network interface of %d %s\n", > > port_id, if_up ? "up" : "down"); > > > > + kni_stop_lcores(); > > + > > if (if_up !=3D 0) { /* Configure network interface up */ > > rte_eth_dev_stop(port_id); > > ret =3D rte_eth_dev_start(port_id); > > @@ -911,6 +931,7 @@ main(int argc, char** argv) > > } > > check_all_ports_link_status(nb_sys_ports, ports_mask); > > > > +restart: > > /* Launch per-lcore function on every lcore */ > > rte_eal_mp_remote_launch(main_loop, NULL, CALL_MASTER); > > RTE_LCORE_FOREACH_SLAVE(i) { > > @@ -918,6 +939,13 @@ main(int argc, char** argv) > > return -1; > > } > > > > + if (rte_atomic32_read(&kni_restart)) { > > + rte_atomic32_dec(&kni_stop); > > + rte_atomic32_dec(&kni_restart); > > kni_stop_lcores() called per port, so it is possible that kni_stop and > kni_restart increased parallel, many times. But this decrement is per > application, so they will be decremented sequentially, casing app stop - = start > unnecessarily. Right. > > > + > > + goto restart; > > This will cause assigning tasks to cores again, and will produce all rela= ted > logs again, if you enable debug logs you will see it [2]. I believe confu= sing to > have those logs every time mtu updated etc... Right. > > > + } > > + > > /* Release resources */ > > for (port =3D 0; port < nb_sys_ports; port++) { > > if (!(ports_mask & (1 << port))) > > -- > > 2.7.4 > > > > [1] > main_loop > kni_ingress > rte_kni_handle_request > kni_change_mtu > || > kni_config_network_interface > kni_stop_lcores > > > [2] > APP: Change MTU of port 0 to 1402 > PMD: ixgbe_set_rx_function(): Vector rx enabled, please make sure RX burs= t size > no less than 4 (port=3D0). > PMD: ixgbe_dev_link_status_print(): Port 0: Link Down > PMD: ixgbe_dev_link_status_print(): PCI Address: 0000:08:00.1 > APP: Lcore 1 is reading from port 0 > APP: Lcore 2 is writing to port 0 > APP: Lcore 3 is reading from port 1 > APP: Lcore 4 is writing to port 1 > APP: Lcore 5 has nothing to do > APP: Lcore 6 has nothing to do > APP: Lcore 7 has nothing to do > APP: Lcore 8 has nothing to do > APP: Lcore 9 has nothing to do > APP: Lcore 10 has nothing to do > APP: Lcore 11 has nothing to do > APP: Lcore 12 has nothing to do > APP: Lcore 13 has nothing to do > APP: Lcore 14 has nothing to do > APP: Lcore 15 has nothing to do > APP: Lcore 16 has nothing to do > APP: Lcore 17 has nothing to do > APP: Lcore 18 has nothing to do > APP: Lcore 19 has nothing to do > APP: Lcore 20 has nothing to do > APP: Lcore 21 has nothing to do > APP: Lcore 22 has nothing to do > APP: Lcore 23 has nothing to do > APP: Lcore 24 has nothing to do > APP: Lcore 25 has nothing to do > APP: Lcore 26 has nothing to do > APP: Lcore 27 has nothing to do > APP: Lcore 28 has nothing to do > APP: Lcore 29 has nothing to do > APP: Lcore 30 has nothing to do > APP: Lcore 31 has nothing to do > APP: Lcore 0 has nothing to do -- - Tomasz Duszy=C5=84ski