From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by dpdk.org (Postfix) with ESMTP id D6F7C6966 for ; Tue, 17 May 2016 10:20:07 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga103.jf.intel.com with ESMTP; 17 May 2016 01:20:06 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.26,324,1459839600"; d="scan'208";a="956143294" Received: from fmsmsx108.amr.corp.intel.com ([10.18.124.206]) by orsmga001.jf.intel.com with ESMTP; 17 May 2016 01:20:06 -0700 Received: from fmsmsx113.amr.corp.intel.com (10.18.116.7) by FMSMSX108.amr.corp.intel.com (10.18.124.206) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 17 May 2016 01:20:04 -0700 Received: from shsmsx101.ccr.corp.intel.com (10.239.4.153) by FMSMSX113.amr.corp.intel.com (10.18.116.7) with Microsoft SMTP Server (TLS) id 14.3.248.2; Tue, 17 May 2016 01:20:03 -0700 Received: from shsmsx102.ccr.corp.intel.com ([169.254.2.104]) by SHSMSX101.ccr.corp.intel.com ([169.254.1.148]) with mapi id 14.03.0248.002; Tue, 17 May 2016 16:20:01 +0800 From: "Lu, Wenzhuo" To: Olivier MATZ , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF Thread-Index: AQHRpkl4+MTW5caN1kWULHCu4LuPTJ+7AqsAgAFX2TD///R0gIAAiiUw Date: Tue, 17 May 2016 08:20:00 +0000 Message-ID: <6A0DE07E22DDAD4C9103DF62FEBC090903468C20@shsmsx102.ccr.corp.intel.com> References: <1462396246-26517-1-git-send-email-wenzhuo.lu@intel.com> <1462396246-26517-4-git-send-email-wenzhuo.lu@intel.com> <5739B698.8010909@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC090903468932@shsmsx102.ccr.corp.intel.com> <573ACD59.3010806@6wind.com> In-Reply-To: <573ACD59.3010806@6wind.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.239.127.40] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2016 08:20:08 -0000 Hi Olivier, > -----Original Message----- > From: Olivier MATZ [mailto:olivier.matz@6wind.com] > Sent: Tuesday, May 17, 2016 3:51 PM > To: Lu, Wenzhuo; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF >=20 > Hi Wenzhuo, >=20 > On 05/17/2016 03:11 AM, Lu, Wenzhuo wrote: > >> -----Original Message----- > >> From: Olivier Matz [mailto:olivier.matz@6wind.com] If I understand > >> well, ixgbevf_dev_link_up_down_handler() is called by > >> ixgbevf_recv_pkts_fake() on a dataplane core. It means that the core > >> that acquired the lock will loop during 100us + 1sec at least. > >> If this core was also in charge of polling other queues of other > >> ports, or timers, many packets will be dropped (even with a 100us > >> loop). I don't think it is acceptable to actively wait inside a rx fun= ction. > >> > >> I think it would avoid many issues to delegate this work to the > >> application, maybe by notifying it that the port is in a bad state > >> and must be restarted. The application could then properly stop > >> polling the queues, and stop and restart the port in a separate thread= , > without bothering the dataplane cores. > > Thanks for the comments. > > Yes, you're right. I had a wrong assumption that every queue is handled= by one > core. > > But surely it's not right, we cannot tell how the users will deploy the= ir system. > > > > I plan to update this patch set. The solution now is, first let the > > users choose if they want this auto-reset feature. If so, we will > > apply another series rx/tx functions which have lock. So we can stop th= e rx/tx > of the bad ports. > > And we also apply a reset API for users. The APPs should call this API = in their > management thread or so. > > It means APPs should guarantee the thread safe for the API. > > You see, there're 2 things, > > 1, Lock the rx/tx to stop them for users. > > 2, Apply a resetting API for users, and every NIC can do their own > > job. APPs need not to worry about the difference between different NICs= . > > > > Surely, it's not *automatic* now. The reason is DPDK doesn't guarantee > > the thread safe. So the operations have to be left to the APPs and let = them to > guarantee the thread safe. > > > > And if the users choose not using auto-reset feature, we will leave > > this work to the APP :) >=20 > Yes, I think having 2 modes is a good approach: >=20 > - the first mode would let the application know a reset has to > be performed, without active loop or lock in the rx/tx funcs. > - the second mode would transparently manage the reset in the driver, > but may lock the core during some time. For the second mode, at first we want to let the driver manage the reset tr= ansparently. But the bad news is in driver layer the operations is not thread safe. If we want the reset to = be transparent, we need a whole new mechanism to guarantee the thread safe for the operatio= ns in driver layer. Obviously, it need to be discussed and cannot be finished in this release. So now we write a reset API for APP, and let APP call this API and guarante= e the thread safe for all the operations. It's not transparent. But seems it's what we can do at this stage. >=20 > By the way, you talk about a reset API, why not just using the usual stop= /start > functions? I think it would work the same. For ixgbe/igb, stop/start is enough. But for i40e, some other work should b= e done. (For example, the resource of the queues should be re-init.) So we think about introducing a new API, then different NICs can do what th= ey have to do. >=20 > Regards, > Olivier