From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from proxy.6wind.com (host.76.145.23.62.rev.coltfrance.com [62.23.145.76]) by dpdk.org (Postfix) with ESMTP id 2AA9668D9 for ; Tue, 17 May 2016 09:51:02 +0200 (CEST) Received: from [10.16.0.195] (unknown [10.16.0.195]) by proxy.6wind.com (Postfix) with ESMTP id 6C32428DB9; Tue, 17 May 2016 09:49:25 +0200 (CEST) To: "Lu, Wenzhuo" , "dev@dpdk.org" References: <1462396246-26517-1-git-send-email-wenzhuo.lu@intel.com> <1462396246-26517-4-git-send-email-wenzhuo.lu@intel.com> <5739B698.8010909@6wind.com> <6A0DE07E22DDAD4C9103DF62FEBC090903468932@shsmsx102.ccr.corp.intel.com> From: Olivier MATZ Message-ID: <573ACD59.3010806@6wind.com> Date: Tue, 17 May 2016 09:50:49 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.6.0 MIME-Version: 1.0 In-Reply-To: <6A0DE07E22DDAD4C9103DF62FEBC090903468932@shsmsx102.ccr.corp.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [PATCH 3/4] ixgbe: automatic link recovery on VF X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 17 May 2016 07:51:02 -0000 Hi Wenzhuo, On 05/17/2016 03:11 AM, Lu, Wenzhuo wrote: >> -----Original Message----- >> From: Olivier Matz [mailto:olivier.matz@6wind.com] >> If I understand well, ixgbevf_dev_link_up_down_handler() is called by >> ixgbevf_recv_pkts_fake() on a dataplane core. It means that the core that >> acquired the lock will loop during 100us + 1sec at least. >> If this core was also in charge of polling other queues of other ports, or timers, >> many packets will be dropped (even with a 100us loop). I don't think it is >> acceptable to actively wait inside a rx function. >> >> I think it would avoid many issues to delegate this work to the application, >> maybe by notifying it that the port is in a bad state and must be restarted. The >> application could then properly stop polling the queues, and stop and restart the >> port in a separate thread, without bothering the dataplane cores. > Thanks for the comments. > Yes, you're right. I had a wrong assumption that every queue is handled by one core. > But surely it's not right, we cannot tell how the users will deploy their system. > > I plan to update this patch set. The solution now is, first let the users choose if they want this > auto-reset feature. If so, we will apply another series rx/tx functions which have lock. So we > can stop the rx/tx of the bad ports. > And we also apply a reset API for users. The APPs should call this API in their management thread or so. > It means APPs should guarantee the thread safe for the API. > You see, there're 2 things, > 1, Lock the rx/tx to stop them for users. > 2, Apply a resetting API for users, and every NIC can do their own job. APPs need not to worry about the difference > between different NICs. > > Surely, it's not *automatic* now. The reason is DPDK doesn't guarantee the thread safe. So the operations have to be > left to the APPs and let them to guarantee the thread safe. > > And if the users choose not using auto-reset feature, we will leave this work to the APP :) Yes, I think having 2 modes is a good approach: - the first mode would let the application know a reset has to be performed, without active loop or lock in the rx/tx funcs. - the second mode would transparently manage the reset in the driver, but may lock the core during some time. By the way, you talk about a reset API, why not just using the usual stop/start functions? I think it would work the same. Regards, Olivier