From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id 0EAF12E83; Fri, 12 May 2017 16:55:54 +0200 (CEST) Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 12 May 2017 07:55:50 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.38,330,1491289200"; d="scan'208";a="1129528687" Received: from dwdohert-mobl1.ger.corp.intel.com (HELO [163.33.228.190]) ([163.33.228.190]) by orsmga001.jf.intel.com with ESMTP; 12 May 2017 07:55:49 -0700 To: Kyle Larose , "users@dpdk.org" , "dev@dpdk.org" References: From: Declan Doherty Message-ID: <9aabdec0-81b2-689d-9f2e-838d93c67ccb@intel.com> Date: Fri, 12 May 2017 15:55:49 +0100 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.8.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] active_backup link bonding and mac address X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 12 May 2017 14:55:55 -0000 On 12/05/2017 3:31 PM, Kyle Larose wrote: > I'm adding the dev mailing list/link bonding maintainer, because I've done some more investigation and I'm beginning to think something is wrong. > >> -----Original Message----- >> From: Kyle Larose >> Sent: Thursday, May 11, 2017 4:55 PM >> To: users@dpdk.org >> Subject: active_backup link bonding and mac address >> >> Hey fellow DPDK users, >> >> I have a question about the link bond pmd. >> >> I am running 4 X710 interfaces in a link bond pmd for my application. In >> LACP mode, everything works fine. But, in active_backup mode, if the primary >> link fails, my application stops working. The reason is that I'm still >> sending packets with the original MAC address of the link bond pmd, which is >> that of the original primary slave. However, the new primary is not in >> promiscuous mode, so traffic coming back with that MAC address drops. >> >> What should I be doing here: >> >> 1) Should I be listening for the changes in the state of the primary, and >> updating the MAC address I use to send? (I have it cached for efficiency) >> 2) Should the driver be placing the interface into promiscuous mode to allow >> for this, similar to what LACP does? >> 3) Should the driver be overwriting the MAC on egress, similar to what the >> tlb driver seems to do (in bond_ethdev_tx_burst_tlb) >> >> I'm fine with #1, but it seems to break the goal of having the link bond pmd >> be transparent to the application. >> > > I checked the mac address of the link bond interface after the failover, and it did not change. > It still had the MAC address of the first slave that was added. This seems incompatible with > solution number 1 that I suggested above, which means either it the link bond device should > update its address, or it should be promiscuous at the slave level. > > FWIW, I'm using 16.07. I have reproduced this on testpmd by looking at port state. (with some > fiddling -- needed to prevent it from starting the slave interfaces, and turn off its default > promiscuous mode.) > > Does anyone have any input on this problem? > > Thanks, > > Kyle > Kyle, sorry I didn't see the post in the users list. I think the issue is that the new primary is missing the bond MAC address on it's valid MACs list, hence it is dropping the ingress packets after a fail-over event, placing the all the slave devices into promiscuous mode as you suggest in option 2 would probably make the issue go away but I don't think it's the correct solution. I think we should just be adding the bond MAC to each slaves devices valid MAC list. As only one bond slave is only active at any time this won't cause any issues to the network, and will mean that fail over is transparent to your application and there is no need for MAC rewrites, which would invalidate existing ARP table entries at downstream end points.