From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id C090DC3B0 for ; Fri, 19 Feb 2016 07:12:44 +0100 (CET) Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga103.fm.intel.com with ESMTP; 18 Feb 2016 22:12:43 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.22,469,1449561600"; d="scan'208";a="919087264" Received: from shwdeisgchi083.ccr.corp.intel.com (HELO [10.239.67.119]) ([10.239.67.119]) by fmsmga002.fm.intel.com with ESMTP; 18 Feb 2016 22:11:37 -0800 To: Yuanhan Liu , dev@dpdk.org References: <1450321921-27799-1-git-send-email-yuanhan.liu@linux.intel.com> <1454043483-24579-1-git-send-email-yuanhan.liu@linux.intel.com> <1454043483-24579-7-git-send-email-yuanhan.liu@linux.intel.com> From: "Tan, Jianfeng" Message-ID: <56C6B218.6080501@intel.com> Date: Fri, 19 Feb 2016 14:11:36 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.3; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.6.0 MIME-Version: 1.0 In-Reply-To: <1454043483-24579-7-git-send-email-yuanhan.liu@linux.intel.com> Content-Type: text/plain; charset=windows-1252; format=flowed Content-Transfer-Encoding: 7bit Cc: Victor Kaplansky , "Michael S. Tsirkin" Subject: Re: [dpdk-dev] [PATCH v3 6/8] vhost: handle VHOST_USER_SEND_RARP request X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 19 Feb 2016 06:12:45 -0000 Hi Yuanhan, On 1/29/2016 12:58 PM, Yuanhan Liu wrote: > While in former patch we enabled GUEST_ANNOUNCE feature, so that the > guest OS will broadcast a GARP message after migration to notify the > switch about the new location of migrated VM, the thing is that > GUEST_ANNOUNCE is enabled since kernel v3.5 only. For older kernel, > VHOST_USER_SEND_RARP request comes to rescue. > > The payload of this new request is the mac address of the migrated VM, > with that, we could construct a RARP message, and then broadcast it > to host interfaces. > > That's how this patch works: > > - list all interfaces, with the help of SIOCGIFCONF ioctl command > > - construct an RARP message and broadcast it > > Cc: Thibaut Collet > Signed-off-by: Yuanhan Liu > --- ... > + > +/* > + * Broadcast a RARP message to all interfaces, to update > + * switch's mac table > + */ > +int > +user_send_rarp(struct VhostUserMsg *msg) > +{ > + uint8_t *mac = (uint8_t *)&msg->payload.u64; > + uint8_t rarp[RARP_BUF_SIZE]; > + struct ifconf ifc = {0, }; > + struct ifreq *ifr; > + int nr = 16; > + int fd; > + uint32_t i; > + > + RTE_LOG(DEBUG, VHOST_CONFIG, > + ":: mac: %02x:%02x:%02x:%02x:%02x:%02x\n", > + mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]); > + > + make_rarp_packet(rarp, mac); > + > + /* > + * Get all interfaces > + */ > + fd = socket(AF_INET, SOCK_DGRAM, 0); > + if (fd < 0) { > + perror("failed to create AF_INET socket"); > + return -1; > + } > + > +again: > + ifc.ifc_len = sizeof(*ifr) * nr; > + ifc.ifc_buf = realloc(ifc.ifc_buf, ifc.ifc_len); > + > + if (ioctl(fd, SIOCGIFCONF, &ifc) < 0) { > + perror("failed at SIOCGIFCONF"); > + close(fd); > + return -1; > + } > + > + if (ifc.ifc_len == (int)sizeof(struct ifreq) * nr) { > + /* > + * current ifc_buf is not big enough to hold > + * all interfaces; double it and try again. > + */ > + nr *= 2; > + goto again; > + } > + > + ifr = (struct ifreq *)ifc.ifc_buf; > + for (i = 0; i < ifc.ifc_len / sizeof(struct ifreq); i++) > + send_rarp(ifr[i].ifr_name, rarp); > + > + close(fd); > + > + return 0; > +} From how you implement user_send_rarp(), if I understand it correctly, it broadcasts this ARP packets to all host interfaces, which I don't think it's appropriate. This ARP packets should be sent to it's own L2 networking. You should not make the hypothesis that all interfaces maintained in the kernel are in the same L2 networking. Even worse, this could bring problems when used in overlay networking, in which two VM in two different overlay networking, can have same MAC address. What I suggest here is to move user_send_rarp() to rte_vhost_dequeue_burst() using a flag to control, so that this arp packet can be broadcasted in its own L2 network. Thanks, Jianfeng