From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from na3sys009aog112.obsmtp.com (na3sys009aog112.obsmtp.com [74.125.149.207]) by dpdk.org (Postfix) with SMTP id C3E1C5A2D for ; Fri, 8 May 2015 03:19:57 +0200 (CEST) Received: from mail-ig0-f182.google.com ([209.85.213.182]) (using TLSv1) by na3sys009aob112.postini.com ([74.125.148.12]) with SMTP ID DSNKVUwPPHScOPSrbMPTSiWSVNsy00x0pFpO@postini.com; Thu, 07 May 2015 18:19:57 PDT Received: by mail-ig0-f182.google.com with SMTP id sb11so9342263igb.0 for ; Thu, 07 May 2015 18:19:56 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:in-reply-to:references:date :message-id:subject:from:to:cc:content-type; bh=vevj/LQu4xRr+wGz7xG2WFFp8pGATdla8u/h/uViD2U=; b=YhrqrQ5LESEic7SYLRCxmaNiTIVT0qeaIhw9cgxZfUEhvh6z6nSW15xveUtVhVMvRt bwTtIM4gD0IfhiWi4vR5lxR+c+DRsf3d16WhPuvgq9sVc2fY7Kjhb/28SAro77Y2CLbd fNEkcksdzGZZ4uMsJERxmLyDJpywFPGjkA6ZugQmJ3qGpH2NyriUWOhczKjj6hil5kkO PN+sHjRUq8AXrkVCIBbxIl7oJAB0oZrGz1atRiLPkJsHRh7CS6IGVdEQuxlRcKm20VtC G9w9TJ0Jjlj4ZKK1xUkE4WMMsDCjpLtacJQrLjlC8xofiHaPbRDVmVLeAYydmggILCox rnWA== X-Gm-Message-State: ALoCoQnuG3IpC4Jn1LCApK6GEMwy1FFojBu88HCrGfFnk7YHuqu1wmp83ifhaWg8CrGc0Lao5pDbqQ76xZY3abrAGql11sjUxCevZVpWev8ZwW+CDzBOveN6igVrtxaelTB3dWbHbpj557SpDZEpz0cLha1tH9C7TQ== X-Received: by 10.107.12.158 with SMTP id 30mr1743212iom.61.1431047996811; Thu, 07 May 2015 18:19:56 -0700 (PDT) MIME-Version: 1.0 X-Received: by 10.107.12.158 with SMTP id 30mr1743200iom.61.1431047996679; Thu, 07 May 2015 18:19:56 -0700 (PDT) Received: by 10.107.151.72 with HTTP; Thu, 7 May 2015 18:19:56 -0700 (PDT) In-Reply-To: References: Date: Thu, 7 May 2015 18:19:56 -0700 Message-ID: From: Pravin Shelar To: Oleg Strikov Content-Type: text/plain; charset=UTF-8 Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 08 May 2015 01:19:58 -0000 On Thu, May 7, 2015 at 9:22 AM, Oleg Strikov wrote: > Hi DPDK users and developers, > > Few weeks ago I came up with the idea to run openvswitch with dpdk backend > inside qemu-kvm virtual machine. I don't have enough supported NICs yet and > my plan was to start experimenting inside the virtualized environment, > achieve functional state of all the components and then switch to the real > hardware. Additional useful side-effect of doing things inside the vm is > that issues can be easily reproduced by someone else in a different > environment. > > I (fondly) hoped that running openvswitch/dpdk inside the vm would be > simpler than running the same set of components on the real hardware. > Unfortunately I met a bunch of issues on the way. All these issues lie on a > borderline between dpdk and openvswitch but I think that you might be > interested in my story. Please note that I still don't have > openvswitch/dpdk working inside the vm. I definetely have some progress > though. > Thanks for summarizing all the issues. DPDK is testing is done on real hardware and we are planing testing it in VM. This will certainly help in fixing issues sooner. > Q: Does it sound okay from functional (not performance) standpoint to run > openvswitch/dpdk inside the vm? Do we want to be able to do this? Does > anyone from the dpdk development team do this? > > ## Issue 1 ## > > Openvswitch requires backend pmd driver to provide N_CORES tx queues where > N_CORES is the amount of cores available on the machine (openvswitch counts > the amount of cpu* entries inside /sys/devices/system/node/node0/ folder). > To my understanding it doesn't take into account the actual amount of cores > used by dpdk and just allocates tx queue for each available core. You may > refer to this chunk of code for details: > https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067 > In case of OVS DPDK, there is no dpdk thread. Therefore all polling cores are managed by OVS and there is no need to account cores for DPDK. You can assign specific cores for OVS to limit number of cores used by OVS. > This approach works fine on the real hardware but makes some issues when we > run openvswitch/dpdk inside the virtual machine. I tried both emulated > e1000 NIC and virtio NIC and neither of them worked just from the box. > Emulated e1000 NIC doesn't support multiple tx queues at all (see > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and > virtio NIC doesn't support multiple tx queues by default. To enable > multiple tx queue for virtio NIC I had to add the following line to the > interface section of my libvirt config: '' > Good point. We should document this. Can you send patch to update README.DPDK? > ## Issue 2 ## > > Openvswitch calls rte_eth_tx_queue_setup() twice for the same > port_id/queue_id. First call takes place during device initialization (see > call to dpdk_eth_dev_init() inside netdev_dpdk_init(): > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522). > Second call takes place when openvswitch tries to add more tx queues to the > device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq(): > https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697). > Second call not only initialized new queues but tries to re-initialize > existing ones. > > Unfortunately virtio driver can't handle second call of > rte_eth_tx_queue_setup() and returns error here: > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316 > This happens because memzone with the name portN_tvqN already exists when > second call takes place (memzone has been created during the first call). > To deal with this issue I had to manually add rte_memzone_lookup-based > check for this situation and avoid allocation of a new memzone if it > already exists. > This sounds like issue with virtIO driver. I think we need to fix DPDK upstream for this to work correctly. > Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right > now I can't understand if it's the issue with the virtio pmd driver or > incorrect API usage by openvswitch? Could someone shed some light on this > so I can move forward and maybe propose a fix. > > ## Issue 3 ## > > This issue is also (somehow) related to the fact that openvswitch calls > rte_eth_tx_queue_setup() twice. I fix the previous issue by the method > described above and initialization finishes. The whole machinery starts to > work but crashes at the very beginning (while fetching the first packet > from the NIC maybe). This crash happens here: > http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588 > It takes place because vq_ring structure contains zeros instead of correct > values: > vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0} > My understanding is that vq_ring gets initialized after the first call to > rte_eth_tx_queue_setup(), then overwritten by the second call to > rte_eth_tx_queue_setup() but without an appropriate initialization for the > second time. I'm trying to fix this issue right now. > This also sounds like DPDK issue. > Q: Does it sound like a realistic goal to make virtio driver work in > openvswitch-like scenarios? I'm definitely not an expert in the area of > dpdk and can't estimate time and resources required. Maybe it's better to > wait until I get a proper hardware? > It will be nice to make OVS-DPDK work in VM. As I said I am also planning on working on it. Thanks for the heads up.