From: Oleg Strikov <oleg.strikov@canonical.com>
To: dev@dpdk.org
Subject: [dpdk-dev] Issues met while running openvswitch/dpdk/virtio inside the VM
Date: Thu, 7 May 2015 19:22:02 +0300 [thread overview]
Message-ID: <CAEj-Pb43J8OCk9zf=KnvEGELtY-_5K4AMjhi5whjL2_qWsXsMQ@mail.gmail.com> (raw)
Hi DPDK users and developers,
Few weeks ago I came up with the idea to run openvswitch with dpdk backend
inside qemu-kvm virtual machine. I don't have enough supported NICs yet and
my plan was to start experimenting inside the virtualized environment,
achieve functional state of all the components and then switch to the real
hardware. Additional useful side-effect of doing things inside the vm is
that issues can be easily reproduced by someone else in a different
environment.
I (fondly) hoped that running openvswitch/dpdk inside the vm would be
simpler than running the same set of components on the real hardware.
Unfortunately I met a bunch of issues on the way. All these issues lie on a
borderline between dpdk and openvswitch but I think that you might be
interested in my story. Please note that I still don't have
openvswitch/dpdk working inside the vm. I definetely have some progress
though.
Q: Does it sound okay from functional (not performance) standpoint to run
openvswitch/dpdk inside the vm? Do we want to be able to do this? Does
anyone from the dpdk development team do this?
## Issue 1 ##
Openvswitch requires backend pmd driver to provide N_CORES tx queues where
N_CORES is the amount of cores available on the machine (openvswitch counts
the amount of cpu* entries inside /sys/devices/system/node/node0/ folder).
To my understanding it doesn't take into account the actual amount of cores
used by dpdk and just allocates tx queue for each available core. You may
refer to this chunk of code for details:
https://github.com/openvswitch/ovs/blob/master/lib/dpif-netdev.c#L1067
This approach works fine on the real hardware but makes some issues when we
run openvswitch/dpdk inside the virtual machine. I tried both emulated
e1000 NIC and virtio NIC and neither of them worked just from the box.
Emulated e1000 NIC doesn't support multiple tx queues at all (see
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_e1000/em_ethdev.c#n884) and
virtio NIC doesn't support multiple tx queues by default. To enable
multiple tx queue for virtio NIC I had to add the following line to the
interface section of my libvirt config: '<driver name="vhost" queues="4"/>'
## Issue 2 ##
Openvswitch calls rte_eth_tx_queue_setup() twice for the same
port_id/queue_id. First call takes place during device initialization (see
call to dpdk_eth_dev_init() inside netdev_dpdk_init():
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L522).
Second call takes place when openvswitch tries to add more tx queues to the
device (see call to dpdk_eth_dev_init() inside netdev_dpdk_set_multiq():
https://github.com/openvswitch/ovs/blob/master/lib/netdev-dpdk.c#L697).
Second call not only initialized new queues but tries to re-initialize
existing ones.
Unfortunately virtio driver can't handle second call of
rte_eth_tx_queue_setup() and returns error here:
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_ethdev.c#n316
This happens because memzone with the name portN_tvqN already exists when
second call takes place (memzone has been created during the first call).
To deal with this issue I had to manually add rte_memzone_lookup-based
check for this situation and avoid allocation of a new memzone if it
already exists.
Q: Is it okay that openvswitch calls rte_eth_tx_queue_setup() twice? Right
now I can't understand if it's the issue with the virtio pmd driver or
incorrect API usage by openvswitch? Could someone shed some light on this
so I can move forward and maybe propose a fix.
## Issue 3 ##
This issue is also (somehow) related to the fact that openvswitch calls
rte_eth_tx_queue_setup() twice. I fix the previous issue by the method
described above and initialization finishes. The whole machinery starts to
work but crashes at the very beginning (while fetching the first packet
from the NIC maybe). This crash happens here:
http://dpdk.org/browse/dpdk/tree/lib/librte_pmd_virtio/virtio_rxtx.c#n588
It takes place because vq_ring structure contains zeros instead of correct
values:
vq_ring = {num = 0, desc = 0x0, avail = 0x0, used = 0x0}
My understanding is that vq_ring gets initialized after the first call to
rte_eth_tx_queue_setup(), then overwritten by the second call to
rte_eth_tx_queue_setup() but without an appropriate initialization for the
second time. I'm trying to fix this issue right now.
Q: Does it sound like a realistic goal to make virtio driver work in
openvswitch-like scenarios? I'm definitely not an expert in the area of
dpdk and can't estimate time and resources required. Maybe it's better to
wait until I get a proper hardware?
Thanks for helping,
Oleg
next reply other threads:[~2015-05-07 16:22 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-05-07 16:22 Oleg Strikov [this message]
2015-05-08 1:19 ` Pravin Shelar
2015-05-11 12:10 ` Traynor, Kevin
2015-05-12 15:57 ` Oleg Strikov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CAEj-Pb43J8OCk9zf=KnvEGELtY-_5K4AMjhi5whjL2_qWsXsMQ@mail.gmail.com' \
--to=oleg.strikov@canonical.com \
--cc=dev@dpdk.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).