These are patches we are using in latest release to make our application work with virtio. Stephen Hemminger (2): virtio: initialize iopl when device is initialized virtio: allow running w/o vlan filtering lib/librte_eal/linuxapp/eal/eal.c | 5 +++++ lib/librte_pmd_virtio/virtio_ethdev.c | 6 ------ 2 files changed, 5 insertions(+), 6 deletions(-) -- 2.1.4
The virtio driver needs to use in/out instructions therefore it must initialize using iopl(2) system call. The problem is that virtio initialization happens very early, and any application that uses daemon() or calls eal_init later in another context will fail. The fix is to move the iopl into rte_eal_init. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- lib/librte_eal/linuxapp/eal/eal.c | 5 +++++ lib/librte_pmd_virtio/virtio_ethdev.c | 5 ----- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index 16f9e7c..76481f7 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -764,6 +764,11 @@ rte_eal_init(int argc, char **argv) rte_panic("Cannot init IVSHMEM\n"); #endif +#ifdef RTE_LIBRTE_VIRTIO_PMD + if (rte_eal_iopl_init() != 0) + rte_panic("IOPL call failed - cannot use virtio PMD"); +#endif + if (rte_eal_memory_init() < 0) rte_panic("Cannot init memory\n"); diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index d239083..e2600de 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -1247,11 +1247,6 @@ static int rte_virtio_pmd_init(const char *name __rte_unused, const char *param __rte_unused) { - if (rte_eal_iopl_init() != 0) { - PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD"); - return -1; - } - rte_eth_driver_register(&rte_virtio_pmd); return 0; } -- 2.1.4
Vlan filtering is an option, and not a requirement. If host does not support filtering then it can be done in software. Signed-off-by: Stephen Hemminger <stephen@networkplumber.org> --- lib/librte_pmd_virtio/virtio_ethdev.c | 1 - 1 file changed, 1 deletion(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index e2600de..5e43411 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) { PMD_DRV_LOG(NOTICE, "vlan filtering not available on this host"); - return -ENOTSUP; } if (pci_dev->driver->drv_flags & RTE_PCI_DRV_INTR_LSC) -- 2.1.4
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Friday, March 6, 2015 8:45 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering
>
> Vlan filtering is an option, and not a requirement.
> If host does not support filtering then it can be done in software.
>
The question is that guest only send command, no real action to do the vlan filter.
So if both host and guest have no real action for vlan filter, who will do it?
Thanks
Changchun
> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen
> Hemminger
> Sent: Friday, March 6, 2015 8:45 AM
> To: dev@dpdk.org
> Subject: [dpdk-dev] [PATCH 1/2] virtio: initialize iopl when device is initialized
>
> The virtio driver needs to use in/out instructions therefore it must initialize
> using iopl(2) system call. The problem is that virtio initialization happens very
> early, and any application that uses daemon() or calls eal_init later in another
> context will fail.
>
> The fix is to move the iopl into rte_eal_init.
>
Why need move virtio specific code into rte_eal_init?
thanks
Changchun
On Fri, 6 Mar 2015 03:41:25 +0000
"Ouyang, Changchun" <changchun.ouyang@intel.com> wrote:
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen
> > Hemminger
> > Sent: Friday, March 6, 2015 8:45 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH 1/2] virtio: initialize iopl when device is initialized
> >
> > The virtio driver needs to use in/out instructions therefore it must initialize
> > using iopl(2) system call. The problem is that virtio initialization happens very
> > early, and any application that uses daemon() or calls eal_init later in another
> > context will fail.
> >
> > The fix is to move the iopl into rte_eal_init.
> >
>
> Why need move virtio specific code into rte_eal_init?
> thanks
> Changchun
>
The issue is that virtio has no place it can do iopl() and have the IRQ thread
work. It only shows up on real code where application is daemon, not in a toy
demo or test application.
Right now:
gcc start
rte_virtio_pmd_init
iopl
main
daemon
fork
fork
Process is now child of init not original process
rte_eal_init
fork (pthread) for irq thread
irq thread
(no iopl permssion)
program start
rte_pmd_virtio_configure
So the only place where iopl() can be done in the proper context
so that the IRQ (and other helper threads in future) have the correct
permissions is in eal_init.
On Fri, 6 Mar 2015 03:39:47 +0000
"Ouyang, Changchun" <changchun.ouyang@intel.com> wrote:
>
>
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Stephen
> > Hemminger
> > Sent: Friday, March 6, 2015 8:45 AM
> > To: dev@dpdk.org
> > Subject: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering
> >
> > Vlan filtering is an option, and not a requirement.
> > If host does not support filtering then it can be done in software.
> >
>
> The question is that guest only send command, no real action to do the vlan filter.
> So if both host and guest have no real action for vlan filter, who will do it?
>
> Thanks
> Changchun
>
>
The virtio driver has features.
Guest can not send commands to host where feature bit not enabled.
Application can call filter_set and check if filter worked or not.
Our code already had to do MAC and VLAN validation of incoming packets
therefore if hardware can't do vlan match, there is no problem.
I would expect other applications would do the same thing.
Failing during configuration is bad. DPDK API should never force
application to play "guess the working configuration" with the device
driver or do string match on "which device is this anyway"
Hello Stephen,
On Fri, Mar 6, 2015 at 5:20 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> The issue is that virtio has no place it can do iopl() and have the IRQ
> thread
> work. It only shows up on real code where application is daemon, not in a
> toy
> demo or test application.
>
> Right now:
> gcc start
> rte_virtio_pmd_init
> iopl
> main
> daemon
> fork
> fork
> Process is now child of init not original process
>
> rte_eal_init
> fork (pthread) for irq thread
> irq thread
> (no iopl permssion)
> program start
> rte_pmd_virtio_configure
>
>
> So the only place where iopl() can be done in the proper context
> so that the IRQ (and other helper threads in future) have the correct
> permissions is in eal_init.
>
Is eth_virtio_dev_init() not a better place rather than eal ?
I prefer avoiding #ifdef pmd in eal.
--
David Marchand
On Fri, 6 Mar 2015 17:33:58 +0100
David Marchand <david.marchand@6wind.com> wrote:
> Is eth_virtio_dev_init() not a better place rather than eal ?
>
> I prefer avoiding #ifdef pmd in eal.
No virtio_dev_init is called too late, after interrupt handling
threads are spawned.
/* Launch threads, called at application init(). */
int
rte_eal_init(int argc, char **argv)
{
int i, fctret, ret;
...
if (rte_eal_intr_init() < 0)
rte_panic("Cannot init interrupt-handling thread\n");
...
if (rte_eal_dev_init() < 0)
rte_panic("Cannot init pmd devices\n"
On Fri, Mar 6, 2015 at 5:55 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> No virtio_dev_init is called too late, after interrupt handling
> threads are spawned.
>
> /* Launch threads, called at application init(). */
> int
> rte_eal_init(int argc, char **argv)
> {
> int i, fctret, ret;
> ...
> if (rte_eal_intr_init() < 0)
> rte_panic("Cannot init interrupt-handling thread\n");
> ...
>
> if (rte_eal_dev_init() < 0)
> rte_panic("Cannot init pmd devices\n"
>
There is no "io level" loss when going through fork and/or daemon : this
affirmation contradicts the manual and my test programs.
Now, looking at the problem from scratch, the constructor in virtio pmd
does not call iopl().
It registers a driver with an init function called from rte_eal_dev_init().
So iopl() is not called at "_start" time (in both virtio static or shared
object cases), but in rte_eal_dev_init().
In the end, the fix would be to move rte_eal_intr_init() after
rte_eal_dev_init().
Can you test this ?
Thanks.
--
David Marchand
On Fri, 6 Mar 2015 23:04:33 +0100
David Marchand <david.marchand@6wind.com> wrote:
> In the end, the fix would be to move rte_eal_intr_init() after rte_eal_dev_init().
That would work, but was worried it would break other devices.
On Sat, Mar 7, 2015 at 12:43 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Fri, 6 Mar 2015 23:04:33 +0100
> David Marchand <david.marchand@6wind.com> wrote:
>
> > In the end, the fix would be to move rte_eal_intr_init() after
> rte_eal_dev_init().
>
>
> That would work, but was worried it would break other devices.
>
I had checked the pmds before proposing.
rte_intr_* api is called only from the pdev (meaning pci) drivers at the
moment.
rte_eal_dev_init() means we are still in the "pmd init" phase for the pdev
drivers which means we are still initialising per-driver things for them
since they have no idea of the devices to handle (the devices are
discovered later with the pci scan).
In this phase, the code tells that they are not registering interrupts (and
I don't see how they could register interrupts without knowing devices).
The only problem would be for future vdev drivers (current drivers do not
use rte_intr_*).
But this problem would occur because the driver/device datamodel is not
consistent.
If we could rework this to have devices instantiated at the same phase,
then the problem won't happen.
So, we can hope we will have reworked the driver/device datamodel in dpdk,
before this problem hits us.
I only had checked the pmds, so I may have missed something else that uses
interrupts (if any) but I think this is worth trying and fixing for 2.0.
Thomas, opinion ?
--
David Marchand
So, a little summary. On Fri, Mar 6, 2015 at 5:20 PM, Stephen Hemminger < stephen@networkplumber.org> wrote: > > > The virtio driver needs to use in/out instructions therefore it must > initialize > > > using iopl(2) system call. The problem is that virtio initialization > happens very > > > early, and any application that uses daemon() or calls eal_init later > in another > > > context will fail. > Part of this is wrong. The manual tells that this should be inherited. I added some fork and daemon to test programs of mine which confirmed the manual seems to be right. The real problem is that iopl is not called at "constructor" time for it to be inherited by irq thread. > The issue is that virtio has no place it can do iopl() and have the IRQ > thread > work. It only shows up on real code where application is daemon, not in a > toy > demo or test application. > Contrary to what you asserted, the problem does happen when using testpmd (if this is the toy you are talking about). So, I would say your real world application is no better than testpmd. I will post the right fixes. -- David Marchand
This patchset fixes a segfault in eal linux when using virtio pmd with link status change interrupts in place because of an incorrect io privilege level. Ouyang, Fixing this has revealed some problems to be fixed (for 2.0.0 at least for the first): - changing link status from qemu monitor triggers this error message (even if link status seems to be correctly set) testpmd> EAL: Error reading interrupts status for fd 0 - log macros in virtio pmd (especially for init) are still under a build option. They should be aligned on other pmds to make it easier to debug, see ixgbe pmd and/or thread: http://dpdk.org/ml/archives/dev/2014-September/005450.html Can you look at these ? Thanks. -- David Marchand David Marchand (2): eal/linux: move plugin load to very start of eal init virtio: change io privilege level as early as possible lib/librte_eal/linuxapp/eal/eal.c | 14 +++---- lib/librte_pmd_virtio/virtio_ethdev.c | 71 ++++++++++++++++++++------------- 2 files changed, 50 insertions(+), 35 deletions(-) -- 1.7.10.4
Loading shared libraries should be done at the very start of eal init so that the code statically built in dpdk and the code loaded from shared objects is handled (almost) the same way wrt to call to rte_eal_init(). The only thing that must be done before is filling the solib_list which is done by eal_parse_args(). Signed-off-by: David Marchand <david.marchand@6wind.com> --- lib/librte_eal/linuxapp/eal/eal.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index 16f9e7c..c1c103d 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -725,6 +725,13 @@ rte_eal_init(int argc, char **argv) if (fctret < 0) exit(1); + TAILQ_FOREACH(solib, &solib_list, next) { + RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name); + solib->lib_handle = dlopen(solib->name, RTLD_NOW); + if (solib->lib_handle == NULL) + RTE_LOG(WARNING, EAL, "%s\n", dlerror()); + } + /* set log level as early as possible */ rte_set_log_level(internal_config.log_level); @@ -797,13 +804,6 @@ rte_eal_init(int argc, char **argv) rte_eal_mcfg_complete(); - TAILQ_FOREACH(solib, &solib_list, next) { - RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name); - solib->lib_handle = dlopen(solib->name, RTLD_NOW); - if (solib->lib_handle == NULL) - RTE_LOG(WARNING, EAL, "%s\n", dlerror()); - } - eal_thread_init_master(rte_config.master_lcore); ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN); -- 1.7.10.4
Playing with virtio link status triggers a segfault because of an incorrect io privilege level. To reproduce the problem, virtio device must be bound to igb_uio to have lsc enabled. $ lspci |grep Ethernet 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device $ modprobe uio $ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko $ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind $ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id $ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01 [snip] EAL: PCI device 0000:00:03.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd Interactive-mode selected Configuring Port 0 (socket 0) Port 0: DE:AD:DE:01:02:03 Checking link statuses... Port 0 Link Up - speed 10000 Mbps - full-duplex Done testpmd> Then, from qemu monitor: (qemu) set_link virtio-net-pci.0 off testpmd> Segmentation fault A call to rte_eal_iopl_init() in a specific constructor has been added so that, on Linux, iopl() is called asap: - for statically linked virtio pmd, the constructor will be called at binary start, - for shared virtio pmd (loaded using -d eal option), it will be called at dlopen() in rte_eal_init(). On linux, iopl() effects are inherited by children threads, so calling it in a constructor will ensure that any thread / process created after rte_eal_init() inherits the io privilege level needed by the pmd. We keep call to rte_eal_iopl_init() in the pmd init macro so that the pmd will not register if the io privilege level is insufficient. Besides, I moved rte_virtio_pmd_init() and related structure near its calling site, so that all init code can be looked at in a glance. The only change here is a new comment. Signed-off-by: David Marchand <david.marchand@6wind.com> --- lib/librte_pmd_virtio/virtio_ethdev.c | 71 ++++++++++++++++++++------------- 1 file changed, 43 insertions(+), 28 deletions(-) diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c index d239083..29332fa 100644 --- a/lib/librte_pmd_virtio/virtio_ethdev.c +++ b/lib/librte_pmd_virtio/virtio_ethdev.c @@ -1228,34 +1228,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv, return 0; } -static struct eth_driver rte_virtio_pmd = { - { - .name = "rte_virtio_pmd", - .id_table = pci_id_virtio_map, - }, - .eth_dev_init = eth_virtio_dev_init, - .dev_private_size = sizeof(struct virtio_hw), -}; - -/* - * Driver initialization routine. - * Invoked once at EAL init time. - * Register itself as the [Poll Mode] Driver of PCI virtio devices. - * Returns 0 on success. - */ -static int -rte_virtio_pmd_init(const char *name __rte_unused, - const char *param __rte_unused) -{ - if (rte_eal_iopl_init() != 0) { - PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD"); - return -1; - } - - rte_eth_driver_register(&rte_virtio_pmd); - return 0; -} - /* * Only 1 queue is supported, no queue release related operation */ @@ -1484,6 +1456,49 @@ __rte_unused uint8_t is_rx) return 0; } +/* + * Early init function, this is needed so that iopl() on linux is done before + * rte_eal_init (since it creates other threads that must inherit from the one + * that calls rte_eal_init). + */ +static void __attribute__((constructor, used)) +rte_virtio_early_init(void) +{ + /* return value is checked at pmd init time, no need to check here, see + * below. */ + rte_eal_iopl_init(); +} + +static struct eth_driver rte_virtio_pmd = { + { + .name = "rte_virtio_pmd", + .id_table = pci_id_virtio_map, + }, + .eth_dev_init = eth_virtio_dev_init, + .dev_private_size = sizeof(struct virtio_hw), +}; + +/* + * Driver initialization routine. + * Invoked once at EAL init time. + * Register itself as the [Poll Mode] Driver of PCI virtio devices. + * Returns 0 on success. + */ +static int +rte_virtio_pmd_init(const char *name __rte_unused, + const char *param __rte_unused) +{ + /* rte_eal_iopl_init() is already called by rte_virtio_early_init(). + * Call it again here, to be sure we did get the io level we wanted. */ + if (rte_eal_iopl_init() != 0) { + PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD"); + return -1; + } + + rte_eth_driver_register(&rte_virtio_pmd); + return 0; +} + static struct rte_driver rte_virtio_driver = { .type = PMD_PDEV, .init = rte_virtio_pmd_init, -- 1.7.10.4
On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> Loading shared libraries should be done at the very start of eal init so that
> the code statically built in dpdk and the code loaded from shared objects is
> handled (almost) the same way wrt to call to rte_eal_init().
> The only thing that must be done before is filling the solib_list which is done
> by eal_parse_args().
>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
> lib/librte_eal/linuxapp/eal/eal.c | 14 +++++++-------
> 1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c
> index 16f9e7c..c1c103d 100644
> --- a/lib/librte_eal/linuxapp/eal/eal.c
> +++ b/lib/librte_eal/linuxapp/eal/eal.c
> @@ -725,6 +725,13 @@ rte_eal_init(int argc, char **argv)
> if (fctret < 0)
> exit(1);
>
> + TAILQ_FOREACH(solib, &solib_list, next) {
> + RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> + solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> + if (solib->lib_handle == NULL)
> + RTE_LOG(WARNING, EAL, "%s\n", dlerror());
> + }
> +
> /* set log level as early as possible */
> rte_set_log_level(internal_config.log_level);
>
> @@ -797,13 +804,6 @@ rte_eal_init(int argc, char **argv)
>
> rte_eal_mcfg_complete();
>
> - TAILQ_FOREACH(solib, &solib_list, next) {
> - RTE_LOG(INFO, EAL, "open shared lib %s\n", solib->name);
> - solib->lib_handle = dlopen(solib->name, RTLD_NOW);
> - if (solib->lib_handle == NULL)
> - RTE_LOG(WARNING, EAL, "%s\n", dlerror());
> - }
> -
> eal_thread_init_master(rte_config.master_lcore);
>
> ret = eal_thread_dump_affinity(cpuset, RTE_CPU_AFFINITY_STR_LEN);
> --
> 1.7.10.4
>
>
I don't see anything explicitly wrong with this, but at the same time it doesn't
seem to fix anything. Is there a particular bug that you're fixing in relation
to your cover letter here? Or is there some expectation that PMD's loaded in
this fashion expect the dpdk to be completely uninitalized? That would seem
like a strange operational requirement to me.
Neil
On Tue, Mar 10, 2015 at 10:08:24AM +0100, David Marchand wrote: > Hello Neil, > > On Mon, Mar 9, 2015 at 4:21 PM, Neil Horman <nhorman@tuxdriver.com> wrote: > > > On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote: > > > Loading shared libraries should be done at the very start of eal init so > > that > > > the code statically built in dpdk and the code loaded from shared > > objects is > > > handled (almost) the same way wrt to call to rte_eal_init(). > > > The only thing that must be done before is filling the solib_list which > > is done > > > by eal_parse_args(). > > > > > > > > > I don't see anything explicitly wrong with this, but at the same time it > > doesn't > > seem to fix anything. Is there a particular bug that you're fixing in > > relation > > to your cover letter here? Or is there some expectation that PMD's loaded > > in > > this fashion expect the dpdk to be completely uninitalized? That would > > seem > > like a strange operational requirement to me. > > > > Well, at first, I wanted to fix the virtio pmd init issue (iopl() not > called at the right place wrt to other pthreads created in rte_eal_init()). Ah, this is what you were addressing: http://dpdk.org/ml/archives/dev/2015-March/014765.html > With next patch, this issue is fixed for statically builtin virtio pmd, but > for virtio pmd as a shared object, the dlopen comes too late. > So, yes, I moved the dlopen() for this reason. > But this doesn't do anything to help you. The goal, according to the above thread, is to initalize the pmd earlier so that you can call iopl prior to doing any forks (so that io privlidges are inherited). But both static and dynamic pmd have constructors that just register their driver structures. No initalization happens until rte_eal_dev_init is called. So this movement does nothing to change the time any given drivers init routine is called. > From a more general point of view, since we support both static and dso > pmds, I would say that this is more logical to have dlopen comes very > early, since static code is "loaded" even earlier : if the current pmds > needed more than just register to the driver list, they would already have > triggered segfaults and/or bugs. > No, not really. I suppose it doesn't hurt anything, but moving this earlier in a function doesn't really buy you anything, as statically allocate pmds are called by the gcc start code prior to an applications main routine running, so we're never actually going to get close to parity there, nor do we need to, because the actual init happens at rte_eal_dev_init, which is in parity for both static and dynamic drivers. > > I know this change comes really late for 2.0. > I am open to other ideas but I don't want to see more #ifdef <some feature> > in eal.c (especially for a pmd), this is a non sense. > > I would say that at least the patch 2 is needed for 2.0 : it fixes the > static case, but without patch 1 virtio pmd triggers a segfault on > interrupt receipt when built as a dso. > The static case suffers from problems as well I think, in that its possible to architect multiple processes that are not started from fork that use the same pmd, which would create the same issue. I think a better course of action would be to document the need for an application to call iopl before rte_eal_init. Neil > > -- > David Marchand
On Mon, Mar 09, 2015 at 03:56:39PM +0100, David Marchand wrote:
> Playing with virtio link status triggers a segfault because of an incorrect io
> privilege level.
>
> To reproduce the problem, virtio device must be bound to igb_uio to have lsc
> enabled.
>
> $ lspci |grep Ethernet
> 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
> $ modprobe uio
> $ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> $ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind
> $ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id
> $ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01
> [snip]
> EAL: PCI device 0000:00:03.0 on NUMA socket -1
> EAL: probe driver: 1af4:1000 rte_virtio_pmd
> Interactive-mode selected
> Configuring Port 0 (socket 0)
> Port 0: DE:AD:DE:01:02:03
> Checking link statuses...
> Port 0 Link Up - speed 10000 Mbps - full-duplex
> Done
> testpmd>
>
> Then, from qemu monitor:
> (qemu) set_link virtio-net-pci.0 off
>
> testpmd> Segmentation fault
>
> A call to rte_eal_iopl_init() in a specific constructor has been added so that,
> on Linux, iopl() is called asap:
> - for statically linked virtio pmd, the constructor will be called at binary
> start,
> - for shared virtio pmd (loaded using -d eal option), it will be called at
> dlopen() in rte_eal_init().
>
> On linux, iopl() effects are inherited by children threads, so calling it in a
> constructor will ensure that any thread / process created after rte_eal_init()
> inherits the io privilege level needed by the pmd.
>
> We keep call to rte_eal_iopl_init() in the pmd init macro so that the pmd will
> not register if the io privilege level is insufficient.
>
> Besides, I moved rte_virtio_pmd_init() and related structure near its calling
> site, so that all init code can be looked at in a glance.
> The only change here is a new comment.
>
> Signed-off-by: David Marchand <david.marchand@6wind.com>
> ---
> lib/librte_pmd_virtio/virtio_ethdev.c | 71 ++++++++++++++++++++-------------
> 1 file changed, 43 insertions(+), 28 deletions(-)
>
> diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
> index d239083..29332fa 100644
> --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> @@ -1228,34 +1228,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
> return 0;
> }
>
> -static struct eth_driver rte_virtio_pmd = {
> - {
> - .name = "rte_virtio_pmd",
> - .id_table = pci_id_virtio_map,
> - },
> - .eth_dev_init = eth_virtio_dev_init,
> - .dev_private_size = sizeof(struct virtio_hw),
> -};
> -
> -/*
> - * Driver initialization routine.
> - * Invoked once at EAL init time.
> - * Register itself as the [Poll Mode] Driver of PCI virtio devices.
> - * Returns 0 on success.
> - */
> -static int
> -rte_virtio_pmd_init(const char *name __rte_unused,
> - const char *param __rte_unused)
> -{
> - if (rte_eal_iopl_init() != 0) {
> - PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD");
> - return -1;
> - }
> -
> - rte_eth_driver_register(&rte_virtio_pmd);
> - return 0;
> -}
> -
> /*
> * Only 1 queue is supported, no queue release related operation
> */
> @@ -1484,6 +1456,49 @@ __rte_unused uint8_t is_rx)
> return 0;
> }
>
> +/*
> + * Early init function, this is needed so that iopl() on linux is done before
> + * rte_eal_init (since it creates other threads that must inherit from the one
> + * that calls rte_eal_init).
> + */
> +static void __attribute__((constructor, used))
> +rte_virtio_early_init(void)
> +{
> + /* return value is checked at pmd init time, no need to check here, see
> + * below. */
> + rte_eal_iopl_init();
> +}
> +
I don't see how this works for all cases. The constructor is called once when
the library is first loaded. What if you have multiple independent (i.e. not
forked children) processes that are using the dpdk in parallel? Only the
process that triggered the library load will have io permissions set
appropriately. I think what you need is to have every application that expects
to call through the transmit path or poll the receive path call iopl, which I
think speaks to having this requirement documented, so each application can call
iopl prior to calling fork/daemonize/etc.
Neil
Back on this old patch, it seems justified but nobody agreed.
--- a/lib/librte_pmd_virtio/virtio_ethdev.c
+++ b/lib/librte_pmd_virtio/virtio_ethdev.c
@@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
&& !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
PMD_DRV_LOG(NOTICE,
"vlan filtering not available on this host");
- return -ENOTSUP;
}
2015-03-06 08:24, Stephen Hemminger:
> "Ouyang, Changchun" <changchun.ouyang@intel.com> wrote:
> > > From: Stephen Hemminger
> > > Vlan filtering is an option, and not a requirement.
> > > If host does not support filtering then it can be done in software.
> >
> > The question is that guest only send command, no real action to do the vlan filter.
> > So if both host and guest have no real action for vlan filter, who will do it?
>
> The virtio driver has features.
> Guest can not send commands to host where feature bit not enabled.
> Application can call filter_set and check if filter worked or not.
>
> Our code already had to do MAC and VLAN validation of incoming packets
> therefore if hardware can't do vlan match, there is no problem.
> I would expect other applications would do the same thing.
>
> Failing during configuration is bad. DPDK API should never force
> application to play "guess the working configuration" with the device
> driver or do string match on "which device is this anyway"
2015-03-06 08:20, Stephen Hemminger: > The issue is that virtio has no place it can do iopl() and have the IRQ thread > work. It only shows up on real code where application is daemon, not in a toy > demo or test application. > > Right now: > gcc start > rte_virtio_pmd_init > iopl > main > daemon > fork > fork > Process is now child of init not original process > > rte_eal_init > fork (pthread) for irq thread > irq thread > (no iopl permssion) > program start > rte_pmd_virtio_configure This call tree is wrong. Below is the right (and more complete) one: gcc start driver constructor rte_eal_driver_register (if .a) main rte_eal_init eal_parse_args rte_eal_pci_init rte_eal_memory_init rte_eal_intr_init pthread_create eal_intr_thread_main eal_intr_handle_interrupts eal_intr_process_interrupts virtio_interrupt_handler dlopen (if .so) driver constructor rte_eal_driver_register rte_eal_dev_init driver->init rte_virtio_pmd_init rte_eal_iopl_init rte_eth_driver_register pthread_create rte_eal_pci_probe driver->devinit rte_eth_dev_init rte_eth_dev_allocate eth_drv->eth_dev_init eth_virtio_dev_init virtio_resource_init > So the only place where iopl() can be done in the proper context > so that the IRQ (and other helper threads in future) have the correct > permissions is in eal_init. No there are 2 other possible solutions: 1) Move rte_eal_intr_init() after rte_eal_dev_init(). 2) Move dlopen() before rte_eal_intr_init() and call iopl in the constructor. With the second solution, we must also keep an iopl call in rte_virtio_pmd_init() to return an error if iopl fails. The second solution was proposed in the series sent by David: http://dpdk.org/ml/archives/dev/2015-March/014931.html http://dpdk.org/ml/archives/dev/2015-March/014932.html
I have comments for that. Pls see below. > -----Original Message----- > From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com] > Sent: Wednesday, July 29, 2015 8:57 PM > To: Ouyang, Changchun > Cc: dev@dpdk.org; Stephen Hemminger > Subject: Re: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering > > Back on this old patch, it seems justified but nobody agreed. > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c > @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) > && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) { > PMD_DRV_LOG(NOTICE, > "vlan filtering not available on this host"); > - return -ENOTSUP; > } > > 2015-03-06 08:24, Stephen Hemminger: > > "Ouyang, Changchun" <changchun.ouyang@intel.com> wrote: > > > > From: Stephen Hemminger > > > > Vlan filtering is an option, and not a requirement. > > > > If host does not support filtering then it can be done in software. Yes, vlan filter is an option, but currently virtio driver has no software solution for vlan filter. So I would like to disable hw_vlan_filter in rxmode if the dev can't really support it rather than removing the return there. > > > > > > The question is that guest only send command, no real action to do the > vlan filter. > > > So if both host and guest have no real action for vlan filter, who will do it? > > > > The virtio driver has features. > > Guest can not send commands to host where feature bit not enabled. > > Application can call filter_set and check if filter worked or not. > > > > Our code already had to do MAC and VLAN validation of incoming packets There is vlan strip, but have no vlan filter in the rx function. > > therefore if hardware can't do vlan match, there is no problem. > > I would expect other applications would do the same thing. > > > > Failing during configuration is bad. DPDK API should never force > > application to play "guess the working configuration" with the device > > driver or do string match on "which device is this anyway"
Thomas, Changchun, On 29/07/2015 14:56, Thomas Monjalon wrote: > Back on this old patch, it seems justified but nobody agreed. > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c > @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) > && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) { > PMD_DRV_LOG(NOTICE, > "vlan filtering not available on this host"); > - return -ENOTSUP; > } > > 2015-03-06 08:24, Stephen Hemminger: >> "Ouyang, Changchun" <changchun.ouyang@intel.com> wrote: >>>> From: Stephen Hemminger >>>> Vlan filtering is an option, and not a requirement. >>>> If host does not support filtering then it can be done in software. +1 with Stephen, remove return -ENOTSUP; applications must not fail, software stacks will handle it. We did experiment some issues when testpmd was failing while it was supposed to run. A notice would be good enough. >>> >>> The question is that guest only send command, no real action to do the vlan filter. >>> So if both host and guest have no real action for vlan filter, who will do it? >> >> The virtio driver has features. >> Guest can not send commands to host where feature bit not enabled. >> Application can call filter_set and check if filter worked or not. >> >> Our code already had to do MAC and VLAN validation of incoming packets >> therefore if hardware can't do vlan match, there is no problem. >> I would expect other applications would do the same thing. >> >> Failing during configuration is bad. DPDK API should never force >> application to play "guess the working configuration" with the device >> driver or do string match on "which device is this anyway" Agree, it is not a failure of a configuration, it is a failure of negotiation of virtio's capabilities. Let's use another example: we do not expect a guest kernel to panic() because it is not properly negotiated? So why should a DPDK application fail and return -ENOTSUP? Thank you, Vincent
Hi Vincent, > -----Original Message----- > From: Vincent JARDIN [mailto:vincent.jardin@6wind.com] > Sent: Tuesday, August 4, 2015 8:52 PM > To: Thomas Monjalon; Ouyang, Changchun > Cc: dev@dpdk.org; Stephen Hemminger > Subject: Re: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan filtering > > Thomas, Changchun, > > On 29/07/2015 14:56, Thomas Monjalon wrote: > > Back on this old patch, it seems justified but nobody agreed. > > > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c > > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c > > @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev) > > && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) { > > PMD_DRV_LOG(NOTICE, > > "vlan filtering not available on this host"); > > - return -ENOTSUP; > > } > > > > 2015-03-06 08:24, Stephen Hemminger: > >> "Ouyang, Changchun" <changchun.ouyang@intel.com> wrote: > >>>> From: Stephen Hemminger > >>>> Vlan filtering is an option, and not a requirement. > >>>> If host does not support filtering then it can be done in software. > > +1 with Stephen, remove return -ENOTSUP; > > applications must not fail, software stacks will handle it. We did experiment Do you mean handling it in software stack outside the virtio pmd? AFAIK, inside virtio PMD, we have no codes to handle it currently. > some issues when testpmd was failing while it was supposed to run. A notice > would be good enough. > Use '--disable-hw-vlan-filter' in testpmd command line will allow it continue to work. You can have a try. > > >>> > >>> The question is that guest only send command, no real action to do the > vlan filter. > >>> So if both host and guest have no real action for vlan filter, who will do it? > >> > >> The virtio driver has features. > >> Guest can not send commands to host where feature bit not enabled. > >> Application can call filter_set and check if filter worked or not. > >> > >> Our code already had to do MAC and VLAN validation of incoming > >> packets therefore if hardware can't do vlan match, there is no problem. > >> I would expect other applications would do the same thing. > >> > >> Failing during configuration is bad. DPDK API should never force > >> application to play "guess the working configuration" with the device > >> driver or do string match on "which device is this anyway" > > Agree, it is not a failure of a configuration, it is a failure of negotiation of > virtio's capabilities. I am not sure which one is better when app configures one feature but fail to negotiate it with host(which means has no such capability to support this feature currently). 1)The driver cheat the app, and continue to do the rest of work(of course need some hints). 2)give hints and exit, then user re-run app with correct configuration. > > Let's use another example: we do not expect a guest kernel to panic() > because it is not properly negotiated? So why should a DPDK application fail > and return -ENOTSUP? I think user mode driver/app and kernel is different thing :-) Changchun
In Linux and BSD, if driver does not support filtering, the kernel takes
care of the situation.
A DPDK application will need a device layer; maybe some part of the
port/pipeline would
be a good place for it.
On Tue, Aug 4, 2015 at 6:01 PM, Ouyang, Changchun <
changchun.ouyang@intel.com> wrote:
>
> Hi Vincent,
>
> > -----Original Message-----
> > From: Vincent JARDIN [mailto:vincent.jardin@6wind.com]
> > Sent: Tuesday, August 4, 2015 8:52 PM
> > To: Thomas Monjalon; Ouyang, Changchun
> > Cc: dev@dpdk.org; Stephen Hemminger
> > Subject: Re: [dpdk-dev] [PATCH 2/2] virtio: allow running w/o vlan
> filtering
> >
> > Thomas, Changchun,
> >
> > On 29/07/2015 14:56, Thomas Monjalon wrote:
> > > Back on this old patch, it seems justified but nobody agreed.
> > >
> > > --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> > > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> > > @@ -1288,7 +1288,6 @@ virtio_dev_configure(struct rte_eth_dev *dev)
> > > && !vtpci_with_feature(hw, VIRTIO_NET_F_CTRL_VLAN)) {
> > > PMD_DRV_LOG(NOTICE,
> > > "vlan filtering not available on this
> host");
> > > - return -ENOTSUP;
> > > }
> > >
> > > 2015-03-06 08:24, Stephen Hemminger:
> > >> "Ouyang, Changchun" <changchun.ouyang@intel.com> wrote:
> > >>>> From: Stephen Hemminger
> > >>>> Vlan filtering is an option, and not a requirement.
> > >>>> If host does not support filtering then it can be done in software.
> >
> > +1 with Stephen, remove return -ENOTSUP;
> >
> > applications must not fail, software stacks will handle it. We did
> experiment
> Do you mean handling it in software stack outside the virtio pmd?
> AFAIK, inside virtio PMD, we have no codes to handle it currently.
>
> > some issues when testpmd was failing while it was supposed to run. A
> notice
> > would be good enough.
> >
>
> Use '--disable-hw-vlan-filter' in testpmd command line will allow it
> continue to work.
> You can have a try.
>
> >
> > >>>
> > >>> The question is that guest only send command, no real action to do
> the
> > vlan filter.
> > >>> So if both host and guest have no real action for vlan filter, who
> will do it?
> > >>
> > >> The virtio driver has features.
> > >> Guest can not send commands to host where feature bit not enabled.
> > >> Application can call filter_set and check if filter worked or not.
> > >>
> > >> Our code already had to do MAC and VLAN validation of incoming
> > >> packets therefore if hardware can't do vlan match, there is no
> problem.
> > >> I would expect other applications would do the same thing.
> > >>
> > >> Failing during configuration is bad. DPDK API should never force
> > >> application to play "guess the working configuration" with the device
> > >> driver or do string match on "which device is this anyway"
> >
> > Agree, it is not a failure of a configuration, it is a failure of
> negotiation of
> > virtio's capabilities.
>
> I am not sure which one is better when app configures one feature but fail
> to negotiate it with host(which means has
> no such capability to support this feature currently).
> 1)The driver cheat the app, and continue to do the rest of work(of course
> need some hints).
> 2)give hints and exit, then user re-run app with correct configuration.
>
> >
> > Let's use another example: we do not expect a guest kernel to panic()
> > because it is not properly negotiated? So why should a DPDK application
> fail
> > and return -ENOTSUP?
> I think user mode driver/app and kernel is different thing :-)
>
> Changchun
>
>
> Use '--disable-hw-vlan-filter' in testpmd command line will allow it continue to work. > You can have a try. Yes, but not using this flag should not imply to exit. > I am not sure which one is better when app configures one feature but fail to negotiate it with host(which means has > no such capability to support this feature currently). > 1)The driver cheat the app, and continue to do the rest of work(of course need some hints). > 2)give hints and exit, then user re-run app with correct configuration. Same as Stephen: 3) give hints of capabilities, do not exit, then user app does whatever it wants (including exit if needed). Thank you,
On Tue, 10 Mar 2015 09:14:28 -0400
Neil Horman <nhorman@tuxdriver.com> wrote:
> On Mon, Mar 09, 2015 at 03:56:39PM +0100, David Marchand wrote:
> > Playing with virtio link status triggers a segfault because of an incorrect io
> > privilege level.
> >
> > To reproduce the problem, virtio device must be bound to igb_uio to have lsc
> > enabled.
> >
> > $ lspci |grep Ethernet
> > 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device
> > $ modprobe uio
> > $ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko
> > $ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind
> > $ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id
> > $ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01
> > [snip]
> > EAL: PCI device 0000:00:03.0 on NUMA socket -1
> > EAL: probe driver: 1af4:1000 rte_virtio_pmd
> > Interactive-mode selected
> > Configuring Port 0 (socket 0)
> > Port 0: DE:AD:DE:01:02:03
> > Checking link statuses...
> > Port 0 Link Up - speed 10000 Mbps - full-duplex
> > Done
> > testpmd>
> >
> > Then, from qemu monitor:
> > (qemu) set_link virtio-net-pci.0 off
> >
> > testpmd> Segmentation fault
> >
> > A call to rte_eal_iopl_init() in a specific constructor has been added so that,
> > on Linux, iopl() is called asap:
> > - for statically linked virtio pmd, the constructor will be called at binary
> > start,
> > - for shared virtio pmd (loaded using -d eal option), it will be called at
> > dlopen() in rte_eal_init().
> >
> > On linux, iopl() effects are inherited by children threads, so calling it in a
> > constructor will ensure that any thread / process created after rte_eal_init()
> > inherits the io privilege level needed by the pmd.
> >
> > We keep call to rte_eal_iopl_init() in the pmd init macro so that the pmd will
> > not register if the io privilege level is insufficient.
> >
> > Besides, I moved rte_virtio_pmd_init() and related structure near its calling
> > site, so that all init code can be looked at in a glance.
> > The only change here is a new comment.
> >
> > Signed-off-by: David Marchand <david.marchand@6wind.com>
> > ---
> > lib/librte_pmd_virtio/virtio_ethdev.c | 71 ++++++++++++++++++++-------------
> > 1 file changed, 43 insertions(+), 28 deletions(-)
> >
> > diff --git a/lib/librte_pmd_virtio/virtio_ethdev.c b/lib/librte_pmd_virtio/virtio_ethdev.c
> > index d239083..29332fa 100644
> > --- a/lib/librte_pmd_virtio/virtio_ethdev.c
> > +++ b/lib/librte_pmd_virtio/virtio_ethdev.c
> > @@ -1228,34 +1228,6 @@ eth_virtio_dev_init(__rte_unused struct eth_driver *eth_drv,
> > return 0;
> > }
> >
> > -static struct eth_driver rte_virtio_pmd = {
> > - {
> > - .name = "rte_virtio_pmd",
> > - .id_table = pci_id_virtio_map,
> > - },
> > - .eth_dev_init = eth_virtio_dev_init,
> > - .dev_private_size = sizeof(struct virtio_hw),
> > -};
> > -
> > -/*
> > - * Driver initialization routine.
> > - * Invoked once at EAL init time.
> > - * Register itself as the [Poll Mode] Driver of PCI virtio devices.
> > - * Returns 0 on success.
> > - */
> > -static int
> > -rte_virtio_pmd_init(const char *name __rte_unused,
> > - const char *param __rte_unused)
> > -{
> > - if (rte_eal_iopl_init() != 0) {
> > - PMD_INIT_LOG(ERR, "IOPL call failed - cannot use virtio PMD");
> > - return -1;
> > - }
> > -
> > - rte_eth_driver_register(&rte_virtio_pmd);
> > - return 0;
> > -}
> > -
> > /*
> > * Only 1 queue is supported, no queue release related operation
> > */
> > @@ -1484,6 +1456,49 @@ __rte_unused uint8_t is_rx)
> > return 0;
> > }
> >
> > +/*
> > + * Early init function, this is needed so that iopl() on linux is done before
> > + * rte_eal_init (since it creates other threads that must inherit from the one
> > + * that calls rte_eal_init).
> > + */
> > +static void __attribute__((constructor, used))
> > +rte_virtio_early_init(void)
> > +{
> > + /* return value is checked at pmd init time, no need to check here, see
> > + * below. */
> > + rte_eal_iopl_init();
> > +}
> > +
>
> I don't see how this works for all cases. The constructor is called once when
> the library is first loaded. What if you have multiple independent (i.e. not
> forked children) processes that are using the dpdk in parallel? Only the
> process that triggered the library load will have io permissions set
> appropriately. I think what you need is to have every application that expects
> to call through the transmit path or poll the receive path call iopl, which I
> think speaks to having this requirement documented, so each application can call
> iopl prior to calling fork/daemonize/etc.
>
> Neil
I am still seeing this problem with DPDK 2.0 and 2.1.
It seems to me that doing the iopl init in eal_init is the only safe way.
Other workaround is to have application calling iopl_init before eal_init
but that kind of violates the current method of all things being initialized
by eal_init
On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Tue, 10 Mar 2015 09:14:28 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
>
> >
> > I don't see how this works for all cases. The constructor is called
> once when
> > the library is first loaded. What if you have multiple independent
> (i.e. not
> > forked children) processes that are using the dpdk in parallel? Only the
> > process that triggered the library load will have io permissions set
> > appropriately. I think what you need is to have every application that
> expects
> > to call through the transmit path or poll the receive path call iopl,
> which I
> > think speaks to having this requirement documented, so each application
> can call
> > iopl prior to calling fork/daemonize/etc.
> >
>
> I am still seeing this problem with DPDK 2.0 and 2.1.
> It seems to me that doing the iopl init in eal_init is the only safe way.
> Other workaround is to have application calling iopl_init before eal_init
> but that kind of violates the current method of all things being
> initialized
> by eal_init
>
Putting it in the virtio pmd constructor is my preferred solution and we
don't need to pollute the eal for virtio (specific to x86, btw).
--
David Marchand
On Wed, Sep 30, 2015 at 10:28:53AM +0200, David Marchand wrote: > On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger < > stephen@networkplumber.org> wrote: > > > On Tue, 10 Mar 2015 09:14:28 -0400 > > Neil Horman <nhorman@tuxdriver.com> wrote: > > > > > > > > I don't see how this works for all cases. The constructor is called > > once when > > > the library is first loaded. What if you have multiple independent > > (i.e. not > > > forked children) processes that are using the dpdk in parallel? Only the > > > process that triggered the library load will have io permissions set > > > appropriately. I think what you need is to have every application that > > expects > > > to call through the transmit path or poll the receive path call iopl, > > which I > > > think speaks to having this requirement documented, so each application > > can call > > > iopl prior to calling fork/daemonize/etc. > > > > > > > I am still seeing this problem with DPDK 2.0 and 2.1. > > It seems to me that doing the iopl init in eal_init is the only safe way. > > Other workaround is to have application calling iopl_init before eal_init > > but that kind of violates the current method of all things being > > initialized > > by eal_init > > > > Putting it in the virtio pmd constructor is my preferred solution and we > don't need to pollute the eal for virtio (specific to x86, btw). > Preferred solution or not, you can't just call iopl from the constructor, because not all process will get appropriate permissions. It needs to be called by every process. What Stephen is saying is that your solution has use cases for which it doesn't work, and that needs to be solved. Neil > > -- > David Marchand
2015-09-30 10:52, Neil Horman: > On Wed, Sep 30, 2015 at 10:28:53AM +0200, David Marchand wrote: > > On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger < > > stephen@networkplumber.org> wrote: > > > > > On Tue, 10 Mar 2015 09:14:28 -0400 > > > Neil Horman <nhorman@tuxdriver.com> wrote: > > > > I don't see how this works for all cases. The constructor is called > > > once when > > > > the library is first loaded. What if you have multiple independent > > > (i.e. not > > > > forked children) processes that are using the dpdk in parallel? Only the > > > > process that triggered the library load will have io permissions set > > > > appropriately. I think what you need is to have every application that > > > expects > > > > to call through the transmit path or poll the receive path call iopl, > > > which I > > > > think speaks to having this requirement documented, so each application > > > can call > > > > iopl prior to calling fork/daemonize/etc. > > > > > > > > > > I am still seeing this problem with DPDK 2.0 and 2.1. > > > It seems to me that doing the iopl init in eal_init is the only safe way. > > > Other workaround is to have application calling iopl_init before eal_init > > > but that kind of violates the current method of all things being > > > initialized by eal_init > > > > Putting it in the virtio pmd constructor is my preferred solution and we > > don't need to pollute the eal for virtio (specific to x86, btw). > > Preferred solution or not, you can't just call iopl from the constructor, > because not all process will get appropriate permissions. It needs to be called > by every process. What Stephen is saying is that your solution has use cases > for which it doesn't work, and that needs to be solved. I think it may be solved by calling iopl in the constructor. We just need an extra call in rte_virtio_pmd_init() to detect iopl failures. We can also simply move rte_eal_intr_init() after rte_eal_dev_init(). Please read my previous post on this topic: http://thread.gmane.org/gmane.comp.networking.dpdk.devel/14761/focus=22341 About the multiprocess case, I don't see the problem as the RX/TX and interrupt threads are forked in the rte_eal_init() context which should call iopl even in secondary processes.
On Wed, 30 Sep 2015 17:37:05 +0200
Thomas Monjalon <thomas.monjalon@6wind.com> wrote:
> 2015-09-30 10:52, Neil Horman:
> > On Wed, Sep 30, 2015 at 10:28:53AM +0200, David Marchand wrote:
> > > On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger <
> > > stephen@networkplumber.org> wrote:
> > >
> > > > On Tue, 10 Mar 2015 09:14:28 -0400
> > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > > I don't see how this works for all cases. The constructor is called
> > > > once when
> > > > > the library is first loaded. What if you have multiple independent
> > > > (i.e. not
> > > > > forked children) processes that are using the dpdk in parallel? Only the
> > > > > process that triggered the library load will have io permissions set
> > > > > appropriately. I think what you need is to have every application that
> > > > expects
> > > > > to call through the transmit path or poll the receive path call iopl,
> > > > which I
> > > > > think speaks to having this requirement documented, so each application
> > > > can call
> > > > > iopl prior to calling fork/daemonize/etc.
> > > > >
> > > >
> > > > I am still seeing this problem with DPDK 2.0 and 2.1.
> > > > It seems to me that doing the iopl init in eal_init is the only safe way.
> > > > Other workaround is to have application calling iopl_init before eal_init
> > > > but that kind of violates the current method of all things being
> > > > initialized by eal_init
> > >
> > > Putting it in the virtio pmd constructor is my preferred solution and we
> > > don't need to pollute the eal for virtio (specific to x86, btw).
> >
> > Preferred solution or not, you can't just call iopl from the constructor,
> > because not all process will get appropriate permissions. It needs to be called
> > by every process. What Stephen is saying is that your solution has use cases
> > for which it doesn't work, and that needs to be solved.
>
> I think it may be solved by calling iopl in the constructor.
> We just need an extra call in rte_virtio_pmd_init() to detect iopl failures.
> We can also simply move rte_eal_intr_init() after rte_eal_dev_init().
> Please read my previous post on this topic:
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/14761/focus=22341
>
> About the multiprocess case, I don't see the problem as the RX/TX and interrupt
> threads are forked in the rte_eal_init() context which should call iopl even in
> secondary processes.
Another alternative is to use ioperm to give access to certain regions.
But ioperm does not inherit to other threads, therefore it required some code on
startup to execute the syscall on each EAL thread.
On Wed, Sep 30, 2015 at 05:37:05PM +0200, Thomas Monjalon wrote:
> 2015-09-30 10:52, Neil Horman:
> > On Wed, Sep 30, 2015 at 10:28:53AM +0200, David Marchand wrote:
> > > On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger <
> > > stephen@networkplumber.org> wrote:
> > >
> > > > On Tue, 10 Mar 2015 09:14:28 -0400
> > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > > I don't see how this works for all cases. The constructor is called
> > > > once when
> > > > > the library is first loaded. What if you have multiple independent
> > > > (i.e. not
> > > > > forked children) processes that are using the dpdk in parallel? Only the
> > > > > process that triggered the library load will have io permissions set
> > > > > appropriately. I think what you need is to have every application that
> > > > expects
> > > > > to call through the transmit path or poll the receive path call iopl,
> > > > which I
> > > > > think speaks to having this requirement documented, so each application
> > > > can call
> > > > > iopl prior to calling fork/daemonize/etc.
> > > > >
> > > >
> > > > I am still seeing this problem with DPDK 2.0 and 2.1.
> > > > It seems to me that doing the iopl init in eal_init is the only safe way.
> > > > Other workaround is to have application calling iopl_init before eal_init
> > > > but that kind of violates the current method of all things being
> > > > initialized by eal_init
> > >
> > > Putting it in the virtio pmd constructor is my preferred solution and we
> > > don't need to pollute the eal for virtio (specific to x86, btw).
> >
> > Preferred solution or not, you can't just call iopl from the constructor,
> > because not all process will get appropriate permissions. It needs to be called
> > by every process. What Stephen is saying is that your solution has use cases
> > for which it doesn't work, and that needs to be solved.
>
> I think it may be solved by calling iopl in the constructor.
> We just need an extra call in rte_virtio_pmd_init() to detect iopl failures.
> We can also simply move rte_eal_intr_init() after rte_eal_dev_init().
> Please read my previous post on this topic:
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/14761/focus=22341
>
> About the multiprocess case, I don't see the problem as the RX/TX and interrupt
> threads are forked in the rte_eal_init() context which should call iopl even in
> secondary processes.
>
I'm not talking about secondary processes here (i.e. processes forked from a
parent that was the process which initialized the dpdk). I'm referring to two
completely independent processes, both of which link to and use the dpdk.
Though I think we're saying the same thing. When you say 'constructor' above,
you don't mean 'constructor' in the strict sense, but rather the pmd init
routine (the one called from rte_eal_vdev_init and rte_eal_dev_init). If this
is the case, then yes, that works fine, since each process linking to the DPDK
will enter those routines and call iopl. In fact, if thats the case, then no
call is needed in the constructor at all.
Neil
I think David's patch should be in 2.2 (and 2.1.X if that starts out). It solves the problem for both link state and secondary processes. It may need to be updated for the current drivers/net/virtio layout.
On Tue, 10 Mar 2015 06:55:41 -0400
Neil Horman <nhorman@tuxdriver.com> wrote:
> On Tue, Mar 10, 2015 at 10:08:24AM +0100, David Marchand wrote:
> > Hello Neil,
> >
> > On Mon, Mar 9, 2015 at 4:21 PM, Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > > On Mon, Mar 09, 2015 at 03:56:38PM +0100, David Marchand wrote:
> > > > Loading shared libraries should be done at the very start of eal init so
> > > that
> > > > the code statically built in dpdk and the code loaded from shared
> > > objects is
> > > > handled (almost) the same way wrt to call to rte_eal_init().
> > > > The only thing that must be done before is filling the solib_list which
> > > is done
> > > > by eal_parse_args().
> > > >
> > >
> > >
> > > I don't see anything explicitly wrong with this, but at the same time it
> > > doesn't
> > > seem to fix anything. Is there a particular bug that you're fixing in
> > > relation
> > > to your cover letter here? Or is there some expectation that PMD's loaded
> > > in
> > > this fashion expect the dpdk to be completely uninitalized? That would
> > > seem
> > > like a strange operational requirement to me.
> > >
> >
> > Well, at first, I wanted to fix the virtio pmd init issue (iopl() not
> > called at the right place wrt to other pthreads created in rte_eal_init()).
> Ah, this is what you were addressing:
> http://dpdk.org/ml/archives/dev/2015-March/014765.html
>
> > With next patch, this issue is fixed for statically builtin virtio pmd, but
> > for virtio pmd as a shared object, the dlopen comes too late.
> > So, yes, I moved the dlopen() for this reason.
> >
> But this doesn't do anything to help you. The goal, according to the above
> thread, is to initalize the pmd earlier so that you can call iopl prior to doing
> any forks (so that io privlidges are inherited). But both static and dynamic
> pmd have constructors that just register their driver structures. No
> initalization happens until rte_eal_dev_init is called. So this movement does
> nothing to change the time any given drivers init routine is called.
>
> > From a more general point of view, since we support both static and dso
> > pmds, I would say that this is more logical to have dlopen comes very
> > early, since static code is "loaded" even earlier : if the current pmds
> > needed more than just register to the driver list, they would already have
> > triggered segfaults and/or bugs.
> >
> No, not really. I suppose it doesn't hurt anything, but moving this earlier in
> a function doesn't really buy you anything, as statically allocate pmds are
> called by the gcc start code prior to an applications main routine running, so
> we're never actually going to get close to parity there, nor do we need to,
> because the actual init happens at rte_eal_dev_init, which is in parity for both
> static and dynamic drivers.
>
> >
> > I know this change comes really late for 2.0.
> > I am open to other ideas but I don't want to see more #ifdef <some feature>
> > in eal.c (especially for a pmd), this is a non sense.
> >
> > I would say that at least the patch 2 is needed for 2.0 : it fixes the
> > static case, but without patch 1 virtio pmd triggers a segfault on
> > interrupt receipt when built as a dso.
> >
> The static case suffers from problems as well I think, in that its possible to
> architect multiple processes that are not started from fork that use the same
> pmd, which would create the same issue. I think a better course of action would
> be to document the need for an application to call iopl before rte_eal_init.
>
Given all this, I recommend that Thomas not apply this patch.
Please resubmit if there is a real problem with drivers (something in tree).
There are enough other bugs to fix without chasing ghosts.
On Thu, 1 Oct 2015 07:25:45 -0400
Neil Horman <nhorman@tuxdriver.com> wrote:
> On Wed, Sep 30, 2015 at 05:37:05PM +0200, Thomas Monjalon wrote:
> > 2015-09-30 10:52, Neil Horman:
> > > On Wed, Sep 30, 2015 at 10:28:53AM +0200, David Marchand wrote:
> > > > On Tue, Sep 29, 2015 at 9:25 PM, Stephen Hemminger <
> > > > stephen@networkplumber.org> wrote:
> > > >
> > > > > On Tue, 10 Mar 2015 09:14:28 -0400
> > > > > Neil Horman <nhorman@tuxdriver.com> wrote:
> > > > > > I don't see how this works for all cases. The constructor is called
> > > > > once when
> > > > > > the library is first loaded. What if you have multiple independent
> > > > > (i.e. not
> > > > > > forked children) processes that are using the dpdk in parallel? Only the
> > > > > > process that triggered the library load will have io permissions set
> > > > > > appropriately. I think what you need is to have every application that
> > > > > expects
> > > > > > to call through the transmit path or poll the receive path call iopl,
> > > > > which I
> > > > > > think speaks to having this requirement documented, so each application
> > > > > can call
> > > > > > iopl prior to calling fork/daemonize/etc.
> > > > > >
> > > > >
> > > > > I am still seeing this problem with DPDK 2.0 and 2.1.
> > > > > It seems to me that doing the iopl init in eal_init is the only safe way.
> > > > > Other workaround is to have application calling iopl_init before eal_init
> > > > > but that kind of violates the current method of all things being
> > > > > initialized by eal_init
> > > >
> > > > Putting it in the virtio pmd constructor is my preferred solution and we
> > > > don't need to pollute the eal for virtio (specific to x86, btw).
> > >
> > > Preferred solution or not, you can't just call iopl from the constructor,
> > > because not all process will get appropriate permissions. It needs to be called
> > > by every process. What Stephen is saying is that your solution has use cases
> > > for which it doesn't work, and that needs to be solved.
> >
> > I think it may be solved by calling iopl in the constructor.
> > We just need an extra call in rte_virtio_pmd_init() to detect iopl failures.
> > We can also simply move rte_eal_intr_init() after rte_eal_dev_init().
> > Please read my previous post on this topic:
> > http://thread.gmane.org/gmane.comp.networking.dpdk.devel/14761/focus=22341
> >
> > About the multiprocess case, I don't see the problem as the RX/TX and interrupt
> > threads are forked in the rte_eal_init() context which should call iopl even in
> > secondary processes.
> >
>
> I'm not talking about secondary processes here (i.e. processes forked from a
> parent that was the process which initialized the dpdk). I'm referring to two
> completely independent processes, both of which link to and use the dpdk.
>
> Though I think we're saying the same thing. When you say 'constructor' above,
> you don't mean 'constructor' in the strict sense, but rather the pmd init
> routine (the one called from rte_eal_vdev_init and rte_eal_dev_init). If this
> is the case, then yes, that works fine, since each process linking to the DPDK
> will enter those routines and call iopl. In fact, if thats the case, then no
> call is needed in the constructor at all.
I think this patch should be rebased and resubmitted for 2.2.
It fixes a real problem (virtio link state). The driver changed directory
and the the patch could be redone to minimize changes.
On Wed, Oct 14, 2015 at 2:07 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> On Thu, 1 Oct 2015 07:25:45 -0400
> Neil Horman <nhorman@tuxdriver.com> wrote:
>
> > On Wed, Sep 30, 2015 at 05:37:05PM +0200, Thomas Monjalon wrote:
> > > 2015-09-30 10:52, Neil Horman:
>
> > > I think it may be solved by calling iopl in the constructor.
> > > We just need an extra call in rte_virtio_pmd_init() to detect iopl
> failures.
> > > We can also simply move rte_eal_intr_init() after rte_eal_dev_init().
> > > Please read my previous post on this topic:
> > >
> http://thread.gmane.org/gmane.comp.networking.dpdk.devel/14761/focus=22341
> > >
> > > About the multiprocess case, I don't see the problem as the RX/TX and
> interrupt
> > > threads are forked in the rte_eal_init() context which should call
> iopl even in
> > > secondary processes.
> > >
> >
> > I'm not talking about secondary processes here (i.e. processes forked
> from a
> > parent that was the process which initialized the dpdk). I'm referring
> to two
> > completely independent processes, both of which link to and use the dpdk.
> >
> > Though I think we're saying the same thing. When you say 'constructor'
> above,
> > you don't mean 'constructor' in the strict sense, but rather the pmd init
> > routine (the one called from rte_eal_vdev_init and rte_eal_dev_init).
> If this
> > is the case, then yes, that works fine, since each process linking to
> the DPDK
> > will enter those routines and call iopl. In fact, if thats the case,
> then no
> > call is needed in the constructor at all.
>
> I think this patch should be rebased and resubmitted for 2.2.
>
> It fixes a real problem (virtio link state). The driver changed directory
> and the the patch could be redone to minimize changes.
>
I am testing Thomas proposal (just moving the intr_init() call), else I
will rebase this patch.
--
David Marchand
On Wed, Oct 14, 2015 at 10:00 AM, David Marchand <david.marchand@6wind.com>
wrote:
>
> I am testing Thomas proposal (just moving the intr_init() call), else I
> will rebase this patch.
>
Thomas approach works and is minimal.
I ran some autotest, did not see anything wrong (let aside the tests that
already fail on current head).
Looking at eal and the drivers, this looks fine as well.
So my patches can be dropped.
Let's go with this new patch.
--
David Marchand
For virtio-net pmd, the interrupt management thread must be created after this driver has initialised so that iopl() has been properly called and its effects are inherited by all eal children threads. Before this change, changing link status on a virtio-net device would trigger a segfault in the interrupt thread : $ mkdir -p /mnt/huge $ echo 256 > /proc/sys/vm/nr_hugepages $ mount -t hugetlbfs none /mnt/huge $ lspci |grep Ethernet 00:03.0 Ethernet controller: Red Hat, Inc Virtio network device $ modprobe uio $ insmod ./x86_64-native-linuxapp-gcc/kmod/igb_uio.ko $ echo 0000:00:03.0 > /sys/bus/pci/devices/0000\:00\:03.0/driver/unbind $ echo 1af4 1000 > /sys/bus/pci/drivers/igb_uio/new_id $ ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x6 -n 3 -w 0000:00:03.0 -- -i --txqflags=0xf01 --total-num-mbufs 2048 [snip] EAL: PCI device 0000:00:03.0 on NUMA socket -1 EAL: probe driver: 1af4:1000 rte_virtio_pmd Interactive-mode selected Configuring Port 0 (socket 0) Port 0: DE:AD:DE:01:02:03 Checking link statuses... Port 0 Link Up - speed 10000 Mbps - full-duplex Done testpmd> Then, from qemu monitor: (qemu) set_link virtio-net-pci.0 off testpmd> Segmentation fault Fixes: "eal: set iopl only when needed" (565b85d) Reported-by: Stephen Hemminger <shemming@brocade.com> Suggested-by: Thomas Monjalon <thomas.monjalon@6wind.com> Signed-off-by: David Marchand <david.marchand@6wind.com> --- lib/librte_eal/linuxapp/eal/eal.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/lib/librte_eal/linuxapp/eal/eal.c b/lib/librte_eal/linuxapp/eal/eal.c index 33e1067..e0ad1d7 100644 --- a/lib/librte_eal/linuxapp/eal/eal.c +++ b/lib/librte_eal/linuxapp/eal/eal.c @@ -817,9 +817,6 @@ rte_eal_init(int argc, char **argv) if (rte_eal_alarm_init() < 0) rte_panic("Cannot init interrupt-handling thread\n"); - if (rte_eal_intr_init() < 0) - rte_panic("Cannot init interrupt-handling thread\n"); - if (rte_eal_timer_init() < 0) rte_panic("Cannot init HPET or TSC timers\n"); @@ -845,6 +842,9 @@ rte_eal_init(int argc, char **argv) if (rte_eal_dev_init() < 0) rte_panic("Cannot init pmd devices\n"); + if (rte_eal_intr_init() < 0) + rte_panic("Cannot init interrupt-handling thread\n"); + RTE_LCORE_FOREACH_SLAVE(i) { /* -- 1.7.10.4
On Wed, Oct 14, 2015 at 2:05 AM, Stephen Hemminger <
stephen@networkplumber.org> wrote:
> Given all this, I recommend that Thomas not apply this patch.
> Please resubmit if there is a real problem with drivers (something in
> tree).
> There are enough other bugs to fix without chasing ghosts.
>
Yes, this patch can be dropped.
Removed it from patchwork.
--
David Marchand
On Wed, Oct 14, 2015 at 11:50 AM, David Marchand <david.marchand@6wind.com>
wrote:
> Fixes: "eal: set iopl only when needed" (565b85d)
>
Forgot the release note.
If this current patch is ok for all, I will resubmit with an update to
doc/guides/rel_notes/release_2_2.rst.
--
David Marchand
2015-10-14 13:32, David Marchand:
> On Wed, Oct 14, 2015 at 11:50 AM, David Marchand <david.marchand@6wind.com>
> wrote:
>
> > Fixes: "eal: set iopl only when needed" (565b85d)
> >
>
> Forgot the release note.
> If this current patch is ok for all, I will resubmit with an update to
> doc/guides/rel_notes/release_2_2.rst.
Applied with release note, thanks.
2015-08-05 12:49, Vincent JARDIN:
> > Use '--disable-hw-vlan-filter' in testpmd command line will allow it continue to work.
> > You can have a try.
>
> Yes, but not using this flag should not imply to exit.
>
> > I am not sure which one is better when app configures one feature but fail to negotiate it with host(which means has
> > no such capability to support this feature currently).
> > 1)The driver cheat the app, and continue to do the rest of work(of course need some hints).
> > 2)give hints and exit, then user re-run app with correct configuration.
>
> Same as Stephen:
>
> 3) give hints of capabilities, do not exit, then user app does whatever
> it wants (including exit if needed).
I would tend to apply this patch but as it was discussed we need some
clear acknowledgement.
Huawei?
This is a very old thread which has no conclusion. Now that we have more experience with virtio, could we try to fix it properly? http://dpdk.org/patch/3897/ 2015-10-21 15:58, Thomas Monjalon: > 2015-08-05 12:49, Vincent JARDIN: > > > Use '--disable-hw-vlan-filter' in testpmd command line will allow it continue to work. > > > You can have a try. > > > > Yes, but not using this flag should not imply to exit. > > > > > I am not sure which one is better when app configures one feature but fail to negotiate it with host(which means has > > > no such capability to support this feature currently). > > > 1)The driver cheat the app, and continue to do the rest of work(of course need some hints). > > > 2)give hints and exit, then user re-run app with correct configuration. > > > > Same as Stephen: > > > > 3) give hints of capabilities, do not exit, then user app does whatever > > it wants (including exit if needed). > > I would tend to apply this patch but as it was discussed we need some > clear acknowledgement. > Huawei?