DPDK patches and discussions
 help / color / mirror / Atom feed
From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: "Wang, Yinan" <yinan.wang@intel.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"matan@mellanox.com" <matan@mellanox.com>,
	"Xia, Chenbo" <chenbo.xia@intel.com>,
	"Liu, Yong" <yong.liu@intel.com>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"david.marchand@redhat.com" <david.marchand@redhat.com>
Subject: Re: [dpdk-dev] [PATCH 0/2] Fix vhost performance regression
Date: Fri, 24 Jul 2020 17:43:17 +0200	[thread overview]
Message-ID: <aae8a1a9-0592-3a15-08fa-ea16abd9b933@redhat.com> (raw)
In-Reply-To: <ea83bb41-a08d-5157-4f43-ac1db0026f2a@redhat.com>



On 7/24/20 10:54 AM, Maxime Coquelin wrote:
> 
> 
> On 7/24/20 9:06 AM, Maxime Coquelin wrote:
>> Hi Yinan,
>>
>> On 7/24/20 6:55 AM, Wang, Yinan wrote:
>>> Hi Maxime,
>>>
>>> The performance drop issue can be fixed, thanks!
>>> The multi-queues interrupt issue still exist w/ this patch set.
>>
>> Thanks for the test report, so that's only half good.
>> I'm setting up the multi-queues interrupt test case to further debug it.
> 
> I have now a reproducer, i.e. only interrupts are received on rxq0.
> 
> (gdb) p *((struct internal_list *)internal_list)->eth_dev->intr_handle
> $20 = {
>   {
>     vfio_dev_fd = 0,
>     uio_cfg_fd = 0
>   },
>   fd = 0,
>   type = RTE_INTR_HANDLE_VDEV,
>   max_intr = 2,
>   nb_efd = 1,
>   efd_counter_size = 8 '\b',
>   efds = {622, 621, 624, 626, 628, 630, 632, 634, 636, 638, 640, 642,
> 692, 646, 701, 650, 0 <repeats 496 times>},
>   elist = {{
>       status = 1,
>       fd = 622,
>       epfd = 645,
>       epdata = {
>         event = 2147483651,
>         data = 0x1,
>         cb_fun = 0x8af840 <eal_intr_proc_rxtx_intr>,
>         cb_arg = 0x7f4df0001580
>       }
>     }, {
>       status = 0,
>       fd = 0,
>       epfd = 0,
>       epdata = {
>         event = 0,
>         data = 0x0,
>         cb_fun = 0x0,
>         cb_arg = 0x0
>       }
>     } <repeats 511 times>},
>   intr_vec = 0x7f4df0007db0
> }
> 
> In above dump, we can see the efds are well set via the fix provided by
> Matan, but max_intr and nb_efd aren't so polling won't take them into
> account.
> 
> I'm working on a fix.

So it is a bit more complex than I imagined.
There are no DPDK API to update the FD in the epoll, so it seems we need
to do it directly in the driver by removing the old one and adding the
new one.

I have cooked a patch that makes it work, but I would like to know if
that would be acceptable  for this release? We could imagine introducing
new rte_epoll API to handle that properly in v20.11.

Matan, could you review below patch and confirm whether it is safe?

(The patch needs some style clean-up before being submitted).

Thanks in advance,
Maxime

==================================================================

diff --git a/drivers/net/vhost/rte_eth_vhost.c
b/drivers/net/vhost/rte_eth_vhost.c
index 14b7b59f67..f36ea4b24c 100644
--- a/drivers/net/vhost/rte_eth_vhost.c
+++ b/drivers/net/vhost/rte_eth_vhost.c
@@ -5,6 +5,7 @@
 #include <unistd.h>
 #include <pthread.h>
 #include <stdbool.h>
+#include <sys/epoll.h>

 #include <rte_mbuf.h>
 #include <rte_ethdev_driver.h>
@@ -593,7 +594,6 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)
 {
        struct rte_vhost_vring vring;
        struct vhost_queue *vq;
-       int count = 0;
        int nb_rxq = dev->data->nb_rx_queues;
        int i;
        int ret;
@@ -623,6 +623,8 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)

        VHOST_LOG(INFO, "Prepare intr vec\n");
        for (i = 0; i < nb_rxq; i++) {
+               dev->intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET
+ i;
+               dev->intr_handle->efds[i] = -1;
                vq = dev->data->rx_queues[i];
                if (!vq) {
                        VHOST_LOG(INFO, "rxq-%d not setup yet, skip!\n", i);
@@ -641,14 +643,12 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)
                                "rxq-%d's kickfd is invalid, skip!\n", i);
                        continue;
                }
-               dev->intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET
+ i;
                dev->intr_handle->efds[i] = vring.kickfd;
-               count++;
                VHOST_LOG(INFO, "Installed intr vec for rxq-%d\n", i);
        }

-       dev->intr_handle->nb_efd = count;
-       dev->intr_handle->max_intr = count + 1;
+       dev->intr_handle->nb_efd = nb_rxq;
+       dev->intr_handle->max_intr = nb_rxq + 1;
        dev->intr_handle->type = RTE_INTR_HANDLE_VDEV;

        return 0;
@@ -836,8 +836,11 @@ vring_conf_update(int vid, struct rte_eth_dev
*eth_dev, uint16_t vring_id)
        struct rte_eth_conf *dev_conf = &eth_dev->data->dev_conf;
        struct pmd_internal *internal = eth_dev->data->dev_private;
        struct rte_vhost_vring vring;
+       struct rte_intr_handle *handle;
+       struct rte_epoll_event rev;
        int rx_idx = vring_id % 2 ? (vring_id - 1) >> 1 : -1;
        int ret = 0;
+       int epfd;

        /*
         * The vring kickfd may be changed after the new device
notification.
@@ -852,9 +855,17 @@ vring_conf_update(int vid, struct rte_eth_dev
*eth_dev, uint16_t vring_id)
                        return ret;

                if (vring.kickfd != eth_dev->intr_handle->efds[rx_idx]) {
+                       handle = eth_dev->intr_handle;
                        VHOST_LOG(INFO, "kickfd for rxq-%d was changed.\n",
                                          rx_idx);
-                       eth_dev->intr_handle->efds[rx_idx] = vring.kickfd;
+
+                       handle->efds[rx_idx] = vring.kickfd;
+                       epfd = handle->elist[rx_idx].epfd;
+                       rev = handle->elist[rx_idx];
+                       rev.fd = vring.kickfd;
+                       rte_epoll_ctl(epfd, EPOLL_CTL_DEL,
handle->elist[rx_idx].fd, &handle->elist[rx_idx]);
+                       handle->elist[rx_idx] = rev;
+                       rte_epoll_ctl(epfd, EPOLL_CTL_ADD, rev.fd,
&handle->elist[rx_idx]);
                }
        }


> Regards,
> Maxime
> 
>> Regards,
>> Maxime
>>
>>> BR,
>>> Yinan
>>>
>>>> -----Original Message-----
>>>> From: Maxime Coquelin <maxime.coquelin@redhat.com>
>>>> Sent: 2020?7?23? 21:09
>>>> To: dev@dpdk.org; matan@mellanox.com; Xia, Chenbo
>>>> <chenbo.xia@intel.com>; Liu, Yong <yong.liu@intel.com>; Wang, Yinan
>>>> <yinan.wang@intel.com>
>>>> Cc: thomas@monjalon.net; Yigit, Ferruh <ferruh.yigit@intel.com>;
>>>> david.marchand@redhat.com; Maxime Coquelin
>>>> <maxime.coquelin@redhat.com>
>>>> Subject: [PATCH 0/2] Fix vhost performance regression
>>>>
>>>> Hi,
>>>>
>>>> This series aims at fixing the performance degradation reported
>>>> by Intel QE. I managed to reproduce the issue, and this series
>>>> fixes it.
>>>>
>>>> I only tested the first test case provided in the Bz[0], but wanted
>>>> to send early for Intel QE to try and confirm it solves the issue.
>>>>
>>>> I will work on reproducing the other test cases, and see if this
>>>> also fixes them.
>>>>
>>>> Thanks to Intel QE team for finding this issue.
>>>> Maxime
>>>>
>>>> [0]: https://bugs.dpdk.org/show_bug.cgi?id=507#c0
>>>>
>>>> Maxime Coquelin (2):
>>>>   vhost: fix guest notification setting
>>>>   net/vhost: fix queue update
>>>>
>>>>  drivers/net/vhost/rte_eth_vhost.c | 25 ++++++-------------------
>>>>  lib/librte_vhost/vhost.c          | 24 ++++++++++++++++++++----
>>>>  lib/librte_vhost/vhost.h          |  5 +++++
>>>>  lib/librte_vhost/vhost_user.c     | 11 ++++++++---
>>>>  4 files changed, 39 insertions(+), 26 deletions(-)
>>>>
>>>> --
>>>> 2.26.2
>>>
>>
> 


  reply	other threads:[~2020-07-24 15:43 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-23 13:08 Maxime Coquelin
2020-07-23 13:08 ` [dpdk-dev] [PATCH 1/2] vhost: fix guest notification setting Maxime Coquelin
2020-07-27 13:46   ` Xia, Chenbo
2020-07-23 13:08 ` [dpdk-dev] [PATCH 2/2] net/vhost: fix queue update Maxime Coquelin
2020-07-27 13:54   ` Xia, Chenbo
2020-07-24  4:55 ` [dpdk-dev] [PATCH 0/2] Fix vhost performance regression Wang, Yinan
2020-07-24  7:06   ` Maxime Coquelin
2020-07-24  8:54     ` Maxime Coquelin
2020-07-24 15:43       ` Maxime Coquelin [this message]
2020-07-27  8:28         ` Matan Azrad
2020-07-24 12:42     ` David Marchand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aae8a1a9-0592-3a15-08fa-ea16abd9b933@redhat.com \
    --to=maxime.coquelin@redhat.com \
    --cc=chenbo.xia@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=matan@mellanox.com \
    --cc=thomas@monjalon.net \
    --cc=yinan.wang@intel.com \
    --cc=yong.liu@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).