From: Maxime Coquelin <maxime.coquelin@redhat.com>
To: David Marchand <david.marchand@redhat.com>
Cc: dev <dev@dpdk.org>, Matan Azrad <matan@mellanox.com>,
"Xia, Chenbo" <chenbo.xia@intel.com>,
Marvin Liu <yong.liu@intel.com>,
"Wang, Yinan" <yinan.wang@intel.com>,
Thomas Monjalon <thomas@monjalon.net>,
"Yigit, Ferruh" <ferruh.yigit@intel.com>
Subject: Re: [dpdk-dev] [PATCH v3 3/3] net/vhost: fix interrupt mode
Date: Wed, 29 Jul 2020 15:19:42 +0200 [thread overview]
Message-ID: <0d057cd7-47a2-542e-7dd5-1348c41ba48a@redhat.com> (raw)
In-Reply-To: <CAJFAV8yU=6aSc3Rs7ARwON9FW+Pi_pSo=mYNcBFmF2iKE9dqPQ@mail.gmail.com>
On 7/29/20 1:27 PM, David Marchand wrote:
> On Wed, Jul 29, 2020 at 11:20 AM Maxime Coquelin
> <maxime.coquelin@redhat.com> wrote:
>>
>> At .new_device() time, only the first vring pair is
>> now ready, other vrings are consfigured later.
>
> configured*
>
>>
>> Problem is that when application will setup and enable
>> interrupts, only the first queue pair Rx interrupt will
>> be enabled.
>>
>> This patches fixes the issue by setting the number of
>> max interrupts to the number of Rx queues that will be
>> later initialized. Then, as soon as a Rx vring is ready
>> and interrupt enabled by the application, it removes the
>> corresponding uninitialized epoll event, and install a
>
> installs*
>
>> new one with the valid FD.
>>
>> Fixes: 604052ae5395 ("net/vhost: support queue update")
>>
>> Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
>
> It seems a bit of a hack, but I _think_ the patch is good wrt races on
> epoll configuration.
>
> We are only touching the vhost pmd, in interrupt mode.
> The interrupt mode is not that frequently used (I found no usage in
> opensource projects).
> The vhost pmd is not used in OVS as it lags behind the vhost library
> and has limitations.
>
> So my opinion is that the risk of taking this patch rather than
> reverting the changes (which is not trivial iiuc) in the vhost library
> is acceptable.
>
>
> One comment below:
>
>> ---
>> drivers/net/vhost/rte_eth_vhost.c | 75 +++++++++++++++++++++++++++----
>> 1 file changed, 66 insertions(+), 9 deletions(-)
>>
>> diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c
>> index 951929c663..237785dd66 100644
>> --- a/drivers/net/vhost/rte_eth_vhost.c
>> +++ b/drivers/net/vhost/rte_eth_vhost.c
>> @@ -5,6 +5,7 @@
>> #include <unistd.h>
>> #include <pthread.h>
>> #include <stdbool.h>
>> +#include <sys/epoll.h>
>>
>> #include <rte_mbuf.h>
>> #include <rte_ethdev_driver.h>
>> @@ -95,6 +96,8 @@ struct vhost_queue {
>> uint16_t port;
>> uint16_t virtqueue_id;
>> struct vhost_stats stats;
>> + int intr_enable;
>> + rte_spinlock_t intr_lock;
>> };
>>
>> struct pmd_internal {
>> @@ -524,6 +527,45 @@ find_internal_resource(char *ifname)
>> return list;
>> }
>>
>> +static int
>> +eth_vhost_update_intr(struct rte_eth_dev *eth_dev, uint16_t rxq_idx)
>> +{
>> + struct rte_intr_handle *handle = eth_dev->intr_handle;
>> + struct rte_epoll_event rev;
>> + int epfd, ret;
>> +
>> + if (handle->efds[rxq_idx] == handle->elist[rxq_idx].fd)
>> + return 0;
>
> Feel free to ignore if this situation can not happen.
>
> We are expecting only -1 -> valid fd transitions.
> Maybe add an error log if we are in another situation?
> This would indicate something quite broken.
That's a very good idea, I will add such warning in v4.
Thanks,
Maxime
>
>
>> +
>> + VHOST_LOG(INFO, "kickfd for rxq-%d was changed, updating handler.\n",
>> + rxq_idx);
>> +
>> + /*
>> + * First remove invalid epoll event, and then isntall
>> + * the new one. May be solved with a proper API in the
>> + * future.
>> + */
>> + epfd = handle->elist[rxq_idx].epfd;
>> + rev = handle->elist[rxq_idx];
>> + ret = rte_epoll_ctl(epfd, EPOLL_CTL_DEL, rev.fd,
>> + &handle->elist[rxq_idx]);
>> + if (ret) {
>> + VHOST_LOG(ERR, "Delete epoll event failed.\n");
>> + return ret;
>> + }
>> +
>> + rev.fd = handle->efds[rxq_idx];
>> + handle->elist[rxq_idx] = rev;
>> + ret = rte_epoll_ctl(epfd, EPOLL_CTL_ADD, rev.fd,
>> + &handle->elist[rxq_idx]);
>> + if (ret) {
>> + VHOST_LOG(ERR, "Add epoll event failed.\n");
>> + return ret;
>> + }
>> +
>> + return 0;
>> +}
>> +
>> static int
>> eth_rxq_intr_enable(struct rte_eth_dev *dev, uint16_t qid)
>> {
>> @@ -537,6 +579,11 @@ eth_rxq_intr_enable(struct rte_eth_dev *dev, uint16_t qid)
>> return -1;
>> }
>>
>> + rte_spinlock_lock(&vq->intr_lock);
>> + vq->intr_enable = 1;
>> + ret = eth_vhost_update_intr(dev, qid);
>> + rte_spinlock_unlock(&vq->intr_lock);
>> +
>> ret = rte_vhost_get_vhost_vring(vq->vid, (qid << 1) + 1, &vring);
>> if (ret < 0) {
>> VHOST_LOG(ERR, "Failed to get rxq%d's vring\n", qid);
>> @@ -571,6 +618,8 @@ eth_rxq_intr_disable(struct rte_eth_dev *dev, uint16_t qid)
>> rte_vhost_enable_guest_notification(vq->vid, (qid << 1) + 1, 0);
>> rte_wmb();
>>
>> + vq->intr_enable = 0;
>> +
>> return 0;
>> }
>>
>> @@ -593,7 +642,6 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)
>> {
>> struct rte_vhost_vring vring;
>> struct vhost_queue *vq;
>> - int count = 0;
>> int nb_rxq = dev->data->nb_rx_queues;
>> int i;
>> int ret;
>> @@ -623,6 +671,8 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)
>>
>> VHOST_LOG(INFO, "Prepare intr vec\n");
>> for (i = 0; i < nb_rxq; i++) {
>> + dev->intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + i;
>> + dev->intr_handle->efds[i] = -1;
>> vq = dev->data->rx_queues[i];
>> if (!vq) {
>> VHOST_LOG(INFO, "rxq-%d not setup yet, skip!\n", i);
>> @@ -641,14 +691,12 @@ eth_vhost_install_intr(struct rte_eth_dev *dev)
>> "rxq-%d's kickfd is invalid, skip!\n", i);
>> continue;
>> }
>> - dev->intr_handle->intr_vec[i] = RTE_INTR_VEC_RXTX_OFFSET + i;
>> dev->intr_handle->efds[i] = vring.kickfd;
>> - count++;
>> VHOST_LOG(INFO, "Installed intr vec for rxq-%d\n", i);
>> }
>>
>> - dev->intr_handle->nb_efd = count;
>> - dev->intr_handle->max_intr = count + 1;
>> + dev->intr_handle->nb_efd = nb_rxq;
>> + dev->intr_handle->max_intr = nb_rxq + 1;
>> dev->intr_handle->type = RTE_INTR_HANDLE_VDEV;
>>
>> return 0;
>> @@ -835,6 +883,7 @@ vring_conf_update(int vid, struct rte_eth_dev *eth_dev, uint16_t vring_id)
>> {
>> struct rte_eth_conf *dev_conf = ð_dev->data->dev_conf;
>> struct pmd_internal *internal = eth_dev->data->dev_private;
>> + struct vhost_queue *vq;
>> struct rte_vhost_vring vring;
>> int rx_idx = vring_id % 2 ? (vring_id - 1) >> 1 : -1;
>> int ret = 0;
>> @@ -853,12 +902,18 @@ vring_conf_update(int vid, struct rte_eth_dev *eth_dev, uint16_t vring_id)
>> vring_id);
>> return ret;
>> }
>> + eth_dev->intr_handle->efds[rx_idx] = vring.kickfd;
>>
>> - if (vring.kickfd != eth_dev->intr_handle->efds[rx_idx]) {
>> - VHOST_LOG(INFO, "kickfd for rxq-%d was changed.\n",
>> - rx_idx);
>> - eth_dev->intr_handle->efds[rx_idx] = vring.kickfd;
>> + vq = eth_dev->data->rx_queues[rx_idx];
>> + if (!vq) {
>> + VHOST_LOG(ERR, "rxq%d is not setup yet\n", rx_idx);
>> + return -1;
>> }
>> +
>> + rte_spinlock_lock(&vq->intr_lock);
>> + if (vq->intr_enable)
>> + ret = eth_vhost_update_intr(eth_dev, rx_idx);
>> + rte_spinlock_unlock(&vq->intr_lock);
>> }
>>
>> return ret;
>> @@ -1152,6 +1207,7 @@ eth_rx_queue_setup(struct rte_eth_dev *dev, uint16_t rx_queue_id,
>>
>> vq->mb_pool = mb_pool;
>> vq->virtqueue_id = rx_queue_id * VIRTIO_QNUM + VIRTIO_TXQ;
>> + rte_spinlock_init(&vq->intr_lock);
>> dev->data->rx_queues[rx_queue_id] = vq;
>>
>> return 0;
>> @@ -1173,6 +1229,7 @@ eth_tx_queue_setup(struct rte_eth_dev *dev, uint16_t tx_queue_id,
>> }
>>
>> vq->virtqueue_id = tx_queue_id * VIRTIO_QNUM + VIRTIO_RXQ;
>> + rte_spinlock_init(&vq->intr_lock);
>> dev->data->tx_queues[tx_queue_id] = vq;
>>
>> return 0;
>> --
>> 2.26.2
>>
>
>
>
> --
> David Marchand
>
next prev parent reply other threads:[~2020-07-29 13:20 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-07-29 9:19 [dpdk-dev] [PATCH v3 0/3] Fix Vhost regressions Maxime Coquelin
2020-07-29 9:19 ` [dpdk-dev] [PATCH v3 1/3] vhost: fix guest notification setting Maxime Coquelin
2020-07-29 9:19 ` [dpdk-dev] [PATCH v3 2/3] net/vhost: fix queue update Maxime Coquelin
2020-07-29 9:20 ` [dpdk-dev] [PATCH v3 3/3] net/vhost: fix interrupt mode Maxime Coquelin
2020-07-29 11:27 ` David Marchand
2020-07-29 13:19 ` Maxime Coquelin [this message]
2020-07-29 12:53 ` Maxime Coquelin
2020-07-29 13:24 ` Xia, Chenbo
2020-07-29 13:27 ` Maxime Coquelin
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0d057cd7-47a2-542e-7dd5-1348c41ba48a@redhat.com \
--to=maxime.coquelin@redhat.com \
--cc=chenbo.xia@intel.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=ferruh.yigit@intel.com \
--cc=matan@mellanox.com \
--cc=thomas@monjalon.net \
--cc=yinan.wang@intel.com \
--cc=yong.liu@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).