From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0A68442808; Mon, 27 Mar 2023 15:21:43 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id DA0DA40EE1; Mon, 27 Mar 2023 15:21:42 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by mails.dpdk.org (Postfix) with ESMTP id C9DCF40ED8 for ; Mon, 27 Mar 2023 15:21:41 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679923301; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=RFEvIKXSYDdOuBinFj4ru7SyoUCkHJesvspqxSHZ3do=; b=X26V6s/R85h/GU2pUzxn03kc7PLiLXXE7EfKAZuXwQx34fAc/oCVqR72MUjD+HyvRITdmK nFKZEW8PFSNkuWbrxiwl8438Ufl5sL3WKxxxkmWoFYL9e0l8T9xUH+TH7/leJ9lpJtI9KN 2phw98axlmb4uhaB7+G1Mz4d4Ya1feQ= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-28-Dp2EU7fZM2uF2y-Ruuy9DQ-1; Mon, 27 Mar 2023 09:21:40 -0400 X-MC-Unique: Dp2EU7fZM2uF2y-Ruuy9DQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A9EE13C10144; Mon, 27 Mar 2023 13:21:39 +0000 (UTC) Received: from [10.39.208.21] (unknown [10.39.208.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A5014C15BB8; Mon, 27 Mar 2023 13:21:38 +0000 (UTC) Message-ID: <09508737-d567-444f-9efa-87b06726e336@redhat.com> Date: Mon, 27 Mar 2023 15:21:37 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.8.0 To: Eelco Chaudron , chenbo.xia@intel.com Cc: dev@dpdk.org, David Marchand References: <167992139724.45323.17979512439014217881.stgit@ebuild.local> From: Maxime Coquelin Subject: Re: [PATCH] vhost: add device op to offload the interrupt kick In-Reply-To: <167992139724.45323.17979512439014217881.stgit@ebuild.local> X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Language: en-US Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi Eelco, On 3/27/23 14:51, Eelco Chaudron wrote: > This patch adds an operation callback which gets called every time the > library wants to call eventfd_write(). This eventfd_write() call could > result in a system call, which could potentially block the PMD thread. > > The callback function can decide whether it's ok to handle the > eventfd_write() now or have the newly introduced function, > rte_vhost_notify_guest(), called at a later time. > > This can be used by 3rd party applications, like OVS, to avoid system > calls being called as part of the PMD threads. That's a good idea, please find some comments inline: > Signed-off-by: Eelco Chaudron > --- > lib/vhost/rte_vhost.h | 10 +++++++++- > lib/vhost/vhost.c | 21 +++++++++++++++++++++ > lib/vhost/vhost.h | 43 ++++++++++++++++++++++++++++--------------- > 3 files changed, 58 insertions(+), 16 deletions(-) > > diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h > index 58a5d4be92..af7a394d0f 100644 > --- a/lib/vhost/rte_vhost.h > +++ b/lib/vhost/rte_vhost.h > @@ -298,7 +298,13 @@ struct rte_vhost_device_ops { > */ > void (*guest_notified)(int vid); > > - void *reserved[1]; /**< Reserved for future extension */ > + /** > + * If this callback is registered, notification to the guest can > + * be handled by the front-end calling rte_vhost_notify_guest(). > + * If it's not handled, 'false' should be returned. This can be used > + * to remove the "slow" eventfd_write() syscall from the datapath. > + */ > + bool (*guest_notify)(int vid, uint16_t queue_id); > }; > > /** > @@ -433,6 +439,8 @@ void rte_vhost_log_used_vring(int vid, uint16_t vring_idx, > > int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable); > > +void rte_vhost_notify_guest(int vid, uint16_t queue_id); The new API needs to be tagged as experimental, and also documented. (I see rte_vhost_enable_guest_notification is not properly documented, so not a good example!) > + > /** > * Register vhost driver. path could be different for multiple > * instance support. > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c > index ef37943817..ee090d78ef 100644 > --- a/lib/vhost/vhost.c > +++ b/lib/vhost/vhost.c > @@ -1467,6 +1467,27 @@ rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, int enable) > return ret; > } > > +void > +rte_vhost_notify_guest(int vid, uint16_t queue_id) > +{ > + struct virtio_net *dev = get_device(vid); > + struct vhost_virtqueue *vq; > + > + if (!dev || queue_id >= VHOST_MAX_VRING) > + return; > + > + vq = dev->virtqueue[queue_id]; > + if (!vq) > + return; > + > + rte_spinlock_lock(&vq->access_lock); > + > + if (vq->callfd >= 0) > + eventfd_write(vq->callfd, (eventfd_t)1); Maybe we should return an error of callfd is invalid or eventfd_write() failed. > + > + rte_spinlock_unlock(&vq->access_lock); > +} > + > void > rte_vhost_log_write(int vid, uint64_t addr, uint64_t len) > { > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h > index 8fdab13c70..39ad8260a1 100644 > --- a/lib/vhost/vhost.h > +++ b/lib/vhost/vhost.h > @@ -883,6 +883,30 @@ vhost_need_event(uint16_t event_idx, uint16_t new_idx, uint16_t old) > return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - old); > } > > +static __rte_always_inline void > +vhost_vring_kick_guest(struct virtio_net *dev, struct vhost_virtqueue *vq) > +{ > + if (dev->notify_ops->guest_notify) { > + uint16_t qid; > + for (qid = 0; qid < dev->nr_vring; qid++) { > + if (dev->virtqueue[qid] == vq) { > + if (dev->notify_ops->guest_notify(dev->vid, > + qid)) > + goto done; > + break; > + } > + } Since v22.11, you no more need to iterate through the port's virtqueues, David introduced index field in vhost_virtuqueue in v22.11 (57e414e3ec29 ("vhost: keep a reference to virtqueue index")). > + } > + eventfd_write(vq->callfd, (eventfd_t) 1); > + > +done: > + if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > + vq->stats.guest_notifications++; > + if (dev->notify_ops->guest_notified) > + dev->notify_ops->guest_notified(dev->vid); > +} FYI, I have done almost the same refactoring in the VDUSE series I will soon send: https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/7089a8a6d1e89b9db90547eb5b4fef886f24fd5b My version also introduces a new counter in case the notification failed. Maybe we could also have a counter in case the notification has been "offloaded" to the application? > + > + > static __rte_always_inline void > vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) > { > @@ -905,21 +929,13 @@ vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue *vq) > if ((vhost_need_event(vhost_used_event(vq), new, old) && > (vq->callfd >= 0)) || > unlikely(!signalled_used_valid)) { > - eventfd_write(vq->callfd, (eventfd_t) 1); > - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > - vq->stats.guest_notifications++; > - if (dev->notify_ops->guest_notified) > - dev->notify_ops->guest_notified(dev->vid); > + vhost_vring_kick_guest(dev, vq); > } > } else { > /* Kick the guest if necessary. */ > if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT) > && (vq->callfd >= 0)) { > - eventfd_write(vq->callfd, (eventfd_t)1); > - if (dev->flags & VIRTIO_DEV_STATS_ENABLED) > - vq->stats.guest_notifications++; > - if (dev->notify_ops->guest_notified) > - dev->notify_ops->guest_notified(dev->vid); > + vhost_vring_kick_guest(dev, vq); > } > } > } > @@ -971,11 +987,8 @@ vhost_vring_call_packed(struct virtio_net *dev, struct vhost_virtqueue *vq) > if (vhost_need_event(off, new, old)) > kick = true; > kick: > - if (kick) { > - eventfd_write(vq->callfd, (eventfd_t)1); > - if (dev->notify_ops->guest_notified) > - dev->notify_ops->guest_notified(dev->vid); > - } > + if (kick) > + vhost_vring_kick_guest(dev, vq); > } > > static __rte_always_inline void >