From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 0B5C64284B; Mon, 27 Mar 2023 16:10:37 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 93A2C40EE1; Mon, 27 Mar 2023 16:10:36 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by mails.dpdk.org (Postfix) with ESMTP id DE42340ED8 for ; Mon, 27 Mar 2023 16:10:34 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1679926234; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Pk0zuN4XHgNPFXk0X6FeoDT+Nit9qL+MhEZTFUSfZmY=; b=fU/1udTJqOS04RTxxhVMS5DGeKFuB1kBLs5tByDn2ZOqWtNsr++HCTtGvrNmPzyNxWD6ul EitD4lk49PG0WrS4tjuyMguwRwiiCqnqQNyWiVgC/MPMsi/ZA7WbH1HzqkOzhQRdeSQKOG z+a0jGtuaXR8oU/ClSxIdR4zXmWUbKg= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-57-wCn5o03PO--c5WMfX18DkQ-1; Mon, 27 Mar 2023 10:10:32 -0400 X-MC-Unique: wCn5o03PO--c5WMfX18DkQ-1 Received: by mail-ed1-f72.google.com with SMTP id m18-20020a50d7d2000000b00501dfd867a4so12936774edj.20 for ; Mon, 27 Mar 2023 07:10:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1679926231; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=l/ckQ9bJQHGzLHn+gqdPqCttkoKMzlE38GXKUrSQzdA=; b=cA5C7xFwYn5IWI1YJ80OmRTIAP/uTHTlqo32XIWUJ7KK8FwecDbYaA4k0Jf/8m/eV0 1Yotsqk4sMbc7KCyFohZLo+0DKn2KQ9ZH9ezBLp36MULoBMFyVlPSTmZjQKrfaFWpkme XQpVo3A5ymnBG5sXgiEC8haz+LnrD6rgaoTU2vh441XwwyibU1BERe7Br/YtzQyfyutl NLH12As/4d8ROKi2kLE9hcsTi8NBrCz71lxQYivMiJmZKzWx+UvK0Sc5i23d4RkB/9r+ 5yAhM2q5Eqg5NKWAcdx0bSR339b/Dj1jgb31cIOt9r1Sjxf4oyFxrIpH+BDHPN+oRA2B MGUA== X-Gm-Message-State: AAQBX9f9PSpWGp7LOxOl3aGknRtsJadaJ5fAh8K8ks3jD5GZSfvWjiX2 e2b0HBmtOCASzhasO/I3zr8Qq2ZEPRFogdnyfEDew7PFwmvziEwnE91nd9RKIbx6U+B8QHMU6XJ pLes= X-Received: by 2002:a17:906:5850:b0:935:3028:ff58 with SMTP id h16-20020a170906585000b009353028ff58mr12080372ejs.55.1679926231749; Mon, 27 Mar 2023 07:10:31 -0700 (PDT) X-Google-Smtp-Source: AKy350atwA40UyeEUoWZSV1aacX6oQdlhj3jletiW4DXcZMQLymxxv5+eNXoxloTE8qays/4NFJ38A== X-Received: by 2002:a17:906:5850:b0:935:3028:ff58 with SMTP id h16-20020a170906585000b009353028ff58mr12080334ejs.55.1679926231277; Mon, 27 Mar 2023 07:10:31 -0700 (PDT) Received: from [10.39.192.159] (5920ab7b.static.cust.trined.nl. [89.32.171.123]) by smtp.gmail.com with ESMTPSA id bb14-20020a1709070a0e00b0093f322187f0sm3399833ejc.189.2023.03.27.07.10.30 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 27 Mar 2023 07:10:30 -0700 (PDT) From: Eelco Chaudron To: Maxime Coquelin Cc: chenbo.xia@intel.com, dev@dpdk.org, David Marchand Subject: Re: [PATCH] vhost: add device op to offload the interrupt kick Date: Mon, 27 Mar 2023 16:10:29 +0200 X-Mailer: MailMate (1.14r5950) Message-ID: <223DB816-D925-4014-AF9F-2D8E6F0B5F79@redhat.com> In-Reply-To: <09508737-d567-444f-9efa-87b06726e336@redhat.com> References: <167992139724.45323.17979512439014217881.stgit@ebuild.local> <09508737-d567-444f-9efa-87b06726e336@redhat.com> MIME-Version: 1.0 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 27 Mar 2023, at 15:21, Maxime Coquelin wrote: > Hi Eelco, > > On 3/27/23 14:51, Eelco Chaudron wrote: >> This patch adds an operation callback which gets called every time the >> library wants to call eventfd_write(). This eventfd_write() call could >> result in a system call, which could potentially block the PMD thread. >> >> The callback function can decide whether it's ok to handle the >> eventfd_write() now or have the newly introduced function, >> rte_vhost_notify_guest(), called at a later time. >> >> This can be used by 3rd party applications, like OVS, to avoid system >> calls being called as part of the PMD threads. > > That's a good idea, please find some comments inline: Thanks for the review, see inline. > >> Signed-off-by: Eelco Chaudron >> --- >> lib/vhost/rte_vhost.h | 10 +++++++++- >> lib/vhost/vhost.c | 21 +++++++++++++++++++++ >> lib/vhost/vhost.h | 43 ++++++++++++++++++++++++++++-------------= -- >> 3 files changed, 58 insertions(+), 16 deletions(-) >> >> diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h >> index 58a5d4be92..af7a394d0f 100644 >> --- a/lib/vhost/rte_vhost.h >> +++ b/lib/vhost/rte_vhost.h >> @@ -298,7 +298,13 @@ struct rte_vhost_device_ops { >> =09 */ >> =09void (*guest_notified)(int vid); >> -=09void *reserved[1]; /**< Reserved for future extension */ >> +=09/** >> +=09 * If this callback is registered, notification to the guest can >> +=09 * be handled by the front-end calling rte_vhost_notify_guest(). >> +=09 * If it's not handled, 'false' should be returned. This can be used >> +=09 * to remove the "slow" eventfd_write() syscall from the datapath. >> +=09 */ >> +=09bool (*guest_notify)(int vid, uint16_t queue_id); >> }; >> /** >> @@ -433,6 +439,8 @@ void rte_vhost_log_used_vring(int vid, uint16_t vrin= g_idx, >> int rte_vhost_enable_guest_notification(int vid, uint16_t queue_id, i= nt enable); >> +void rte_vhost_notify_guest(int vid, uint16_t queue_id); > > The new API needs to be tagged as experimental, and also documented. (I > see rte_vhost_enable_guest_notification is not properly documented, so > not a good example!) I used exactly that as an example, so thought it should be good ;) Will add this in the next revision... >> + >> /** >> * Register vhost driver. path could be different for multiple >> * instance support. >> diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c >> index ef37943817..ee090d78ef 100644 >> --- a/lib/vhost/vhost.c >> +++ b/lib/vhost/vhost.c >> @@ -1467,6 +1467,27 @@ rte_vhost_enable_guest_notification(int vid, uint= 16_t queue_id, int enable) >> =09return ret; >> } >> +void >> +rte_vhost_notify_guest(int vid, uint16_t queue_id) >> +{ >> +=09struct virtio_net *dev =3D get_device(vid); >> +=09struct vhost_virtqueue *vq; >> + >> +=09if (!dev || queue_id >=3D VHOST_MAX_VRING) >> +=09=09return; >> + >> +=09vq =3D dev->virtqueue[queue_id]; >> +=09if (!vq) >> +=09=09return; >> + >> +=09rte_spinlock_lock(&vq->access_lock); >> + >> +=09if (vq->callfd >=3D 0) >> +=09=09eventfd_write(vq->callfd, (eventfd_t)1); > > Maybe we should return an error of callfd is invalid or eventfd_write() > failed. See below >> + >> +=09rte_spinlock_unlock(&vq->access_lock); >> +} >> + >> void >> rte_vhost_log_write(int vid, uint64_t addr, uint64_t len) >> { >> diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h >> index 8fdab13c70..39ad8260a1 100644 >> --- a/lib/vhost/vhost.h >> +++ b/lib/vhost/vhost.h >> @@ -883,6 +883,30 @@ vhost_need_event(uint16_t event_idx, uint16_t new_i= dx, uint16_t old) >> =09return (uint16_t)(new_idx - event_idx - 1) < (uint16_t)(new_idx - o= ld); >> } >> +static __rte_always_inline void >> +vhost_vring_kick_guest(struct virtio_net *dev, struct vhost_virtqueue *= vq) >> +{ >> +=09if (dev->notify_ops->guest_notify) { >> +=09=09uint16_t qid; >> +=09=09for (qid =3D 0; qid < dev->nr_vring; qid++) { >> +=09=09=09if (dev->virtqueue[qid] =3D=3D vq) { >> +=09=09=09=09if (dev->notify_ops->guest_notify(dev->vid, >> +=09=09=09=09=09=09=09=09 qid)) >> +=09=09=09=09=09goto done; >> +=09=09=09=09break; >> +=09=09=09} >> +=09=09} > > Since v22.11, you no more need to iterate through the port's virtqueues, > David introduced index field in vhost_virtuqueue in v22.11 (57e414e3ec29 > ("vhost: keep a reference to virtqueue index")). Thanks will change this, I did most of this a while back and did not re-che= ck. >> +=09} >> +=09eventfd_write(vq->callfd, (eventfd_t) 1); >> + >> +done: >> +=09if (dev->flags & VIRTIO_DEV_STATS_ENABLED) >> +=09=09vq->stats.guest_notifications++; >> +=09if (dev->notify_ops->guest_notified) >> +=09=09dev->notify_ops->guest_notified(dev->vid); >> +} > > FYI, I have done almost the same refactoring in the VDUSE series I will > soon send: > > https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commit/7089a8a6d1e89b9db9= 0547eb5b4fef886f24fd5b > > My version also introduces a new counter in case the notification > failed. Maybe we could also have a counter in case the notification has > been "offloaded" to the application? What would be the best approach!? I can wait to send a new rev until your p= atch is out. >> + >> + >> static __rte_always_inline void >> vhost_vring_call_split(struct virtio_net *dev, struct vhost_virtqueue = *vq) >> { >> @@ -905,21 +929,13 @@ vhost_vring_call_split(struct virtio_net *dev, str= uct vhost_virtqueue *vq) >> =09=09if ((vhost_need_event(vhost_used_event(vq), new, old) && >> =09=09=09=09=09(vq->callfd >=3D 0)) || >> =09=09=09=09unlikely(!signalled_used_valid)) { >> -=09=09=09eventfd_write(vq->callfd, (eventfd_t) 1); >> -=09=09=09if (dev->flags & VIRTIO_DEV_STATS_ENABLED) >> -=09=09=09=09vq->stats.guest_notifications++; >> -=09=09=09if (dev->notify_ops->guest_notified) >> -=09=09=09=09dev->notify_ops->guest_notified(dev->vid); >> +=09=09=09vhost_vring_kick_guest(dev, vq); >> =09=09} >> =09} else { >> =09=09/* Kick the guest if necessary. */ >> =09=09if (!(vq->avail->flags & VRING_AVAIL_F_NO_INTERRUPT) >> =09=09=09=09&& (vq->callfd >=3D 0)) { >> -=09=09=09eventfd_write(vq->callfd, (eventfd_t)1); >> -=09=09=09if (dev->flags & VIRTIO_DEV_STATS_ENABLED) >> -=09=09=09=09vq->stats.guest_notifications++; >> -=09=09=09if (dev->notify_ops->guest_notified) >> -=09=09=09=09dev->notify_ops->guest_notified(dev->vid); >> +=09=09=09vhost_vring_kick_guest(dev, vq); >> =09=09} >> =09} >> } >> @@ -971,11 +987,8 @@ vhost_vring_call_packed(struct virtio_net *dev, str= uct vhost_virtqueue *vq) >> =09if (vhost_need_event(off, new, old)) >> =09=09kick =3D true; >> kick: >> -=09if (kick) { >> -=09=09eventfd_write(vq->callfd, (eventfd_t)1); >> -=09=09if (dev->notify_ops->guest_notified) >> -=09=09=09dev->notify_ops->guest_notified(dev->vid); >> -=09} >> +=09if (kick) >> +=09=09vhost_vring_kick_guest(dev, vq); >> } >> static __rte_always_inline void >>