From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 01FE8A052F; Wed, 29 Jan 2020 20:33:22 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id ED7B11C043; Wed, 29 Jan 2020 20:33:21 +0100 (CET) Received: from us-smtp-delivery-1.mimecast.com (us-smtp-2.mimecast.com [207.211.31.81]) by dpdk.org (Postfix) with ESMTP id 507571C035 for ; Wed, 29 Jan 2020 20:33:20 +0100 (CET) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1580326399; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z0k+Olqu72hrpWfcWB/byrTv73diRtV6VCqxdtb3ORg=; b=GJZAWfhhxQ4po9w+PFTrf1eD4RFzSod+7yFv1IodT3tR815KBVgJaXrrjzXRtMQ/I05VSa w6LiX67zI1GhJdluCa1IusECrFmTI/cQHEqEZ1EjjZpPvtKTNQSdpZl0mU6AEdpTm8WhPF ZISeVcnQDIlkYCX5mmBGvFnnPzkKk5Y= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-211-65qMBLsnOpWzTGMt9VG_Aw-1; Wed, 29 Jan 2020 14:33:17 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 6495013E6; Wed, 29 Jan 2020 19:33:16 +0000 (UTC) Received: from eperezma.remote.csb (ovpn-117-72.ams2.redhat.com [10.36.117.72]) by smtp.corp.redhat.com (Postfix) with ESMTP id 905E05C541; Wed, 29 Jan 2020 19:33:12 +0000 (UTC) From: =?UTF-8?q?Eugenio=20P=C3=A9rez?= To: dev@dpdk.org, yong.liu@intel.com Cc: Maxime Coquelin , Adrian Moreno Zapata , Jason Wang , "Michael S. Tsirkin" Date: Wed, 29 Jan 2020 20:33:10 +0100 Message-Id: <20200129193310.9157-1-eperezma@redhat.com> In-Reply-To: References: MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-MC-Unique: 65qMBLsnOpWzTGMt9VG_Aw-1 X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Subject: [dpdk-dev] [PATCH] vhost: flush shadow tx if there is no more packets X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The current implementation of vhost_net in packed vring tries to fill the shadow vector before send any actual changes to the guest. While this can be beneficial for the throughput, it conflicts with some bufferfloats methods like the linux kernel napi, that stops transmitting packets if there are too much bytes/buffers in the driver. To solve it, we flush the shadow packets at the end of virtio_dev_tx_packed if we have starved the vring, i.e., the next buffer is not available for the device. Since this last check can be expensive because of the atomic, we only check it if we have not obtained the expected (count) packets. If it happens to obtain "count" packets and there is no more available packets the caller needs to keep call virtio_dev_tx_packed again. Signed-off-by: Eugenio P=C3=A9rez --- lib/librte_vhost/virtio_net.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index 21c311732..ac2842b2d 100644 --- a/lib/librte_vhost/virtio_net.c +++ b/lib/librte_vhost/virtio_net.c @@ -2133,6 +2133,20 @@ virtio_dev_tx_packed_zmbuf(struct virtio_net *dev, =09return pkt_idx; } =20 +static __rte_always_inline bool +next_desc_is_avail(const struct vhost_virtqueue *vq) +{ +=09bool wrap_counter =3D vq->avail_wrap_counter; +=09uint16_t next_used_idx =3D vq->last_used_idx + 1; + +=09if (next_used_idx >=3D vq->size) { +=09=09next_used_idx -=3D vq->size; +=09=09wrap_counter ^=3D 1; +=09} + +=09return desc_is_avail(&vq->desc_packed[next_used_idx], wrap_counter); +} + static __rte_noinline uint16_t virtio_dev_tx_packed(struct virtio_net *dev, =09=09 struct vhost_virtqueue *vq, @@ -2165,9 +2179,20 @@ virtio_dev_tx_packed(struct virtio_net *dev, =20 =09} while (remained); =20 -=09if (vq->shadow_used_idx) +=09if (vq->shadow_used_idx) { =09=09do_data_copy_dequeue(vq); =20 +=09=09if (remained && !next_desc_is_avail(vq)) { +=09=09=09/* +=09=09=09 * The guest may be waiting to TX some buffers to +=09=09=09 * enqueue more to avoid bufferfloat, so we try to +=09=09=09 * reduce latency here. +=09=09=09 */ +=09=09=09vhost_flush_dequeue_shadow_packed(dev, vq); +=09=09=09vhost_vring_call_packed(dev, vq); +=09=09} +=09} + =09return pkt_idx; } =20 --=20 2.18.1