From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 3B2C243D73; Fri, 29 Mar 2024 00:35:54 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 1946040268; Fri, 29 Mar 2024 00:35:52 +0100 (CET) Received: from rn-mailsvcp-mx-lapp03.apple.com (rn-mailsvcp-mx-lapp03.apple.com [17.179.253.24]) by mails.dpdk.org (Postfix) with ESMTP id 74E7840042 for ; Fri, 29 Mar 2024 00:35:49 +0100 (CET) Received: from rn-mailsvcp-mta-lapp02.rno.apple.com (rn-mailsvcp-mta-lapp02.rno.apple.com [10.225.203.150]) by rn-mailsvcp-mx-lapp03.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0SB3011TZ07OQ500@rn-mailsvcp-mx-lapp03.rno.apple.com> for dev@dpdk.org; Thu, 28 Mar 2024 16:35:48 -0700 (PDT) X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-28_17,2024-03-28_01,2023-05-22_02 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apple.com; h=from : to : cc : subject : date : message-id : mime-version : content-transfer-encoding; s=20180706; bh=vfGhas7yJSeKXbAp5HUn/ZCLEX1xfhOOR4QzMXiksZU=; b=BKIAP9Iwxhuf4BJpF+eiO1taMwFefqe9Jazpcu5ZEePaO6YfiLgvUxy4md7e/0EHExXX pL+hTo98XWWXVh4cJbAojFq++o9aTlDuZOm3+ebaDCJv1UgwGWl16tkBmiwbX5cvYUA7 6Whx9xzdJQOi5LPk72eTqAkyAewLTGBIGnXHgyWIyq4Q2lrAPz0wpwpmyvl933CjTZX7 92wfhAYkWlfWykYtB04+mzAiDFGEglKrKtuS6LSbrNnBywQQiueHSg/IgFV9XKk2mQ4f 8SpL6c2hoMIF0SMiWmS6MfFB74xCgY9msmmgSWDk3txKWHRkcbU1eJ2bMrJMfPJ22E13 iQ== Received: from rn-mailsvcp-mmp-lapp02.rno.apple.com (rn-mailsvcp-mmp-lapp02.rno.apple.com [17.179.253.15]) by rn-mailsvcp-mta-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPS id <0SB3004KL07OQUA0@rn-mailsvcp-mta-lapp02.rno.apple.com>; Thu, 28 Mar 2024 16:35:48 -0700 (PDT) Received: from process_milters-daemon.rn-mailsvcp-mmp-lapp02.rno.apple.com by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) id <0SB300S000231300@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Thu, 28 Mar 2024 16:35:48 -0700 (PDT) X-Va-A: X-Va-T-CD: 890ecbe268e4da4a9b8ebf5646b77237 X-Va-E-CD: 562329c792fa7855219cca3742546f23 X-Va-R-CD: b1971efc66bc3425806df554b278f494 X-Va-ID: 33691462-f8f9-410d-8e09-cfe330421288 X-Va-CD: 0 X-V-A: X-V-T-CD: 890ecbe268e4da4a9b8ebf5646b77237 X-V-E-CD: 562329c792fa7855219cca3742546f23 X-V-R-CD: b1971efc66bc3425806df554b278f494 X-V-ID: 026902e4-9260-46d6-9487-a86b6c6074ac X-V-CD: 0 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.1011,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2024-03-28_17,2024-03-28_01,2023-05-22_02 Received: from localhost (unknown [17.192.171.224]) by rn-mailsvcp-mmp-lapp02.rno.apple.com (Oracle Communications Messaging Server 8.1.0.23.20230328 64bit (built Mar 28 2023)) with ESMTPSA id <0SB300JAK07OQ600@rn-mailsvcp-mmp-lapp02.rno.apple.com>; Thu, 28 Mar 2024 16:35:48 -0700 (PDT) From: Andrey Ignatov To: dev@dpdk.org Cc: Maxime Coquelin , Chenbo Xia , Wei Shen , Andrey Ignatov Subject: [PATCH] vhost: optimize mbuf allocation in virtio Tx packed path Date: Thu, 28 Mar 2024 16:33:38 -0700 Message-id: <20240328233338.56544-1-rdna@apple.com> X-Mailer: git-send-email 2.39.3 (Apple Git-147) MIME-version: 1.0 Content-transfer-encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Currently virtio_dev_tx_packed() always allocates requested @count of packets, no matter how many packets are really available on the virtio Tx ring. Later it has to free all packets it didn't use and if, for example, there were zero available packets on the ring, then all @count mbufs would be allocated just to be freed afterwards. This wastes CPU cycles since rte_pktmbuf_alloc_bulk() / rte_pktmbuf_free_bulk() do quite a lot of work. Optimize it by using the same idea as the virtio_dev_tx_split() uses on the Tx split path: estimate the number of available entries on the ring and allocate only that number of mbufs. On the split path it's pretty easy to estimate. On the packed path it's more work since it requires checking flags for up to @count of descriptors. Still it's much less expensive than the alloc/free pair. The new get_nb_avail_entries_packed() function doesn't change how virtio_dev_tx_packed() works with regard to memory barriers since the barrier between checking flags and other descriptor fields is still in place later in virtio_dev_tx_batch_packed() and virtio_dev_tx_single_packed(). The difference for a guest transmitting ~17Gbps with MTU 1500 on a `perf record` / `perf report` (on lower pps the savings will be bigger): * Before the change: Samples: 18K of event 'cycles:P', Event count (approx.): 19206831288 Children Self Pid:Command - 100.00% 100.00% 798808:dpdk-worker1 <... skip ...> - 99.09% pkt_burst_io_forward - 90.26% common_fwd_stream_receive - 90.04% rte_eth_rx_burst - 75.53% eth_vhost_rx - 74.29% rte_vhost_dequeue_burst - 71.48% virtio_dev_tx_packed_compliant + 17.11% rte_pktmbuf_alloc_bulk + 11.80% rte_pktmbuf_free_bulk + 2.11% vhost_user_inject_irq 0.75% rte_pktmbuf_reset 0.53% __rte_pktmbuf_free_seg_via_array 0.88% vhost_queue_stats_update + 13.66% mlx5_rx_burst_vec + 8.69% common_fwd_stream_transmit * After: Samples: 18K of event 'cycles:P', Event count (approx.): 19225310840 Children Self Pid:Command - 100.00% 100.00% 859754:dpdk-worker1 <... skip ...> - 98.61% pkt_burst_io_forward - 86.29% common_fwd_stream_receive - 85.84% rte_eth_rx_burst - 61.94% eth_vhost_rx - 60.05% rte_vhost_dequeue_burst - 55.98% virtio_dev_tx_packed_compliant + 3.43% rte_pktmbuf_alloc_bulk + 2.50% vhost_user_inject_irq 1.17% vhost_queue_stats_update 0.76% rte_rwlock_read_unlock 0.54% rte_rwlock_read_trylock + 22.21% mlx5_rx_burst_vec + 12.00% common_fwd_stream_transmit It can be seen that virtio_dev_tx_packed_compliant() goes from 71.48% to 55.98% with rte_pktmbuf_alloc_bulk() going from 17.11% to 3.43% and rte_pktmbuf_free_bulk() going away completely. Signed-off-by: Andrey Ignatov --- lib/vhost/virtio_net.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+) diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index 1359c5fb1f..b406b5d7d9 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -3484,6 +3484,35 @@ virtio_dev_tx_single_packed(struct virtio_net *dev, return ret; } +static __rte_always_inline uint16_t +get_nb_avail_entries_packed(const struct vhost_virtqueue *__rte_restrict vq, + uint16_t max_nb_avail_entries) +{ + const struct vring_packed_desc *descs = vq->desc_packed; + bool avail_wrap = vq->avail_wrap_counter; + uint16_t avail_idx = vq->last_avail_idx; + uint16_t nb_avail_entries = 0; + uint16_t flags; + + while (nb_avail_entries < max_nb_avail_entries) { + flags = descs[avail_idx].flags; + + if ((avail_wrap != !!(flags & VRING_DESC_F_AVAIL)) || + (avail_wrap == !!(flags & VRING_DESC_F_USED))) + return nb_avail_entries; + + if (!(flags & VRING_DESC_F_NEXT)) + ++nb_avail_entries; + + if (unlikely(++avail_idx >= vq->size)) { + avail_idx -= vq->size; + avail_wrap = !avail_wrap; + } + } + + return nb_avail_entries; +} + __rte_always_inline static uint16_t virtio_dev_tx_packed(struct virtio_net *dev, @@ -3497,6 +3526,10 @@ virtio_dev_tx_packed(struct virtio_net *dev, { uint32_t pkt_idx = 0; + count = get_nb_avail_entries_packed(vq, count); + if (count == 0) + return 0; + if (rte_pktmbuf_alloc_bulk(mbuf_pool, pkts, count)) { vq->stats.mbuf_alloc_failed += count; return 0; -- 2.41.0