From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id B7999A034F; Mon, 3 May 2021 15:27:46 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id A4BD040150; Mon, 3 May 2021 15:27:46 +0200 (CEST) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [216.205.24.124]) by mails.dpdk.org (Postfix) with ESMTP id 5361F4014E for ; Mon, 3 May 2021 15:27:45 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1620048464; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=f9goSolCyxSk9B8KHkfLEUfgOzIZT9zSUj5IOledkHA=; b=eXGfA/P0CCpMoZE2wHSiiOo0/mMyF4A6ObpDN5ZU6uayxdZbRhPfzrhS3NRHXFZ6Imqof5 +wl4hAPiG3TdZxATmezNwZeWhg8sI5rmWsy1eFkv+1VKzO7kDDXLGOGrkdLUl4WLuQPs5z Riz+A2WsdQ7jRWMCp+yYYAwSXJFk+5U= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-457-rnryfndPNQC12VhIv3Jalw-1; Mon, 03 May 2021 09:27:42 -0400 X-MC-Unique: rnryfndPNQC12VhIv3Jalw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3C70E6D246; Mon, 3 May 2021 13:27:41 +0000 (UTC) Received: from dmarchan.remote.csb (unknown [10.40.192.99]) by smtp.corp.redhat.com (Postfix) with ESMTP id B4B615D9CD; Mon, 3 May 2021 13:27:37 +0000 (UTC) From: David Marchand To: dev@dpdk.org Cc: maxime.coquelin@redhat.com, olivier.matz@6wind.com, fbl@sysclose.org, i.maximets@ovn.org, chenbo.xia@intel.com, ian.stokes@intel.com, stable@dpdk.org, Jijiang Liu , Yuanhan Liu Date: Mon, 3 May 2021 15:26:46 +0200 Message-Id: <20210503132646.16076-5-david.marchand@redhat.com> In-Reply-To: <20210503132646.16076-1-david.marchand@redhat.com> References: <20210401095243.18211-1-david.marchand@redhat.com> <20210503132646.16076-1-david.marchand@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=david.marchand@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="US-ASCII" Subject: [dpdk-dev] [PATCH v3 4/4] vhost: fix offload flags in Rx path X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" The vhost library currently configures Tx offloading (PKT_TX_*) on any packet received from a guest virtio device which asks for some offloading. This is problematic, as Tx offloading is something that the application must ask for: the application needs to configure devices to support every used offloads (ip, tcp checksumming, tso..), and the various l2/l3/l4 lengths must be set following any processing that happened in the application itself. On the other hand, the received packets are not marked wrt current packet l3/l4 checksumming info. Copy virtio rx processing to fix those offload flags with some differences: - accept VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP, - ignore anything but the VIRTIO_NET_HDR_F_NEEDS_CSUM flag (to comply with the virtio spec), Some applications might rely on the current behavior, so it is left untouched by default. A new RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS flag is added to enable the new behavior. The vhost example has been updated for the new behavior: TSO is applied to any packet marked LRO. Fixes: 859b480d5afd ("vhost: add guest offload setting") Cc: stable@dpdk.org Signed-off-by: David Marchand Reviewed-by: Maxime Coquelin --- Changes since v2: - introduced a new flag to keep existing behavior as the default, - packets with unrecognised offload are passed to the application with no offload metadata rather than dropped, - ignored VIRTIO_NET_HDR_F_DATA_VALID since the virtio spec states that the virtio driver is not allowed to use this flag when transmitting packets, Changes since v1: - updated vhost example, - restored VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP support, - restored log on buggy offload request, --- doc/guides/prog_guide/vhost_lib.rst | 12 ++ doc/guides/rel_notes/release_21_05.rst | 6 + drivers/net/vhost/rte_eth_vhost.c | 2 +- examples/vhost/main.c | 44 +++--- lib/vhost/rte_vhost.h | 1 + lib/vhost/socket.c | 5 +- lib/vhost/vhost.c | 6 +- lib/vhost/vhost.h | 14 +- lib/vhost/virtio_net.c | 185 ++++++++++++++++++++++--- 9 files changed, 222 insertions(+), 53 deletions(-) diff --git a/doc/guides/prog_guide/vhost_lib.rst b/doc/guides/prog_guide/vhost_lib.rst index dc29229167..042875a9ca 100644 --- a/doc/guides/prog_guide/vhost_lib.rst +++ b/doc/guides/prog_guide/vhost_lib.rst @@ -118,6 +118,18 @@ The following is an overview of some key Vhost API functions: It is disabled by default. + - ``RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS`` + + Since v16.04, the vhost library forwards checksum and gso requests for + packets received from a virtio driver by filling Tx offload metadata in + the mbuf. This behavior is inconsistent with other drivers but it is left + untouched for existing applications that might rely on it. + + This flag disables the legacy behavior and instead ask vhost to simply + populate Rx offload metadata in the mbuf. + + It is disabled by default. + * ``rte_vhost_driver_set_features(path, features)`` This function sets the feature bits the vhost-user driver supports. The diff --git a/doc/guides/rel_notes/release_21_05.rst b/doc/guides/rel_notes/release_21_05.rst index b3224dc332..1cb06ce487 100644 --- a/doc/guides/rel_notes/release_21_05.rst +++ b/doc/guides/rel_notes/release_21_05.rst @@ -329,6 +329,12 @@ API Changes ``policer_action_recolor_supported`` and ``policer_action_drop_supported`` have been removed. +* vhost: The vhost library currently populates received mbufs from a virtio + driver with Tx offload flags while not filling Rx offload flags. + While this behavior is arguable, it is kept untouched. + A new flag ``RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS`` has been added to ask + for a behavior compliant with to the mbuf offload API. + ABI Changes ----------- diff --git a/drivers/net/vhost/rte_eth_vhost.c b/drivers/net/vhost/rte_eth_vhost.c index d198fc8a8e..281379d6a3 100644 --- a/drivers/net/vhost/rte_eth_vhost.c +++ b/drivers/net/vhost/rte_eth_vhost.c @@ -1505,7 +1505,7 @@ rte_pmd_vhost_probe(struct rte_vdev_device *dev) int ret = 0; char *iface_name; uint16_t queues; - uint64_t flags = 0; + uint64_t flags = RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS; uint64_t disable_flags = 0; int client_mode = 0; int iommu_support = 0; diff --git a/examples/vhost/main.c b/examples/vhost/main.c index ff48ba270d..64295aaf7e 100644 --- a/examples/vhost/main.c +++ b/examples/vhost/main.c @@ -19,6 +19,7 @@ #include #include #include +#include #include #include #include @@ -1032,33 +1033,34 @@ find_local_dest(struct vhost_dev *vdev, struct rte_mbuf *m, return 0; } -static uint16_t -get_psd_sum(void *l3_hdr, uint64_t ol_flags) -{ - if (ol_flags & PKT_TX_IPV4) - return rte_ipv4_phdr_cksum(l3_hdr, ol_flags); - else /* assume ethertype == RTE_ETHER_TYPE_IPV6 */ - return rte_ipv6_phdr_cksum(l3_hdr, ol_flags); -} - static void virtio_tx_offload(struct rte_mbuf *m) { + struct rte_net_hdr_lens hdr_lens; + struct rte_ipv4_hdr *ipv4_hdr; + struct rte_tcp_hdr *tcp_hdr; + uint32_t ptype; void *l3_hdr; - struct rte_ipv4_hdr *ipv4_hdr = NULL; - struct rte_tcp_hdr *tcp_hdr = NULL; - struct rte_ether_hdr *eth_hdr = - rte_pktmbuf_mtod(m, struct rte_ether_hdr *); - l3_hdr = (char *)eth_hdr + m->l2_len; + ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK); + m->l2_len = hdr_lens.l2_len; + m->l3_len = hdr_lens.l3_len; + m->l4_len = hdr_lens.l4_len; - if (m->ol_flags & PKT_TX_IPV4) { + l3_hdr = rte_pktmbuf_mtod_offset(m, void *, m->l2_len); + tcp_hdr = rte_pktmbuf_mtod_offset(m, struct rte_tcp_hdr *, + m->l2_len + m->l3_len); + + m->ol_flags |= PKT_TX_TCP_SEG; + if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4) { + m->ol_flags |= PKT_TX_IPV4; + m->ol_flags |= PKT_TX_IP_CKSUM; ipv4_hdr = l3_hdr; ipv4_hdr->hdr_checksum = 0; - m->ol_flags |= PKT_TX_IP_CKSUM; + tcp_hdr->cksum = rte_ipv4_phdr_cksum(l3_hdr, m->ol_flags); + } else { /* assume ethertype == RTE_ETHER_TYPE_IPV6 */ + m->ol_flags |= PKT_TX_IPV6; + tcp_hdr->cksum = rte_ipv6_phdr_cksum(l3_hdr, m->ol_flags); } - - tcp_hdr = (struct rte_tcp_hdr *)((char *)l3_hdr + m->l3_len); - tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags); } static __rte_always_inline void @@ -1151,7 +1153,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct rte_mbuf *m, uint16_t vlan_tag) m->vlan_tci = vlan_tag; } - if (m->ol_flags & PKT_TX_TCP_SEG) + if (m->ol_flags & PKT_RX_LRO) virtio_tx_offload(m); tx_q->m_table[tx_q->len++] = m; @@ -1636,7 +1638,7 @@ main(int argc, char *argv[]) int ret, i; uint16_t portid; static pthread_t tid; - uint64_t flags = 0; + uint64_t flags = RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS; signal(SIGINT, sigint_handler); diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h index d0a8ae31f2..8d875e9322 100644 --- a/lib/vhost/rte_vhost.h +++ b/lib/vhost/rte_vhost.h @@ -36,6 +36,7 @@ extern "C" { /* support only linear buffers (no chained mbufs) */ #define RTE_VHOST_USER_LINEARBUF_SUPPORT (1ULL << 6) #define RTE_VHOST_USER_ASYNC_COPY (1ULL << 7) +#define RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS (1ULL << 8) /* Features. */ #ifndef VIRTIO_NET_F_GUEST_ANNOUNCE diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c index 0169d36481..5d0d728d52 100644 --- a/lib/vhost/socket.c +++ b/lib/vhost/socket.c @@ -42,6 +42,7 @@ struct vhost_user_socket { bool extbuf; bool linearbuf; bool async_copy; + bool net_compliant_ol_flags; /* * The "supported_features" indicates the feature bits the @@ -224,7 +225,8 @@ vhost_user_add_connection(int fd, struct vhost_user_socket *vsocket) size = strnlen(vsocket->path, PATH_MAX); vhost_set_ifname(vid, vsocket->path, size); - vhost_set_builtin_virtio_net(vid, vsocket->use_builtin_virtio_net); + vhost_setup_virtio_net(vid, vsocket->use_builtin_virtio_net, + vsocket->net_compliant_ol_flags); vhost_attach_vdpa_device(vid, vsocket->vdpa_dev); @@ -877,6 +879,7 @@ rte_vhost_driver_register(const char *path, uint64_t flags) vsocket->extbuf = flags & RTE_VHOST_USER_EXTBUF_SUPPORT; vsocket->linearbuf = flags & RTE_VHOST_USER_LINEARBUF_SUPPORT; vsocket->async_copy = flags & RTE_VHOST_USER_ASYNC_COPY; + vsocket->net_compliant_ol_flags = flags & RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS; if (vsocket->async_copy && (flags & (RTE_VHOST_USER_IOMMU_SUPPORT | diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c index a70fe01d8f..846113d46f 100644 --- a/lib/vhost/vhost.c +++ b/lib/vhost/vhost.c @@ -752,7 +752,7 @@ vhost_set_ifname(int vid, const char *if_name, unsigned int if_len) } void -vhost_set_builtin_virtio_net(int vid, bool enable) +vhost_setup_virtio_net(int vid, bool enable, bool compliant_ol_flags) { struct virtio_net *dev = get_device(vid); @@ -763,6 +763,10 @@ vhost_set_builtin_virtio_net(int vid, bool enable) dev->flags |= VIRTIO_DEV_BUILTIN_VIRTIO_NET; else dev->flags &= ~VIRTIO_DEV_BUILTIN_VIRTIO_NET; + if (!compliant_ol_flags) + dev->flags |= VIRTIO_DEV_LEGACY_OL_FLAGS; + else + dev->flags &= ~VIRTIO_DEV_LEGACY_OL_FLAGS; } void diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h index f628714c24..65bcdc5301 100644 --- a/lib/vhost/vhost.h +++ b/lib/vhost/vhost.h @@ -27,15 +27,17 @@ #include "rte_vhost_async.h" /* Used to indicate that the device is running on a data core */ -#define VIRTIO_DEV_RUNNING 1 +#define VIRTIO_DEV_RUNNING ((uint32_t)1 << 0) /* Used to indicate that the device is ready to operate */ -#define VIRTIO_DEV_READY 2 +#define VIRTIO_DEV_READY ((uint32_t)1 << 1) /* Used to indicate that the built-in vhost net device backend is enabled */ -#define VIRTIO_DEV_BUILTIN_VIRTIO_NET 4 +#define VIRTIO_DEV_BUILTIN_VIRTIO_NET ((uint32_t)1 << 2) /* Used to indicate that the device has its own data path and configured */ -#define VIRTIO_DEV_VDPA_CONFIGURED 8 +#define VIRTIO_DEV_VDPA_CONFIGURED ((uint32_t)1 << 3) /* Used to indicate that the feature negotiation failed */ -#define VIRTIO_DEV_FEATURES_FAILED 16 +#define VIRTIO_DEV_FEATURES_FAILED ((uint32_t)1 << 4) +/* Used to indicate that the virtio_net tx code should fill TX ol_flags */ +#define VIRTIO_DEV_LEGACY_OL_FLAGS ((uint32_t)1 << 5) /* Backend value set by guest. */ #define VIRTIO_DEV_STOPPED -1 @@ -674,7 +676,7 @@ int alloc_vring_queue(struct virtio_net *dev, uint32_t vring_idx); void vhost_attach_vdpa_device(int vid, struct rte_vdpa_device *dev); void vhost_set_ifname(int, const char *if_name, unsigned int if_len); -void vhost_set_builtin_virtio_net(int vid, bool enable); +void vhost_setup_virtio_net(int vid, bool enable, bool legacy_ol_flags); void vhost_enable_extbuf(int vid); void vhost_enable_linearbuf(int vid); int vhost_enable_guest_notification(struct virtio_net *dev, diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c index ff39878609..aef30ad4fe 100644 --- a/lib/vhost/virtio_net.c +++ b/lib/vhost/virtio_net.c @@ -8,6 +8,7 @@ #include #include +#include #include #include #include @@ -1875,15 +1876,12 @@ parse_ethernet(struct rte_mbuf *m, uint16_t *l4_proto, void **l4_hdr) } static __rte_always_inline void -vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m) +vhost_dequeue_offload_legacy(struct virtio_net_hdr *hdr, struct rte_mbuf *m) { uint16_t l4_proto = 0; void *l4_hdr = NULL; struct rte_tcp_hdr *tcp_hdr = NULL; - if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE) - return; - parse_ethernet(m, &l4_proto, &l4_hdr); if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) { if (hdr->csum_start == (m->l2_len + m->l3_len)) { @@ -1928,6 +1926,94 @@ vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m) } } +static __rte_always_inline void +vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m, + bool legacy_ol_flags) +{ + struct rte_net_hdr_lens hdr_lens; + int l4_supported = 0; + uint32_t ptype; + + if (hdr->flags == 0 && hdr->gso_type == VIRTIO_NET_HDR_GSO_NONE) + return; + + if (legacy_ol_flags) { + vhost_dequeue_offload_legacy(hdr, m); + return; + } + + m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN; + + ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK); + m->packet_type = ptype; + if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP || + (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP || + (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP) + l4_supported = 1; + + /* According to Virtio 1.1 spec, the device only needs to look at + * VIRTIO_NET_HDR_F_NEEDS_CSUM in the packet transmission path. + * This differs from the processing incoming packets path where the + * driver could rely on VIRTIO_NET_HDR_F_DATA_VALID flag set by the + * device. + * + * 5.1.6.2.1 Driver Requirements: Packet Transmission + * The driver MUST NOT set the VIRTIO_NET_HDR_F_DATA_VALID and + * VIRTIO_NET_HDR_F_RSC_INFO bits in flags. + * + * 5.1.6.2.2 Device Requirements: Packet Transmission + * The device MUST ignore flag bits that it does not recognize. + */ + if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) { + uint32_t hdrlen; + + hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len; + if (hdr->csum_start <= hdrlen && l4_supported != 0) { + m->ol_flags |= PKT_RX_L4_CKSUM_NONE; + } else { + /* Unknown proto or tunnel, do sw cksum. We can assume + * the cksum field is in the first segment since the + * buffers we provided to the host are large enough. + * In case of SCTP, this will be wrong since it's a CRC + * but there's nothing we can do. + */ + uint16_t csum = 0, off; + + if (rte_raw_cksum_mbuf(m, hdr->csum_start, + rte_pktmbuf_pkt_len(m) - hdr->csum_start, &csum) < 0) + return; + if (likely(csum != 0xffff)) + csum = ~csum; + off = hdr->csum_offset + hdr->csum_start; + if (rte_pktmbuf_data_len(m) >= off + 1) + *rte_pktmbuf_mtod_offset(m, uint16_t *, off) = csum; + } + } + + if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) { + if (hdr->gso_size == 0) + return; + + switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) { + case VIRTIO_NET_HDR_GSO_TCPV4: + case VIRTIO_NET_HDR_GSO_TCPV6: + if ((ptype & RTE_PTYPE_L4_MASK) != RTE_PTYPE_L4_TCP) + break; + m->ol_flags |= PKT_RX_LRO | PKT_RX_L4_CKSUM_NONE; + m->tso_segsz = hdr->gso_size; + break; + case VIRTIO_NET_HDR_GSO_UDP: + if ((ptype & RTE_PTYPE_L4_MASK) != RTE_PTYPE_L4_UDP) + break; + m->ol_flags |= PKT_RX_LRO | PKT_RX_L4_CKSUM_NONE; + m->tso_segsz = hdr->gso_size; + break; + default: + break; + } + } +} + static __rte_noinline void copy_vnet_hdr_from_desc(struct virtio_net_hdr *hdr, struct buf_vector *buf_vec) @@ -1952,7 +2038,8 @@ copy_vnet_hdr_from_desc(struct virtio_net_hdr *hdr, static __rte_always_inline int copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, struct buf_vector *buf_vec, uint16_t nr_vec, - struct rte_mbuf *m, struct rte_mempool *mbuf_pool) + struct rte_mbuf *m, struct rte_mempool *mbuf_pool, + bool legacy_ol_flags) { uint32_t buf_avail, buf_offset; uint64_t buf_addr, buf_len; @@ -2085,7 +2172,7 @@ copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq, m->pkt_len += mbuf_offset; if (hdr) - vhost_dequeue_offload(hdr, m); + vhost_dequeue_offload(hdr, m, legacy_ol_flags); out: @@ -2168,9 +2255,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp, return NULL; } -static __rte_noinline uint16_t +__rte_always_inline +static uint16_t virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, - struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) + struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count, + bool legacy_ol_flags) { uint16_t i; uint16_t free_entries; @@ -2230,7 +2319,7 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, } err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts[i], - mbuf_pool); + mbuf_pool, legacy_ol_flags); if (unlikely(err)) { rte_pktmbuf_free(pkts[i]); if (!allocerr_warned) { @@ -2258,6 +2347,24 @@ virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq, return (i - dropped); } +__rte_noinline +static uint16_t +virtio_dev_tx_split_legacy(struct virtio_net *dev, + struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, + struct rte_mbuf **pkts, uint16_t count) +{ + return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, true); +} + +__rte_noinline +static uint16_t +virtio_dev_tx_split_compliant(struct virtio_net *dev, + struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, + struct rte_mbuf **pkts, uint16_t count) +{ + return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, false); +} + static __rte_always_inline int vhost_reserve_avail_batch_packed(struct virtio_net *dev, struct vhost_virtqueue *vq, @@ -2338,7 +2445,8 @@ static __rte_always_inline int virtio_dev_tx_batch_packed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, - struct rte_mbuf **pkts) + struct rte_mbuf **pkts, + bool legacy_ol_flags) { uint16_t avail_idx = vq->last_avail_idx; uint32_t buf_offset = sizeof(struct virtio_net_hdr_mrg_rxbuf); @@ -2362,7 +2470,7 @@ virtio_dev_tx_batch_packed(struct virtio_net *dev, if (virtio_net_with_host_offload(dev)) { vhost_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) { hdr = (struct virtio_net_hdr *)(desc_addrs[i]); - vhost_dequeue_offload(hdr, pkts[i]); + vhost_dequeue_offload(hdr, pkts[i], legacy_ol_flags); } } @@ -2383,7 +2491,8 @@ vhost_dequeue_single_packed(struct virtio_net *dev, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t *buf_id, - uint16_t *desc_count) + uint16_t *desc_count, + bool legacy_ol_flags) { struct buf_vector buf_vec[BUF_VECTOR_MAX]; uint32_t buf_len; @@ -2410,7 +2519,7 @@ vhost_dequeue_single_packed(struct virtio_net *dev, } err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, *pkts, - mbuf_pool); + mbuf_pool, legacy_ol_flags); if (unlikely(err)) { if (!allocerr_warned) { VHOST_LOG_DATA(ERR, @@ -2429,14 +2538,15 @@ static __rte_always_inline int virtio_dev_tx_single_packed(struct virtio_net *dev, struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool, - struct rte_mbuf **pkts) + struct rte_mbuf **pkts, + bool legacy_ol_flags) { uint16_t buf_id, desc_count = 0; int ret; ret = vhost_dequeue_single_packed(dev, vq, mbuf_pool, pkts, &buf_id, - &desc_count); + &desc_count, legacy_ol_flags); if (likely(desc_count > 0)) { if (virtio_net_is_inorder(dev)) @@ -2452,12 +2562,14 @@ virtio_dev_tx_single_packed(struct virtio_net *dev, return ret; } -static __rte_noinline uint16_t +__rte_always_inline +static uint16_t virtio_dev_tx_packed(struct virtio_net *dev, struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool, struct rte_mbuf **__rte_restrict pkts, - uint32_t count) + uint32_t count, + bool legacy_ol_flags) { uint32_t pkt_idx = 0; uint32_t remained = count; @@ -2467,7 +2579,8 @@ virtio_dev_tx_packed(struct virtio_net *dev, if (remained >= PACKED_BATCH_SIZE) { if (!virtio_dev_tx_batch_packed(dev, vq, mbuf_pool, - &pkts[pkt_idx])) { + &pkts[pkt_idx], + legacy_ol_flags)) { pkt_idx += PACKED_BATCH_SIZE; remained -= PACKED_BATCH_SIZE; continue; @@ -2475,7 +2588,8 @@ virtio_dev_tx_packed(struct virtio_net *dev, } if (virtio_dev_tx_single_packed(dev, vq, mbuf_pool, - &pkts[pkt_idx])) + &pkts[pkt_idx], + legacy_ol_flags)) break; pkt_idx++; remained--; @@ -2492,6 +2606,24 @@ virtio_dev_tx_packed(struct virtio_net *dev, return pkt_idx; } +__rte_noinline +static uint16_t +virtio_dev_tx_packed_legacy(struct virtio_net *dev, + struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool, + struct rte_mbuf **__rte_restrict pkts, uint32_t count) +{ + return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, true); +} + +__rte_noinline +static uint16_t +virtio_dev_tx_packed_compliant(struct virtio_net *dev, + struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool *mbuf_pool, + struct rte_mbuf **__rte_restrict pkts, uint32_t count) +{ + return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, false); +} + uint16_t rte_vhost_dequeue_burst(int vid, uint16_t queue_id, struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t count) @@ -2567,10 +2699,17 @@ rte_vhost_dequeue_burst(int vid, uint16_t queue_id, count -= 1; } - if (vq_is_packed(dev)) - count = virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count); - else - count = virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count); + if (vq_is_packed(dev)) { + if (dev->flags & VIRTIO_DEV_LEGACY_OL_FLAGS) + count = virtio_dev_tx_packed_legacy(dev, vq, mbuf_pool, pkts, count); + else + count = virtio_dev_tx_packed_compliant(dev, vq, mbuf_pool, pkts, count); + } else { + if (dev->flags & VIRTIO_DEV_LEGACY_OL_FLAGS) + count = virtio_dev_tx_split_legacy(dev, vq, mbuf_pool, pkts, count); + else + count = virtio_dev_tx_split_compliant(dev, vq, mbuf_pool, pkts, count); + } out: if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM)) -- 2.23.0