DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Wang, Yinan" <yinan.wang@intel.com>
To: "Wang, Yinan" <yinan.wang@intel.com>,
	David Marchand <david.marchand@redhat.com>,
	"dev@dpdk.org" <dev@dpdk.org>
Cc: "maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"fbl@sysclose.org" <fbl@sysclose.org>,
	"i.maximets@ovn.org" <i.maximets@ovn.org>,
	"Xia, Chenbo" <chenbo.xia@intel.com>,
	"Stokes, Ian" <ian.stokes@intel.com>,
	"stable@dpdk.org" <stable@dpdk.org>,
	Jijiang Liu <jijiang.liu@intel.com>,
	Yuanhan Liu <yuanhan.liu@linux.intel.com>
Subject: Re: [dpdk-dev] [PATCH v4 3/3] vhost: fix offload flags in Rx path
Date: Wed, 12 May 2021 03:29:57 +0000
Message-ID: <BYAPR11MB2648F122110B5226E2994FAE8F529@BYAPR11MB2648.namprd11.prod.outlook.com> (raw)
In-Reply-To: <BYAPR11MB264891FD6862DC3936D4EACB8F569@BYAPR11MB2648.namprd11.prod.outlook.com>

Hi David,

Since vhost tx offload can’t work now, we report a Bugzilla as below, could you help to take a look?
https://bugs.dpdk.org/show_bug.cgi?id=702
We also tried vhost example with VM2VM iperf test, large pkts also can't forwarding.

BR,
Yinan


> -----Original Message-----
> From: dev <dev-bounces@dpdk.org> On Behalf Of Wang, Yinan
> Sent: 2021?5?8? 14:24
> To: David Marchand <david.marchand@redhat.com>; dev@dpdk.org
> Cc: maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> fbl@sysclose.org; i.maximets@ovn.org; Xia, Chenbo
> <chenbo.xia@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
> stable@dpdk.org; Jijiang Liu <jijiang.liu@intel.com>; Yuanhan Liu
> <yuanhan.liu@linux.intel.com>
> Subject: Re: [dpdk-dev] [PATCH v4 3/3] vhost: fix offload flags in Rx path
> 
> Hi David,
> 
> May I know how to configures Tx offloading by testpmd, could you help to
> provide an example case?
> I add a case which need vhost tx offload (TSO/cksum) function, this case
> can't work with the patch, could you use this case as the example if possible?
> 
> For example: VM2VM split ring vhost-user/virtio-net test with tcp traffic
> ==========================================================
> ===============
> 
> 1. Launch the Vhost sample on socket 0 by below commands::
> 
>     rm -rf vhost-net*
>     ./dpdk-testpmd -l 2-4 -n 4 --no-pci --file-prefix=vhost --vdev
> 'net_vhost0,iface=vhost-net0,queues=1' \
>     --vdev 'net_vhost1,iface=vhost-net1,queues=1'  -- -i --nb-cores=2 --
> txd=1024 --rxd=1024
>     testpmd>start
> 
> 2. Launch VM1 and VM2 on socket 1::
> 
>     taskset -c 32 qemu-system-x86_64 -name vm1 -enable-kvm -cpu host -
> smp 1 -m 4096 \
>     -object memory-backend-file,id=mem,size=4096M,mem-
> path=/mnt/huge,share=on \
>     -numa node,memdev=mem -mem-prealloc -drive
> file=/home/osimg/ubuntu20-04.img  \
>     -chardev socket,path=/tmp/vm2_qga0.sock,server,nowait,id=vm2_qga0
> -device virtio-serial \
>     -device virtserialport,chardev=vm2_qga0,name=org.qemu.guest_agent.2
> -daemonize \
>     -monitor unix:/tmp/vm2_monitor.sock,server,nowait -device
> e1000,netdev=nttsip1 \
>     -netdev user,id=nttsip1,hostfwd=tcp:127.0.0.1:6002-:22 \
>     -chardev socket,id=char0,path=./vhost-net0 \
>     -netdev type=vhost-user,id=netdev0,chardev=char0,vhostforce \
>     -device virtio-net-pci,netdev=netdev0,mac=52:54:00:00:00:01,disable-
> modern=false,mrg_rxbuf=on,csum=on,guest_csum=on,host_tso4=on,guest
> _tso4=on,guest_ecn=on -vnc :10
> 
>    taskset -c 33 qemu-system-x86_64 -name vm2 -enable-kvm -cpu host -
> smp 1 -m 4096 \
>     -object memory-backend-file,id=mem,size=4096M,mem-
> path=/mnt/huge,share=on \
>     -numa node,memdev=mem -mem-prealloc -drive
> file=/home/osimg/ubuntu20-04-2.img  \
>     -chardev socket,path=/tmp/vm2_qga0.sock,server,nowait,id=vm2_qga0
> -device virtio-serial \
>     -device virtserialport,chardev=vm2_qga0,name=org.qemu.guest_agent.2
> -daemonize \
>     -monitor unix:/tmp/vm2_monitor.sock,server,nowait -device
> e1000,netdev=nttsip1 \
>     -netdev user,id=nttsip1,hostfwd=tcp:127.0.0.1:6003-:22 \
>     -chardev socket,id=char0,path=./vhost-net1 \
>     -netdev type=vhost-user,id=netdev0,chardev=char0,vhostforce \
>     -device virtio-net-pci,netdev=netdev0,mac=52:54:00:00:00:02,disable-
> modern=false,mrg_rxbuf=on,csum=on,guest_csum=on,host_tso4=on,guest
> _tso4=on,guest_ecn=on -vnc :12
> 
> 3. On VM1, set virtio device IP and run arp protocal::
> 
>     ifconfig ens5 1.1.1.2
>     arp -s 1.1.1.8 52:54:00:00:00:02
> 
> 4. On VM2, set virtio device IP and run arp protocal::
> 
>     ifconfig ens5 1.1.1.8
>     arp -s 1.1.1.2 52:54:00:00:00:01
> 
> 5. Check the iperf performance with different packet size between two VMs
> by below commands::
> 
>     Under VM1, run: `iperf -s -i 1`
>     Under VM2, run: `iperf -c 1.1.1.2 -i 1 -t 60`
> 
> BR,
> Yinan
> 
> > -----Original Message-----
> > From: dev <dev-bounces@dpdk.org> On Behalf Of David Marchand
> > Sent: 2021?5?4? 0:44
> > To: dev@dpdk.org
> > Cc: maxime.coquelin@redhat.com; olivier.matz@6wind.com;
> > fbl@sysclose.org; i.maximets@ovn.org; Xia, Chenbo
> > <chenbo.xia@intel.com>; Stokes, Ian <ian.stokes@intel.com>;
> > stable@dpdk.org; Jijiang Liu <jijiang.liu@intel.com>; Yuanhan Liu
> > <yuanhan.liu@linux.intel.com>
> > Subject: [dpdk-dev] [PATCH v4 3/3] vhost: fix offload flags in Rx path
> >
> > The vhost library currently configures Tx offloading (PKT_TX_*) on any
> > packet received from a guest virtio device which asks for some offloading.
> >
> > This is problematic, as Tx offloading is something that the application
> > must ask for: the application needs to configure devices
> > to support every used offloads (ip, tcp checksumming, tso..), and the
> > various l2/l3/l4 lengths must be set following any processing that
> > happened in the application itself.
> >
> > On the other hand, the received packets are not marked wrt current
> > packet l3/l4 checksumming info.
> >
> > Copy virtio rx processing to fix those offload flags with some
> > differences:
> > - accept VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP,
> > - ignore anything but the VIRTIO_NET_HDR_F_NEEDS_CSUM flag (to
> comply
> > with
> >   the virtio spec),
> >
> > Some applications might rely on the current behavior, so it is left
> > untouched by default.
> > A new RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS flag is added to
> > enable the
> > new behavior.
> >
> > The vhost example has been updated for the new behavior: TSO is applied
> > to
> > any packet marked LRO.
> >
> > Fixes: 859b480d5afd ("vhost: add guest offload setting")
> > Cc: stable@dpdk.org
> >
> > Signed-off-by: David Marchand <david.marchand@redhat.com>
> > Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> > ---
> > Changes since v3:
> > - rebased on next-virtio,
> >
> > Changes since v2:
> > - introduced a new flag to keep existing behavior as the default,
> > - packets with unrecognised offload are passed to the application with no
> >   offload metadata rather than dropped,
> > - ignored VIRTIO_NET_HDR_F_DATA_VALID since the virtio spec states
> that
> >   the virtio driver is not allowed to use this flag when transmitting
> >   packets,
> >
> > Changes since v1:
> > - updated vhost example,
> > - restored VIRTIO_NET_HDR_GSO_ECN and VIRTIO_NET_HDR_GSO_UDP
> > support,
> > - restored log on buggy offload request,
> >
> > ---
> >  doc/guides/prog_guide/vhost_lib.rst    |  12 ++
> >  doc/guides/rel_notes/release_21_05.rst |   6 +
> >  drivers/net/vhost/rte_eth_vhost.c      |   2 +-
> >  examples/vhost/main.c                  |  44 +++---
> >  lib/vhost/rte_vhost.h                  |   1 +
> >  lib/vhost/socket.c                     |   5 +-
> >  lib/vhost/vhost.c                      |   6 +-
> >  lib/vhost/vhost.h                      |  14 +-
> >  lib/vhost/virtio_net.c                 | 185 ++++++++++++++++++++++---
> >  9 files changed, 222 insertions(+), 53 deletions(-)
> >
> > diff --git a/doc/guides/prog_guide/vhost_lib.rst
> > b/doc/guides/prog_guide/vhost_lib.rst
> > index 7afa351675..d18fb98910 100644
> > --- a/doc/guides/prog_guide/vhost_lib.rst
> > +++ b/doc/guides/prog_guide/vhost_lib.rst
> > @@ -118,6 +118,18 @@ The following is an overview of some key Vhost
> > API functions:
> >
> >      It is disabled by default.
> >
> > +  - ``RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS``
> > +
> > +    Since v16.04, the vhost library forwards checksum and gso requests
> for
> > +    packets received from a virtio driver by filling Tx offload metadata in
> > +    the mbuf. This behavior is inconsistent with other drivers but it is left
> > +    untouched for existing applications that might rely on it.
> > +
> > +    This flag disables the legacy behavior and instead ask vhost to simply
> > +    populate Rx offload metadata in the mbuf.
> > +
> > +    It is disabled by default.
> > +
> >  * ``rte_vhost_driver_set_features(path, features)``
> >
> >    This function sets the feature bits the vhost-user driver supports. The
> > diff --git a/doc/guides/rel_notes/release_21_05.rst
> > b/doc/guides/rel_notes/release_21_05.rst
> > index a5f21f8425..6b7b0810a5 100644
> > --- a/doc/guides/rel_notes/release_21_05.rst
> > +++ b/doc/guides/rel_notes/release_21_05.rst
> > @@ -337,6 +337,12 @@ API Changes
> >    ``policer_action_recolor_supported`` and
> > ``policer_action_drop_supported``
> >    have been removed.
> >
> > +* vhost: The vhost library currently populates received mbufs from a
> virtio
> > +  driver with Tx offload flags while not filling Rx offload flags.
> > +  While this behavior is arguable, it is kept untouched.
> > +  A new flag ``RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS`` has been
> > added to ask
> > +  for a behavior compliant with to the mbuf offload API.
> > +
> >
> >  ABI Changes
> >  -----------
> > diff --git a/drivers/net/vhost/rte_eth_vhost.c
> > b/drivers/net/vhost/rte_eth_vhost.c
> > index d198fc8a8e..281379d6a3 100644
> > --- a/drivers/net/vhost/rte_eth_vhost.c
> > +++ b/drivers/net/vhost/rte_eth_vhost.c
> > @@ -1505,7 +1505,7 @@ rte_pmd_vhost_probe(struct rte_vdev_device
> > *dev)
> >  	int ret = 0;
> >  	char *iface_name;
> >  	uint16_t queues;
> > -	uint64_t flags = 0;
> > +	uint64_t flags = RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS;
> >  	uint64_t disable_flags = 0;
> >  	int client_mode = 0;
> >  	int iommu_support = 0;
> > diff --git a/examples/vhost/main.c b/examples/vhost/main.c
> > index 0bee1f3321..d2179eadb9 100644
> > --- a/examples/vhost/main.c
> > +++ b/examples/vhost/main.c
> > @@ -19,6 +19,7 @@
> >  #include <rte_log.h>
> >  #include <rte_string_fns.h>
> >  #include <rte_malloc.h>
> > +#include <rte_net.h>
> >  #include <rte_vhost.h>
> >  #include <rte_ip.h>
> >  #include <rte_tcp.h>
> > @@ -1029,33 +1030,34 @@ find_local_dest(struct vhost_dev *vdev,
> > struct rte_mbuf *m,
> >  	return 0;
> >  }
> >
> > -static uint16_t
> > -get_psd_sum(void *l3_hdr, uint64_t ol_flags)
> > -{
> > -	if (ol_flags & PKT_TX_IPV4)
> > -		return rte_ipv4_phdr_cksum(l3_hdr, ol_flags);
> > -	else /* assume ethertype == RTE_ETHER_TYPE_IPV6 */
> > -		return rte_ipv6_phdr_cksum(l3_hdr, ol_flags);
> > -}
> > -
> >  static void virtio_tx_offload(struct rte_mbuf *m)
> >  {
> > +	struct rte_net_hdr_lens hdr_lens;
> > +	struct rte_ipv4_hdr *ipv4_hdr;
> > +	struct rte_tcp_hdr *tcp_hdr;
> > +	uint32_t ptype;
> >  	void *l3_hdr;
> > -	struct rte_ipv4_hdr *ipv4_hdr = NULL;
> > -	struct rte_tcp_hdr *tcp_hdr = NULL;
> > -	struct rte_ether_hdr *eth_hdr =
> > -		rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
> >
> > -	l3_hdr = (char *)eth_hdr + m->l2_len;
> > +	ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK);
> > +	m->l2_len = hdr_lens.l2_len;
> > +	m->l3_len = hdr_lens.l3_len;
> > +	m->l4_len = hdr_lens.l4_len;
> >
> > -	if (m->ol_flags & PKT_TX_IPV4) {
> > +	l3_hdr = rte_pktmbuf_mtod_offset(m, void *, m->l2_len);
> > +	tcp_hdr = rte_pktmbuf_mtod_offset(m, struct rte_tcp_hdr *,
> > +		m->l2_len + m->l3_len);
> > +
> > +	m->ol_flags |= PKT_TX_TCP_SEG;
> > +	if ((ptype & RTE_PTYPE_L3_MASK) == RTE_PTYPE_L3_IPV4) {
> > +		m->ol_flags |= PKT_TX_IPV4;
> > +		m->ol_flags |= PKT_TX_IP_CKSUM;
> >  		ipv4_hdr = l3_hdr;
> >  		ipv4_hdr->hdr_checksum = 0;
> > -		m->ol_flags |= PKT_TX_IP_CKSUM;
> > +		tcp_hdr->cksum = rte_ipv4_phdr_cksum(l3_hdr, m-
> > >ol_flags);
> > +	} else { /* assume ethertype == RTE_ETHER_TYPE_IPV6 */
> > +		m->ol_flags |= PKT_TX_IPV6;
> > +		tcp_hdr->cksum = rte_ipv6_phdr_cksum(l3_hdr, m-
> > >ol_flags);
> >  	}
> > -
> > -	tcp_hdr = (struct rte_tcp_hdr *)((char *)l3_hdr + m->l3_len);
> > -	tcp_hdr->cksum = get_psd_sum(l3_hdr, m->ol_flags);
> >  }
> >
> >  static __rte_always_inline void
> > @@ -1148,7 +1150,7 @@ virtio_tx_route(struct vhost_dev *vdev, struct
> > rte_mbuf *m, uint16_t vlan_tag)
> >  		m->vlan_tci = vlan_tag;
> >  	}
> >
> > -	if (m->ol_flags & PKT_TX_TCP_SEG)
> > +	if (m->ol_flags & PKT_RX_LRO)
> >  		virtio_tx_offload(m);
> >
> >  	tx_q->m_table[tx_q->len++] = m;
> > @@ -1633,7 +1635,7 @@ main(int argc, char *argv[])
> >  	int ret, i;
> >  	uint16_t portid;
> >  	static pthread_t tid;
> > -	uint64_t flags = 0;
> > +	uint64_t flags = RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS;
> >
> >  	signal(SIGINT, sigint_handler);
> >
> > diff --git a/lib/vhost/rte_vhost.h b/lib/vhost/rte_vhost.h
> > index d0a8ae31f2..8d875e9322 100644
> > --- a/lib/vhost/rte_vhost.h
> > +++ b/lib/vhost/rte_vhost.h
> > @@ -36,6 +36,7 @@ extern "C" {
> >  /* support only linear buffers (no chained mbufs) */
> >  #define RTE_VHOST_USER_LINEARBUF_SUPPORT	(1ULL << 6)
> >  #define RTE_VHOST_USER_ASYNC_COPY	(1ULL << 7)
> > +#define RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS	(1ULL << 8)
> >
> >  /* Features. */
> >  #ifndef VIRTIO_NET_F_GUEST_ANNOUNCE
> > diff --git a/lib/vhost/socket.c b/lib/vhost/socket.c
> > index 0169d36481..5d0d728d52 100644
> > --- a/lib/vhost/socket.c
> > +++ b/lib/vhost/socket.c
> > @@ -42,6 +42,7 @@ struct vhost_user_socket {
> >  	bool extbuf;
> >  	bool linearbuf;
> >  	bool async_copy;
> > +	bool net_compliant_ol_flags;
> >
> >  	/*
> >  	 * The "supported_features" indicates the feature bits the
> > @@ -224,7 +225,8 @@ vhost_user_add_connection(int fd, struct
> > vhost_user_socket *vsocket)
> >  	size = strnlen(vsocket->path, PATH_MAX);
> >  	vhost_set_ifname(vid, vsocket->path, size);
> >
> > -	vhost_set_builtin_virtio_net(vid, vsocket->use_builtin_virtio_net);
> > +	vhost_setup_virtio_net(vid, vsocket->use_builtin_virtio_net,
> > +		vsocket->net_compliant_ol_flags);
> >
> >  	vhost_attach_vdpa_device(vid, vsocket->vdpa_dev);
> >
> > @@ -877,6 +879,7 @@ rte_vhost_driver_register(const char *path,
> > uint64_t flags)
> >  	vsocket->extbuf = flags & RTE_VHOST_USER_EXTBUF_SUPPORT;
> >  	vsocket->linearbuf = flags &
> > RTE_VHOST_USER_LINEARBUF_SUPPORT;
> >  	vsocket->async_copy = flags & RTE_VHOST_USER_ASYNC_COPY;
> > +	vsocket->net_compliant_ol_flags = flags &
> > RTE_VHOST_USER_NET_COMPLIANT_OL_FLAGS;
> >
> >  	if (vsocket->async_copy &&
> >  		(flags & (RTE_VHOST_USER_IOMMU_SUPPORT |
> > diff --git a/lib/vhost/vhost.c b/lib/vhost/vhost.c
> > index c9b6379f73..9abfc0bfe7 100644
> > --- a/lib/vhost/vhost.c
> > +++ b/lib/vhost/vhost.c
> > @@ -752,7 +752,7 @@ vhost_set_ifname(int vid, const char *if_name,
> > unsigned int if_len)
> >  }
> >
> >  void
> > -vhost_set_builtin_virtio_net(int vid, bool enable)
> > +vhost_setup_virtio_net(int vid, bool enable, bool compliant_ol_flags)
> >  {
> >  	struct virtio_net *dev = get_device(vid);
> >
> > @@ -763,6 +763,10 @@ vhost_set_builtin_virtio_net(int vid, bool
> enable)
> >  		dev->flags |= VIRTIO_DEV_BUILTIN_VIRTIO_NET;
> >  	else
> >  		dev->flags &= ~VIRTIO_DEV_BUILTIN_VIRTIO_NET;
> > +	if (!compliant_ol_flags)
> > +		dev->flags |= VIRTIO_DEV_LEGACY_OL_FLAGS;
> > +	else
> > +		dev->flags &= ~VIRTIO_DEV_LEGACY_OL_FLAGS;
> >  }
> >
> >  void
> > diff --git a/lib/vhost/vhost.h b/lib/vhost/vhost.h
> > index b303635645..8078ddff79 100644
> > --- a/lib/vhost/vhost.h
> > +++ b/lib/vhost/vhost.h
> > @@ -27,15 +27,17 @@
> >  #include "rte_vhost_async.h"
> >
> >  /* Used to indicate that the device is running on a data core */
> > -#define VIRTIO_DEV_RUNNING 1
> > +#define VIRTIO_DEV_RUNNING ((uint32_t)1 << 0)
> >  /* Used to indicate that the device is ready to operate */
> > -#define VIRTIO_DEV_READY 2
> > +#define VIRTIO_DEV_READY ((uint32_t)1 << 1)
> >  /* Used to indicate that the built-in vhost net device backend is enabled
> */
> > -#define VIRTIO_DEV_BUILTIN_VIRTIO_NET 4
> > +#define VIRTIO_DEV_BUILTIN_VIRTIO_NET ((uint32_t)1 << 2)
> >  /* Used to indicate that the device has its own data path and configured
> */
> > -#define VIRTIO_DEV_VDPA_CONFIGURED 8
> > +#define VIRTIO_DEV_VDPA_CONFIGURED ((uint32_t)1 << 3)
> >  /* Used to indicate that the feature negotiation failed */
> > -#define VIRTIO_DEV_FEATURES_FAILED 16
> > +#define VIRTIO_DEV_FEATURES_FAILED ((uint32_t)1 << 4)
> > +/* Used to indicate that the virtio_net tx code should fill TX ol_flags */
> > +#define VIRTIO_DEV_LEGACY_OL_FLAGS ((uint32_t)1 << 5)
> >
> >  /* Backend value set by guest. */
> >  #define VIRTIO_DEV_STOPPED -1
> > @@ -683,7 +685,7 @@ int alloc_vring_queue(struct virtio_net *dev,
> > uint32_t vring_idx);
> >  void vhost_attach_vdpa_device(int vid, struct rte_vdpa_device *dev);
> >
> >  void vhost_set_ifname(int, const char *if_name, unsigned int if_len);
> > -void vhost_set_builtin_virtio_net(int vid, bool enable);
> > +void vhost_setup_virtio_net(int vid, bool enable, bool legacy_ol_flags);
> >  void vhost_enable_extbuf(int vid);
> >  void vhost_enable_linearbuf(int vid);
> >  int vhost_enable_guest_notification(struct virtio_net *dev,
> > diff --git a/lib/vhost/virtio_net.c b/lib/vhost/virtio_net.c
> > index 1a34867f3c..8e36f4c340 100644
> > --- a/lib/vhost/virtio_net.c
> > +++ b/lib/vhost/virtio_net.c
> > @@ -8,6 +8,7 @@
> >
> >  #include <rte_mbuf.h>
> >  #include <rte_memcpy.h>
> > +#include <rte_net.h>
> >  #include <rte_ether.h>
> >  #include <rte_ip.h>
> >  #include <rte_vhost.h>
> > @@ -2303,15 +2304,12 @@ parse_ethernet(struct rte_mbuf *m,
> uint16_t
> > *l4_proto, void **l4_hdr)
> >  }
> >
> >  static __rte_always_inline void
> > -vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m)
> > +vhost_dequeue_offload_legacy(struct virtio_net_hdr *hdr, struct
> > rte_mbuf *m)
> >  {
> >  	uint16_t l4_proto = 0;
> >  	void *l4_hdr = NULL;
> >  	struct rte_tcp_hdr *tcp_hdr = NULL;
> >
> > -	if (hdr->flags == 0 && hdr->gso_type ==
> > VIRTIO_NET_HDR_GSO_NONE)
> > -		return;
> > -
> >  	parse_ethernet(m, &l4_proto, &l4_hdr);
> >  	if (hdr->flags == VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> >  		if (hdr->csum_start == (m->l2_len + m->l3_len)) {
> > @@ -2356,6 +2354,94 @@ vhost_dequeue_offload(struct virtio_net_hdr
> > *hdr, struct rte_mbuf *m)
> >  	}
> >  }
> >
> > +static __rte_always_inline void
> > +vhost_dequeue_offload(struct virtio_net_hdr *hdr, struct rte_mbuf *m,
> > +	bool legacy_ol_flags)
> > +{
> > +	struct rte_net_hdr_lens hdr_lens;
> > +	int l4_supported = 0;
> > +	uint32_t ptype;
> > +
> > +	if (hdr->flags == 0 && hdr->gso_type ==
> > VIRTIO_NET_HDR_GSO_NONE)
> > +		return;
> > +
> > +	if (legacy_ol_flags) {
> > +		vhost_dequeue_offload_legacy(hdr, m);
> > +		return;
> > +	}
> > +
> > +	m->ol_flags |= PKT_RX_IP_CKSUM_UNKNOWN;
> > +
> > +	ptype = rte_net_get_ptype(m, &hdr_lens, RTE_PTYPE_ALL_MASK);
> > +	m->packet_type = ptype;
> > +	if ((ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_TCP ||
> > +	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_UDP ||
> > +	    (ptype & RTE_PTYPE_L4_MASK) == RTE_PTYPE_L4_SCTP)
> > +		l4_supported = 1;
> > +
> > +	/* According to Virtio 1.1 spec, the device only needs to look at
> > +	 * VIRTIO_NET_HDR_F_NEEDS_CSUM in the packet transmission
> > path.
> > +	 * This differs from the processing incoming packets path where the
> > +	 * driver could rely on VIRTIO_NET_HDR_F_DATA_VALID flag set by
> > the
> > +	 * device.
> > +	 *
> > +	 * 5.1.6.2.1 Driver Requirements: Packet Transmission
> > +	 * The driver MUST NOT set the VIRTIO_NET_HDR_F_DATA_VALID
> > and
> > +	 * VIRTIO_NET_HDR_F_RSC_INFO bits in flags.
> > +	 *
> > +	 * 5.1.6.2.2 Device Requirements: Packet Transmission
> > +	 * The device MUST ignore flag bits that it does not recognize.
> > +	 */
> > +	if (hdr->flags & VIRTIO_NET_HDR_F_NEEDS_CSUM) {
> > +		uint32_t hdrlen;
> > +
> > +		hdrlen = hdr_lens.l2_len + hdr_lens.l3_len + hdr_lens.l4_len;
> > +		if (hdr->csum_start <= hdrlen && l4_supported != 0) {
> > +			m->ol_flags |= PKT_RX_L4_CKSUM_NONE;
> > +		} else {
> > +			/* Unknown proto or tunnel, do sw cksum. We can
> > assume
> > +			 * the cksum field is in the first segment since the
> > +			 * buffers we provided to the host are large enough.
> > +			 * In case of SCTP, this will be wrong since it's a CRC
> > +			 * but there's nothing we can do.
> > +			 */
> > +			uint16_t csum = 0, off;
> > +
> > +			if (rte_raw_cksum_mbuf(m, hdr->csum_start,
> > +					rte_pktmbuf_pkt_len(m) - hdr-
> > >csum_start, &csum) < 0)
> > +				return;
> > +			if (likely(csum != 0xffff))
> > +				csum = ~csum;
> > +			off = hdr->csum_offset + hdr->csum_start;
> > +			if (rte_pktmbuf_data_len(m) >= off + 1)
> > +				*rte_pktmbuf_mtod_offset(m, uint16_t *,
> > off) = csum;
> > +		}
> > +	}
> > +
> > +	if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
> > +		if (hdr->gso_size == 0)
> > +			return;
> > +
> > +		switch (hdr->gso_type & ~VIRTIO_NET_HDR_GSO_ECN) {
> > +		case VIRTIO_NET_HDR_GSO_TCPV4:
> > +		case VIRTIO_NET_HDR_GSO_TCPV6:
> > +			if ((ptype & RTE_PTYPE_L4_MASK) !=
> > RTE_PTYPE_L4_TCP)
> > +				break;
> > +			m->ol_flags |= PKT_RX_LRO |
> > PKT_RX_L4_CKSUM_NONE;
> > +			m->tso_segsz = hdr->gso_size;
> > +			break;
> > +		case VIRTIO_NET_HDR_GSO_UDP:
> > +			if ((ptype & RTE_PTYPE_L4_MASK) !=
> > RTE_PTYPE_L4_UDP)
> > +				break;
> > +			m->ol_flags |= PKT_RX_LRO |
> > PKT_RX_L4_CKSUM_NONE;
> > +			m->tso_segsz = hdr->gso_size;
> > +			break;
> > +		default:
> > +			break;
> > +		}
> > +	}
> > +}
> > +
> >  static __rte_noinline void
> >  copy_vnet_hdr_from_desc(struct virtio_net_hdr *hdr,
> >  		struct buf_vector *buf_vec)
> > @@ -2380,7 +2466,8 @@ copy_vnet_hdr_from_desc(struct
> virtio_net_hdr
> > *hdr,
> >  static __rte_always_inline int
> >  copy_desc_to_mbuf(struct virtio_net *dev, struct vhost_virtqueue *vq,
> >  		  struct buf_vector *buf_vec, uint16_t nr_vec,
> > -		  struct rte_mbuf *m, struct rte_mempool *mbuf_pool)
> > +		  struct rte_mbuf *m, struct rte_mempool *mbuf_pool,
> > +		  bool legacy_ol_flags)
> >  {
> >  	uint32_t buf_avail, buf_offset;
> >  	uint64_t buf_addr, buf_len;
> > @@ -2513,7 +2600,7 @@ copy_desc_to_mbuf(struct virtio_net *dev,
> > struct vhost_virtqueue *vq,
> >  	m->pkt_len    += mbuf_offset;
> >
> >  	if (hdr)
> > -		vhost_dequeue_offload(hdr, m);
> > +		vhost_dequeue_offload(hdr, m, legacy_ol_flags);
> >
> >  out:
> >
> > @@ -2606,9 +2693,11 @@ virtio_dev_pktmbuf_alloc(struct virtio_net
> > *dev, struct rte_mempool *mp,
> >  	return pkt;
> >  }
> >
> > -static __rte_noinline uint16_t
> > +__rte_always_inline
> > +static uint16_t
> >  virtio_dev_tx_split(struct virtio_net *dev, struct vhost_virtqueue *vq,
> > -	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count)
> > +	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count,
> > +	bool legacy_ol_flags)
> >  {
> >  	uint16_t i;
> >  	uint16_t free_entries;
> > @@ -2668,7 +2757,7 @@ virtio_dev_tx_split(struct virtio_net *dev,
> struct
> > vhost_virtqueue *vq,
> >  		}
> >
> >  		err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts[i],
> > -				mbuf_pool);
> > +				mbuf_pool, legacy_ol_flags);
> >  		if (unlikely(err)) {
> >  			rte_pktmbuf_free(pkts[i]);
> >  			if (!allocerr_warned) {
> > @@ -2696,6 +2785,24 @@ virtio_dev_tx_split(struct virtio_net *dev,
> > struct vhost_virtqueue *vq,
> >  	return (i - dropped);
> >  }
> >
> > +__rte_noinline
> > +static uint16_t
> > +virtio_dev_tx_split_legacy(struct virtio_net *dev,
> > +	struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
> > +	struct rte_mbuf **pkts, uint16_t count)
> > +{
> > +	return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, true);
> > +}
> > +
> > +__rte_noinline
> > +static uint16_t
> > +virtio_dev_tx_split_compliant(struct virtio_net *dev,
> > +	struct vhost_virtqueue *vq, struct rte_mempool *mbuf_pool,
> > +	struct rte_mbuf **pkts, uint16_t count)
> > +{
> > +	return virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count, false);
> > +}
> > +
> >  static __rte_always_inline int
> >  vhost_reserve_avail_batch_packed(struct virtio_net *dev,
> >  				 struct vhost_virtqueue *vq,
> > @@ -2770,7 +2877,8 @@ vhost_reserve_avail_batch_packed(struct
> > virtio_net *dev,
> >  static __rte_always_inline int
> >  virtio_dev_tx_batch_packed(struct virtio_net *dev,
> >  			   struct vhost_virtqueue *vq,
> > -			   struct rte_mbuf **pkts)
> > +			   struct rte_mbuf **pkts,
> > +			   bool legacy_ol_flags)
> >  {
> >  	uint16_t avail_idx = vq->last_avail_idx;
> >  	uint32_t buf_offset = sizeof(struct virtio_net_hdr_mrg_rxbuf);
> > @@ -2794,7 +2902,7 @@ virtio_dev_tx_batch_packed(struct virtio_net
> > *dev,
> >  	if (virtio_net_with_host_offload(dev)) {
> >  		vhost_for_each_try_unroll(i, 0, PACKED_BATCH_SIZE) {
> >  			hdr = (struct virtio_net_hdr *)(desc_addrs[i]);
> > -			vhost_dequeue_offload(hdr, pkts[i]);
> > +			vhost_dequeue_offload(hdr, pkts[i], legacy_ol_flags);
> >  		}
> >  	}
> >
> > @@ -2815,7 +2923,8 @@ vhost_dequeue_single_packed(struct
> virtio_net
> > *dev,
> >  			    struct rte_mempool *mbuf_pool,
> >  			    struct rte_mbuf *pkts,
> >  			    uint16_t *buf_id,
> > -			    uint16_t *desc_count)
> > +			    uint16_t *desc_count,
> > +			    bool legacy_ol_flags)
> >  {
> >  	struct buf_vector buf_vec[BUF_VECTOR_MAX];
> >  	uint32_t buf_len;
> > @@ -2841,7 +2950,7 @@ vhost_dequeue_single_packed(struct
> virtio_net
> > *dev,
> >  	}
> >
> >  	err = copy_desc_to_mbuf(dev, vq, buf_vec, nr_vec, pkts,
> > -				mbuf_pool);
> > +				mbuf_pool, legacy_ol_flags);
> >  	if (unlikely(err)) {
> >  		if (!allocerr_warned) {
> >  			VHOST_LOG_DATA(ERR,
> > @@ -2859,14 +2968,15 @@ static __rte_always_inline int
> >  virtio_dev_tx_single_packed(struct virtio_net *dev,
> >  			    struct vhost_virtqueue *vq,
> >  			    struct rte_mempool *mbuf_pool,
> > -			    struct rte_mbuf *pkts)
> > +			    struct rte_mbuf *pkts,
> > +			    bool legacy_ol_flags)
> >  {
> >
> >  	uint16_t buf_id, desc_count = 0;
> >  	int ret;
> >
> >  	ret = vhost_dequeue_single_packed(dev, vq, mbuf_pool, pkts,
> > &buf_id,
> > -					&desc_count);
> > +					&desc_count, legacy_ol_flags);
> >
> >  	if (likely(desc_count > 0)) {
> >  		if (virtio_net_is_inorder(dev))
> > @@ -2882,12 +2992,14 @@ virtio_dev_tx_single_packed(struct
> virtio_net
> > *dev,
> >  	return ret;
> >  }
> >
> > -static __rte_noinline uint16_t
> > +__rte_always_inline
> > +static uint16_t
> >  virtio_dev_tx_packed(struct virtio_net *dev,
> >  		     struct vhost_virtqueue *__rte_restrict vq,
> >  		     struct rte_mempool *mbuf_pool,
> >  		     struct rte_mbuf **__rte_restrict pkts,
> > -		     uint32_t count)
> > +		     uint32_t count,
> > +		     bool legacy_ol_flags)
> >  {
> >  	uint32_t pkt_idx = 0;
> >
> > @@ -2899,14 +3011,16 @@ virtio_dev_tx_packed(struct virtio_net *dev,
> >
> >  		if (count - pkt_idx >= PACKED_BATCH_SIZE) {
> >  			if (!virtio_dev_tx_batch_packed(dev, vq,
> > -							&pkts[pkt_idx])) {
> > +							&pkts[pkt_idx],
> > +							legacy_ol_flags)) {
> >  				pkt_idx += PACKED_BATCH_SIZE;
> >  				continue;
> >  			}
> >  		}
> >
> >  		if (virtio_dev_tx_single_packed(dev, vq, mbuf_pool,
> > -						pkts[pkt_idx]))
> > +						pkts[pkt_idx],
> > +						legacy_ol_flags))
> >  			break;
> >  		pkt_idx++;
> >  	} while (pkt_idx < count);
> > @@ -2924,6 +3038,24 @@ virtio_dev_tx_packed(struct virtio_net *dev,
> >  	return pkt_idx;
> >  }
> >
> > +__rte_noinline
> > +static uint16_t
> > +virtio_dev_tx_packed_legacy(struct virtio_net *dev,
> > +	struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool
> > *mbuf_pool,
> > +	struct rte_mbuf **__rte_restrict pkts, uint32_t count)
> > +{
> > +	return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, true);
> > +}
> > +
> > +__rte_noinline
> > +static uint16_t
> > +virtio_dev_tx_packed_compliant(struct virtio_net *dev,
> > +	struct vhost_virtqueue *__rte_restrict vq, struct rte_mempool
> > *mbuf_pool,
> > +	struct rte_mbuf **__rte_restrict pkts, uint32_t count)
> > +{
> > +	return virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts, count, false);
> > +}
> > +
> >  uint16_t
> >  rte_vhost_dequeue_burst(int vid, uint16_t queue_id,
> >  	struct rte_mempool *mbuf_pool, struct rte_mbuf **pkts, uint16_t
> > count)
> > @@ -2999,10 +3131,17 @@ rte_vhost_dequeue_burst(int vid, uint16_t
> > queue_id,
> >  		count -= 1;
> >  	}
> >
> > -	if (vq_is_packed(dev))
> > -		count = virtio_dev_tx_packed(dev, vq, mbuf_pool, pkts,
> > count);
> > -	else
> > -		count = virtio_dev_tx_split(dev, vq, mbuf_pool, pkts, count);
> > +	if (vq_is_packed(dev)) {
> > +		if (dev->flags & VIRTIO_DEV_LEGACY_OL_FLAGS)
> > +			count = virtio_dev_tx_packed_legacy(dev, vq,
> > mbuf_pool, pkts, count);
> > +		else
> > +			count = virtio_dev_tx_packed_compliant(dev, vq,
> > mbuf_pool, pkts, count);
> > +	} else {
> > +		if (dev->flags & VIRTIO_DEV_LEGACY_OL_FLAGS)
> > +			count = virtio_dev_tx_split_legacy(dev, vq,
> > mbuf_pool, pkts, count);
> > +		else
> > +			count = virtio_dev_tx_split_compliant(dev, vq,
> > mbuf_pool, pkts, count);
> > +	}
> >
> >  out:
> >  	if (dev->features & (1ULL << VIRTIO_F_IOMMU_PLATFORM))
> > --
> > 2.23.0


  reply	other threads:[~2021-05-12  3:30 UTC|newest]

Thread overview: 63+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-04-01  9:52 [dpdk-dev] [PATCH 0/5] Offload flags fixes David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 1/5] mbuf: mark old offload flag as deprecated David Marchand
2021-04-07 20:14   ` Flavio Leitner
2021-04-08  7:23   ` Olivier Matz
2021-04-08  8:41     ` David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 2/5] net/tap: do not touch Tx offload flags David Marchand
2021-04-07 20:15   ` Flavio Leitner
2021-04-08  7:41     ` Olivier Matz
2021-04-08 11:21       ` Flavio Leitner
2021-04-08 12:05         ` Olivier Matz
2021-04-08 12:58           ` Flavio Leitner
2021-04-09 13:30             ` Olivier Matz
2021-04-09 16:55               ` Flavio Leitner
2021-04-28 12:17               ` David Marchand
2021-04-08 12:16         ` Ananyev, Konstantin
2021-04-08  7:53   ` Olivier Matz
2021-04-28 12:12     ` David Marchand
2021-04-01  9:52 ` [dpdk-dev] [PATCH 3/5] net/virtio: " David Marchand
2021-04-13 14:17   ` Maxime Coquelin
2021-04-01  9:52 ` [dpdk-dev] [PATCH 4/5] net/virtio: refactor Tx offload helper David Marchand
2021-04-08 13:05   ` Flavio Leitner
2021-04-09  2:31   ` Ruifeng Wang
2021-04-01  9:52 ` [dpdk-dev] [PATCH 5/5] vhost: fix offload flags in Rx path David Marchand
2021-04-08  8:28   ` Olivier Matz
2021-04-08 18:38   ` Flavio Leitner
2021-04-13 15:27     ` Maxime Coquelin
2021-04-27 17:09       ` David Marchand
2021-04-27 17:19         ` David Marchand
2021-04-29  8:04 ` [dpdk-dev] [PATCH v2 0/4] Offload flags fixes David Marchand
2021-04-29  8:04   ` [dpdk-dev] [PATCH v2 1/4] mbuf: mark old offload flag as deprecated David Marchand
2021-04-29 12:14     ` Lance Richardson
2021-04-29 16:45     ` Ajit Khaparde
2021-04-29  8:04   ` [dpdk-dev] [PATCH v2 2/4] net/virtio: do not touch Tx offload flags David Marchand
2021-04-29 13:51     ` Flavio Leitner
2021-04-29  8:04   ` [dpdk-dev] [PATCH v2 3/4] net/virtio: refactor Tx offload helper David Marchand
2021-04-29 12:59     ` Maxime Coquelin
2021-04-29  8:04   ` [dpdk-dev] [PATCH v2 4/4] vhost: fix offload flags in Rx path David Marchand
2021-04-29 13:30     ` Maxime Coquelin
2021-04-29 13:31       ` Maxime Coquelin
2021-04-29 20:21         ` David Marchand
2021-04-30  8:38           ` Maxime Coquelin
2021-04-29 20:09       ` David Marchand
2021-04-29 18:39     ` Flavio Leitner
2021-04-29 19:18       ` David Marchand
2021-05-03 13:26 ` [dpdk-dev] [PATCH v3 0/4] Offload flags fixes David Marchand
2021-05-03 13:26   ` [dpdk-dev] [PATCH v3 1/4] mbuf: mark old offload flag as deprecated David Marchand
2021-05-03 14:02     ` Maxime Coquelin
2021-05-03 14:12     ` David Marchand
2021-05-03 13:26   ` [dpdk-dev] [PATCH v3 2/4] net/virtio: do not touch Tx offload flags David Marchand
2021-05-03 13:26   ` [dpdk-dev] [PATCH v3 3/4] net/virtio: refactor Tx offload helper David Marchand
2021-05-03 13:26   ` [dpdk-dev] [PATCH v3 4/4] vhost: fix offload flags in Rx path David Marchand
2021-05-03 15:24   ` [dpdk-dev] [PATCH v3 0/4] Offload flags fixes Maxime Coquelin
2021-05-03 16:21     ` David Marchand
2021-05-03 16:43 ` [dpdk-dev] [PATCH v4 0/3] " David Marchand
2021-05-03 16:43   ` [dpdk-dev] [PATCH v4 1/3] net/virtio: do not touch Tx offload flags David Marchand
2021-05-03 16:43   ` [dpdk-dev] [PATCH v4 2/3] net/virtio: refactor Tx offload helper David Marchand
2021-05-03 16:43   ` [dpdk-dev] [PATCH v4 3/3] vhost: fix offload flags in Rx path David Marchand
2021-05-04 11:07     ` Flavio Leitner
2021-05-08  6:24     ` Wang, Yinan
2021-05-12  3:29       ` Wang, Yinan [this message]
2021-05-12 15:20         ` David Marchand
2021-05-13  6:34           ` Wang, Yinan
2021-05-04  8:29   ` [dpdk-dev] [PATCH v4 0/3] Offload flags fixes Maxime Coquelin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB2648F122110B5226E2994FAE8F529@BYAPR11MB2648.namprd11.prod.outlook.com \
    --to=yinan.wang@intel.com \
    --cc=chenbo.xia@intel.com \
    --cc=david.marchand@redhat.com \
    --cc=dev@dpdk.org \
    --cc=fbl@sysclose.org \
    --cc=i.maximets@ovn.org \
    --cc=ian.stokes@intel.com \
    --cc=jijiang.liu@intel.com \
    --cc=maxime.coquelin@redhat.com \
    --cc=olivier.matz@6wind.com \
    --cc=stable@dpdk.org \
    --cc=yuanhan.liu@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

This inbox may be cloned and mirrored by anyone:

	git clone --mirror https://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ https://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev

Example config snippet for mirrors.
Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git