From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1C049A0548; Thu, 2 Jun 2022 15:20:36 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id BE58E427EA; Thu, 2 Jun 2022 15:20:35 +0200 (CEST) Received: from shelob.oktetlabs.ru (shelob.oktetlabs.ru [91.220.146.113]) by mails.dpdk.org (Postfix) with ESMTP id 577B6427EA for ; Thu, 2 Jun 2022 15:20:34 +0200 (CEST) Received: from [192.168.1.126] (unknown [188.242.181.57]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by shelob.oktetlabs.ru (Postfix) with ESMTPSA id E8F6BC9; Thu, 2 Jun 2022 16:20:33 +0300 (MSK) DKIM-Filter: OpenDKIM Filter v2.11.0 shelob.oktetlabs.ru E8F6BC9 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=oktetlabs.ru; s=default; t=1654176034; bh=tuLY/fFYJLQjGojEYKx4aCz3Fk3sJhrK+RdGtoku0Jw=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=avjoOR7vqqnOUZSzPIyDgyjew/tJHiim+/zBpzukwR1hDf+hQjRuRzCH6KBj2vMB0 wpeDe4HSPJ3kX5DPOx5XAgIUoHxYc0Dj0RwADa+VsU+hpvx8alvrLvHYtkG6pGKTj6 oX4s0y4khG8roebFeg35lL0JysO16H3jAcXg9nzg= Message-ID: Date: Thu, 2 Jun 2022 16:20:33 +0300 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.0 Subject: Re: [PATCH v8 1/3] ethdev: introduce protocol hdr based buffer split Content-Language: en-US To: wenxuanx.wu@intel.com, thomas@monjalon.net, xiaoyun.li@intel.com, ferruh.yigit@xilinx.com, aman.deep.singh@intel.com, dev@dpdk.org, yuying.zhang@intel.com, qi.z.zhang@intel.com, jerinjacobk@gmail.com Cc: stephen@networkplumber.org, Xuan Ding , Yuan Wang , Ray Kinsella References: <20220303060136.36427-1-xuan.ding@intel.com> <20220601135059.958882-1-wenxuanx.wu@intel.com> <20220601135059.958882-2-wenxuanx.wu@intel.com> From: Andrew Rybchenko In-Reply-To: <20220601135059.958882-2-wenxuanx.wu@intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Is it the right one since it is listed in patchwork? On 6/1/22 16:50, wenxuanx.wu@intel.com wrote: > From: Wenxuan Wu > > Currently, Rx buffer split supports length based split. With Rx queue > offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT enabled and Rx packet segment > configured, PMD will be able to split the received packets into > multiple segments. > > However, length based buffer split is not suitable for NICs that do split > based on protocol headers. Given a arbitrarily variable length in Rx packet a -> an > segment, it is almost impossible to pass a fixed protocol header to PMD. > Besides, the existence of tunneling results in the composition of a packet > is various, which makes the situation even worse. > > This patch extends current buffer split to support protocol header based > buffer split. A new proto_hdr field is introduced in the reserved field > of rte_eth_rxseg_split structure to specify protocol header. The proto_hdr > field defines the split position of packet, splitting will always happens > after the protocol header defined in the Rx packet segment. When Rx queue > offload RTE_ETH_RX_OFFLOAD_BUFFER_SPLIT is enabled and corresponding > protocol header is configured, PMD will split the ingress packets into > multiple segments. > > struct rte_eth_rxseg_split { > > struct rte_mempool *mp; /* memory pools to allocate segment from */ > uint16_t length; /* segment maximal data length, > configures "split point" */ > uint16_t offset; /* data offset from beginning > of mbuf data buffer */ > uint32_t proto_hdr; /* inner/outer L2/L3/L4 protocol header, > configures "split point" */ > }; > > Both inner and outer L2/L3/L4 level protocol header split can be supported. > Corresponding protocol header capability is RTE_PTYPE_L2_ETHER, > RTE_PTYPE_L3_IPV4, RTE_PTYPE_L3_IPV6, RTE_PTYPE_L4_TCP, RTE_PTYPE_L4_UDP, > RTE_PTYPE_L4_SCTP, RTE_PTYPE_INNER_L2_ETHER, RTE_PTYPE_INNER_L3_IPV4, > RTE_PTYPE_INNER_L3_IPV6, RTE_PTYPE_INNER_L4_TCP, RTE_PTYPE_INNER_L4_UDP, > RTE_PTYPE_INNER_L4_SCTP. > > For example, let's suppose we configured the Rx queue with the > following segments: > seg0 - pool0, proto_hdr0=RTE_PTYPE_L3_IPV4, off0=2B > seg1 - pool1, proto_hdr1=RTE_PTYPE_L4_UDP, off1=128B > seg2 - pool2, off1=0B > > The packet consists of MAC_IPV4_UDP_PAYLOAD will be split like > following: > seg0 - ipv4 header @ RTE_PKTMBUF_HEADROOM + 2 in mbuf from pool0 > seg1 - udp header @ 128 in mbuf from pool1 > seg2 - payload @ 0 in mbuf from pool2 It must be defined how ICMPv4 packets will be split in such case. And how UDP over IPv6 will be split. > > Now buffet split can be configured in two modes. For length based > buffer split, the mp, length, offset field in Rx packet segment should > be configured, while the proto_hdr field should not be configured. > For protocol header based buffer split, the mp, offset, proto_hdr field > in Rx packet segment should be configured, while the length field should > not be configured. > > The split limitations imposed by underlying PMD is reported in the > rte_eth_dev_info->rx_seg_capa field. The memory attributes for the split > parts may differ either, dpdk memory and external memory, respectively. > > Signed-off-by: Xuan Ding > Signed-off-by: Yuan Wang > Signed-off-by: Wenxuan Wu > Reviewed-by: Qi Zhang > Acked-by: Ray Kinsella > --- > lib/ethdev/rte_ethdev.c | 40 +++++++++++++++++++++++++++++++++------- > lib/ethdev/rte_ethdev.h | 28 +++++++++++++++++++++++++++- > 2 files changed, 60 insertions(+), 8 deletions(-) > > diff --git a/lib/ethdev/rte_ethdev.c b/lib/ethdev/rte_ethdev.c > index 29a3d80466..fbd55cdd9d 100644 > --- a/lib/ethdev/rte_ethdev.c > +++ b/lib/ethdev/rte_ethdev.c > @@ -1661,6 +1661,7 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg, > struct rte_mempool *mpl = rx_seg[seg_idx].mp; > uint32_t length = rx_seg[seg_idx].length; > uint32_t offset = rx_seg[seg_idx].offset; > + uint32_t proto_hdr = rx_seg[seg_idx].proto_hdr; > > if (mpl == NULL) { > RTE_ETHDEV_LOG(ERR, "null mempool pointer\n"); > @@ -1694,13 +1695,38 @@ rte_eth_rx_queue_check_split(const struct rte_eth_rxseg_split *rx_seg, > } > offset += seg_idx != 0 ? 0 : RTE_PKTMBUF_HEADROOM; > *mbp_buf_size = rte_pktmbuf_data_room_size(mpl); > - length = length != 0 ? length : *mbp_buf_size; > - if (*mbp_buf_size < length + offset) { > - RTE_ETHDEV_LOG(ERR, > - "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n", > - mpl->name, *mbp_buf_size, > - length + offset, length, offset); > - return -EINVAL; > + if (proto_hdr == RTE_PTYPE_UNKNOWN) { > + /* Split at fixed length. */ > + length = length != 0 ? length : *mbp_buf_size; > + if (*mbp_buf_size < length + offset) { > + RTE_ETHDEV_LOG(ERR, > + "%s mbuf_data_room_size %u < %u (segment length=%u + segment offset=%u)\n", > + mpl->name, *mbp_buf_size, > + length + offset, length, offset); > + return -EINVAL; > + } > + } else { > + /* Split after specified protocol header. */ > + if (!(proto_hdr & RTE_BUFFER_SPLIT_PROTO_HDR_MASK)) { The condition looks suspicious. It will be true if proto_hdr has no single bit from the mask. I guess it is not the intent. I guess the condition should be proto_hdr & ~RTE_BUFFER_SPLIT_PROTO_HDR_MASK i.e. there is unsupported bits in proto_hdr IMHO we need extra field in dev_info to report supported protocols to split on. Or a new API to get an array similar to ptype get. May be a new API is a better choice to not overload dev_info and to be more flexible in reporting. > + RTE_ETHDEV_LOG(ERR, > + "Protocol header %u not supported)\n", > + proto_hdr); I think it would be useful to log unsupported bits only, if we say so. > + return -EINVAL; > + } > + > + if (length != 0) { > + RTE_ETHDEV_LOG(ERR, "segment length should be set to zero in protocol header " > + "based buffer split\n"); > + return -EINVAL; > + } > + > + if (*mbp_buf_size < offset) { > + RTE_ETHDEV_LOG(ERR, > + "%s mbuf_data_room_size %u < %u segment offset)\n", > + mpl->name, *mbp_buf_size, > + offset); > + return -EINVAL; > + } > } > } > return 0; > diff --git a/lib/ethdev/rte_ethdev.h b/lib/ethdev/rte_ethdev.h > index 04cff8ee10..0cd9dd6cc0 100644 > --- a/lib/ethdev/rte_ethdev.h > +++ b/lib/ethdev/rte_ethdev.h > @@ -1187,6 +1187,9 @@ struct rte_eth_txmode { > * mbuf) the following data will be pushed to the next segment > * up to its own length, and so on. > * > + * - The proto_hdrs in the elements define the split position of > + * received packets. > + * > * - If the length in the segment description element is zero > * the actual buffer size will be deduced from the appropriate > * memory pool properties. > @@ -1197,14 +1200,37 @@ struct rte_eth_txmode { > * - pool from the last valid element > * - the buffer size from this pool > * - zero offset > + * > + * - Length based buffer split: > + * - mp, length, offset should be configured. > + * - The proto_hdr field should not be configured. > + * > + * - Protocol header based buffer split: > + * - mp, offset, proto_hdr should be configured. > + * - The length field should not be configured. > */ > struct rte_eth_rxseg_split { > struct rte_mempool *mp; /**< Memory pool to allocate segment from. */ > uint16_t length; /**< Segment data length, configures split point. */ > uint16_t offset; /**< Data offset from beginning of mbuf data buffer. */ > - uint32_t reserved; /**< Reserved field. */ > + uint32_t proto_hdr; /**< Inner/outer L2/L3/L4 protocol header, configures split point. */ > }; > > +/* Buffer split protocol header capability. */ > +#define RTE_BUFFER_SPLIT_PROTO_HDR_MASK ( \ > + RTE_PTYPE_L2_ETHER | \ > + RTE_PTYPE_L3_IPV4 | \ > + RTE_PTYPE_L3_IPV6 | \ > + RTE_PTYPE_L4_TCP | \ > + RTE_PTYPE_L4_UDP | \ > + RTE_PTYPE_L4_SCTP | \ > + RTE_PTYPE_INNER_L2_ETHER | \ > + RTE_PTYPE_INNER_L3_IPV4 | \ > + RTE_PTYPE_INNER_L3_IPV6 | \ > + RTE_PTYPE_INNER_L4_TCP | \ > + RTE_PTYPE_INNER_L4_UDP | \ > + RTE_PTYPE_INNER_L4_SCTP) > + > /** > * @warning > * @b EXPERIMENTAL: this structure may change without prior notice.