DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "yang_y_yi@163.com" <yang_y_yi@163.com>, "dev@dpdk.org" <dev@dpdk.org>
Cc: "Hu, Jiayu" <jiayu.hu@intel.com>,
	"techboard@dpdk.org" <techboard@dpdk.org>,
	"thomas@monjalon.net" <thomas@monjalon.net>,
	"yangyi01@inspur.com" <yangyi01@inspur.com>
Subject: Re: [dpdk-dev] [PATCH v3] gso: fix free issue of mbuf gso segments attach to
Date: Tue, 27 Oct 2020 19:55:49 +0000	[thread overview]
Message-ID: <BYAPR11MB3301B32FA2889BDDBEF806899A160@BYAPR11MB3301.namprd11.prod.outlook.com> (raw)
In-Reply-To: <20201026064713.33316-1-yang_y_yi@163.com>



> -----Original Message-----
> From: yang_y_yi@163.com <yang_y_yi@163.com>
> Sent: Monday, October 26, 2020 6:47 AM
> To: dev@dpdk.org
> Cc: Hu, Jiayu <jiayu.hu@intel.com>; Ananyev, Konstantin <konstantin.ananyev@intel.com>; techboard@dpdk.org; thomas@monjalon.net;
> yangyi01@inspur.com; yang_y_yi@163.com
> Subject: [PATCH v3] gso: fix free issue of mbuf gso segments attach to> 
> From: Yi Yang <yangyi01@inspur.com>
> 
> rte_gso_segment decreased refcnt of pkt by one, but
> it is wrong if pkt is external mbuf, pkt won't be
> freed because of incorrect refcnt, the result is
> application can't allocate mbuf from mempool because
> mbufs in mempool are run out of.
> 
> One correct way is application should call
> rte_pktmbuf_free after calling rte_gso_segment to free
> pkt explicitly. rte_gso_segment mustn't handle it, this
> should be responsibility of application.
> 
> This commit changed rte_gso_segment in functional behavior
> and return value, so the application must take appropriate
> actions according to return values, "ret < 0" means it
> should free and drop 'pkt', "ret == 0" means 'pkt' isn't
> GSOed but 'pkt' can be transimmitted as a normal packet,
> "ret > 0" means 'pkt' has been GSOed into two or multiple
> segments, it should use "pkts_out" to transmit these
> segments. The application must free 'pkt' after call
> rte_gso_segment when return value isn't equal to 0.

Tech-board members: this is not a formal API breakage,
but it is a functional change (i.e. all code that uses that API will need to be changed).
There was no deprecation note in advance.
So please provide your input: are you ok with such change or not.

I am ok with the proposed changes.
Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>
  

> 
> Fixes: 119583797b6a ("gso: support TCP/IPv4 GSO")
> Signed-off-by: Yi Yang <yangyi01@inspur.com>
> ---
> Changelog:
> 
> v2->v3:
>   - add release notes to emphasize behavior and return
>     value changes of rte_gso_segment().
>   - update return value description of rte_gso_segment().
>   - modify related code to adapt to the changes.
> 
> v1->v2:
>   - update description of rte_gso_segment().
>   - change code which calls rte_gso_segment() to
>     fix free issue.
> 
> ---
>  app/test-pmd/csumonly.c                                   | 12 ++++++++++--
>  .../prog_guide/generic_segmentation_offload_lib.rst       |  7 +++++--
>  doc/guides/rel_notes/release_20_11.rst                    |  7 +++++++
>  drivers/net/tap/rte_eth_tap.c                             | 12 ++++++++++--
>  lib/librte_gso/gso_tcp4.c                                 |  6 ++----
>  lib/librte_gso/gso_tunnel_tcp4.c                          | 14 +++++---------
>  lib/librte_gso/gso_udp4.c                                 |  6 ++----
>  lib/librte_gso/rte_gso.c                                  | 15 +++------------
>  lib/librte_gso/rte_gso.h                                  |  8 ++++++--
>  9 files changed, 50 insertions(+), 37 deletions(-)
> 
> diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
> index 3d7d244..d813d4f 100644
> --- a/app/test-pmd/csumonly.c
> +++ b/app/test-pmd/csumonly.c
> @@ -1080,9 +1080,17 @@ struct simple_gre_hdr {
>  			ret = rte_gso_segment(pkts_burst[i], gso_ctx,
>  					&gso_segments[nb_segments],
>  					GSO_MAX_PKT_BURST - nb_segments);
> -			if (ret >= 0)
> +			if (ret >= 1) {
> +				/* pkts_burst[i] can be freed safely here. */
> +				rte_pktmbuf_free(pkts_burst[i]);
>  				nb_segments += ret;
> -			else {
> +			} else if (ret == 0) {
> +				/* 0 means it can be transmitted directly
> +				 * without gso.
> +				 */
> +				gso_segments[nb_segments] = pkts_burst[i];
> +				nb_segments += 1;
> +			} else {
>  				TESTPMD_LOG(DEBUG, "Unable to segment packet");
>  				rte_pktmbuf_free(pkts_burst[i]);
>  			}
> diff --git a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
> index 205cb8a..8577572 100644
> --- a/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
> +++ b/doc/guides/prog_guide/generic_segmentation_offload_lib.rst
> @@ -25,8 +25,9 @@ Bearing that in mind, the GSO library enables DPDK applications to segment
>  packets in software. Note however, that GSO is implemented as a standalone
>  library, and not via a 'fallback' mechanism (i.e. for when TSO is unsupported
>  in the underlying hardware); that is, applications must explicitly invoke the
> -GSO library to segment packets. The size of GSO segments ``(segsz)`` is
> -configurable by the application.
> +GSO library to segment packets, they also must call ``rte_pktmbuf_free()`` to
> +free mbuf GSO segments attach to after calling ``rte_gso_segment()``. The size
> +of GSO segments ``(segsz)`` is configurable by the application.
> 
>  Limitations
>  -----------
> @@ -233,6 +234,8 @@ To segment an outgoing packet, an application must:
> 
>  #. Invoke the GSO segmentation API, ``rte_gso_segment()``.
> 
> +#. Call ``rte_pktmbuf_free()`` to free mbuf ``rte_gso_segment()`` segments.
> +
>  #. If required, update the L3 and L4 checksums of the newly-created segments.
>     For tunneled packets, the outer IPv4 headers' checksums should also be
>     updated. Alternatively, the application may offload checksum calculation
> diff --git a/doc/guides/rel_notes/release_20_11.rst b/doc/guides/rel_notes/release_20_11.rst
> index d8ac359..da77396 100644
> --- a/doc/guides/rel_notes/release_20_11.rst
> +++ b/doc/guides/rel_notes/release_20_11.rst
> @@ -543,6 +543,13 @@ API Changes
>  * sched: Removed ``tb_rate``, ``tc_rate``, ``tc_period`` and ``tb_size``
>    from ``struct rte_sched_subport_params``.
> 
> +* **Changed ``rte_gso_segment`` in functional behavior and return value.**
> +
> +  * Don't save pkt to pkts_out[0] if it isn't GSOed in case of ret == 1.
> +  * Return 0 instead of 1 for the above case.
> +  * ``rte_gso_segment`` won't free pkt no matter whether it is GSOed, the
> +    application has responsibility to free it after call ``rte_gso_segment``.
> +
> 
>  ABI Changes
>  -----------
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index 81c6884..2f8abb1 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -751,8 +751,16 @@ struct ipc_queues {
>  			if (num_tso_mbufs < 0)
>  				break;
> 
> -			mbuf = gso_mbufs;
> -			num_mbufs = num_tso_mbufs;
> +			if (num_tso_mbufs >= 1) {
> +				mbuf = gso_mbufs;
> +				num_mbufs = num_tso_mbufs;
> +			} else {
> +				/* 0 means it can be transmitted directly
> +				 * without gso.
> +				 */
> +				mbuf = &mbuf_in;
> +				num_mbufs = 1;
> +			}
>  		} else {
>  			/* stats.errs will be incremented */
>  			if (rte_pktmbuf_pkt_len(mbuf_in) > max_size)
> diff --git a/lib/librte_gso/gso_tcp4.c b/lib/librte_gso/gso_tcp4.c
> index ade172a..d31feaf 100644
> --- a/lib/librte_gso/gso_tcp4.c
> +++ b/lib/librte_gso/gso_tcp4.c
> @@ -50,15 +50,13 @@
>  			pkt->l2_len);
>  	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
>  	if (unlikely(IS_FRAGMENTED(frag_off))) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	/* Don't process the packet without data */
>  	hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;
>  	if (unlikely(hdr_offset >= pkt->pkt_len)) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	pyld_unit_size = gso_size - hdr_offset;
> diff --git a/lib/librte_gso/gso_tunnel_tcp4.c b/lib/librte_gso/gso_tunnel_tcp4.c
> index e0384c2..166aace 100644
> --- a/lib/librte_gso/gso_tunnel_tcp4.c
> +++ b/lib/librte_gso/gso_tunnel_tcp4.c
> @@ -62,7 +62,7 @@
>  {
>  	struct rte_ipv4_hdr *inner_ipv4_hdr;
>  	uint16_t pyld_unit_size, hdr_offset, frag_off;
> -	int ret = 1;
> +	int ret;
> 
>  	hdr_offset = pkt->outer_l2_len + pkt->outer_l3_len + pkt->l2_len;
>  	inner_ipv4_hdr = (struct rte_ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +
> @@ -73,25 +73,21 @@
>  	 */
>  	frag_off = rte_be_to_cpu_16(inner_ipv4_hdr->fragment_offset);
>  	if (unlikely(IS_FRAGMENTED(frag_off))) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	hdr_offset += pkt->l3_len + pkt->l4_len;
>  	/* Don't process the packet without data */
>  	if (hdr_offset >= pkt->pkt_len) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
>  	pyld_unit_size = gso_size - hdr_offset;
> 
>  	/* Segment the payload */
>  	ret = gso_do_segment(pkt, hdr_offset, pyld_unit_size, direct_pool,
>  			indirect_pool, pkts_out, nb_pkts_out);
> -	if (ret <= 1)
> -		return ret;
> -
> -	update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
> +	if (ret > 1)
> +		update_tunnel_ipv4_tcp_headers(pkt, ipid_delta, pkts_out, ret);
> 
>  	return ret;
>  }
> diff --git a/lib/librte_gso/gso_udp4.c b/lib/librte_gso/gso_udp4.c
> index 6fa68f2..5d0186a 100644
> --- a/lib/librte_gso/gso_udp4.c
> +++ b/lib/librte_gso/gso_udp4.c
> @@ -52,8 +52,7 @@
>  			pkt->l2_len);
>  	frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);
>  	if (unlikely(IS_FRAGMENTED(frag_off))) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	/*
> @@ -65,8 +64,7 @@
> 
>  	/* Don't process the packet without data. */
>  	if (unlikely(hdr_offset + pkt->l4_len >= pkt->pkt_len)) {
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	/* pyld_unit_size must be a multiple of 8 because frag_off
> diff --git a/lib/librte_gso/rte_gso.c b/lib/librte_gso/rte_gso.c
> index 751b5b6..896350e 100644
> --- a/lib/librte_gso/rte_gso.c
> +++ b/lib/librte_gso/rte_gso.c
> @@ -30,7 +30,6 @@
>  		uint16_t nb_pkts_out)
>  {
>  	struct rte_mempool *direct_pool, *indirect_pool;
> -	struct rte_mbuf *pkt_seg;
>  	uint64_t ol_flags;
>  	uint16_t gso_size;
>  	uint8_t ipid_delta;
> @@ -44,8 +43,7 @@
> 
>  	if (gso_ctx->gso_size >= pkt->pkt_len) {
>  		pkt->ol_flags &= (~(PKT_TX_TCP_SEG | PKT_TX_UDP_SEG));
> -		pkts_out[0] = pkt;
> -		return 1;
> +		return 0;
>  	}
> 
>  	direct_pool = gso_ctx->direct_pool;
> @@ -75,18 +73,11 @@
>  				indirect_pool, pkts_out, nb_pkts_out);
>  	} else {
>  		/* unsupported packet, skip */
> -		pkts_out[0] = pkt;
>  		RTE_LOG(DEBUG, GSO, "Unsupported packet type\n");
> -		return 1;
> +		ret = 0;
>  	}
> 
> -	if (ret > 1) {
> -		pkt_seg = pkt;
> -		while (pkt_seg) {
> -			rte_mbuf_refcnt_update(pkt_seg, -1);
> -			pkt_seg = pkt_seg->next;
> -		}
> -	} else if (ret < 0) {
> +	if (ret < 0) {
>  		/* Revert the ol_flags in the event of failure. */
>  		pkt->ol_flags = ol_flags;
>  	}
> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h
> index 3aab297..af480ee 100644
> --- a/lib/librte_gso/rte_gso.h
> +++ b/lib/librte_gso/rte_gso.h
> @@ -89,8 +89,11 @@ struct rte_gso_ctx {
>   * the GSO segments are sent to should support transmission of multi-segment
>   * packets.
>   *
> - * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,
> - * when all GSO segments are freed, the input packet is freed automatically.
> + * If the input packet is GSO'd, all the indirect segments are attached to the
> + * input packet.
> + *
> + * rte_gso_segment() will not free the input packet no matter whether it is
> + * GSO'd or not, the application should free it after call rte_gso_segment().
>   *
>   * If the memory space in pkts_out or MBUF pools is insufficient, this
>   * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,
> @@ -109,6 +112,7 @@ struct rte_gso_ctx {
>   *
>   * @return
>   *  - The number of GSO segments filled in pkts_out on success.
> + *  - Return 0 if it needn't GSOed.
>   *  - Return -ENOMEM if run out of memory in MBUF pools.
>   *  - Return -EINVAL for invalid parameters.
>   */
> --
> 1.8.3.1


  reply	other threads:[~2020-10-27 19:56 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-26  6:47 yang_y_yi
2020-10-27 19:55 ` Ananyev, Konstantin [this message]
2020-10-29 21:39   ` Thomas Monjalon
2020-11-03 21:34     ` Thomas Monjalon
2020-11-04  0:35       ` yang_y_yi
2020-10-28  0:51 ` Hu, Jiayu

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=BYAPR11MB3301B32FA2889BDDBEF806899A160@BYAPR11MB3301.namprd11.prod.outlook.com \
    --to=konstantin.ananyev@intel.com \
    --cc=dev@dpdk.org \
    --cc=jiayu.hu@intel.com \
    --cc=techboard@dpdk.org \
    --cc=thomas@monjalon.net \
    --cc=yang_y_yi@163.com \
    --cc=yangyi01@inspur.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).