DPDK patches and discussions
 help / color / mirror / Atom feed
From: yang_y_yi  <yang_y_yi@163.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Hu, Jiayu" <jiayu.hu@intel.com>, "dev@dpdk.org" <dev@dpdk.org>,
	 "olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	 "thomas@monjalon.net" <thomas@monjalon.net>,
	 "yangyi01@inspur.com" <yangyi01@inspur.com>
Subject: Re: [dpdk-dev] [PATCH] gso: fix free issue of mbuf gso segments attach to
Date: Tue, 20 Oct 2020 09:16:47 +0800 (CST)	[thread overview]
Message-ID: <53a1ce59.be8.17543948083.Coremail.yang_y_yi@163.com> (raw)
In-Reply-To: <BYAPR11MB33014583D6C2365A83436B7A9A1E0@BYAPR11MB3301.namprd11.prod.outlook.com>

At 2020-10-19 16:47:55, "Ananyev, Konstantin" <konstantin.ananyev@intel.com> wrote:

> -----Original Message-----

> From: Ananyev, Konstantin <konstantin.ananyev@intel.com>

> Sent: Monday, October 19, 2020 9:44 AM

> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>

> Subject: FW: Re:RE: Re:RE: [PATCH] gso: fix free issue of mbuf gso segments attach to

>

>

>

> From: yang_y_yi <yang_y_yi@163.com>

> Sent: Monday, October 19, 2020 7:45 AM

> To: Hu, Jiayu <jiayu.hu@intel.com>

> Cc: Ananyev, Konstantin <konstantin.ananyev@intel.com>; dev@dpdk.org; olivier.matz@6wind.com; thomas@monjalon.net;

> yangyi01@inspur.com

> Subject: Re:RE: Re:RE: [PATCH] gso: fix free issue of mbuf gso segments attach to

>

> At 2020-10-19 11:17:48, "Hu, Jiayu" <mailto:jiayu.hu@intel.com> wrote:

> >

> >

> >> -----Original Message-----

> >> From: Ananyev, Konstantin <mailto:konstantin.ananyev@intel.com>

> >> Sent: Friday, October 16, 2020 4:31 PM

> >> To: Hu, Jiayu <mailto:jiayu.hu@intel.com>; yang_y_yi <mailto:yang_y_yi@163.com>

> >> Cc: mailto:dev@dpdk.org; mailto:olivier.matz@6wind.com; mailto:thomas@monjalon.net;

> >> mailto:yangyi01@inspur.com

> >> Subject: RE: Re:RE: [PATCH] gso: fix free issue of mbuf gso segments attach to

> >>

> >>

> >>

> >> >

> >> > > > > >

> >> > > > > > I think it isn't a good idea to free it in rte_gso_segment, maybe

> >> > > application

> >> > > > > will continue to use this pkt for other purpose, rte_gso_segment

> >> > > > > > can't make decision for application without any notice, it is better to

> >> > > return

> >> > > > > this decision right backt to application.

> >> > > > > >

> >> > > > >

> >> > > > > I think, if user wants to keep original packet, he can always call

> >> > > > > rte_pktmbuf_refcnt_update(pkt, 1)

> >> > > > > just before calling gso function.

> >> > > > >

> >> > > > > Also, as I remember in some cases it is not safe to do free() for input

> >> > > packet

> >> > > > > (as pkt_out[] can contain input pkt itself). Would it also be user

> >> > > responsibility

> >> > > > > to determine

> >> > > > > such situations?

> >> > > >

> >> > > > In what case will pkt_out[] contain the input pkt? Can you give an

> >> example?

> >> > >

> >> > > As I can read the code, whenever gso code decides that

> >> > > no segmentation is not really needed, or it is not capable

> >> > > of doing it properly.

> >> > > Let say:

> >> > >

> >> > > gso_tcp4_segment(struct rte_mbuf *pkt,

> >> > >                 uint16_t gso_size,

> >> > >                 uint8_t ipid_delta,

> >> > >                 struct rte_mempool *direct_pool,

> >> > >                 struct rte_mempool *indirect_pool,

> >> > >                 struct rte_mbuf **pkts_out,

> >> > >                 uint16_t nb_pkts_out)

> >> > > {

> >> > > ...

> >> > > /* Don't process the fragmented packet */

> >> > >         ipv4_hdr = (struct rte_ipv4_hdr *)(rte_pktmbuf_mtod(pkt, char *) +

> >> > >                         pkt->l2_len);

> >> > >         frag_off = rte_be_to_cpu_16(ipv4_hdr->fragment_offset);

> >> > >         if (unlikely(IS_FRAGMENTED(frag_off))) {

> >> > >                 pkts_out[0] = pkt;

> >> > >                 return 1;

> >> > >         }

> >> > >

> >> > >         /* Don't process the packet without data */

> >> > >         hdr_offset = pkt->l2_len + pkt->l3_len + pkt->l4_len;

> >> > >         if (unlikely(hdr_offset >= pkt->pkt_len)) {

> >> > >                 pkts_out[0] = pkt;

> >> > >                 return 1;

> >> > >         }

> >> > >

> >> > > That's why in rte_gso_segment() we update refcnt only when ret > 1.

> >> >

> >> > But in these cases, the value of ret is 1. So we can free input pkt only when

> >> > ret > 1. Like:

> >> >

> >> > -       if (ret > 1) {

> >> > -               pkt_seg = pkt;

> >> > -               while (pkt_seg) {

> >> > -                       rte_mbuf_refcnt_update(pkt_seg, -1);

> >> > -                       pkt_seg = pkt_seg->next;

> >> > -               }

> >> > -       } else if (ret < 0) {

> >> > +       if (ret > 1)

> >> > +               rte_pktmbuf_free(pkt);

> >> > +       else if (ret < 0) {

> >> >                 /* Revert the ol_flags in the event of failure. */

> >> >                 pkt->ol_flags = ol_flags;

> >> >         }

> >>

> >> Yes, definitely. I am not arguing about that.

> >> My question was to the author of the original patch:

> >> He suggests not to free input packet inside gso function and leave it to the

> >> user.

> >> So, in his proposition, would it also become user responsibility to determine

> >> when input packet can be freed (it is not present in pkt_out[]) or not?

> >

> >@Yi, I am OK with the both designs. If you think it's better to free the input pkt by

> >users, you can keep the original design. But one thing to notice is that you need

> >to update definition of rte_gso_segment() in rte_gso.h too.

> >

> >Thanks,

> >Jiayu

>

> Ok, I prefer to handle it by users, this is incremental patch for rte_gso_segment description. Is it ok to you?

>

> diff --git a/lib/librte_gso/rte_gso.h b/lib/librte_gso/rte_gso.h

> index 3aab297..3762eba 100644

> --- a/lib/librte_gso/rte_gso.h

> +++ b/lib/librte_gso/rte_gso.h

> @@ -89,8 +89,11 @@ struct rte_gso_ctx {

>   * the GSO segments are sent to should support transmission of multi-segment

>   * packets.

>   *

> - * If the input packet is GSO'd, its mbuf refcnt reduces by 1. Therefore,

> - * when all GSO segments are freed, the input packet is freed automatically.

> + * If the input packet is GSO'd, all the indirect segments are attached to the

> + * input packet.

> + *

> + * rte_gso_segment() will not free the input packet no matter whether it is

> + * GSO'd or not, the application should free it after call rte_gso_segment().

>   *

>   * If the memory space in pkts_out or MBUF pools is insufficient, this

>   * function fails, and it returns (-1) * errno. Otherwise, GSO succeeds,

 

I think such change will require much more than that.

While formally API is not changed, due to behaviour change,

API users will need to change their code, right?

If so, I think such change needs to be treated as API breakage.

So I think if you'll choose to go ahead with your original approach,

you'll need to:

1. Change all places in dpdk.org that uses rte_gso_segment()

to make them work correctly with new behaviour.

2. Usually such change has to be declared at least one release in advance via deprecation notice.

In 20.11 we do allow some API changes without deprecation notice,

but all such changes have to be reviewed by dpdk tech-board.

So don’t forget to CC your patch to TB for review,

and explain clearly changes required in API caller code.

 

Konstantin

 

 

Thanks Konstantin for pointing out this, fortunately there are hardly users of rte_gso_segment() in dpdk source tree. drivers/net/tap/rte_eth_tap.c has already had correct code there.

                        num_tso_mbufs = rte_gso_segment(mbuf_in,
                                gso_ctx, /* gso control block */
                                (struct rte_mbuf **)&gso_mbufs, /* out mbufs */
                                RTE_DIM(gso_mbufs)); /* max tso mbufs */

                        /* ret contains the number of new created mbufs */
                        if (num_tso_mbufs < 0)
                                break;
......
                /* free original mbuf */
                rte_pktmbuf_free(mbuf_in);

We only need to change app/test-pmd/csumonly.c.

diff --git a/app/test-pmd/csumonly.c b/app/test-pmd/csumonly.c
index 3d7d244..829e07f 100644
--- a/app/test-pmd/csumonly.c
+++ b/app/test-pmd/csumonly.c
@@ -1080,11 +1080,12 @@ struct simple_gre_hdr {
                        ret = rte_gso_segment(pkts_burst[i], gso_ctx,
                                        &gso_segments[nb_segments],
                                        GSO_MAX_PKT_BURST - nb_segments);
+                       /* pkts_burst[i] can be freed safely here. */
+                       rte_pktmbuf_free(pkts_burst[i]);
                        if (ret >= 0)
                                nb_segments += ret;
                        else {
                                TESTPMD_LOG(DEBUG, "Unable to segment packet");
-                               rte_pktmbuf_free(pkts_burst[i]);
                        }
                }


I'll send a new version and cc to TB if you have no other comments.




  reply	other threads:[~2020-10-20  1:17 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-10-10  3:10 yang_y_yi
2020-10-13  7:28 ` Hu, Jiayu
2020-10-13 15:39   ` Ananyev, Konstantin
2020-10-14  1:00     ` Hu, Jiayu
2020-10-14  2:56       ` yang_y_yi
2020-10-14 12:05         ` Ananyev, Konstantin
2020-10-15  5:14           ` Hu, Jiayu
2020-10-15 16:16             ` Ananyev, Konstantin
2020-10-16  0:53               ` Hu, Jiayu
2020-10-16  8:31                 ` Ananyev, Konstantin
2020-10-19  3:17                   ` Hu, Jiayu
2020-10-19  6:44                     ` yang_y_yi
2020-10-19  8:47                       ` Ananyev, Konstantin
2020-10-20  1:16                         ` yang_y_yi [this message]
2020-10-19  2:29                 ` yang_y_yi
2020-10-19  2:20               ` yang_y_yi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53a1ce59.be8.17543948083.Coremail.yang_y_yi@163.com \
    --to=yang_y_yi@163.com \
    --cc=dev@dpdk.org \
    --cc=jiayu.hu@intel.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=yangyi01@inspur.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).