DPDK patches and discussions
 help / color / Atom feed
From: yang_y_yi  <yang_y_yi@163.com>
To: "Olivier Matz" <olivier.matz@6wind.com>
Cc: dev@dpdk.org, jiayu.hu@intel.com, thomas@monjalon.net,
	yangyi01@inspur.com
Subject: Re: [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case
Date: Sun, 2 Aug 2020 07:12:36 +0800 (CST)
Message-ID: <2a50e80c.44f.173ac4c6d20.Coremail.yang_y_yi@163.com> (raw)
In-Reply-To: <20200731151543.GH5869@platinum>



At 2020-07-31 23:15:43, "Olivier Matz" <olivier.matz@6wind.com> wrote:
>Hi,
>
>On Thu, Jul 30, 2020 at 08:08:59PM +0800, yang_y_yi@163.com wrote:
>> From: Yi Yang <yangyi01@inspur.com>
>> 
>> In GSO case, segmented mbufs are attached to original
>> mbuf which can't be freed when it is external. The issue
>> is free_cb doesn't know original mbuf and doesn't free
>> it when refcnt of shinfo is 0.
>> 
>> Original mbuf can be freed by rte_pktmbuf_free segmented
>> mbufs or by rte_pktmbuf_free original mbuf. Two kind of
>> cases should have different behaviors. free_cb won't
>> explicitly call rte_pktmbuf_free to free original mbuf
>> if it is freed by rte_pktmbuf_free original mbuf, but it
>> has to call rte_pktmbuf_free to free original mbuf if it
>> is freed by rte_pktmbuf_free segmented mbufs.
>> 
>> In order to fix this issue, free_cb interface has to been
>> changed, __rte_pktmbuf_free_extbuf must deliver called
>> mbuf pointer to free_cb, argument opaque can be defined
>> as a custom struct by user, it can includes original mbuf
>> pointer, user-defined free_cb can compare caller mbuf with
>> mbuf in opaque struct, free_cb should free original mbuf
>> if they are not same, this corresponds to rte_pktmbuf_free
>> segmented mbufs case, otherwise, free_cb won't free original
>> mbuf because the caller explicitly called rte_pktmbuf_free
>> to free it.
>> 
>> Here is pseduo code to show two kind of cases.
>> 
>> case 1. rte_pktmbuf_free segmented mbufs
>> 
>> nb_tx = rte_gso_segment(original_mbuf, /* original mbuf */
>>                       &gso_ctx,
>>                       /* segmented mbuf */
>>                       (struct rte_mbuf **)&gso_mbufs,
>>                       MAX_GSO_MBUFS);
>
>I'm sorry but it is not very clear to me what operations are done by
>rte_gso_segment().
>
>In the current code, I only see calls to rte_pktmbuf_attach(),
>which do not deal with external buffers. Am I missing something?
>
>Are you able to show the issue only with mbuf functions? It would
>be helpful to understand what does not work.
>
>
>Thanks,
>Olivier
>
Oliver, thank you for comment, let me show you why it doesn't work for my use case.  In OVS DPDK, VM uses vhostuserclient to send large packets whose size is about 64K because we enabled TSO & UFO, these large packets use rte_mbufs allocated by DPDK virtio_net function 
virtio_dev_pktmbuf_alloc() (in file lib/librte_vhost/virtio_net.c. Please refer to [PATCH V1 3/3], I changed free_cb as below, these packets use the same allocate function and the same free_cb no matter they are TCP packet or UDP packets, in case of VXLAN TSO, most NICs can't support inner UDP fragment offload, so OVS DPDK has to do it by software, for UDP case, the original rte_mbuf only can be freed by segmented rte_mbufs which are output packets of rte_gso_segment, i.e. the original rte_mbuf only can freed by free_cb, you can see, it explicitly called rte_pktmbuf_free(arg->mbuf), the condition statement "if (caller_m != arg->mbuf)" is true for this case, this has no problem, but for TCP case, the original mbuf is delivered to rte_eth_tx_burst() but not segmented rte_mbufs output by rte_gso_segment, PMD driver will call rte_pktmbuf_free(original_rte_mbuf) but not rte_pktmbuf_free(segmented_rte_mbufs), the same free_cb will be called, that means original_rte_mbuf will be freed twice, you know what will happen, this is just the issue I'm fixing. I bring in caller_m argument, it can help work around this because caller_m is arg->mbuf and the condition statement "if (caller_m != arg->mbuf)" is false, you can't fix it without the change this patch series did.

diff --git a/lib/librte_vhost/virtio_net.c b/lib/librte_vhost/virtio_net.c index e663fd4..3b69cbb 100644

--- a/lib/librte_vhost/virtio_net.c

+++ b/lib/librte_vhost/virtio_net.c

@@ -2136,10 +2136,20 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,

                return NULL;

 }

 

+struct shinfo_arg {

+             void *buf;

+             struct rte_mbuf *mbuf;

+};

+

 static void

-virtio_dev_extbuf_free(struct rte_mbuf * caller_m __rte_unused, void *opaque)

+virtio_dev_extbuf_free(struct rte_mbuf *caller_m, void *opaque)

 {

-              rte_free(opaque);

+             struct shinfo_arg *arg = (struct shinfo_arg *)opaque;

+

+             rte_free(arg->buf);

+             if (caller_m != arg->mbuf)

+                             rte_pktmbuf_free(arg->mbuf);

+             rte_free(arg);

 }

 

 static int

@@ -2172,8 +2182,14 @@ uint16_t rte_vhost_poll_enqueue_completed(int vid, uint16_t queue_id,

 

                /* Initialize shinfo */

                if (shinfo) {

+                             struct shinfo_arg *arg = (struct shinfo_arg *)

+                                             rte_malloc(NULL, sizeof(struct shinfo_arg),

+                                                                RTE_CACHE_LINE_SIZE);

+

+                             arg->buf = buf;

+                             arg->mbuf = pkt;

                                shinfo->free_cb = virtio_dev_extbuf_free;

-                              shinfo->fcb_opaque = buf;

+                             shinfo->fcb_opaque = arg;

                                rte_mbuf_ext_refcnt_set(shinfo, 1);

                } else {

                                shinfo = rte_pktmbuf_ext_shinfo_init_helper(buf, &buf_len,

--

1.8.3.1



/*
 * Allocate a host supported pktmbuf.
 */
static __rte_always_inline struct rte_mbuf *
virtio_dev_pktmbuf_alloc(struct virtio_net *dev, struct rte_mempool *mp,
                         uint32_t data_len)
{
        struct rte_mbuf *pkt = rte_pktmbuf_alloc(mp);

        if (unlikely(pkt == NULL)) {
                VHOST_LOG_DATA(ERR,
                        "Failed to allocate memory for mbuf.\n");
                return NULL;
        }

        if (rte_pktmbuf_tailroom(pkt) >= data_len)
                return pkt;

        /* attach an external buffer if supported */
        if (dev->extbuf && !virtio_dev_extbuf_alloc(pkt, data_len))
                return pkt;

        /* check if chained buffers are allowed */
        if (!dev->linearbuf)
                return pkt;

        /* Data doesn't fit into the buffer and the host supports
         * only linear buffers
         */
        rte_pktmbuf_free(pkt);

        return NULL;
}
 

  reply index

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-30 12:08 [dpdk-dev] [PATCH V1 0/3] Fix external mbuf free issue in " yang_y_yi
2020-07-30 12:08 ` [dpdk-dev] [PATCH V1 1/3] gso: fix refcnt update issue for external mbuf yang_y_yi
2020-07-30 12:08 ` [dpdk-dev] [PATCH V1 2/3] mbuf: change free_cb interface to adapt to GSO case yang_y_yi
2020-07-31 15:15   ` Olivier Matz
2020-08-01 23:12     ` yang_y_yi [this message]
2020-08-02 20:29       ` Olivier Matz
2020-08-02 20:45         ` Olivier Matz
2020-08-03  1:32           ` yang_y_yi
2020-08-03  1:26         ` yang_y_yi
2020-08-03  8:11           ` Olivier Matz
2020-08-03  9:42             ` yang_y_yi
2020-08-03 12:34               ` Olivier Matz
2020-08-04  1:31                 ` yang_y_yi
2020-09-27  5:55                   ` yang_y_yi
2020-10-07  9:48                     ` Olivier Matz
2020-10-09  9:51                       ` yang_y_yi
2020-10-09 11:55                         ` Olivier Matz
2020-10-10  1:49                           ` yang_y_yi
2020-10-14 13:55                             ` Olivier Matz
2020-10-15  2:52                               ` yang_y_yi
2020-07-30 12:09 ` [dpdk-dev] [PATCH V1 3/3] vhost: use new free_cb interface to fix mbuf free issue yang_y_yi

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=2a50e80c.44f.173ac4c6d20.Coremail.yang_y_yi@163.com \
    --to=yang_y_yi@163.com \
    --cc=dev@dpdk.org \
    --cc=jiayu.hu@intel.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    --cc=yangyi01@inspur.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

DPDK patches and discussions

Archives are clonable:
	git clone --mirror http://inbox.dpdk.org/dev/0 dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 dev dev/ http://inbox.dpdk.org/dev \
		dev@dpdk.org
	public-inbox-index dev


Newsgroup available over NNTP:
	nntp://inbox.dpdk.org/inbox.dpdk.dev


AGPL code for this site: git clone https://public-inbox.org/ public-inbox