From: Shahaf Shuler <shahafs@mellanox.com>
To: David Marchand <david.marchand@redhat.com>
Cc: Flavio Leitner <fbl@sysclose.org>, "dev@dpdk.org" <dev@dpdk.org>,
Maxime Coquelin <maxime.coquelin@redhat.com>,
Tiwei Bie <tiwei.bie@intel.com>,
Zhihong Wang <zhihong.wang@intel.com>,
Obrembski MichalX <michalx.obrembski@intel.com>,
Stokes Ian <ian.stokes@intel.com>
Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
Date: Wed, 2 Oct 2019 09:00:11 +0000 [thread overview]
Message-ID: <AM0PR0502MB3795E94C2D0570A715525C3AC39C0@AM0PR0502MB3795.eurprd05.prod.outlook.com> (raw)
In-Reply-To: <CAJFAV8xebrgarbfM04sFcMANTOUxRMvEaud9bvLuVKneBBZQfg@mail.gmail.com>
Wednesday, October 2, 2019 11:05 AM, David Marchand:
> Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
>
> Hello Shahaf,
>
> On Wed, Oct 2, 2019 at 6:46 AM Shahaf Shuler <shahafs@mellanox.com>
> wrote:
> >
> > Wednesday, October 2, 2019 1:20 AM, Flavio Leitner:
> > > Subject: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
> > >
> > > The rte_vhost_dequeue_burst supports two ways of dequeuing data. If
> > > the data fits into a buffer, then all data is copied and a single
> > > linear buffer is returned. Otherwise it allocates additional mbufs
> > > and chains them together to return a multiple segments mbuf.
> > >
> > > While that covers most use cases, it forces applications that need
> > > to work with larger data sizes to support multiple segments mbufs.
> > > The non-linear characteristic brings complexity and performance
> > > implications to the application.
> > >
> > > To resolve the issue, change the API so that the application can
> > > optionally provide a second mempool containing larger mbufs. If that
> > > is not provided (NULL), the behavior remains as before the change.
> > > Otherwise, the data size is checked and the corresponding mempool is
> > > used to return linear mbufs.
> >
> > I understand the motivation.
> > However, providing a static pool w/ large buffers is not so efficient in terms
> of memory footprint. You will need to prepare to worst case (all packet are
> large) w/ max size of 64KB.
> > Also, the two mempools are quite restrictive as the memory fill of the
> mbufs might be very sparse. E.g. mempool1 mbuf.size = 1.5K , mempool2
> mbuf.size = 64K, packet size 4KB.
> >
> > Instead, how about using the mbuf external buffer feature?
> > The flow will be:
> > 1. vhost PMD always receive a single mempool (like today) 2. on
> > dequeue, PMD looks on the virtio packet size. If smaller than the mbuf
> > size use the mbuf as is (like today) 3. otherwise, allocate a new
> > buffer (inside the PMD) and link it to the mbuf as external buffer
> > (rte_pktmbuf_attach_extbuf)
>
> I am missing some piece here.
> Which pool would the PMD take those external buffers from?
The mbuf is always taken from the single mempool associated w/ the rxq.
The buffer for the mbuf may be allocated (in case virtio payload is bigger than current mbuf size) from DPDK hugepages or any other system memory and be attached to the mbuf.
You can see example implementation of it in mlx5 PMD (checkout rte_pktmbuf_attach_extbuf call)
>
> If it is from an additional mempool passed to the vhost pmd, I can't see the
> difference with Flavio proposal.
>
>
> > The pros of this approach is that you have full flexibility on the memory
> allocation, and therefore a lower footprint.
> > The cons is the OVS will need to know how to handle mbuf w/ external
> buffers (not too complex IMO).
>
>
> --
> David Marchand
next prev parent reply other threads:[~2019-10-02 9:00 UTC|newest]
Thread overview: 32+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-01 22:19 Flavio Leitner
2019-10-01 23:10 ` Flavio Leitner
2019-10-02 4:45 ` Shahaf Shuler
2019-10-02 8:04 ` David Marchand
2019-10-02 9:00 ` Shahaf Shuler [this message]
2019-10-02 12:58 ` Flavio Leitner
2019-10-02 17:50 ` Shahaf Shuler
2019-10-02 18:15 ` Flavio Leitner
2019-10-03 16:57 ` Ilya Maximets
2019-10-03 21:25 ` Flavio Leitner
2019-10-02 7:51 ` Maxime Coquelin
2019-10-04 20:10 ` [dpdk-dev] [PATCH v2] vhost: add support for large buffers Flavio Leitner
2019-10-06 4:47 ` Shahaf Shuler
2019-10-10 5:12 ` Tiwei Bie
2019-10-10 12:12 ` Flavio Leitner
2019-10-11 17:09 ` [dpdk-dev] [PATCH v3] " Flavio Leitner
2019-10-14 2:44 ` Tiwei Bie
2019-10-15 16:17 ` [dpdk-dev] [PATCH v4] " Flavio Leitner
2019-10-15 17:41 ` Ilya Maximets
2019-10-15 18:44 ` Flavio Leitner
2019-10-15 18:59 ` [dpdk-dev] [PATCH v5] " Flavio Leitner
2019-10-16 10:02 ` Maxime Coquelin
2019-10-16 11:13 ` Maxime Coquelin
2019-10-16 13:32 ` Ilya Maximets
2019-10-16 13:46 ` Maxime Coquelin
2019-10-16 14:02 ` Flavio Leitner
2019-10-16 14:08 ` Ilya Maximets
2019-10-16 14:14 ` Flavio Leitner
2019-10-16 14:05 ` Ilya Maximets
2019-10-29 9:02 ` David Marchand
2019-10-29 12:21 ` Flavio Leitner
2019-10-29 16:19 ` David Marchand
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=AM0PR0502MB3795E94C2D0570A715525C3AC39C0@AM0PR0502MB3795.eurprd05.prod.outlook.com \
--to=shahafs@mellanox.com \
--cc=david.marchand@redhat.com \
--cc=dev@dpdk.org \
--cc=fbl@sysclose.org \
--cc=ian.stokes@intel.com \
--cc=maxime.coquelin@redhat.com \
--cc=michalx.obrembski@intel.com \
--cc=tiwei.bie@intel.com \
--cc=zhihong.wang@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).