From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id E1659A2EDB
	for <public@inbox.dpdk.org>; Wed,  2 Oct 2019 10:04:56 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id F03861B94D;
	Wed,  2 Oct 2019 10:04:55 +0200 (CEST)
Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28])
 by dpdk.org (Postfix) with ESMTP id 7DB1C1B94A
 for <dev@dpdk.org>; Wed,  2 Oct 2019 10:04:53 +0200 (CEST)
Received: from mail-vk1-f199.google.com (mail-vk1-f199.google.com
 [209.85.221.199])
 (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits))
 (No client certificate requested)
 by mx1.redhat.com (Postfix) with ESMTPS id 8E0532C9700
 for <dev@dpdk.org>; Wed,  2 Oct 2019 08:04:52 +0000 (UTC)
Received: by mail-vk1-f199.google.com with SMTP id u123so9196667vkf.8
 for <dev@dpdk.org>; Wed, 02 Oct 2019 01:04:52 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
 d=1e100.net; s=20161025;
 h=x-gm-message-state:mime-version:references:in-reply-to:from:date
 :message-id:subject:to:cc;
 bh=naJ6jSz3hx44gvWqw0PX8FCAYwME4REmZ+gLHVChNjI=;
 b=RMUu2q8vFB/34tNLaGNlyz8Jjkzj/v69mlsbFChV89v6jatACPxhhyqj/4Yj9xwJiF
 1VfoQ/RV/munFSSwx0D3wP/ndclKpgdeqpIIoT4vnSakn3Gc2JB2ihSoedxsGYNDGGFf
 NfbTvStADLklDgZHI6Ta+5omNpZs1SEoWYhkAYgCbw6Ndpz1cmKFV2cLDtwHieb5ePmX
 novnNc9iDlmj74wPB+B8yFH+nk9b2/F9WqHLgUpz2clwjDrjfdxdkKiohU7wJhhbizR/
 ne/Fe/9GInAtdtxLFB0yYaWfQUnEqx8brQCBDtwNEb6Ktjz8lURTxVL9Io0OzUF264Ub
 Ppfg==
X-Gm-Message-State: APjAAAXsDJ7P+1OFDQa0OD+JX968E0/3Ct/OAqvK2j1iouJwx1Hpf11E
 9953v2lgKLkcY9V3Yujoq40MZZ9GzRAE5eO+DL+2wdNuymBKCnMA/F4yR30foJYsra33GdJzjoJ
 9s0tQVc1P70PNAYE7UQE=
X-Received: by 2002:a67:be15:: with SMTP id x21mr1085974vsq.141.1570003491799; 
 Wed, 02 Oct 2019 01:04:51 -0700 (PDT)
X-Google-Smtp-Source: APXvYqzzhgJyQmSr0z9pIhTfBELWv7HPZutHFIckWVxz+VmFgPdy8FBMjPWUd1zSTXnVLCPTIdYoH042zTfSlXbh60Y=
X-Received: by 2002:a67:be15:: with SMTP id x21mr1085956vsq.141.1570003491483; 
 Wed, 02 Oct 2019 01:04:51 -0700 (PDT)
MIME-Version: 1.0
References: <20191001221935.12140-1-fbl@sysclose.org>
 <AM0PR0502MB3795683AFDA5EE7759018EC4C39C0@AM0PR0502MB3795.eurprd05.prod.outlook.com>
In-Reply-To: <AM0PR0502MB3795683AFDA5EE7759018EC4C39C0@AM0PR0502MB3795.eurprd05.prod.outlook.com>
From: David Marchand <david.marchand@redhat.com>
Date: Wed, 2 Oct 2019 10:04:40 +0200
Message-ID: <CAJFAV8xebrgarbfM04sFcMANTOUxRMvEaud9bvLuVKneBBZQfg@mail.gmail.com>
To: Shahaf Shuler <shahafs@mellanox.com>
Cc: Flavio Leitner <fbl@sysclose.org>, "dev@dpdk.org" <dev@dpdk.org>, 
 Maxime Coquelin <maxime.coquelin@redhat.com>, Tiwei Bie <tiwei.bie@intel.com>, 
 Zhihong Wang <zhihong.wang@intel.com>,
 Obrembski MichalX <michalx.obrembski@intel.com>, 
 Stokes Ian <ian.stokes@intel.com>
Content-Type: text/plain; charset="UTF-8"
Subject: Re: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

Hello Shahaf,

On Wed, Oct 2, 2019 at 6:46 AM Shahaf Shuler <shahafs@mellanox.com> wrote:
>
> Wednesday, October 2, 2019 1:20 AM, Flavio Leitner:
> > Subject: [dpdk-dev] [PATCH] vhost: add support to large linear mbufs
> >
> > The rte_vhost_dequeue_burst supports two ways of dequeuing data. If the
> > data fits into a buffer, then all data is copied and a single linear buffer is
> > returned. Otherwise it allocates additional mbufs and chains them together
> > to return a multiple segments mbuf.
> >
> > While that covers most use cases, it forces applications that need to work
> > with larger data sizes to support multiple segments mbufs.
> > The non-linear characteristic brings complexity and performance implications
> > to the application.
> >
> > To resolve the issue, change the API so that the application can optionally
> > provide a second mempool containing larger mbufs. If that is not provided
> > (NULL), the behavior remains as before the change.
> > Otherwise, the data size is checked and the corresponding mempool is used
> > to return linear mbufs.
>
> I understand the motivation.
> However, providing a static pool w/ large buffers is not so efficient in terms of memory footprint. You will need to prepare to worst case (all packet are large) w/ max size of 64KB.
> Also, the two mempools are quite restrictive as the memory fill of the mbufs might be very sparse. E.g. mempool1 mbuf.size = 1.5K , mempool2 mbuf.size = 64K, packet size 4KB.
>
> Instead, how about using the mbuf external buffer feature?
> The flow will be:
> 1. vhost PMD always receive a single mempool (like today)
> 2. on dequeue, PMD looks on the virtio packet size. If smaller than the mbuf size use the mbuf as is (like today)
> 3. otherwise, allocate a new buffer (inside the PMD) and link it to the mbuf as external buffer (rte_pktmbuf_attach_extbuf)

I am missing some piece here.
Which pool would the PMD take those external buffers from?

If it is from an additional mempool passed to the vhost pmd, I can't
see the difference with Flavio proposal.


> The pros of this approach is that you have full flexibility on the memory allocation, and therefore a lower footprint.
> The cons is the OVS will need to know how to handle mbuf w/ external buffers (not too complex IMO).


-- 
David Marchand