From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <venky.venkatesan@intel.com>
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by dpdk.org (Postfix) with ESMTP id 492D7591A
 for <dev@dpdk.org>; Wed, 27 May 2015 17:30:24 +0200 (CEST)
Received: from fmsmga002.fm.intel.com ([10.253.24.26])
 by fmsmga101.fm.intel.com with ESMTP; 27 May 2015 08:30:23 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.13,506,1427785200"; d="scan'208";a="732538884"
Received: from orsmsx105.amr.corp.intel.com ([10.22.225.132])
 by fmsmga002.fm.intel.com with ESMTP; 27 May 2015 08:30:24 -0700
Received: from orsmsx151.amr.corp.intel.com (10.22.226.38) by
 ORSMSX105.amr.corp.intel.com (10.22.225.132) with Microsoft SMTP Server (TLS)
 id 14.3.224.2; Wed, 27 May 2015 08:30:22 -0700
Received: from orsmsx102.amr.corp.intel.com ([169.254.1.45]) by
 ORSMSX151.amr.corp.intel.com ([169.254.7.160]) with mapi id 14.03.0224.002;
 Wed, 27 May 2015 08:30:22 -0700
From: "Venkatesan, Venky" <venky.venkatesan@intel.com>
To: "Wiles, Keith" <keith.wiles@intel.com>, Lin XU <lxu@astri.org>,
 "dev@dpdk.org" <dev@dpdk.org>
Thread-Topic: [dpdk-dev] proposal: raw packet send and receive API for PMD
 driver
Thread-Index: AQHQmIyEdvTN/E8PiUuVC401KnNyBZ2P7WKQ
Date: Wed, 27 May 2015 15:30:22 +0000
Message-ID: <1FD9B82B8BF2CF418D9A1000154491D9744CD75D@ORSMSX102.amr.corp.intel.com>
References: <D18B41D3.20FDE%keith.wiles@intel.com>
In-Reply-To: <D18B41D3.20FDE%keith.wiles@intel.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [10.22.254.139]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] proposal: raw packet send and receive API for PMD
 driver
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 27 May 2015 15:30:25 -0000

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Wiles, Keith
> Sent: Wednesday, May 27, 2015 7:51 AM
> To: Lin XU; dev@dpdk.org
> Subject: Re: [dpdk-dev] proposal: raw packet send and receive API for PMD
> driver
>=20
>=20
>=20
> On 5/26/15, 11:18 PM, "Lin XU" <lxu@astri.org> wrote:
>=20
> >I think it is very important to decouple PMD driver with DPDK framework.
> >   (1) Currently, the rte_mbuf struct is too simple and hard to support
> >complex application such as IPSEC, flow control etc. This key struct
> >should be extendable to support customer defined management header
> and
> >hardware offloading feature.
>=20
> I was wondering if adding something like M_EXT support for external stora=
ge
> to DPDK MBUF would be more reasonable.
>=20
> IMO decoupling PMDs from DPDK will possible impact performance and I
> would prefer not to let this happen. The drivers are written for performa=
nce,
> but they did start out as normal FreeBSD/Linux drivers. Most of the core
> code to the Intel drivers are shared between other systems.
>=20
This was an explicit design choice to keep the mbuf simple, yet sufficient =
to service volume NIC controllers and the limited offloads that they have. =
I would prefer not to have rte_mbuf burdened with all that a protocol stack=
 needs - that will simply increase the size of the structure and penalize a=
pplications that need a lean structure (like security applications). Extend=
ing the mbuf to 128 bytes itself caused a regression in some performance ap=
ps.=20

That said, extensibility or for that matter a custom defined header is poss=
ible with in at least two ways.=20
a) the userdata pointer field can be set to point to a data structure that =
contains more information
b) you could simply embed the custom structure (like the pipeline code does=
) behind the rte_mbuf

These can be used to pass through any information from NICs that support ha=
rdware offloads as well as carrying areas for protocol stack specific infor=
mation (e.g. complete IPSEC offload).=20

> >   (2) To support more NICs.
> >So, I thinks it time to add new API for PMD(a no radical way), and
> >developer can add initial callback function in PMD for various upper
> >layer protocol procedures.
>=20
> We have one callback now I think, but what callbacks do you need?
>=20
> The only callback I can think of is for a stack to know when it can relea=
se its
> hold on the data as it has been transmitted for retry needs.
> >
> >

The one place that I do think we need to change is the memory allocation fr=
amework - allowing external memory allocators (buf alloc/free) so that the =
driver could be run within a completely different memory allocator system. =
It can be done with the system we have in place today with specific overrid=
es,  but it isn't simple. I think there was another request along similar l=
ines on the list. This would pretty much allow a TCP stack for example to a=
llocate and manage memory (as long as the driver interface via an MBUF can =
be maintained).  If this is something valuable, we could look at pursuing t=
his for the next release.=20

-Venky