From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <ff@ozog.com>
Received: from mo3.mail-out.ovh.net (11.mo3.mail-out.ovh.net [87.98.184.158])
 by dpdk.org (Postfix) with ESMTP id 23BEA5320
 for <dev@dpdk.org>; Wed, 29 Jan 2014 21:49:28 +0100 (CET)
Received: from mail408.ha.ovh.net (b6.ovh.net [213.186.33.56])
 by mo3.mail-out.ovh.net (Postfix) with SMTP id 7BA57FFB1B3
 for <dev@dpdk.org>; Wed, 29 Jan 2014 21:50:46 +0100 (CET)
Received: from b0.ovh.net (HELO queueout) (213.186.33.50)
 by b0.ovh.net with SMTP; 29 Jan 2014 22:51:33 +0200
Received: from lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr (HELO pcdeff)
 (ff@ozog.com@193.252.40.75)
 by ns0.ovh.net with SMTP; 29 Jan 2014 22:51:32 +0200
From: =?iso-8859-1?Q?Fran=E7ois-Fr=E9d=E9ric_Ozog?= <ff@ozog.com>
To: "'Thomas Graf'" <tgraf@redhat.com>,
 "'Vincent JARDIN'" <vincent.jardin@6wind.com>
References: <1390873715-26714-1-git-send-email-pshelar@nicira.com>	<52E7D13B.9020404@redhat.com>
 <CALnjE+rP29s8mkiKPtppt-a8jMn-B2qS7+re2ZBd8bK46ozUPA@mail.gmail.com>
 <52E8B88A.1070104@redhat.com> <52E8D772.9070302@6wind.com>
 <52E8E2AB.1080600@redhat.com> <52E92DA6.9070704@6wind.com>
 <52E936D9.4010207@redhat.com>
In-Reply-To: <52E936D9.4010207@redhat.com>
Date: Wed, 29 Jan 2014 21:47:47 +0100
Message-ID: <00ef01cf1d33$5e509270$1af1b750$@com>
MIME-Version: 1.0
Content-Type: text/plain;
	charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
X-Mailer: Microsoft Office Outlook 12.0
Thread-Index: Ac8dFYYyHRZDQgiqRhyUPV79pq/LXwAF/OeA
Content-Language: fr
X-Ovh-Tracer-Id: 7625720069508880601
X-Ovh-Remote: 193.252.40.75 (lneuilly-152-23-9-75.w193-252.abo.wanadoo.fr)
X-Ovh-Local: 213.186.33.20 (ns0.ovh.net)
X-OVH-SPAMSTATE: OK
X-OVH-SPAMSCORE: -100
X-OVH-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrieehucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
X-Spam-Check: DONE|U 0.5/N
X-VR-SPAMSTATE: OK
X-VR-SPAMSCORE: -100
X-VR-SPAMCAUSE: gggruggvucftvghtrhhoucdtuddrfeejtddrieehucetufdoteggodetrfcurfhrohhfihhlvgemucfqggfjnecuuegrihhlohhuthemuceftddtnecusecvtfgvtghiphhivghnthhsucdlqddutddtmd
Cc: dev@openvswitch.org, dev@dpdk.org,
 'Gerald Rogers' <gerald.rogers@intel.com>, dpdk-ovs@ml01.01.org
Subject: Re: [dpdk-dev] [ovs-dev] [PATCH RFC] dpif-netdev: Add support Intel
	DPDK based ports.
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 29 Jan 2014 20:49:28 -0000

> > First and easy answer: it is open source, so anyone can recompile. =
So,
> > what's the issue?
>=20
> I'm talking from a pure distribution perspective here: Requiring to
> recompile all DPDK based applications to distribute a bugfix or to add
> support for a new PMD is not ideal.

>=20
> So ideally OVS would have the possibility to link against the shared
> library long term.

I agree that distribution of DPDK apps is not covered properly at =
present.
Identifying the proper scheme requires a specific analysis based on the
constraints of the Telecom/Cloud/Networking markets.

In the telecom world, if you fix the underlying framework of an app, you
will still have to validate the solution, ie app/framework. In addition, =
the
idea of shared libraries introduces the implied requirement to validate =
apps
against diverse versions of DPDK shared libraries. This translates into
development and support costs.

I also expect many DPDK applications to tackle core networking features,
with sub micro second packet handling delays  and even lower than 200ns
(NAT64...). The lazy binding based on ELF PLT represent quite a cost, =
not
mentioning that optimization stops are shared libraries boundaries (gcc
whole program optimization can be very effective...). Microsoft DLL =
linkage
are an order of magnitude faster. If Linux was to provide that, I would
probably revise my judgment. (I haven't checked Linux dynamic linking
implementation for some time so my understanding of Linux dynamic =
linking
may be outdated).


>=20
> > I get lost: do you mean ABI + API toward the PMDs or towards the
> > applications using the librte ?
>=20
> Towards the PMDs is more straight forward at first so it seems logical =
to
> focus on that first.

I don't think it is so straight forward. Many recent cards such as =
Chelsio
and Myricom have a very different "packet memory layout" that does not =
fit
so easily into actual DPDK architecture.

1) "traditional" architecture: the driver reserves X buffers and provide =
the
card with descriptors of those buffers. Each packet is DMA'ed into =
exactly
one buffer. Typically you have 2K buffers, a 64 byte packet consumes =
exactly
one buffer

2) "alternative" new architecture: the driver reserves a memory zone, =
say
4MB, without any structure, and provide a a single zone description and =
a
ring buffer to the card. (there no individual buffer descriptors any =
more).
The card fills the memory zone with packets, one next to the other and
specifies where the packets are by updating the supplied ring. Out of =
the
many issues fitting this scheme into DPDK, you cannot free a single =
mbuf:
you have to maintain a ref count to the memory zone so that, when all =
mbufs
have been "released", the memory zone can be freed.
That's quite a stretch from actual paradigm.

Apart from this aspect, managing RSS is two tied to Intel's flow =
director
concepts and cannot accommodate directly smarter or dumber RSS =
mechanisms.

That said, I fully agree PMD API should be revisited.


Cordially,

Fran=E7ois-Fr=E9d=E9ric