From: "François-Frédéric Ozog" <ff@ozog.com>
To: "'Thomas Graf'" <tgraf@redhat.com>,
	"'Vincent JARDIN'" <vincent.jardin@6wind.com>
Cc: dev@openvswitch.org, dev@dpdk.org,
	'Gerald Rogers' <gerald.rogers@intel.com>,
	dpdk-ovs@ml01.01.org
Subject: Re: [dpdk-dev] [ovs-dev] [PATCH RFC] dpif-netdev: Add support Intel DPDK based ports.
Date: Wed, 29 Jan 2014 21:47:47 +0100	[thread overview]
Message-ID: <00ef01cf1d33$5e509270$1af1b750$@com> (raw)
In-Reply-To: <52E936D9.4010207@redhat.com>
> > First and easy answer: it is open source, so anyone can recompile. So,
> > what's the issue?
> 
> I'm talking from a pure distribution perspective here: Requiring to
> recompile all DPDK based applications to distribute a bugfix or to add
> support for a new PMD is not ideal.
> 
> So ideally OVS would have the possibility to link against the shared
> library long term.
I agree that distribution of DPDK apps is not covered properly at present.
Identifying the proper scheme requires a specific analysis based on the
constraints of the Telecom/Cloud/Networking markets.
In the telecom world, if you fix the underlying framework of an app, you
will still have to validate the solution, ie app/framework. In addition, the
idea of shared libraries introduces the implied requirement to validate apps
against diverse versions of DPDK shared libraries. This translates into
development and support costs.
I also expect many DPDK applications to tackle core networking features,
with sub micro second packet handling delays  and even lower than 200ns
(NAT64...). The lazy binding based on ELF PLT represent quite a cost, not
mentioning that optimization stops are shared libraries boundaries (gcc
whole program optimization can be very effective...). Microsoft DLL linkage
are an order of magnitude faster. If Linux was to provide that, I would
probably revise my judgment. (I haven't checked Linux dynamic linking
implementation for some time so my understanding of Linux dynamic linking
may be outdated).
> 
> > I get lost: do you mean ABI + API toward the PMDs or towards the
> > applications using the librte ?
> 
> Towards the PMDs is more straight forward at first so it seems logical to
> focus on that first.
I don't think it is so straight forward. Many recent cards such as Chelsio
and Myricom have a very different "packet memory layout" that does not fit
so easily into actual DPDK architecture.
1) "traditional" architecture: the driver reserves X buffers and provide the
card with descriptors of those buffers. Each packet is DMA'ed into exactly
one buffer. Typically you have 2K buffers, a 64 byte packet consumes exactly
one buffer
2) "alternative" new architecture: the driver reserves a memory zone, say
4MB, without any structure, and provide a a single zone description and a
ring buffer to the card. (there no individual buffer descriptors any more).
The card fills the memory zone with packets, one next to the other and
specifies where the packets are by updating the supplied ring. Out of the
many issues fitting this scheme into DPDK, you cannot free a single mbuf:
you have to maintain a ref count to the memory zone so that, when all mbufs
have been "released", the memory zone can be freed.
That's quite a stretch from actual paradigm.
Apart from this aspect, managing RSS is two tied to Intel's flow director
concepts and cannot accommodate directly smarter or dumber RSS mechanisms.
That said, I fully agree PMD API should be revisited.
Cordially,
François-Frédéric
next prev parent reply	other threads:[~2014-01-29 20:49 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-28  1:48 [dpdk-dev] " pshelar
     [not found] ` <20140128044950.GA4545@nicira.com>
2014-01-28  5:28   ` [dpdk-dev] [ovs-dev] " Pravin Shelar
2014-01-28 14:47     ` [dpdk-dev] " Vincent JARDIN
2014-01-28 17:56       ` Pravin Shelar
2014-01-29  0:15         ` Vincent JARDIN
2014-01-29 19:32           ` Pravin Shelar
     [not found]       ` <52E7D2A8.400@redhat.com>
2014-01-28 18:20         ` [dpdk-dev] [ovs-dev] " Pravin Shelar
     [not found] ` <52E7D13B.9020404@redhat.com>
2014-01-28 18:17   ` Pravin Shelar
2014-01-29  8:15     ` Thomas Graf
2014-01-29 10:26       ` Vincent JARDIN
2014-01-29 11:14         ` Thomas Graf
2014-01-29 16:34           ` Vincent JARDIN
2014-01-29 17:14             ` Thomas Graf
2014-01-29 18:42               ` Stephen Hemminger
2014-01-29 20:47               ` François-Frédéric Ozog [this message]
2014-01-29 23:15                 ` Thomas Graf
2014-03-13  7:37                 ` David Nyström
2014-01-29  8:56 ` [dpdk-dev] " Prashant Upadhyaya
2014-01-29 21:29   ` Pravin Shelar
2014-01-30 10:15     ` Prashant Upadhyaya
2014-01-30 16:27       ` Rogers, Gerald
2014-01-29 10:01 ` [dpdk-dev] [ovs-dev] " Thomas Graf
2014-01-29 21:49   ` Pravin Shelar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox
  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):
  git send-email \
    --in-reply-to='00ef01cf1d33$5e509270$1af1b750$@com' \
    --to=ff@ozog.com \
    --cc=dev@dpdk.org \
    --cc=dev@openvswitch.org \
    --cc=dpdk-ovs@ml01.01.org \
    --cc=gerald.rogers@intel.com \
    --cc=tgraf@redhat.com \
    --cc=vincent.jardin@6wind.com \
    /path/to/YOUR_REPLY
  https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
  Be sure your reply has a Subject: header at the top and a blank line
  before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).