DPDK patches and discussions
 help / color / mirror / Atom feed
From: "Zhou, Danny" <danny.zhou@intel.com>
To: Thomas Monjalon <thomas.monjalon@6wind.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>,
	"Fastabend, John R" <john.r.fastabend@intel.com>,
	Or Gerlitz <ogerlitz@mellanox.com>
Subject: Re: [dpdk-dev] bifurcated driver
Date: Thu, 6 Nov 2014 04:45:09 +0000	[thread overview]
Message-ID: <DFDF335405C17848924A094BC35766CF0A990522@SHSMSX104.ccr.corp.intel.com> (raw)
In-Reply-To: <545ACF39.5000507@6wind.com>

I roughly read libibverbs related code and relevant infiniband/rdma documents, and found though 
many concepts in libibverbs looks similar to bifurcated driver, but there are still lots of differences as 
illustrated below based on my understanding: 

1) Queue pair defined in RDMA specification are abstract concept, where the queue pairs term used in 
  bifurcated driver are rx/tx queue pairs in the NIC.
2) Bifurcated PMD in DPDK directly access NIC resources as a slave driver (no NIC control), while libibverbs
  as a user space library rather than driver offloads certain operations to kernel driver and NIC by invoking 
  "verbs" APIs.
3) Libibverbs invokes infiniband specific system calls to allow user/kernel space communication based on 
  "verbs" defined in infiniband/RDMA spec, while bifurcated driver build on top of af_packet module 
  and new socket options to do things like hw queue split-off , map certain pages on I/O space to user space 
  operations, etc.
4) There is a specific embedded MMU unit in Infiniband/RDMA to provides memory protection, while
  bifurcated driver uses IOMMU rather than NIC to provide memory protection.

IMHO, libibverbs and corresponding kernel modules/drivers are specifically designed and implemented for 
direct access to RDMA hardware from userspace, and it highly depends on "verbs" related system calls 
supported by infiniband/rdma mechanism in kernel, rather than netdev mechanism that bifurcated driver 
solution depends on. 

> -----Original Message-----
> From: Vincent JARDIN [mailto:vincent.jardin@6wind.com]
> Sent: Thursday, November 06, 2014 9:31 AM
> To: Zhou, Danny
> Cc: Thomas Monjalon; dev@dpdk.org; Fastabend, John R; Or Gerlitz
> Subject: Re: [dpdk-dev] bifurcated driver
> 
> +Or
> 
> On 05/11/2014 23:48, Zhou, Danny wrote:
> > Hi Thomas,
> >
> > Thanks for sharing the links to ibverbs, I will take a close look at it and compare it to bifurcated driver. My take
> > after a rough review is that idea is very much similar, but bifurcated driver implementation is generic for any
> > Ethernet device based on existing af_packet mechanism, with extension of exchanging the messages between
> > user space and kernel space driver.
> >
> > I have an internal document to summary the pros and cons of below solutions, except for ibvers, but
> > will be adding it shortly.
> >
> > - igb_uio
> > - uio_pci_generic
> > - VFIO
> > - bifurcated driver
> >
> > Short answers to your questions:
> >> 	- upstream status
> > Adding IOMMU based memory protection and generic descriptor description support now, into version 2
> > kernel patches.
> >
> >> 	- usable with kernel netdev
> > af_packet based, and relevant patchset will be submitted to netdev for sure.
> >
> >> 	- usable in a vm
> > No, it does no coexist with SRIOV for number of reasons. but if you pass-through a PF to a VM, it works perfect.
> >
> >> 	- usable for Ethernet
> > It could work with all Ethernet NICs, as flow director is available and NIC driver support new net_ops to split off
> > queue pairs for user space.
> >
> >> 	- hardware requirements
> > No specific hardware requirements. All mainstream NICs have multiple qpairs and flow director support.
> >
> >> 	- security protection
> > Leverage IOMMU to provide memory protection on Intel platform. Other archs provide similar memory protection
> > mechanism, so we only use arch-agnostic DMA memory allocation APIs in kernel to support memory protection.
> >
> >> 	- performance
> > DPDK native performance on user space queues, as long as drop_en is enabled to avoid head-of-line blocking.
> >
> > -Danny
> >
> >> -----Original Message-----
> >> From: Thomas Monjalon [mailto:thomas.monjalon@6wind.com]
> >> Sent: Wednesday, November 05, 2014 9:01 PM
> >> To: Zhou, Danny
> >> Cc: dev@dpdk.org; Fastabend, John R
> >> Subject: Re: [dpdk-dev] bifurcated driver
> >>
> >> Hi Danny,
> >>
> >> 2014-10-31 17:36, O'driscoll, Tim:
> >>> Bifurcated Driver (Danny.Zhou@intel.com)
> >>
> >> Thanks for the presentation of bifurcated driver during the community call.
> >> I asked if you looked at ibverbs and you wanted a link to check.
> >> The kernel module is here:
> >> 	http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/core
> >> The userspace library:
> >> 	http://git.kernel.org/cgit/libs/infiniband/libibverbs.git
> >>
> >> Extract from Kconfig:
> >> "
> >> config INFINIBAND_USER_ACCESS
> >> 	tristate "InfiniBand userspace access (verbs and CM)"
> >> 	select ANON_INODES
> >> 	---help---
> >> 	  Userspace InfiniBand access support.  This enables the
> >> 	  kernel side of userspace verbs and the userspace
> >> 	  communication manager (CM).  This allows userspace processes
> >> 	  to set up connections and directly access InfiniBand
> >> 	  hardware for fast-path operations.  You will also need
> >> 	  libibverbs, libibcm and a hardware driver library from
> >> 	  <http://www.openfabrics.org/git/>.
> >> "
> >>
> >> It seems to be close to the bifurcated driver needs.
> >> Not sure if it can solve the security issues if there is no dedicated MMU
> >> in the NIC.
> >>
> >> I feel we should sum up pros and cons of
> >> 	- igb_uio
> >> 	- uio_pci_generic
> >> 	- VFIO
> >> 	- ibverbs
> >> 	- bifurcated driver
> >> I suggest to consider these criterias:
> >> 	- upstream status
> >> 	- usable with kernel netdev
> >> 	- usable in a vm
> >> 	- usable for ethernet
> >> 	- hardware requirements
> >> 	- security protection
> >> 	- performance
> >>
> >> --
> >> Thomas

  reply	other threads:[~2014-11-06  4:36 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-24  9:22 [dpdk-dev] DPDK Community Conference Call - Friday 31st October O'driscoll, Tim
2014-10-24 15:05 ` Michael Marchetti
2014-10-24 15:22   ` O'driscoll, Tim
2014-10-31 15:34 ` O'driscoll, Tim
2014-10-31 17:36   ` O'driscoll, Tim
2014-11-01 12:59     ` Neil Horman
2014-11-01 14:05       ` Vincent JARDIN
2014-11-05 13:00     ` [dpdk-dev] bifurcated driver Thomas Monjalon
2014-11-05 15:14       ` Alex Markuze
2014-11-05 15:19         ` Alex Markuze
2014-11-05 22:19           ` Zhou, Danny
2014-11-05 22:48       ` Zhou, Danny
2014-11-06  1:30         ` Vincent JARDIN
2014-11-06  4:45           ` Zhou, Danny [this message]
2014-11-06  8:13             ` Alex Markuze
2014-11-06  9:10               ` Nicolas Dichtel
2014-11-24 11:57       ` Luke Gorrie
2014-11-24 13:38         ` Zhou, Danny
2014-11-20  7:17     ` [dpdk-dev] DPDK Community Conference Call - Friday 31st October Kevin Wilson
2014-11-20 13:13       ` O'driscoll, Tim
2014-11-20 17:02         ` Kevin Wilson
2014-11-20 23:26           ` O'driscoll, Tim
2014-11-21 10:54             ` Kevin Wilson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DFDF335405C17848924A094BC35766CF0A990522@SHSMSX104.ccr.corp.intel.com \
    --to=danny.zhou@intel.com \
    --cc=dev@dpdk.org \
    --cc=john.r.fastabend@intel.com \
    --cc=ogerlitz@mellanox.com \
    --cc=thomas.monjalon@6wind.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).