From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <jiaxt@sinogrid.com>
Received: from APAC01-SG1-obe.outbound.protection.outlook.com
 (mail-sg1on0116.outbound.protection.outlook.com [134.170.132.116])
 by dpdk.org (Postfix) with ESMTP id 2EC796A87
 for <dev@dpdk.org>; Mon, 20 Apr 2015 12:01:11 +0200 (CEST)
Authentication-Results: dpdk.org; dkim=none (message not signed) header.d=none;
Received: from mail-ig0-f180.google.com (209.85.213.180) by
 HKNPR04MB146.apcprd04.prod.outlook.com (10.242.103.147) with Microsoft SMTP
 Server (TLS) id 15.1.136.25; Mon, 20 Apr 2015 10:01:07 +0000
Received: by igbhj9 with SMTP id hj9so56044815igb.1
 for <dev@dpdk.org>; Mon, 20 Apr 2015 03:01:00 -0700 (PDT)
MIME-Version: 1.0
X-Received: by 10.50.61.226 with SMTP id t2mr6866955igr.19.1429523599477; Mon,
 20 Apr 2015 02:53:19 -0700 (PDT)
Received: by 10.50.37.232 with HTTP; Mon, 20 Apr 2015 02:53:19 -0700 (PDT)
In-Reply-To: <CAMiqCqXWzjyc_y_jYeiBx1ecoMBe9dKwx9fsD7bLQ1n8rjYrDA@mail.gmail.com>
References: <1416924682-24170-1-git-send-email-cunming.liang@intel.com>
 <CAMiqCqXWzjyc_y_jYeiBx1ecoMBe9dKwx9fsD7bLQ1n8rjYrDA@mail.gmail.com>
Date: Mon, 20 Apr 2015 17:53:19 +0800
Message-ID: <CAMiqCqWQ5qFYCmcB_sTo=nFv+t+atMxhk8qP6z_STVaUvsyNBQ@mail.gmail.com>
From: Shelton Chia <jiaxt@sinogrid.com>
To: Cunming Liang <cunming.liang@intel.com>, <dev@dpdk.org>
X-Originating-IP: [209.85.213.180]
X-ClientProxiedBy: HKNPR04CA003.apcprd04.prod.outlook.com (10.242.116.33) To
 HKNPR04MB146.apcprd04.prod.outlook.com (10.242.103.147)
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:HKNPR04MB146;
X-Forefront-Antispam-Report: BMV:1; SFV:NSPM;
 SFS:(10019020)(377424004)(51704005)(450100001)(40100003)(62966003)(5001770100001)(55446002)(54356999)(63696999)(53806999)(15975445007)(2950100001)(77156002)(98316002)(84326002)(66066001)(59536001)(42186005)(16601075003)(107886001)(76176999)(86362001)(122856001)(61726006)(122386002)(46102003)(87976001)(43066003)(512874002)(19580405001)(50986999)(19617315012)(19580395003)(61266001)(42262002)(217873001);
 DIR:OUT; SFP:1102; SCL:1; SRVR:HKNPR04MB146; H:mail-ig0-f180.google.com; FPR:;
 SPF:None; MLV:sfv; LANG:en; 
X-Microsoft-Antispam-PRVS: <HKNPR04MB1463D615A01161B171EA9D6D8E00@HKNPR04MB146.apcprd04.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0;
 RULEID:(601004)(5002010)(5005006); SRVR:HKNPR04MB146; BCL:0; PCL:0; RULEID:;
 SRVR:HKNPR04MB146; 
X-Forefront-PRVS: 05529C6FDB
X-OriginatorOrg: sinogrid.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 20 Apr 2015 10:01:07.1558 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: HKNPR04MB146
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Content-Filtered-By: Mailman/MimeDel 2.1.15
Subject: Re: [dpdk-dev] [RFC PATCH 0/6] DPDK support to bifurcated driver
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: jiaxt@sinogrid.com
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 20 Apr 2015 10:01:13 -0000

Hi,
    I can receive packets when I mmaped all pci memory not only rx and tx
desc.

2015-04-09 11:43 GMT+08:00 =E8=B4=BE=E5=AD=A6=E6=B6=9B <jiaxt@sinogrid.com>=
:

> Hi Cunming,
>      I applyed bifurc dirver patches and tested it follow your example.
> But I can't received packets with testpmd and l2fwd.
>     Kernel stack can receive packets from 10.0.0.2 before "ethtool -N
> XGE4.1 flow-type ip4 src-ip 10.0.0.2 action 12". After "thtool -N XGE4.1
> flow-type ip4 src-ip 10.0.0.2 action 12", kernel stack can't receive
> packets from 10.0.0.2, but testpmd and l2fwd cannot receive any packets
> too.
>    queue 0-11 used by kernel and queue 12 used by bifurc dirver.
>    How can I make it work?
>
> 2014-11-25 22:11 GMT+08:00 Cunming Liang <cunming.liang@intel.com>:
>
>>
>> This is a RFC patch set to support "bifurcated driver" in DPDK.
>>
>>
>> What is "bifurcated driver"?
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D
>>
>> The "bifurcated driver" stands for the kernel NIC driver that supports:
>>
>> 1. on-demand rx/tx queue pairs split-off and assignment to user space
>>
>> 2. direct NIC resource(e.g. rx/tx queue registers) access from user spac=
e
>>
>> 3. distributing packets to kernel or user space rx queues by
>>    NIC's flow director according to the filter rules
>>
>> Here's the kernel patch set to support.
>> http://comments.gmane.org/gmane.linux.network/333615
>>
>>
>> Usage scenario
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>
>> It's well accepted by industry to use DPDK to process fast path packets =
in
>> user space in a high performance fashion, meanwhile processing slow path
>> control packets in kernel space is still needed as those packets usually
>> rely on in_kernel TCP/IP stacks and/or socket programming interface.
>>
>> KNI(Kernel NIC Interface) mechanism in DPDK is designed to meet this
>> requirement, with below limitation:
>>
>>   1) Software classifies packets and distributes them to kernel via DPDK
>>      software rings, at the cost of significant CPU cycles and memory
>> bandwidth.
>>
>>   2) Memory copy packets between kernel' socket buffer and mbuf brings
>>      significant negative performance impact to KNI performance.
>>
>> The bifurcated driver provides a alternative approach that not only
>> offloads
>> flow classification and distribution to NIC but also support packets
>> zero_copy.
>>
>> User can use standard ethtool to add filter rules to the NIC in order to
>> distribute specific flows to the queues only accessed by kernel driver a=
nd
>> stack, and add other rules to distribute packets to the queues assigned =
to
>> user-space.
>>
>> For those rx/tx queue pairs that directly accessed from user space,
>> DPDK takes over the packets rx/tx as well as corresponding DMA operation
>> for high performance packet I/O.
>>
>>
>> What's the impact and change to DPDK
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>
>> DPDK usually binds PCIe NIC devices by leveraging kernel' user space
>> driver
>> mechanism UIO or VFIO to map entire NIC' PCIe I/O space of NIC to user
>> space.
>> The bifurcated driver PMD talks to a NIC interface using raw socket APIs
>> and
>> only mmap() limited I/O space (e.g. certain 4K pages) for accessing
>> involved
>> rx/tx queue pairs. So the impact and changes mainly comes with below:
>>
>> - netdev
>>     DPDK needs to create a af_packet socket and bind it to a bifurcated
>> netdev.
>>     The socket fd will be used to request 'queue pairs info',
>>     'split/return queue pairs' and etc. The PCIe device ID, netdev MAC
>> address,
>>     numa info are also from the netdev response.
>>
>> - PCIe device scan and driver probe
>>     netdev provides the PCIe device ID information. Refer to the device
>> ID,
>>     the correct driver should be used. And for such netdev device, the
>> creation
>>     of PCIe device is no longer from scan but the on-demand assignment.
>>
>> - PCIe BAR mapping
>>     "bifurcated driver" maps several pages for the queue pairs.
>>     Others BAR register space maps to a fake page. The BAR mapping go
>> through
>>     mmap on sockfd. Which is a little different from what UIO/VFIO does.
>>
>> - PMD
>>     The PMD will no longer really initialize and configure NIC.
>>     Instead, it only takes care the queue pair setup, rx_burst and
>> tx_burst.
>>
>> The patch uses eal '--vdev' parameter to assign netdev iface name and
>> number of
>> queue pairs. Here's a example about how to configure the bifurcated
>> driver and
>> run DPDK testpmd with bifurcated PMD.
>>
>>   1. Set promisc mode
>>   > ifconfig eth0 promisc
>>
>>   2. Turn on fdir
>>   > ethtool -K eth0 ntuple on
>>
>>   3. Setup a flow director rule to distribute packets with source ip
>>      0.0.0.0 to rxq No.0
>>   > ethtool -N eth0  flow-type udp4 src-ip 0.0.0.0 action 0
>>
>>   4. Run testpmd on netdev 'eth0' with 1 queue pair.
>>   > ./x86_64-native-linuxapp-gcc/app/testpmd -c 0x3 -n 4 \
>>   >  --vdev=3Drte_bifurc,iface=3Deth0,qpairs=3D1 -- \
>>   >  -i --rxfreet=3D32 --txfreet=3D32 --txrst=3D32
>>   Note:
>>     iface and qpairs arguments above specify the netdev interface name a=
nd
>>     number of qpairs that user space request from the "bifurcated driver=
"
>>     respectively.
>>
>>   5. Setup a flow director rule to distribute packets with source ip
>>      1.1.1.1 to rxq No.32. This needs to be done after testpmd starts.
>>   > ethtool -N eth0 flow-type udp4 src-ip 1.1.1.1 action 32
>>
>> Below illustrates the detailed changes in this patch set.
>>
>> eal
>> --------
>> The first two patches are all about the eal API declaration and Linux
>> version
>> definition to support af_packet socket and verbs of bifurcated netdev.
>> Those APIs include the verbs like open, bind, (un)map, split/retturn,
>> map_umem.
>> And other APIs like set_pci, get_ifinfo and get/put_devargs which help t=
o
>> generate pci device from bifurcated netdev and get basic netdev info.
>>
>> The third patch is used to allow probing driver on the PCIe VDEV created
>> from
>> a NIC interface driven by "bifurcated driver". It defines a new flag
>> 'RTE_PCI_DRV_BIFURC' used for direct ring access PMD.
>>
>> librte_bifurc
>> ---------------
>> The library is used as a VDEV bus driver to scan '--vdev=3Drte_bifurc' V=
DEV
>> from eal command-line. It generates the PCIe VDEV device ready for furth=
er
>> driver probe. It maintains the bifurcated device information include
>> sockfd,
>> hwaddr, mtu, qpairs, iface_name. It's used for other direct ring access
>> PMD
>> to apply for bifurcated device info.
>>
>> direct ring access PMD
>> -------------------------
>> The patch provides direct ring access PMD for ixgbe. Comparing to the
>> normal
>> PMD ixgbe, it uses 'RTE_PCI_DRV_BIFURC' flag during self registration.
>> It mostly reuses the existing PMD ops to avoid re-implementing everythin=
g
>> from scratch. And it also modifies the rx/tx_queue_setup to allow queue
>> setup from any queue offset.
>>
>> Supported NIC driver
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>
>> The "bifurcated driver" kernel patch only supports "ixgbe" driver at the
>> moment,
>> so this RFC patch also provides "ixgbe" PMD via direct-mapped rings as
>> sample.
>> The support for 40GE(i40e) will be added in the future.
>>
>> In addition, for those multi-queues enabled NIC with flow director
>> capability
>> to do perform packet classification and distribution, there's no special
>> technical gap to provide bifurcated driver approach support.
>>
>> Limitation
>> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D
>>
>> By using "bifurcated driver", user space only takes over the DMA
>> operation.
>> For those NIC configure setting, it's out of control from user space PMD=
.
>> All the NIC setting including add/del filter rules need to be done by
>> standard Linux network tools(e.g. ethtool).
>> So the feature support really depend on how much are supported by ethtoo=
l.
>>
>>
>> Any questions, comments and feedback are welcome.
>>
>>
>> -END-
>>
>> Signed-off-by: Cunming Liang <cunming.liang@intel.com>
>> Signed-off-by: Danny Zhou <danny.zhou@intel.com>
>>
>> *** BLURB HERE ***
>>
>> Cunming Liang (6):
>>   eal: common direct ring access API
>>   eal: direct ring access support by linux af_packet
>>   pci: allow VDEV as pci device during device driver probe
>>   bifurc: add driver to scan bifurcated netdev
>>   ixgbe: rx/tx queue stop bug fix
>>   ixgbe: PMD for bifurc ixgbe net device
>>
>>  config/common_linuxapp                         |   5 +
>>  lib/Makefile                                   |   1 +
>>  lib/librte_bifurc/Makefile                     |  58 +++++
>>  lib/librte_bifurc/rte_bifurc.c                 | 284
>> +++++++++++++++++++++
>>  lib/librte_bifurc/rte_bifurc.h                 |  90 +++++++
>>  lib/librte_eal/common/Makefile                 |   5 +
>>  lib/librte_eal/common/include/rte_pci.h        |   4 +
>>  lib/librte_eal/common/include/rte_pci_bifurc.h | 186 ++++++++++++++
>>  lib/librte_eal/linuxapp/eal/Makefile           |   1 +
>>  lib/librte_eal/linuxapp/eal/eal_pci.c          |  42 ++--
>>  lib/librte_eal/linuxapp/eal/eal_pci_bifurc.c   | 336
>> +++++++++++++++++++++++++
>>  lib/librte_ether/rte_ethdev.c                  |   3 +-
>>  lib/librte_pmd_ixgbe/Makefile                  |  13 +-
>>  lib/librte_pmd_ixgbe/ixgbe_bifurcate.c         | 303
>> ++++++++++++++++++++++
>>  lib/librte_pmd_ixgbe/ixgbe_bifurcate.h         |  57 +++++
>>  lib/librte_pmd_ixgbe/ixgbe_rxtx.c              |  44 +++-
>>  lib/librte_pmd_ixgbe/ixgbe_rxtx.h              |  10 +
>>  mk/rte.app.mk                                  |   6 +
>>  18 files changed, 1421 insertions(+), 27 deletions(-)
>>  create mode 100644 lib/librte_bifurc/Makefile
>>  create mode 100644 lib/librte_bifurc/rte_bifurc.c
>>  create mode 100644 lib/librte_bifurc/rte_bifurc.h
>>  create mode 100644 lib/librte_eal/common/include/rte_pci_bifurc.h
>>  create mode 100644 lib/librte_eal/linuxapp/eal/eal_pci_bifurc.c
>>  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.c
>>  create mode 100644 lib/librte_pmd_ixgbe/ixgbe_bifurcate.h
>>
>> --
>> 1.8.1.4
>>
>>
>