From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by dpdk.org (Postfix) with ESMTP id B4340106B for ; Mon, 2 Mar 2015 18:45:50 +0100 (CET) Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga102.fm.intel.com with ESMTP; 02 Mar 2015 09:45:49 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.09,676,1418112000"; d="scan'208";a="692806934" Received: from irsmsx107.ger.corp.intel.com ([163.33.3.99]) by orsmga002.jf.intel.com with ESMTP; 02 Mar 2015 09:45:47 -0800 Received: from irsmsx109.ger.corp.intel.com ([169.254.13.103]) by IRSMSX107.ger.corp.intel.com ([169.254.10.35]) with mapi id 14.03.0195.001; Mon, 2 Mar 2015 17:45:46 +0000 From: "Butler, Siobhan A" To: Adrien Mazarguil , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3 3/3] doc: add librte_pmd_mlx4 documentation Thread-Index: AQHQUQKnwOYhZhnIYUS0gcir0Y/sgJ0Jf1Kg Date: Mon, 2 Mar 2015 17:45:45 +0000 Message-ID: <0C5AFCA4B3408848ADF2A3073F7D8CC86D51A972@IRSMSX109.ger.corp.intel.com> References: <1424492174-27072-1-git-send-email-adrien.mazarguil@6wind.com> <1424872326-17930-1-git-send-email-adrien.mazarguil@6wind.com> <1424872326-17930-4-git-send-email-adrien.mazarguil@6wind.com> In-Reply-To: <1424872326-17930-4-git-send-email-adrien.mazarguil@6wind.com> Accept-Language: en-IE, en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [163.33.239.181] Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3 3/3] doc: add librte_pmd_mlx4 documentation X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 02 Mar 2015 17:45:51 -0000 Thank you Adrien this is great. Siobhan > -----Original Message----- > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Adrien Mazarguil > Sent: Wednesday, February 25, 2015 1:52 PM > To: dev@dpdk.org > Subject: [dpdk-dev] [PATCH v3 3/3] doc: add librte_pmd_mlx4 > documentation >=20 > This documentation covers implementation details, features and limitation= s, > configuration, prerequisites and provides a usage example. >=20 > Signed-off-by: Adrien Mazarguil > --- > MAINTAINERS | 1 + > doc/guides/prog_guide/index.rst | 1 + > doc/guides/prog_guide/mlx4_poll_mode_drv.rst | 326 > +++++++++++++++++++++++++++ > doc/guides/prog_guide/source_org.rst | 1 + > 4 files changed, 329 insertions(+) > create mode 100644 doc/guides/prog_guide/mlx4_poll_mode_drv.rst >=20 > diff --git a/MAINTAINERS b/MAINTAINERS > index d8b0fbc..ac61825 100644 > --- a/MAINTAINERS > +++ b/MAINTAINERS > @@ -223,6 +223,7 @@ F: lib/librte_pmd_fm10k/ Mellanox mlx4 > M: Adrien Mazarguil > F: lib/librte_pmd_mlx4/ > +F: doc/guides/prog_guide/mlx4_poll_mode_drv.rst >=20 > RedHat virtio > M: Changchun Ouyang diff --git > a/doc/guides/prog_guide/index.rst b/doc/guides/prog_guide/index.rst > index de69682..87f6b35 100644 > --- a/doc/guides/prog_guide/index.rst > +++ b/doc/guides/prog_guide/index.rst > @@ -56,6 +56,7 @@ Programmer's Guide > intel_dpdk_xen_based_packet_switch_sol > libpcap_ring_based_poll_mode_drv > link_bonding_poll_mode_drv_lib > + mlx4_poll_mode_drv > timer_lib > hash_lib > lpm_lib > diff --git a/doc/guides/prog_guide/mlx4_poll_mode_drv.rst > b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst > new file mode 100644 > index 0000000..35570c3 > --- /dev/null > +++ b/doc/guides/prog_guide/mlx4_poll_mode_drv.rst > @@ -0,0 +1,326 @@ > +.. BSD LICENSE > + Copyright 2012-2015 6WIND S.A. > + > + Redistribution and use in source and binary forms, with or without > + modification, are permitted provided that the following conditions > + are met: > + > + * Redistributions of source code must retain the above copyright > + notice, this list of conditions and the following disclaimer. > + * Redistributions in binary form must reproduce the above copyright > + notice, this list of conditions and the following disclaimer in > + the documentation and/or other materials provided with the > + distribution. > + * Neither the name of 6WIND S.A. nor the names of its > + contributors may be used to endorse or promote products derived > + from this software without specific prior written permission. > + > + THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND > CONTRIBUTORS > + "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT > NOT > + LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND > FITNESS FOR > + A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE > COPYRIGHT > + OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, > INCIDENTAL, > + SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT > NOT > + LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS > OF USE, > + DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED > AND ON ANY > + THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT > + (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF > THE USE > + OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH > DAMAGE. > + > +MLX4 poll mode driver library > +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D > + > +The MLX4 poll mode driver library (**librte_pmd_mlx4**) implements > +support for **Mellanox ConnectX-3** 10/40 Gbps adapters (EN 40, EN 10, > +Pro EN 40) as well as their virtual functions (VF) in SR-IOV context. > + > +.. note:: > + > + Due to external dependencies, this driver is disabled by default. It = must > + be enabled manually by setting ``CONFIG_RTE_LIBRTE_MLX4_PMD=3Dy`` > and > + recompiling DPDK. > + > +Implementation details > +---------------------- > + > +Most Mellanox ConnectX-3 devices provide two ports but expose a single > +PCI bus address, thus unlike most drivers, librte_pmd_mlx4 registers > +itself as a PCI driver that allocates one Ethernet device per detected p= ort. > + > +For this reason, one cannot white/blacklist a single port without also > +white/blacklisting the others on the same device. > + > +Besides its dependency on libibverbs (that implies libmlx4 and > +associated kernel support), librte_pmd_mlx4 relies heavily on system > +calls for control operations such as querying/updating the MTU and flow > control parameters. > + > +For security reasons and robustness, this driver only deals with > +virtual memory addresses. The way resources allocations are handled by > +the kernel combined with hardware specifications that allow it to > +handle virtual memory addresses directly ensure that DPDK applications > +cannot access random physical memory (or memory that does not belong > to the current process). > + > +This capability allows the PMD to coexist with kernel network > +interfaces which remain functional, although they stop receiving > +unicast packets as long as they share the same MAC address. > + > +Compiling librte_pmd_mlx4 causes DPDK to be linked against libibverbs. > + > +Features and limitations > +------------------------ > + > +- RSS, also known as RCA, is supported. In this mode the number of > + configured RX queues must be a power of two. > +- VLAN filtering is supported. > +- Link state information is provided. > +- Promiscuous mode is supported. > +- All multicast mode is supported. > +- Multiple MAC addresses (unicast, multicast) can be configured. > +- Scattered packets are supported for TX and RX. > + > +.. > + > +- RSS hash key cannot be modified. > +- Hardware counters are not implemented (they are software counters). > +- Checksum offloads are not supported yet. > + > +Configuration > +------------- > + > +Compilation options > +~~~~~~~~~~~~~~~~~~~ > + > +- ``CONFIG_RTE_LIBRTE_MLX4_PMD`` (default **n**) > + > + Toggle compilation of librte_pmd_mlx4 itself. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_DEBUG`` (default **n**) > + > + Toggle debugging code and stricter compilation flags. Enabling this > + option adds additional run-time checks and debugging messages at the > + cost of lower performance. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_SGE_WR_N`` (default **4**) > + > + Number of scatter/gather elements (SGEs) per work request (WR). > + Lowering this number improves performance but also limits the ability > + to receive scattered packets (packets that do not fit a single mbuf). > + The default value is a safe tradeoff. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_MAX_INLINE`` (default **0**) > + > + Amount of data to be inlined during TX operations. Improves latency > + but lowers throughput. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_TX_MP_CACHE`` (default **8**) > + > + Maximum number of cached memory pools (MPs) per TX queue. Each MP > + from which buffers are to be transmitted must be associated to memory > + regions (MRs). This is a slow operation that must be cached. > + > + This value is always 1 for RX queues since they use a single MP. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_SOFT_COUNTERS`` (default **1**) > + > + Toggle software counters. No counters are available if this option is > + disabled since hardware counters are not supported. > + > +- ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE`` (default **1**) > + > + Toggle VMware compatibility code. It also requires the environment > + variable ``MLX4_COMPAT_VMWARE`` set to a nonzero value at runtime. > + > +Environment variables > +~~~~~~~~~~~~~~~~~~~~~ > + > +- ``MLX4_INLINE_RECV_SIZE`` > + > + A nonzero value enables inline receive for packets up to that size. > + May significantly improve performance in some cases but lower it in > + others. Requires careful testing. > + > +- ``MLX4_COMPAT_VMWARE`` > + > + Only supported when compiled with > + ``CONFIG_RTE_LIBRTE_MLX4_COMPAT_VMWARE=3D1``. Adds workarounds > to run > + in VMware systems that do not support the flows API properly. > + > +Run-time configuration > +~~~~~~~~~~~~~~~~~~~~~~ > + > +- The only constraint when RSS mode is requested is to make sure the > +number > + of RX queues is a power of two. This is a hardware requirement. > + > +- librte_pmd_mlx4 brings kernel network interfaces up during > +initialization > + because it is affected by their state. Forcing them down prevents > +packets > + reception. > + > +- **ethtool** operations on related kernel interfaces also affect the PM= D. > + > +Prerequisites > +------------- > + > +This driver relies on external libraries and kernel drivers for > +resources allocations and initialization. The following dependencies > +are not part of DPDK and must be installed separately: > + > +- **libibverbs** > + > + User space verbs framework used by librte_pmd_mlx4. This library > + provides a generic interface between the kernel and low-level user > + space drivers such as libmlx4. > + > + It allows slow and privileged operations (context initialization, > + hardware resources allocations) to be managed by the kernel and fast > + operations to never leave user space. > + > +- **libmlx4** > + > + Low-level user space driver library for Mellanox ConnectX-3 devices, > + it is automatically loaded by libibverbs. > + > + This library basically implements send/receive calls to the hardware > + queues. > + > +- **Kernel modules** (mlnx-ofed-kernel) > + > + They provide the kernel-side verbs API and low level device drivers > + that manage actual hardware initialization and resources sharing with > + user space processes. > + > + Unlike most other PMDs, these modules must remain loaded and bound > to > + their devices: > + > + - mlx4_core: hardware driver managing Mellanox ConnectX-3 devices. > + - mlx4_en: Ethernet device driver that provides kernel network interfa= ces. > + - mlx4_ib: InifiniBand device driver. > + - ib_uverbs: user space driver for verbs (entry point for libibverbs). > + > +While these libraries and kernel modules are available on OpenFabrics > +Aliance's `website `_ and provided by > +package managers on most distributions, this PMD requires Ethernet > +extensions that may not be supported at the moment (this is a work in > progress). > + > +`Mellanox OFED > + g=3Dlinux > +_sw_drivers>`_ includes the necessary support and should be used in the > +meantime. For DPDK, only libibverbs, libmlx4 and mlnx-ofed-kernel > +packages are required from that distribution. > + > +.. note:: > + > + Both libraries are BSD and GPL licensed. Linux kernel modules are GPL > + licensed. > + > +Usage example > +------------- > + > +This section demonstrates how to launch **testpmd** with Mellanox > +ConnectX-3 devices managed by librte_pmd_mlx4. > + > +#. Load the kernel modules: > + > + .. code-block:: console > + > + modprobe -a ib_uverbs mlx4_en mlx4_core mlx4_ib > + > + .. note:: > + > + User space I/O kernel modules (uio and igb_uio) are not used and d= o > + not have to be loaded. > + > +#. Make sure Ethernet interfaces are in working order and linked to kern= el > + verbs. Related sysfs entries should be present: > + > + .. code-block:: console > + > + ls -d /sys/class/net/*/device/infiniband_verbs/uverbs* | cut -d / > + -f 5 > + > + Example output: > + > + .. code-block:: console > + > + eth2 > + eth3 > + eth4 > + eth5 > + > +#. Optionally, retrieve their PCI bus addresses for whitelisting: > + > + .. code-block:: console > + > + { > + for intf in eth2 eth3 eth4 eth5; > + do > + (cd "/sys/class/net/${intf}/device/" && pwd -P); > + done; > + } | > + sed -n 's,.*/\(.*\),-w \1,p' > + > + Example output: > + > + .. code-block:: console > + > + -w 0000:83:00.0 > + -w 0000:83:00.0 > + -w 0000:84:00.0 > + -w 0000:84:00.0 > + > + .. note:: > + > + There are only two distinct PCI bus addresses because the Mellanox > + ConnectX-3 adapters installed on this system are dual port. > + > +#. Request huge pages: > + > + .. code-block:: console > + > + echo 1024 > > + /sys/kernel/mm/hugepages/hugepages- > 2048kB/nr_hugepages/nr_hugepages > + > +#. Start testpmd with basic parameters: > + > + .. code-block:: console > + > + testpmd -c 0xff00 -n 4 -w 0000:83:00.0 -w 0000:84:00.0 -- --rxq=3D= 2 > + --txq=3D2 -i > + > + Example output: > + > + .. code-block:: console > + > + [...] > + EAL: PCI device 0000:83:00.0 on NUMA socket 1 > + EAL: probe driver: 15b3:1007 librte_pmd_mlx4 > + PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_= 0" > (VF: false) > + PMD: librte_pmd_mlx4: 2 port(s) detected > + PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:b7:50 > + PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:b7:51 > + EAL: PCI device 0000:84:00.0 on NUMA socket 1 > + EAL: probe driver: 15b3:1007 librte_pmd_mlx4 > + PMD: librte_pmd_mlx4: PCI information matches, using device "mlx4_= 1" > (VF: false) > + PMD: librte_pmd_mlx4: 2 port(s) detected > + PMD: librte_pmd_mlx4: port 1 MAC address is 00:02:c9:b5:ba:b0 > + PMD: librte_pmd_mlx4: port 2 MAC address is 00:02:c9:b5:ba:b1 > + Interactive-mode selected > + Configuring Port 0 (socket 0) > + PMD: librte_pmd_mlx4: 0x867d60: TX queues number update: 0 -> 2 > + PMD: librte_pmd_mlx4: 0x867d60: RX queues number update: 0 -> 2 > + Port 0: 00:02:C9:B5:B7:50 > + Configuring Port 1 (socket 0) > + PMD: librte_pmd_mlx4: 0x867da0: TX queues number update: 0 -> 2 > + PMD: librte_pmd_mlx4: 0x867da0: RX queues number update: 0 -> 2 > + Port 1: 00:02:C9:B5:B7:51 > + Configuring Port 2 (socket 0) > + PMD: librte_pmd_mlx4: 0x867de0: TX queues number update: 0 -> 2 > + PMD: librte_pmd_mlx4: 0x867de0: RX queues number update: 0 -> 2 > + Port 2: 00:02:C9:B5:BA:B0 > + Configuring Port 3 (socket 0) > + PMD: librte_pmd_mlx4: 0x867e20: TX queues number update: 0 -> 2 > + PMD: librte_pmd_mlx4: 0x867e20: RX queues number update: 0 -> 2 > + Port 3: 00:02:C9:B5:BA:B1 > + Checking link statuses... > + Port 0 Link Up - speed 10000 Mbps - full-duplex > + Port 1 Link Up - speed 40000 Mbps - full-duplex > + Port 2 Link Up - speed 10000 Mbps - full-duplex > + Port 3 Link Up - speed 40000 Mbps - full-duplex > + Done > + testpmd> > diff --git a/doc/guides/prog_guide/source_org.rst > b/doc/guides/prog_guide/source_org.rst > index c8ca54f..c66ad16 100644 > --- a/doc/guides/prog_guide/source_org.rst > +++ b/doc/guides/prog_guide/source_org.rst > @@ -83,6 +83,7 @@ The lib directory contains:: > +-- librte_pmd_e1000 # 1GbE poll mode drivers (igb and em) > +-- librte_pmd_ixgbe # 10GbE poll mode driver > +-- librte_pmd_i40e # 40GbE poll mode driver > + +-- librte_pmd_mlx4 # Mellanox ConnectX-3 poll mode driver > +-- librte_pmd_pcap # PCAP poll mode driver > +-- librte_pmd_ring # ring poll mode driver > +-- librte_pmd_virtio # virtio poll mode driver > -- > 2.1.0