From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id 0D91BDE3 for ; Mon, 18 Sep 2017 10:58:14 +0200 (CEST) Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga105.jf.intel.com with ESMTP; 18 Sep 2017 01:58:13 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.42,412,1500966000"; d="scan'208";a="1015532499" Received: from silpixa00381635.ir.intel.com (HELO silpixa00381635.ger.corp.intel.com) ([10.237.222.149]) by orsmga003.jf.intel.com with ESMTP; 18 Sep 2017 01:58:11 -0700 From: Jasvinder Singh To: dev@dpdk.org Cc: cristian.dumitrescu@intel.com, ferruh.yigit@intel.com, thomas@monjalon.net Date: Mon, 18 Sep 2017 10:10:11 +0100 Message-Id: <20170918091015.82824-1-jasvinder.singh@intel.com> X-Mailer: git-send-email 2.9.3 In-Reply-To: <20170811124929.118564-2-jasvinder.singh@intel.com> References: <20170811124929.118564-2-jasvinder.singh@intel.com> Subject: [dpdk-dev] [PATCH v4 0/4] net/softnic: sw fall-back pmd for traffic mgmt and others X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 18 Sep 2017 08:58:15 -0000 The SoftNIC PMD is intended to provide SW fall-back options for specific ethdev APIs in a generic way to the NICs not supporting those features. Currently, the only implemented ethdev API is Traffic Management (TM), but other ethdev APIs such as rte_flow, traffic metering & policing, etc can be easily implemented. Overview: * Generic: The SoftNIC PMD works with any "hard" PMD that implements the ethdev API. It does not change the "hard" PMD in any way. * Creation: For any given "hard" ethdev port, the user can decide to create an associated "soft" ethdev port to drive the "hard" port. The "soft" port is a virtual device that can be created at app start-up through EAL vdev arg or later through the virtual device API. * Configuration: The app explicitly decides which features are to be enabled on the "soft" port and which features are still to be used from the "hard" port. The app continues to explicitly configure both the "hard" and the "soft" ports after the creation of the "soft" port. * RX/TX: The app reads packets from/writes packets to the "soft" port instead of the "hard" port. The RX and TX queues of the "soft" port are thread safe, as any ethdev. * Execution: The "soft" port is a feature-rich NIC implemented by the CPU, so the run function of the "soft" port has to be executed by the CPU in order to get packets moving between "hard" port and the app. * Meets the NFV vision: The app should be (almost) agnostic about the NIC implementation (different vendors/models, HW-SW mix), the app should not require changes to use different NICs, the app should use the same API for all NICs. If a NIC does not implement a specific feature, the HW should be augmented with SW to meet the functionality while still preserving the same API. Traffic Management SW fall-back overview: * Implements the ethdev traffic management API (rte_tm.h). * Based on the existing librte_sched DPDK library. Example: Create "soft" port for "hard" port "0000:04:00.1", enable the TM feature with default settings: --vdev 'net_softnic0,hard_name=0000:04:00.1,soft_tm=on' Q1: Why generic name, if only TM is supported (for now)? A1: The intention is to have SoftNIC PMD implement many other (all?) ethdev APIs under a single "ideal" ethdev, hence the generic name. The initial motivation is TM API, but the mechanism is generic and can be used for many other ethdev APIs. Somebody looking to provide SW fall-back for other ethdev API is likely to end up inventing the same, hence it would be good to consolidate all under a single PMD and have the user explicitly enable/disable the features it needs for each "soft" device. Q2: Are there any performance requirements for SoftNIC? A2: Yes, performance should be great/decent for every feature, otherwise the SW fall-back is unusable, thus useless. Q3: Why not change the "hard" device (and keep a single device) instead of creating a new "soft" device (and thus having two devices)? A3: This is not possible with the current librte_ether ethdev implementation. The ethdev->dev_ops are defined as constant structure, so it cannot be changed per device (nor per PMD). The new ops also need memory space to store their context data structures, which requires updating the ethdev->data->dev_private of the existing device; at best, maybe a resize of ethdev->data->dev_private could be done, assuming that librte_ether will introduce a way to find out its size, but this cannot be done while device is running. Other side effects might exist, as the changes are very intrusive, plus it likely needs more changes in librte_ether. Q4: Why not call the SW fall-back dev_ops directly in librte_ether for devices which do not support the specific feature? If the device supports the capability, let's call its dev_ops, otherwise call the SW fall-back dev_ops. A4: First, similar reasons to Q&A3. This fixes the need to change ethdev->dev_ops of the device, but it does not do anything to fix the other significant issue of where to store the context data structures needed by the SW fall-back functions (which, in this approach, are called implicitly by librte_ether). Second, the SW fall-back options should not be restricted arbitrarily by the librte_ether library, the decision should belong to the app. For example, the TM SW fall-back should not be limited to only librte_sched, which (like any SW fall-back) is limited to a specific hierarchy and feature set, it cannot do any possible hierarchy. If alternatives exist, the one to use should be picked by the app, not by the ethdev layer. Q5: Why is the app required to continue to configure both the "hard" and the "soft" devices even after the "soft" device has been created? Why not hiding the "hard" device under the "soft" device and have the "soft" device configure the "hard" device under the hood? A5: This was the approach tried in the V2 of this patch set (overlay "soft" device taking over the configuration of the underlay "hard" device) and eventually dropped due to increased complexity of having to keep the configuration of two distinct devices in sync with librte_ether implementation that is not friendly towards such approach. Basically, each ethdev API call for the overlay device needs to configure the overlay device, invoke the same configuration with possibly modified parameters for the underlay device, then resume the configuration of overlay device, turning this into a device emulation project. V2 minuses: increased complexity (deal with two devices at same time); need to implement every ethdev API, even those not needed for the scope of SW fall-back; intrusive; sometimes have to silently take decisions that should be left to the app. V3 pluses: lower complexity (only one device); only need to implement those APIs that are in scope of the SW fall-back; non-intrusive (deal with "hard" device through ethdev API); app decisions taken by the app in an explicit way. Q6: Why expose the SW fall-back in a PMD and not in a SW library? A6: The SW fall-back for an ethdev API has to implement that specific ethdev API, (hence expose an ethdev object through a PMD), as opposed to providing a different API. This approach allows the app to use the same API (NFV vision). For example, we already have a library for TM SW fall-back (librte_sched) that can be called directly by the apps that need to call it outside of ethdev context (use-cases exist), but an app that works with TM-aware NICs through the ethdev TM API would have to be changed significantly in order to work with different TM-agnostic NICs through the librte_sched API. Q7: Why have all the SW fall-backs in a single PMD? Why not develop the SW fall-back for each different ethdev API in a separate PMD, then create a chain of "soft" devices for each "hard" device? Potentially, this results in smaller size PMDs that are easier to maintain. A7: Arguments for single ethdev/PMD and against chain of ethdevs/PMDs: 1. All the existing PMDs for HW NICs implement a lot of features under the same PMD, so there is no reason for single PMD approach to break code modularity. See the V3 code, a lot of care has been taken for code modularity. 2. We should avoid the proliferation of SW PMDs. 3. A single device should be handled by a single PMD. 4. People are used with feature-rich PMDs, not with single-feature PMDs, so we change of mindset? 5. [Configuration nightmare] A chain of "soft" devices attached to single "hard" device requires the app to be aware that the N "soft" devices in the chain plus the "hard" device refer to the same HW device, and which device should be invoked to configure which feature. Also the length of the chain and functionality of each link is different for each HW device. This breaks the requirement of preserving the same API while working with different NICs (NFV). This most likely results in a configuration nightmare, nobody is going to seriously use this. 6. [Feature inter-dependecy] Sometimes different features need to be configured and executed together (e.g. share the same set of resources, are inter-dependent, etc), so it is better and more performant to do them in the same ethdev/PMD. 7. [Code duplication] There is a lot of duplication in the configuration code for the chain of ethdevs approach. The ethdev dev_configure, rx_queue_setup, tx_queue_setup API functions have to be implemented per device, and they become meaningless/inconsistent with the chain approach. 8. [Data structure duplication] The per device data structures have to be duplicated and read repeatedly for each "soft" ethdev. The ethdev device, dev_private, data, per RX/TX queue data structures have to be replicated per "soft" device. They have to be re-read for each stage, so the same cache misses are now multiplied with the number of stages in the chain. 9. [rte_ring proliferation] Thread safety requirements for ethdev RX/TXqueues require an rte_ring to be used for every RX/TX queue of each "soft" ethdev. This rte_ring proliferation unnecessarily increases the memory footprint and lowers performance, especially when each "soft" ethdev ends up on a different CPU core (ping-pong of cache lines). 10.[Meta-data proliferation] A chain of ethdevs is likely to result in proliferation of meta-data that has to be passed between the ethdevs (e.g. policing needs the output of flow classification), which results in more cache line ping-pong between cores, hence performance drops. Cristian Dumitrescu (4): Jasvinder Singh (4): net/softnic: add softnic PMD net/softnic: add traffic management support net/softnic: add TM capabilities ops net/softnic: add TM hierarchy related ops MAINTAINERS | 5 + config/common_base | 5 + doc/api/doxy-api-index.md | 3 +- doc/api/doxy-api.conf | 1 + doc/guides/rel_notes/release_17_11.rst | 6 + drivers/net/Makefile | 5 + drivers/net/softnic/Makefile | 57 + drivers/net/softnic/rte_eth_softnic.c | 853 +++++ drivers/net/softnic/rte_eth_softnic.h | 83 + drivers/net/softnic/rte_eth_softnic_internals.h | 291 ++ drivers/net/softnic/rte_eth_softnic_tm.c | 3449 ++++++++++++++++++++ .../net/softnic/rte_pmd_eth_softnic_version.map | 7 + mk/rte.app.mk | 5 +- 13 files changed, 4768 insertions(+), 2 deletions(-) create mode 100644 drivers/net/softnic/Makefile create mode 100644 drivers/net/softnic/rte_eth_softnic.c create mode 100644 drivers/net/softnic/rte_eth_softnic.h create mode 100644 drivers/net/softnic/rte_eth_softnic_internals.h create mode 100644 drivers/net/softnic/rte_eth_softnic_tm.c create mode 100644 drivers/net/softnic/rte_pmd_eth_softnic_version.map -- 2.9.3