From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5B72EA0542; Thu, 13 Feb 2020 12:50:43 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 8087B1BF99; Thu, 13 Feb 2020 12:50:42 +0100 (CET) Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by dpdk.org (Postfix) with ESMTP id 65D2C1BF7A for ; Thu, 13 Feb 2020 12:50:40 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga107.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 13 Feb 2020 03:50:39 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,436,1574150400"; d="scan'208";a="222624179" Received: from dwdohert-mobl.ger.corp.intel.com (HELO [163.33.176.237]) ([163.33.176.237]) by orsmga007.jf.intel.com with ESMTP; 13 Feb 2020 03:50:38 -0800 To: Jerin Jacob , "Coyle, David" Cc: dpdk-dev , "Trahe, Fiona" References: <1580827512-178449-1-git-send-email-david.coyle@intel.com> From: "Doherty, Declan" Message-ID: Date: Thu, 13 Feb 2020 11:50:37 +0000 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.4.2 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 07/02/2020 2:18 PM, Jerin Jacob wrote: > On Fri, Feb 7, 2020 at 6:08 PM Coyle, David wrote: >> >> Hi Jerin, see below > > Hi David, > >>> >>> On Thu, Feb 6, 2020 at 10:01 PM Coyle, David >>> wrote: >>> > >>> >>> There is a risk in drafting API that meant for HW without any HW exists. >>> Because there could be inefficiency on the metadata and fast path API for >>> both models. >>> For example, In the case of CPU based scheme, it will be pure overhead >>> emulate the "queue"(the enqueue and dequeue) for the sake of abstraction >>> where CPU works better in the synchronous model and I have doubt that the >>> session-based scheme will work for HW or not as both difference HW needs >>> to work hand in hand(IOMMU aspects for two PCI device) >> >> [DC] I understand what you are saying about the overhead of emulating the "sw queue" but this same model is already used in many of the existing device PMDs. >> In the case of SW devices, such as AESNI-MB or NULL for crypto or zlib for compression, the enqueue/dequeue in the PMD is emulated through an rte_ring which is very efficient. >> The accelerator API will use the existing device PMDs so keeping the same model seems like a sensible approach. > > In this release, we added CPU crypto support in cryptodev to support > the synchronous model to fix the overhead. > >> >> From an application's point of view, this abstraction of the underlying device type is important for usability and maintainability - the application doesn't need to know >> the device type as such and therefore doesn't need to make different API calls. >> >> The enqueue/dequeue type API was also used with QAT in mind. While QAT HW doesn't support these xform chains at the moment, it could potentially do so in the future. >> As a side note, as part of the work of adding the accelerator API, the QAT PMD will be updated to support the DOCSIS Crypto-CRC accelerator xform chain, where the Crypto >> is done on QAT HW and the CRC will be done in SW, most likely through a call to the optimized rte_net_crc library. This will give a consistent API for the DOCSIS-MAC data-plane >> pipeline prototype we have developed, which uses both AESNI-MB and QAT for benchmarks. >> >> We will take your feedback on the enqueue/dequeue approach for SW devices into consideration though during development. >> >> Finally, I'm unsure what you mean by this line: >> >> "I have doubt that the session-based scheme will work for HW or not as both difference HW needs to work hand in hand(IOMMU aspects for two PCI device)" >> >> What do mean by different HW working "hand in hand" and "two PCI device"? >> The intention is that 1 HW device (or it's PMD) would have to support the accel xform chain > > I was thinking, it will be N PCIe devices that create the chain. Each > distinct PCI device does the fixed-function and chains them together. > The case we were looking at is more focused on a single discrete (multi-function) device (from the perspective of the host) providing a number of transforms (operations) in a single pass rather than the case of N discrete hardware devices (from the perspective of the host) chained together to achieve the same transforms set. > I do understand the usage of QAT HW and CRC in SW. > So If I understand it correctly, in rte_security, we are combining > rte_ethdev and rte_cryptodev. With this spec, we are trying to > combine, > rte_cryptodev and rte_compressdev. So it looks good to me. My only > remaining concern is the name of this API, accelerator too generic > name. IMO, like rte_security, we may need to give more meaningful name > for the use case where crytodev and compressdev can work together. >