From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9DB26A0554; Tue, 18 Feb 2020 06:12:19 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id EBDEA1D9FF; Tue, 18 Feb 2020 06:12:18 +0100 (CET) Received: from mail-il1-f179.google.com (mail-il1-f179.google.com [209.85.166.179]) by dpdk.org (Postfix) with ESMTP id E9B211D8FF for ; Tue, 18 Feb 2020 06:12:16 +0100 (CET) Received: by mail-il1-f179.google.com with SMTP id s18so16198820iln.0 for ; Mon, 17 Feb 2020 21:12:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=V6mEM/M9xg9mNwe/aQ3S/3HficE+px4ZZz9mRJ7QqU4=; b=gTni6PIMZVJ4WC7fMBuBZONZJitl+4jKoxBKrt9/Z5/VvZ0xdz7ZGKXWhTTNX7kOnL RIZEegTOHGX/FD7TOibRg+UmQ8W1LCBAdLuDRJJtWneLjqwxgqqgOtjTHQ7kCipvoYs0 QiUe5D7IqP2Td0+f34JdyxVg3CNXf3nKhZ2HazJV64dmKjym95rqNRPAmevWrE7uSgfe QaEMzjUHhPtfg9ZfrSoA39pqA3Qo5B7tDOsUsihL2PBo1Fv2xFZWhmJEEpeZTlHOE+nx YbshEj7ELGTQ89BBIY5HtApabVwqCzizG7sdnj4sEmeBnxJm0Xy3VPP6OHV+ux/MeEj9 VSQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=V6mEM/M9xg9mNwe/aQ3S/3HficE+px4ZZz9mRJ7QqU4=; b=Uf/RQrhTpwj8EIP2N8J2LMOHFpqX3sjrMySM7OR0BGqC6W2V9m8FHUnMXHRE/rZLaS FWo96Rb6XhUCfntaJwp6XMyNHWZVRouaiknsU/PbMhYfqGljyjAf9i9KJ15e7Cgn3F8G hp59yF3J6pA7BbeW99nVA4GsoVS90lC5eHaIY/kH0mKbeH8RTCLT0L91bATCP/sqzQs7 o52X9LqerthyJKDTlzp3iwPP4C8gfumnks6VP+blfi+hgzZDYyCW6ZhWNxaPNIPItCGH k5mqN3Aon13Q/kTcUYFcR5zMDq3B1kc68jnIoEiEE2sUA+XkkFmS4UohAyGXmkVjag8d UisQ== X-Gm-Message-State: APjAAAU8zWqGUHyqNewhhJIoghDBLg6Qeywru3FKUMsztrwcHrOXn+qK Q243AE39Euz419gLy0pmA7tACqfUvG4Ju7OLzH0= X-Google-Smtp-Source: APXvYqwmk5zjW0y2D3lYYSICLFLLLw8Rnv5QY69tY1I2HYzaIfHMoDcpWRvKS1A/Eral/e73NzsNs9G9mymTfo9UmLo= X-Received: by 2002:a05:6e02:f47:: with SMTP id y7mr17958163ilj.162.1582002736127; Mon, 17 Feb 2020 21:12:16 -0800 (PST) MIME-Version: 1.0 References: <1580827512-178449-1-git-send-email-david.coyle@intel.com> <3912d015-8bf8-f3c7-15cf-6e68c6c7515e@intel.com> In-Reply-To: <3912d015-8bf8-f3c7-15cf-6e68c6c7515e@intel.com> From: Jerin Jacob Date: Tue, 18 Feb 2020 10:42:00 +0530 Message-ID: To: "Doherty, Declan" Cc: "Coyle, David" , dpdk-dev , "Trahe, Fiona" Content-Type: text/plain; charset="UTF-8" Subject: Re: [dpdk-dev] [RFC] Accelerator API to chain packet processing functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Thu, Feb 13, 2020 at 5:01 PM Doherty, Declan wrote: > > On 06/02/2020 10:54 AM, Jerin Jacob wrote: > > On Thu, Feb 6, 2020 at 3:35 PM Coyle, David wrote: > >> > >> Hi Jerin, > > > > Hi David, > > > >> Thanks for the comments. Please see replies below. > >> > >> Kind Regards, > >> David > >> > >>> On Tue, Feb 4, 2020 at 8:15 PM David Coyle wrote: > >>>> > >>>> Introduction > >>>> ============ > >>>> > >>>> This RFC introduces a new DPDK library, rte_accelerator. > >>>> > >>>> The main aim of this library is to provide a flexible and extensible way of > >>> combining one or more packet-processing functions into a single operation, > >>> thereby allowing these to be performed in parallel in optimized software > >>> libraries or in a hardware accelerator. These functions can include > >>> cryptography, compression and CRC/checksum calculation, while others can > >>> potentially be added in the future. Performing these functions in parallel as a > >>> single operation can enable a significant performance improvement. > >>>> > >>>> > >>>> Background > >>>> ========== > >>>> > >>>> There are a number of byte-wise operations which are present and > >>> common across many access network data-plane pipelines, such as Cipher, > >>> Authentication, CRC, Bit-Interleaved-Parity (BIP), other checksums etc. Some > >>> prototyping has been done at Intel in relation to the 01.org access-network- > >>> dataplanes project to prove that a significant performance improvement is > >>> possible when such byte-wise operations are combined into a single pass of > >>> packet data processing. This performance boost has been prototyped for > >>> both XGS-PON MAC data-plane and DOCSIS MAC data-plane pipelines. > >>> > >>> > >>> Could you share the relative performance numbers to show the gain? > >> > >> [DC] As mentioned above, the main performance gains are when the packet processing operations can be combined into a single pass of the packet. > >> Both Crypto-CRC-BIP (for XGS-PON MAC) and Crypto-CRC (for DOCSIS MAC) have been implemented in the AESNI MB library as single pass operation chains. > >> > >> We have modified the dpdk-crypto-perf-tester as part of our prototyping to test the cases where: > >> 1) each packet processing function is done as an independent stage (e.g. calling rte_net_crc for CRC, AESNI MB through rte_cryptodev for cipher, and a C function to calculate the BIP) > >> 2) all packet processing functions done as a single-pass operation in AESNI MB through rte_cryptodev > >> > >> We see the following results for 1024 byte input frames from dpdk-crypto-perf-tester: > >> - XGS-PON MAC (Crypto-CRC-BIP): > >> - 3 independent stages: 1429 cycles/buf (13.75Gbps) > >> - 1 single-pass stage: 896 cycles/buf (21.9Gbps) > >> 37% cycle reduction > >> > >> - DOCSIS MAC (Crypto-CRC): > >> - 2 independent stages: 1421 cycles/buf (13.84Gbps) > >> - 1 single-pass stage: 1133 cycles/buf (17.34Gbps) > >> 20% cycle reduction > >> > >> Adding the accelerator API will allow vendors gain the benefits of these cycle savings > > > > Numbers make sense. I have seen a similar performance improvement > > doing in one pass with CPU instructions. > > > > > >>>> - XGS-PON MAC: Crypto-CRC-BIP > >>>> - Order: > >>>> - Downstream: CRC, Encrypt, BIP > >>> > >>> I understand if the chain has two operations then it may possible to have > >>> handcrafted SW code to do both operations in one pass. > >>> I understand the spec is agnostic on a number of passes it does require to > >>> enable the xfrom but To understand the SW/HW capability, In the above > >>> case, "CRC, Encrypt, BIP", It is done in one pass in SW or three passes in SW > >>> or one pass using HW? > >> > >> [DC] The CRC, Encrypt, BIP is also currently done as 1 pass in AESNI MB library SW. > >> However, this could also be performed as a single pass in a HW accelerator > > > > As a specification, cascading the xform chains make sense. > > Do we have any HW that does support chaining the xforms more than > > "two" in one pass? > > i.e real chaining function where two blocks of HWs work hand in hand > > for chaining. > > If none, it may be better to abstract as synonymous API(No dequeue, no > > enqueue) for the CPU use case. > > > > Where you thinking along the lines of a synchronous API option like that > just introduced to crytodev? i.e something like > > uint16_t rte_accelerator_process(struct rte_accelerator_ctx *ctx, > struct rte_accelerator_op ops[], > uint16_t nb_ops); Yes. May be with capability or preference to denote application for the preferred usage model. > >