From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <bruce.richardson@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id 8772DB64F
 for <dev@dpdk.org>; Tue, 17 Feb 2015 14:51:24 +0100 (CET)
Received: from fmsmga003.fm.intel.com ([10.253.24.29])
 by orsmga101.jf.intel.com with ESMTP; 17 Feb 2015 05:51:02 -0800
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.09,594,1418112000"; d="scan'208";a="455705540"
Received: from bricha3-mobl3.ger.corp.intel.com ([10.243.20.28])
 by FMSMGA003.fm.intel.com with SMTP; 17 Feb 2015 05:35:59 -0800
Received: by  (sSMTP sendmail emulation); Tue, 17 Feb 2015 13:50:58 +0025
Date: Tue, 17 Feb 2015 13:50:58 +0000
From: Bruce Richardson <bruce.richardson@intel.com>
To: Olivier MATZ <olivier.matz@6wind.com>
Message-ID: <20150217135057.GA1228@bricha3-MOBL3>
References: <1419266844-4848-1-git-send-email-bruce.richardson@intel.com>
 <54E1FFC4.1060605@6wind.com> <20150216151622.GA1888@bricha3-MOBL3>
 <4549532.PTBUOYF3p0@xps13> <20150217122535.GA18168@bricha3-MOBL3>
 <54E341E2.6090006@6wind.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <54E341E2.6090006@6wind.com>
Organization: Intel Shannon Ltd.
User-Agent: Mutt/1.5.23 (2014-03-12)
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH v2 3/4] examples: example showing use of
 callbacks.
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Feb 2015 13:51:25 -0000

On Tue, Feb 17, 2015 at 02:28:02PM +0100, Olivier MATZ wrote:
> Hi Bruce,
> 
> On 02/17/2015 01:25 PM, Bruce Richardson wrote:
> >On Mon, Feb 16, 2015 at 06:34:37PM +0100, Thomas Monjalon wrote:
> >>2015-02-16 15:16, Bruce Richardson:
> >>>In this specific instance, given that the application does little else, there
> >>>is no real advantage to using the callbacks - it's just to have a simple example
> >>>of how they can be used.
> >>>
> >>>Where callbacks are really designed to be useful, is for extending or augmenting
> >>>hardware capabilities. Taking the example of sequence numbers - to use the most
> >>>trivial example - an application could be written to take advantage of sequence
> >>>numbers written to packets by the hardware which received them. However, if such
> >>>an application was to be used with a NIC which does not provide sequence numbering
> >>>capability, for example, anything using ixgbe driver, the application writer has
> >>>two choices - either modify his application code to check each packet for
> >>>a sequence number in the data path, and add it there post-rx, or alternatively,
> >>>to check the NIC capabilities at initialization time, and add a callback there
> >>>at initialization, if the hardware does not support it. In the latter case,
> >>>the main packet processing body of the application can be written as though
> >>>hardware always has sequence numbering capability, safe in the knowledge that
> >>>any hardware not supporting it will be back-filled by a software fallback at
> >>>initialization-time.
> >>>
> >>>By the same token, we could also look to extend hardware capabilities. For
> >>>different filtering or hashing capabilities, there can be limits in hardware
> >>>which are far less than what we need to use in software. Again, callbacks will
> >>>allow the data path to be written in a way that is oblivious to the underlying
> >>>hardware limits, because software will transparently fill in the gaps.
> >>>
> >>>Hope this makes the use case clear.
> >>
> >>After thinking more about these callbacks, I realize these callbacks won't
> >>help, as Olivier said.
> >>
> >>With callback,
> >>1/ application checks device capability
> >>2/ application provides hardware emulation as DPDK callback
> >>3/ application forgets previous steps
> >>4/ application calls DPDK Rx
> >>5/ DPDK calls callback (without calling optimization)
> >>
> >>Without callback,
> >>1/ application checks device capability
> >>2/ application provides hardware emulation as internal function
> >>3/ application set an internal device-flag to enable this function
> >>4/ application calls DPDK Rx
> >>5/ application calls the hardware emulation if flag is set
> >>
> >>So the only difference is to keep persistent the device information in
> >>the application instead of storing it as a function pointer in the
> >>DPDK struct.
> >>You can also be faster with this approach: at initialization time,
> >>you can check that your NIC supports the feature and use a specific
> >>mainloop that adds or not the sequence number without any runtime
> >>test.
> >
> >That is assuming that all NICs are equal on your system. It's also assuming
> >that you only have a single point in your application where you call RX or
> >TX burst. In the case where you have a couple of different NICs on the system,
> >or where you want to write an application to take advantage of capabilities of
> >different NICs, the ability to resolve all these difference at initialization
> >time is useful. The main packet handling code can be written with just the
> >processing of packets in mind, rather than having to have a set of branches
> >after each RX burst call, or before each TX burst call, to "smooth out" the
> >different NIC capabilities.
> >
> >As for the option of maintaining different main loops for different NICs with
> >different capabilities - that sounds like a maintenance nightmare to
> >me, due to duplicated code! Callbacks is a far cleaner solution than that IMHO.
> 
> Why not just provide a function like this:
> 
>   rte_do_unsupported_stuff_by_software(m[], m_count, wanted_features,
>   	dev_feature_flags)
> 
> This function can be called (or not) from the application mainloop.
> You don't need to maintain several mainloops (for each device) as
> the specific work will be done depending on the given flags. And the
> applications that do not require these features (most applications?)
> are not penalized at all.

Have you measured the performance hit due to this proposed change? In my tests
it's very, very small, even for the fastest vectorized path. If performance is
a real concern, I'm happy enough to have this as a compile-time option so that
those who can't take the small performance hit can avoid it.

/Bruce

> 
> If you have several places where you call rx in your application
> and you want to factorize it, you can have your own function that
> calls rx plus the function that does the additional sw work.
> 
> Regards,
> Olivier
>