DPDK patches and discussions
 help / color / mirror / Atom feed
From: Neil Horman <nhorman@tuxdriver.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>
Subject: Re: [dpdk-dev] [PATCH v4 3/5] hash: add fallback to software CRC32 implementation
Date: Wed, 19 Nov 2014 10:05:33 -0500	[thread overview]
Message-ID: <20141119150533.GA28013@localhost.localdomain> (raw)
In-Reply-To: <2601191342CEEE43887BDE71AB977258213B6C4D@IRSMSX105.ger.corp.intel.com>

On Wed, Nov 19, 2014 at 11:50:40AM +0000, Ananyev, Konstantin wrote:
> 
> 
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Bruce Richardson
> > Sent: Wednesday, November 19, 2014 11:38 AM
> > To: Neil Horman
> > Cc: dev@dpdk.org
> > Subject: Re: [dpdk-dev] [PATCH v4 3/5] hash: add fallback to software CRC32 implementation
> > 
> > On Wed, Nov 19, 2014 at 06:34:08AM -0500, Neil Horman wrote:
> > > On Wed, Nov 19, 2014 at 10:16:14AM +0000, Bruce Richardson wrote:
> > > > On Tue, Nov 18, 2014 at 04:36:24PM -0500, Neil Horman wrote:
> > > > > On Tue, Nov 18, 2014 at 05:52:27PM +0000, Bruce Richardson wrote:
> > > > > > On Tue, Nov 18, 2014 at 12:46:19PM -0500, Neil Horman wrote:
> > > > > > > On Tue, Nov 18, 2014 at 11:13:17PM +0600, Yerden Zhumabekov wrote:
> > > > > > > >
> > > > > > > > 18.11.2014 22:00, Neil Horman пишет:
> > > > > > > > > On Tue, Nov 18, 2014 at 09:06:35PM +0600, Yerden Zhumabekov wrote:
> > > > > > > > >> 18.11.2014 20:41, Neil Horman пишет:
> > > > > > > > >>> On Tue, Nov 18, 2014 at 08:03:40PM +0600, Yerden Zhumabekov wrote:
> > > > > > > > >>>>  /**
> > > > > > > > >>>>   * Use single crc32 instruction to perform a hash on a 4 byte value.
> > > > > > > > >>>> + * Fall back to software crc32 implementation in case SSE4.2 is
> > > > > > > > >>>> + * not supported
> > > > > > > > >>>>   *
> > > > > > > > >>>>   * @param data
> > > > > > > > >>>>   *   Data to perform hash on.
> > > > > > > > >>>> @@ -376,11 +413,18 @@ crc32c_2words(uint64_t data, uint32_t init_val)
> > > > > > > > >>>>  static inline uint32_t
> > > > > > > > >>>>  rte_hash_crc_4byte(uint32_t data, uint32_t init_val)
> > > > > > > > >>>>  {
> > > > > > > > >>>> -	return _mm_crc32_u32(init_val, data);
> > > > > > > > >>>> +#ifdef RTE_MACHINE_CPUFLAG_SSE4_2
> > > > > > > > >>>> +	if (likely(crc32_alg == CRC32_SSE42))
> > > > > > > > >>>> +		return _mm_crc32_u32(init_val, data);
> > > > > > > > >>>> +#endif
> > > > > > > > >>> you don't really need these ifdefs here anymore given that you have a
> > > > > > > > >>> constructor to do the algorithm selection.  In fact you need to remove them, in
> > > > > > > > >>> the event you build on a system that doesn't support SSE42, but run on a system
> > > > > > > > >>> that does.
> > > > > > > > >> Originally, I thought so as well. I wrote the code without these ifdefs,
> > > > > > > > >> but it didn't compile on my machine which doesn't support SSE4.2. Error
> > > > > > > > >> was triggered by nmmintrin.h which has a check for respective GCC
> > > > > > > > >> extension. So I think these ifdefs are indeed required.
> > > > > > > > >>
> > > > > > > > > You need to edit the makefile so that the compiler gets passed the option
> > > > > > > > > -msse42.  That way it will know to emit sse42 instructions. It will also allow
> > > > > > > > > you to remove the ifdef from the include file
> > > > > > > >
> > > > > > > > In this case, I guess there are two options:
> > > > > > > > 1) modify all makefiles which use librte_hash
> > > > > > > > 2) move all function bodies from rte_hash_crc.h to separate module,
> > > > > > > > leaving prototype definitions there only.
> > > > > > > >
> > > > > > > > Everybody's up for the second option? :)
> > > > > > > >
> > > > > > > Crud, you're right, I didn't think about the header inclusion issue.  Is it
> > > > > > > worth adding the jump to enable the dynamic hash selection?
> > > > > > > Neil
> > > > > >
> > > > > > Maybe for cases where SSE4.2 is not currently available, i.e. for generic builds.
> > > > > > For builds where we have hardware support confirmed at compile time, just use
> > > > > > the function from the header file.
> > > > > > Does that make sense?
> > > > > >
> > > > > I'm not certain of that, as I don't think anything can be 'confirmed' at compile
> > > > > time.  I.e. just because you have sse42 at compile time doesn't guarantee you
> > > > > have it at run time with a DSO.  If you have these as macros, you need to enable
> > > > > sse42 whereever you include the file so that the intrinsic works properly.
> > > >
> > > > Well, if you compile with sse42 at compile time, the compiler is free to insert
> > > > sse4 instructions at any place it feels like, irrespective of whether or not you
> > > > use SSE4 intrinsics, so I would never expect such a DSO to work on a system
> > > > without SSE42 support.
> > > >
> > > > >
> > > > > an alternate option would be to not use the intrinsic, and craft some explicit
> > > > > __asm__ statement that executes the right sse42 instructions.  That way the asm
> > > > > is directly emitted, without requiring the -msse42 flag at all, and it will just
> > > > > work in all the files that call it.
> > > > >
> > > >
> > > > I really don't like that approach. I think using intrinsics is much more
> > > > maintainable.
> > > >
> > > I grant you that using an intrinsic is easier to read, but if the code doesn't
> > > compile when using the intrinsic unless you have sse42 turned on, I'm not sure
> > > what choice we have.  and inline asm isn't that hard to maintain.  We're talking
> > > about three lines of code:
> > > asm(
> > >  "mov    %[1],%eax
> > >  mov    %[2],%edx
> > >  crc32l %edx,%eax":
> > >  [edx] "r" (crc) /*output*/
> > >  :
> > >  [1] "r" (crc), /* input */
> > >  [2] "r" (val)
> > >  :
> > >  [eax] "r" /* clobber */
> > > )
> > >
> > > I don't have the syntax quite right, but its pretty easy to read the intent.
> > > Its not like we dont have precidence for this, the atomic interface and several
> > > pmds do this frequently.
> > >
> > > Neil
> > 
> > Fair point. If everyone else is happy enough with it, I'm ok too.
> 
> As I remember with gcc & icc it is possible to specify tht you'd like to compile that particular function
> for different target.
> From https://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html:
> "target
> The target attribute is used to specify that a function is to be compiled with different target options than specified on the command line. This can be used for instance to have functions compiled with a different ISA (instruction set architecture) than the default. You can also use the ‘#pragma GCC target’ pragma to set more than one function to be compiled with specific target options. See Function Specific Option Pragmas, for details about the ‘#pragma GCC target’ pragma.
> For instance on a 386, you could compile one function with target("sse4.1,arch=core2") and another with target("sse4a,arch=amdfam10"). This is equivalent to compiling the first function with -msse4.1 and -march=core2 options, and the second function with -msse4a and -march=amdfam10 options. It is up to the user to make sure that a function is only invoked on a machine that supports the particular ISA it is compiled for (for example by using cpuid on 386 to determine what feature bits and architecture family are used).
> 
>           int core2_func (void) __attribute__ ((__target__ ("arch=core2")));
>           int sse3_func (void) __attribute__ ((__target__ ("sse3")));
> You can either use multiple strings to specify multiple options, or separate the options with a comma (‘,’).
> 
> The target attribute is presently implemented for i386/x86_64, PowerPC, and Nios II targets only. The options supported are specific to each target.
> 
> On the 386, the following options are allowed:
> ...
>  ‘sse4.2’
> ‘no-sse4.2’"
> 
> Wouldn't that suit your purposes?
> Probably you can even keep your function inline with that approach.
> 
That would definately work, and be a great solution in this case.  However, its
limited to only the most recent version of gcc.  If thats an acceptible
constraint on the DPDK, then its ok, but distributions are only starting to
include that version now.  Not sure of the icc status of that attribute.

Neil

> Konstantin
> 
> 
> > 
> > /Bruce

  parent reply	other threads:[~2014-11-19 14:55 UTC|newest]

Thread overview: 98+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-03  6:05 [dpdk-dev] [PATCH 0/2] rewritten rte_hash_crc() call Yerden Zhumabekov
2014-09-03  6:05 ` [dpdk-dev] [PATCH 1/2] hash: add new rte_hash_crc_8byte call Yerden Zhumabekov
2014-09-03  6:05 ` [dpdk-dev] [PATCH 2/2] hash: rte_hash_crc uses 8- and 4-byte CRC32 intrinsics Yerden Zhumabekov
2014-11-13 17:33 ` [dpdk-dev] [PATCH 0/2] rewritten rte_hash_crc() call Thomas Monjalon
2014-11-14  0:52   ` Neil Horman
2014-11-14  7:15     ` Yerden Zhumabekov
2014-11-14 11:33       ` Neil Horman
2014-11-14 11:57         ` Yerden Zhumabekov
2014-11-14 13:53           ` Neil Horman
2014-11-14 14:33             ` Thomas Monjalon
2014-11-14 16:43             ` Yerden Zhumabekov
2014-11-14 18:41               ` Neil Horman
2014-11-15 21:45                 ` Yerden Zhumabekov
2014-11-16 17:59 ` [dpdk-dev] [PATCH v2 0/4] rte_hash_crc reworked to be platform-independent Yerden Zhumabekov
2014-11-17 11:31   ` Neil Horman
2014-11-17 11:54     ` Yerden Zhumabekov
2014-11-25 17:05       ` Stephen Hemminger
2014-11-18  3:21   ` [dpdk-dev] [PATCH v3 0/5] " Yerden Zhumabekov
2014-11-18  3:21     ` [dpdk-dev] [PATCH v3 1/5] hash: add software CRC32 implementation Yerden Zhumabekov
2014-11-25 17:34       ` Stephen Hemminger
2014-11-18  3:21     ` [dpdk-dev] [PATCH v3 2/5] hash: add new rte_hash_crc_8byte call Yerden Zhumabekov
2014-11-18  3:21     ` [dpdk-dev] [PATCH v3 3/5] hash: add fallback to software CRC32 implementation Yerden Zhumabekov
2014-11-18  4:56       ` Yerden Zhumabekov
2014-11-18 13:33         ` Neil Horman
2014-11-18 13:37           ` Yerden Zhumabekov
2014-11-18 13:43           ` Thomas Monjalon
2014-11-18  3:21     ` [dpdk-dev] [PATCH v3 4/5] hash: rte_hash_crc() slices data into 8-byte pieces Yerden Zhumabekov
2014-11-18  3:25     ` [dpdk-dev] [PATCH v3 5/5] test: remove redundant compile checks Yerden Zhumabekov
2014-11-16 17:59 ` [dpdk-dev] [PATCH v2 1/4] hash: add software CRC32 implementation Yerden Zhumabekov
2014-11-16 17:59 ` [dpdk-dev] [PATCH v2 2/4] hash: add new rte_hash_crc_8byte call Yerden Zhumabekov
2014-11-16 17:59 ` [dpdk-dev] [PATCH v2 3/4] hash: add fallback to software CRC32 implementation Yerden Zhumabekov
2014-11-17 12:34   ` Ananyev, Konstantin
2014-11-17 12:41     ` Yerden Zhumabekov
2014-11-17 14:06     ` Neil Horman
2014-11-16 17:59 ` [dpdk-dev] [PATCH v2 4/4] hash: rte_hash_crc() slices data into 8-byte pieces Yerden Zhumabekov
2014-11-18 14:03 ` [dpdk-dev] [PATCH v4 0/5] rte_hash_crc reworked to be platform-independent Yerden Zhumabekov
2014-11-18 14:03   ` [dpdk-dev] [PATCH v4 1/5] hash: add software CRC32 implementation Yerden Zhumabekov
2014-11-18 14:03   ` [dpdk-dev] [PATCH v4 2/5] hash: add new rte_hash_crc_8byte call Yerden Zhumabekov
2014-11-18 14:03   ` [dpdk-dev] [PATCH v4 3/5] hash: add fallback to software CRC32 implementation Yerden Zhumabekov
2014-11-18 14:41     ` Neil Horman
2014-11-18 15:06       ` Yerden Zhumabekov
2014-11-18 16:00         ` Neil Horman
2014-11-18 16:04           ` Bruce Richardson
2014-11-18 16:08             ` Bruce Richardson
2014-11-18 16:38             ` Neil Horman
2014-11-18 17:13           ` Yerden Zhumabekov
2014-11-18 17:29             ` Wang, Shawn
2014-11-19  4:07               ` Yerden Zhumabekov
2014-11-18 17:46             ` Neil Horman
2014-11-18 17:52               ` Bruce Richardson
2014-11-18 21:36                 ` Neil Horman
2014-11-19  3:51                   ` Yerden Zhumabekov
2014-11-19 10:16                   ` Bruce Richardson
2014-11-19 11:34                     ` Neil Horman
2014-11-19 11:38                       ` Bruce Richardson
2014-11-19 11:50                         ` Ananyev, Konstantin
2014-11-19 11:59                           ` Yerden Zhumabekov
2014-11-19 15:05                           ` Neil Horman [this message]
2014-11-19 16:51                             ` Ananyev, Konstantin
2014-11-19 11:35                     ` Yerden Zhumabekov
2014-11-19 15:07                       ` Neil Horman
2014-11-20  3:04                         ` Yerden Zhumabekov
2014-11-18 17:58               ` Yerden Zhumabekov
2014-11-18 14:03   ` [dpdk-dev] [PATCH v4 4/5] hash: rte_hash_crc() slices data into 8-byte pieces Yerden Zhumabekov
2014-11-18 14:05   ` [dpdk-dev] [PATCH v4 5/5] test: remove redundant compile checks Yerden Zhumabekov
2014-11-20  5:15 ` [dpdk-dev] [PATCH v5 0/7] rte_hash_crc reworked to be platform-independent Yerden Zhumabekov
2014-11-20  5:16   ` [dpdk-dev] [PATCH v5 1/7] hash: add software CRC32 implementation Yerden Zhumabekov
2014-11-20  5:16   ` [dpdk-dev] [PATCH v5 2/7] hash: add assembly implementation of CRC32 intrinsics Yerden Zhumabekov
2014-11-20  5:16   ` [dpdk-dev] [PATCH v5 3/7] hash: replace built-in functions implementing SSE4.2 Yerden Zhumabekov
2014-11-20  5:16   ` [dpdk-dev] [PATCH v5 4/7] hash: add rte_hash_crc_8byte function Yerden Zhumabekov
2014-11-21 11:22     ` Neil Horman
2014-11-21 11:26       ` Yerden Zhumabekov
2014-11-20  5:17   ` [dpdk-dev] [PATCH v5 6/7] hash: rte_hash_crc() slices data into 8-byte pieces Yerden Zhumabekov
2014-11-20  5:17   ` [dpdk-dev] [PATCH v5 7/7] test: remove redundant compile checks Yerden Zhumabekov
2014-11-20  5:17   ` [dpdk-dev] [PATCH v5 5/7] hash: add fallback to software CRC32 implementation Yerden Zhumabekov
2014-11-27 21:04   ` [dpdk-dev] [PATCH v5 0/7] rte_hash_crc reworked to be platform-independent Thomas Monjalon
2014-11-28  3:28     ` Yerden Zhumabekov
2015-01-29  8:48 ` [dpdk-dev] [PATCH v6 " Yerden Zhumabekov
2015-01-29  8:48   ` [dpdk-dev] [PATCH v6 1/7] hash: add software CRC32 implementation Yerden Zhumabekov
2015-01-29  8:48   ` [dpdk-dev] [PATCH v6 2/7] hash: add assembly implementation of CRC32 intrinsics Yerden Zhumabekov
2015-02-02  5:15     ` Liang, Cunming
2015-02-02  5:34       ` Yerden Zhumabekov
2015-02-02  5:59         ` Liang, Cunming
2015-01-29  8:49   ` [dpdk-dev] [PATCH v6 3/7] hash: replace built-in functions implementing SSE4.2 Yerden Zhumabekov
2015-01-29  8:49   ` [dpdk-dev] [PATCH v6 4/7] hash: add rte_hash_crc_8byte function Yerden Zhumabekov
2015-01-29  8:50   ` [dpdk-dev] [PATCH v6 5/7] hash: add fallback to software CRC32 implementation Yerden Zhumabekov
2015-01-29  8:50   ` [dpdk-dev] [PATCH v6 6/7] hash: rte_hash_crc() slices data into 8-byte pieces Yerden Zhumabekov
2015-01-29  8:50   ` [dpdk-dev] [PATCH v6 7/7] test: remove redundant compile checks Yerden Zhumabekov
2015-02-01 14:13   ` [dpdk-dev] [PATCH v6 0/7] rte_hash_crc reworked to be platform-independent Neil Horman
2015-02-02  3:07     ` Yerden Zhumabekov
2015-02-02  3:31       ` Neil Horman
2015-02-02  5:18         ` [dpdk-dev] HA: " Жумабеков Ерден Мирзагулович
2015-02-02  5:39         ` [dpdk-dev] " Yerden Zhumabekov
2015-02-19 15:21           ` Bruce Richardson
2015-02-23 17:36             ` Thomas Monjalon
2015-02-24  3:00               ` Yerden Zhumabekov
2015-02-24  3:10                 ` Thomas Monjalon
2015-02-24  9:12                   ` Bruce Richardson

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141119150533.GA28013@localhost.localdomain \
    --to=nhorman@tuxdriver.com \
    --cc=dev@dpdk.org \
    --cc=konstantin.ananyev@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).