From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga09.intel.com (mga09.intel.com [134.134.136.24]) by dpdk.org (Postfix) with ESMTP id B4CDD37AA for ; Fri, 24 Feb 2017 15:11:57 +0100 (CET) Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Feb 2017 06:11:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.35,201,1484035200"; d="scan'208";a="61823748" Received: from bricha3-mobl3.ger.corp.intel.com ([10.237.221.61]) by orsmga004.jf.intel.com with SMTP; 24 Feb 2017 06:11:54 -0800 Received: by (sSMTP sendmail emulation); Fri, 24 Feb 2017 14:11:53 +0000 Date: Fri, 24 Feb 2017 14:11:53 +0000 From: Bruce Richardson To: David Hunt Cc: dev@dpdk.org Message-ID: <20170224141153.GH106392@bricha3-MOBL3.ger.corp.intel.com> References: <1485163480-156507-2-git-send-email-david.hunt@intel.com> <1487647073-129064-1-git-send-email-david.hunt@intel.com> <1487647073-129064-7-git-send-email-david.hunt@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1487647073-129064-7-git-send-email-david.hunt@intel.com> Organization: Intel Research and =?iso-8859-1?Q?De=ACvel?= =?iso-8859-1?Q?opment?= Ireland Ltd. User-Agent: Mutt/1.7.2 (2016-11-26) Subject: Re: [dpdk-dev] [PATCH v7 06/17] lib: add SIMD flow matching to distributor X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 24 Feb 2017 14:11:58 -0000 On Tue, Feb 21, 2017 at 03:17:42AM +0000, David Hunt wrote: > Add an optimised version of the in-flight flow matching algorithm > using SIMD instructions. This should give up to 1.5x over the scalar > versions performance. > > Falls back to scalar version if SSE4.2 not available > > Signed-off-by: David Hunt > --- > lib/librte_distributor/Makefile | 7 ++ > lib/librte_distributor/rte_distributor.c | 16 ++- > .../rte_distributor_match_generic.c | 43 ++++++++ > lib/librte_distributor/rte_distributor_match_sse.c | 113 +++++++++++++++++++++ > lib/librte_distributor/rte_distributor_private.h | 5 + > 5 files changed, 182 insertions(+), 2 deletions(-) > create mode 100644 lib/librte_distributor/rte_distributor_match_generic.c > create mode 100644 lib/librte_distributor/rte_distributor_match_sse.c > > diff --git a/lib/librte_distributor/Makefile b/lib/librte_distributor/Makefile > index 276695a..5b599c6 100644 > --- a/lib/librte_distributor/Makefile > +++ b/lib/librte_distributor/Makefile > @@ -44,6 +44,13 @@ LIBABIVER := 1 > # all source are stored in SRCS-y > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) := rte_distributor_v20.c > SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += rte_distributor.c > +ifeq ($(CONFIG_RTE_ARCH_X86),y) > +SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += rte_distributor_match_sse.c > +CFLAGS_rte_distributor_match_sse.o += -msse4.2 > +else > +SRCS-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR) += rte_distributor_match_generic.c > +endif > + > > # install this header file > SYMLINK-$(CONFIG_RTE_LIBRTE_DISTRIBUTOR)-include := rte_distributor_v20.h > diff --git a/lib/librte_distributor/rte_distributor.c b/lib/librte_distributor/rte_distributor.c > index ae8d508..b8e171c 100644 > --- a/lib/librte_distributor/rte_distributor.c > +++ b/lib/librte_distributor/rte_distributor.c > @@ -392,7 +392,13 @@ rte_distributor_process(struct rte_distributor *d, > for (; i < RTE_DIST_BURST_SIZE; i++) > flows[i] = 0; > > - find_match_scalar(d, &flows[0], &matches[0]); > + switch (d->dist_match_fn) { > + case RTE_DIST_MATCH_VECTOR: > + find_match_vec(d, &flows[0], &matches[0]); > + break; > + default: > + find_match_scalar(d, &flows[0], &matches[0]); > + } > > /* > * Matches array now contain the intended worker ID (+1) of > @@ -608,7 +614,13 @@ rte_distributor_create(const char *name, > snprintf(d->name, sizeof(d->name), "%s", name); > d->num_workers = num_workers; > d->alg_type = alg_type; > - d->dist_match_fn = RTE_DIST_MATCH_SCALAR; > + > +#if defined(RTE_ARCH_X86) > + if (rte_cpu_get_flag_enabled(RTE_CPUFLAG_SSE4_2)) { > + d->dist_match_fn = RTE_DIST_MATCH_VECTOR; > + } else Minor nit: you can remove the braces here. /Bruce