From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-pf0-f193.google.com (mail-pf0-f193.google.com [209.85.192.193]) by dpdk.org (Postfix) with ESMTP id 5FAD32C6A for ; Tue, 8 Mar 2016 00:00:06 +0100 (CET) Received: by mail-pf0-f193.google.com with SMTP id 184so8897906pff.1 for ; Mon, 07 Mar 2016 15:00:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id; bh=moRz7YFnpa5hYkwGd79ez5iPkXeEu9iTyC0yTegbNgE=; b=hS6CAmBHebtUsUZ2xHqRM4jV/cNPHLidDFr72h0Qby4KZRvT9TBGREi5iH957PFMPp ZJ7+mByb+MOy2dnLmj/+7kVW1a2+01AunyfEVkEi3ZRGTSSVPSFqa9nd42ARd9mU12td qyc8Jw80QT5lTdRr6VRPKOGxy4AcD4MZ/xvk1B1/44LaST7h/4KQ5LT9/PEX4Pud55hk KnmzyavJZoZS9CLBMEOcWkQR4TyVMk84UHVThr1YhqJZ+TkCseVRgVDnai3fk80GixeW YbN75I5Csx+SmpB8eRK4v7ZaHKzND6XEc5BQ3q66WrF0TYjHq3dPmP13eKIMEFmN9744 dFOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=moRz7YFnpa5hYkwGd79ez5iPkXeEu9iTyC0yTegbNgE=; b=fOsN+L0haWoxuW9MPxtOr3LLr4K398PhCfecb4/YiRoxDOC+bnxv5J7Gv11PGP5QN+ yQpbKvg33nAOxvhXifbT7M7P9m4MFxxFi7M4OSAONJJEz3lQL6U6jPjHA/rKJhBo/8iZ n5DkWOLKpzJOR6Qx4/eIUJK2CJvAmv9+bVIJUwe748NErHbMbmhBQNAvDvs3IsIsRvl4 ZFpzRMYLIj79L1XWrN+U7f1cvTf0llYhmAFAZJCWF4ZswVwOQvb+Q3ma6XPM7tQVXrmG tUDbtMaUbHLbM6QagNVQrBC8bdSJ1kt/ZCBg4lcbTLKPqKPAQkCe1+fCou0DPZDL7L+O bqUw== X-Gm-Message-State: AD7BkJJZHC+CbckoBZAcaE7Eqz2YmrvWxT/oi/6CQfck0PjIhj0lEej1aZv7BDdx7gbayg== X-Received: by 10.98.74.17 with SMTP id x17mr37228728pfa.14.1457391605708; Mon, 07 Mar 2016 15:00:05 -0800 (PST) Received: from user-PC.hsd1.ca.comcast.net (c-24-130-109-45.hsd1.ca.comcast.net. [24.130.109.45]) by smtp.gmail.com with ESMTPSA id y68sm26389555pfi.6.2016.03.07.15.00.04 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 07 Mar 2016 15:00:04 -0800 (PST) From: Ravi Kerur To: dev@dpdk.org Date: Mon, 7 Mar 2016 14:59:43 -0800 Message-Id: <1457391583-29604-1-git-send-email-rkerur@gmail.com> X-Mailer: git-send-email 1.9.1 Subject: [dpdk-dev] [PATCH v1 0/2] rte_memcmp functions X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 07 Mar 2016 23:00:06 -0000 This patch provides AVX/SSE based memcmp implementation on x86. For other architectures supported by DPDK, rte_memcmp simply uses memcmp function. Following are preliminary performance numbers on Intel(R) Xeon(R) CPU E5-2620 v3 @ 2.40GHz RTE>>memcmp_perf_autotest *** RTE memcmp equal performance test results *** *** Length (bytes), Ticks/Op. *** *** 2, 4.8526 *** *** 5, 5.4023 *** *** 8, 4.5067 *** *** 9, 5.4024 *** *** 15, 7.2069 *** *** 16, 4.5027 *** *** 17, 4.5020 *** *** 31, 4.5020 *** *** 32, 4.5033 *** *** 33, 5.1377 *** *** 63, 6.9069 *** *** 64, 6.9472 *** *** 65, 9.6301 *** *** 127, 13.5122 *** *** 128, 10.8028 *** *** 129, 11.7058 *** *** 191, 14.4105 *** *** 192, 14.4251 *** *** 193, 16.2139 *** *** 255, 18.0125 *** *** 256, 17.1150 *** *** 257, 18.9129 *** *** 319, 20.7148 *** *** 320, 20.7161 *** *** 321, 22.5198 *** *** 383, 24.3169 *** *** 384, 22.5195 *** *** 385, 24.3197 *** *** 447, 26.1171 *** *** 448, 26.1289 *** *** 449, 27.9168 *** *** 511, 29.7252 *** *** 512, 29.7202 *** *** 513, 27.9253 *** *** 767, 38.7506 *** *** 768, 36.9327 *** *** 769, 38.7259 *** *** 1023, 49.5368 *** *** 1024, 49.5347 *** *** 1025, 46.8414 *** *** 1522, 68.4517 *** *** 1536, 68.4522 *** *** 1600, 67.5478 *** *** 2048, 87.3674 *** *** 2560, 106.2776 *** *** 3072, 125.1937 *** *** 3584, 144.1503 *** *** 4096, 163.0243 *** *** 4608, 181.9367 *** *** 5632, 219.7613 *** *** 6144, 238.6745 *** *** 6656, 257.6009 *** *** 7168, 276.5084 *** *** 7680, 295.4162 *** *** 8192, 314.3726 *** *** 16834, 746.1065 *** *** memcmp equal performance test results *** *** Length (bytes), Ticks/Op. *** *** 2, 9.0100 *** *** 5, 8.1065 *** *** 8, 9.1944 *** *** 9, 9.0044 *** *** 15, 9.0084 *** *** 16, 10.0695 *** *** 17, 9.0109 *** *** 31, 9.9111 *** *** 32, 9.9085 *** *** 33, 9.9112 *** *** 63, 12.6098 *** *** 64, 12.6106 *** *** 65, 12.6060 *** *** 127, 19.8160 *** *** 128, 19.8145 *** *** 129, 20.7260 *** *** 191, 26.1214 *** *** 192, 26.1195 *** *** 193, 26.1158 *** *** 255, 30.6222 *** *** 256, 30.6267 *** *** 257, 31.5270 *** *** 319, 36.0264 *** *** 320, 36.0497 *** *** 321, 36.9247 *** *** 383, 40.5290 *** *** 384, 40.5265 *** *** 385, 41.4331 *** *** 447, 45.9317 *** *** 448, 45.9324 *** *** 449, 45.9302 *** *** 511, 50.4652 *** *** 512, 50.4379 *** *** 513, 51.3361 *** *** 767, 67.5552 *** *** 768, 67.5464 *** *** 769, 67.5462 *** *** 1023, 85.5579 *** *** 1024, 85.5610 *** *** 1025, 85.5582 *** *** 1522, 120.6860 *** *** 1536, 121.6064 *** *** 1600, 126.1075 *** *** 2048, 157.6208 *** *** 2560, 208.8309 *** *** 3072, 241.7587 *** *** 3584, 276.1556 *** *** 4096, 310.5865 *** *** 4608, 343.8918 *** *** 5632, 411.2264 *** *** 6144, 445.3057 *** *** 6656, 480.4620 *** *** 7168, 512.5769 *** *** 7680, 547.9394 *** *** 8192, 582.7687 *** *** 16834, 1456.4280 *** *** RTE memcmp greater than performance test results *** *** Length (bytes), Ticks/Op. *** *** 1, 22.5862 *** *** 8, 24.9140 *** *** 15, 25.3942 *** *** 16, 22.1721 *** *** 32, 24.1650 *** *** 64, 25.0849 *** *** 128, 26.5515 *** *** 256, 28.7055 *** *** 512, 35.2811 *** *** 1024, 44.4520 *** *** 2048, 64.1331 *** *** 4096, 103.9949 *** *** 8192, 184.8077 *** *** 16384, 345.6785 *** *** memcmp greater than performance test results *** *** Length (bytes), Ticks/Op. *** *** 1, 22.6340 *** *** 8, 25.5552 *** *** 15, 25.4223 *** *** 16, 25.1371 *** *** 32, 26.7381 *** *** 64, 27.4521 *** *** 128, 29.7323 *** *** 256, 35.8891 *** *** 512, 46.0419 *** *** 1024, 101.1564 *** *** 2048, 159.8415 *** *** 4096, 230.2136 *** *** 8192, 366.2912 *** *** 16384, 647.0217 *** *** RTE memcmp less than performance test results *** *** Length (bytes), Ticks/Op. *** *** 1, 22.6627 *** *** 8, 26.2665 *** *** 15, 26.8192 *** *** 16, 21.7960 *** *** 32, 23.9878 *** *** 64, 24.2074 *** *** 128, 26.8111 *** *** 256, 28.3444 *** *** 512, 34.7882 *** *** 1024, 44.4824 *** *** 2048, 63.4154 *** *** 4096, 101.4360 *** *** 8192, 179.1029 *** *** 16384, 333.9357 *** *** memcmp less than performance test results *** *** Length (bytes), Ticks/Op. *** *** 1, 22.2894 *** *** 8, 24.9805 *** *** 15, 24.8632 *** *** 16, 24.3448 *** *** 32, 24.8554 *** *** 64, 25.7541 *** *** 128, 29.1831 *** *** 256, 36.2345 *** *** 512, 45.8233 *** *** 1024, 103.4597 *** *** 2048, 163.5588 *** *** 4096, 232.7368 *** *** 8192, 368.1143 *** *** 16384, 649.0326 *** Test OK RTE>>quit Ravi Kerur (2): rte_memcmp functions using Intel AVX and SSE intrinsics Test cases for rte_memcmp functions app/test/Makefile | 31 +- app/test/autotest_data.py | 19 + app/test/test_memcmp.c | 250 +++++++ app/test/test_memcmp_perf.c | 396 +++++++++++ .../common/include/arch/arm/rte_memcmp.h | 60 ++ .../common/include/arch/ppc_64/rte_memcmp.h | 62 ++ .../common/include/arch/tile/rte_memcmp.h | 60 ++ .../common/include/arch/x86/rte_memcmp.h | 786 +++++++++++++++++++++ lib/librte_eal/common/include/generic/rte_memcmp.h | 175 +++++ 9 files changed, 1838 insertions(+), 1 deletion(-) create mode 100644 app/test/test_memcmp.c create mode 100644 app/test/test_memcmp_perf.c create mode 100644 lib/librte_eal/common/include/arch/arm/rte_memcmp.h create mode 100644 lib/librte_eal/common/include/arch/ppc_64/rte_memcmp.h create mode 100644 lib/librte_eal/common/include/arch/tile/rte_memcmp.h create mode 100644 lib/librte_eal/common/include/arch/x86/rte_memcmp.h create mode 100644 lib/librte_eal/common/include/generic/rte_memcmp.h -- 1.9.1