From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [119.145.14.64]) by dpdk.org (Postfix) with ESMTP id 7CFE65424 for ; Thu, 22 Jan 2015 13:53:19 +0100 (CET) Received: from 172.24.2.119 (EHLO szxeml428-hub.china.huawei.com) ([172.24.2.119]) by szxrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id CIJ25752; Thu, 22 Jan 2015 20:53:17 +0800 (CST) Received: from [127.0.0.1] (10.177.19.115) by szxeml428-hub.china.huawei.com (10.82.67.183) with Microsoft SMTP Server id 14.3.158.1; Thu, 22 Jan 2015 20:53:16 +0800 Message-ID: <54C0F2B9.7050006@huawei.com> Date: Thu, 22 Jan 2015 20:53:13 +0800 From: Linhaifeng User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:31.0) Gecko/20100101 Thunderbird/31.1.0 MIME-Version: 1.0 To: Bruce Richardson , Tetsuya Mukawa References: <54C070DF.1050006@huawei.com> <20150122044531.GA13230@mhcomputing.net> <54C08B54.50700@huawei.com> <20150122073526.GA14800@mhcomputing.net> <54C0CFB5.909@igel.co.jp> <20150122113426.GC4580@bricha3-MOBL3> In-Reply-To: <20150122113426.GC4580@bricha3-MOBL3> Content-Type: text/plain; charset="windows-1252" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.19.115] X-CFilter-Loop: Reflected Cc: "dev@dpdk.org" Subject: Re: [dpdk-dev] some questions about rte_memcpy X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: patches and discussions about DPDK List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Thu, 22 Jan 2015 12:53:23 -0000 On 2015/1/22 19:34, Bruce Richardson wrote: > On Thu, Jan 22, 2015 at 07:23:49PM +0900, Tetsuya Mukawa wrote: >> On 2015/01/22 16:35, Matthew Hall wrote: >>> On Thu, Jan 22, 2015 at 01:32:04PM +0800, Linhaifeng wrote: >>>> Do you mean if call rte_memcpy before rte_eal_init() would crash?why? >>> No guarantee. But a theory. It might use some things from the EAL init to >>> figure out which version of the accelerated algorithm to use. >> >> This selection is done at compile-time. >> And if the size is constant, I guess DPDK assumes memcpy is replaced by >> inline __builtin_memcpy. >> I haven't checked the performance of builtin memcpy, but probably much >> faster. >> > > Yes, that assumption is correct. A couple of years ago we discovered that for > constant size values, the compiler would generate much faster code for us > using a regular memcpy than rte_memcpy, hence the macro. > > /Bruce > >> Tetsuya >> >>> Matthew. >> >> > > Hi,Bruce I test it,most results like you said use constant may be faster,but sometimes not. linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999 rte_memcpy(constant) used:279893712 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277818600 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999 rte_memcpy(constant) used:279264328 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277667116 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 16 9999999 rte_memcpy(constant) used:279491832 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277622772 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999 rte_memcpy(constant) used:279402156 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277738464 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999 rte_memcpy(constant) used:279305172 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277483004 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 32 9999999 rte_memcpy(constant) used:279784124 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:277605332 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999 rte_memcpy(constant) used:322817260 rte_memcpy(variable) used:350333864 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999 rte_memcpy(constant) used:322840748 rte_memcpy(variable) used:350297868 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 48 9999999 rte_memcpy(constant) used:322488240 rte_memcpy(variable) used:350348652 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999 rte_memcpy(constant) used:322021428 rte_memcpy(variable) used:350416440 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999 rte_memcpy(constant) used:321370900 rte_memcpy(variable) used:350355796 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 64 9999999 rte_memcpy(constant) used:322704552 rte_memcpy(variable) used:349900832 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999 rte_memcpy(constant) used:422705828 rte_memcpy(variable) used:425493328 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999 rte_memcpy(constant) used:422421840 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:413691412 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 128 9999999 rte_memcpy(constant) used:425233088 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:421136724 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999 rte_memcpy(constant) used:901014608 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:900997388 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999 rte_memcpy(constant) used:900803308 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:900794076 linux-mnSyvH:/mnt/sdb/linhf/test # ./rte_memcpy_test 256 9999999 rte_memcpy(constant) used:901842436 @@@@@@@@@@@@@@ not faster rte_memcpy(variable) used:901218984 linux-mnSyvH:/mnt/sdb/linhf/test # here is my test codes: #include #include #include int main(int narg, char** args) { int i; char buf[1024]; uint64_t start, end; if (narg < 3) { printf("usage:./rte_memcpy_test size times\n"); return 0; } size_t size_v = atoi(args[1]); const size_t size_c = atoi(args[1]); int times = atoi(args[2]); start = rte_rdtsc(); for(i = 0; i < times; i++) { rte_memcpy(buf, buf, size_c); } end = rte_rdtsc(); printf("rte_memcpy(constant) used:%llu\n", end - start); start = rte_rdtsc(); for (i = 0; i < times; i++) { rte_memcpy(buf, buf, size_v); } end = rte_rdtsc(); printf("rte_memcpy(variable) used:%llu\n", end - start); return 0; } -- Regards, Haifeng