From: Olivier Matz <olivier.matz@6wind.com>
To: "Mattias Rönnblom" <mattias.ronnblom@ericsson.com>
Cc: "Emil Berg" <emil.berg@ericsson.com>,
bruce.richardson@intel.com, stephen@networkplumber.org,
stable@dpdk.org, bugzilla@dpdk.org, dev@dpdk.org,
onar.olsen@ericsson.com,
"Morten Brørup" <mb@smartsharesystems.com>
Subject: Re: [PATCH v2 1/2] app/test: add cksum performance test
Date: Mon, 11 Jul 2022 11:47:17 +0200 [thread overview]
Message-ID: <YsvxpfTqZZJBI2FD@platinum> (raw)
In-Reply-To: <20220708125608.24532-1-mattias.ronnblom@ericsson.com>
Hi Mattias,
Please see few comments below.
On Fri, Jul 08, 2022 at 02:56:07PM +0200, Mattias Rönnblom wrote:
> Add performance test for the rte_raw_cksum() function, which delegates
> the actual work to __rte_raw_cksum(), which in turn is used by other
> functions in need of Internet checksum calculation.
>
> Signed-off-by: Mattias Rönnblom <mattias.ronnblom@ericsson.com>
>
> ---
>
> v2:
> * Added __rte_unused to unused volatile variable, to keep the Intel
> compiler happy.
> ---
> MAINTAINERS | 1 +
> app/test/meson.build | 1 +
> app/test/test_cksum_perf.c | 118 +++++++++++++++++++++++++++++++++++++
> 3 files changed, 120 insertions(+)
> create mode 100644 app/test/test_cksum_perf.c
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c923712946..2a4c99e05a 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -1414,6 +1414,7 @@ Network headers
> M: Olivier Matz <olivier.matz@6wind.com>
> F: lib/net/
> F: app/test/test_cksum.c
> +F: app/test/test_cksum_perf.c
>
> Packet CRC
> M: Jasvinder Singh <jasvinder.singh@intel.com>
> diff --git a/app/test/meson.build b/app/test/meson.build
> index 431c5bd318..191db03d1d 100644
> --- a/app/test/meson.build
> +++ b/app/test/meson.build
> @@ -18,6 +18,7 @@ test_sources = files(
> 'test_bpf.c',
> 'test_byteorder.c',
> 'test_cksum.c',
> + 'test_cksum_perf.c',
> 'test_cmdline.c',
> 'test_cmdline_cirbuf.c',
> 'test_cmdline_etheraddr.c',
> diff --git a/app/test/test_cksum_perf.c b/app/test/test_cksum_perf.c
> new file mode 100644
> index 0000000000..bff73cb3bb
> --- /dev/null
> +++ b/app/test/test_cksum_perf.c
> @@ -0,0 +1,118 @@
> +/* SPDX-License-Identifier: BSD-3-Clause
> + * Copyright(c) 2022 Ericsson AB
> + */
> +
> +#include <stdio.h>
> +
> +#include <rte_common.h>
> +#include <rte_cycles.h>
> +#include <rte_ip.h>
> +#include <rte_malloc.h>
> +#include <rte_random.h>
> +
> +#include "test.h"
> +
> +#define NUM_BLOCKS (10)
> +#define ITERATIONS (1000000)
Parenthesis can be safely removed
> +
> +static const size_t data_sizes[] = { 20, 21, 100, 101, 1500, 1501 };
> +
> +static __rte_noinline uint16_t
> +do_rte_raw_cksum(const void *buf, size_t len)
> +{
> + return rte_raw_cksum(buf, len);
> +}
I don't understand the need to have this wrapper, especially marked
__rte_noinline. What is the objective?
Note that when I remove the __rte_noinline, the performance is better
for size 20 and 21.
> +
> +static void
> +init_block(void *buf, size_t len)
Can buf be a (char *) instead?
It would avoid a cast below.
> +{
> + size_t i;
> +
> + for (i = 0; i < len; i++)
> + ((char *)buf)[i] = (uint8_t)rte_rand();
> +}
> +
> +static int
> +test_cksum_perf_size_alignment(size_t block_size, bool aligned)
> +{
> + char *data[NUM_BLOCKS];
> + char *blocks[NUM_BLOCKS];
> + unsigned int i;
> + uint64_t start;
> + uint64_t end;
> + /* Floating point to handle low (pseudo-)TSC frequencies */
> + double block_latency;
> + double byte_latency;
> + volatile __rte_unused uint64_t sum = 0;
> +
> + for (i = 0; i < NUM_BLOCKS; i++) {
> + data[i] = rte_malloc(NULL, block_size + 1, 0);
> +
> + if (data[i] == NULL) {
> + printf("Failed to allocate memory for block\n");
> + return TEST_FAILED;
> + }
> +
> + init_block(data[i], block_size + 1);
> +
> + blocks[i] = aligned ? data[i] : data[i] + 1;
> + }
> +
> + start = rte_rdtsc();
> +
> + for (i = 0; i < ITERATIONS; i++) {
> + unsigned int j;
> + for (j = 0; j < NUM_BLOCKS; j++)
> + sum += do_rte_raw_cksum(blocks[j], block_size);
> + }
> +
> + end = rte_rdtsc();
> +
> + block_latency = (end - start) / (double)(ITERATIONS * NUM_BLOCKS);
> + byte_latency = block_latency / block_size;
> +
> + printf("%-9s %10zd %19.1f %16.2f\n", aligned ? "Aligned" : "Unaligned",
> + block_size, block_latency, byte_latency);
When I run the test on my dev machine, I get the following results,
which are quite reproductible:
Aligned 20 10.4 0.52 (range is 0.48 - 0.52)
Unaligned 20 7.9 0.39 (range is 0.39 - 0.40)
...
If I increase the number of iterations, the first results
change significantly:
Aligned 20 8.2 0.42 (range is 0.41 - 0.42)
Unaligned 20 8.0 0.40 (always this value)
To have more precise tests with small size, would it make sense to
target a test time instead of an iteration count? Something like
this:
#define ITERATIONS 1000000
uint64_t iterations = 0;
...
do {
for (i = 0; i < ITERATIONS; i++) {
unsigned int j;
for (j = 0; j < NUM_BLOCKS; j++)
sum += do_rte_raw_cksum(blocks[j], block_size);
}
iterations += ITERATIONS;
end = rte_rdtsc();
} while ((end - start) < rte_get_tsc_hz());
block_latency = (end - start) / (double)(iterations * NUM_BLOCKS);
After this change, the aligned and unaligned cases have the same
performance on my machine.
> +
> + for (i = 0; i < NUM_BLOCKS; i++)
> + rte_free(data[i]);
> +
> + return TEST_SUCCESS;
> +}
> +
> +static int
> +test_cksum_perf_size(size_t block_size)
> +{
> + int rc;
> +
> + rc = test_cksum_perf_size_alignment(block_size, true);
> + if (rc != TEST_SUCCESS)
> + return rc;
> +
> + rc = test_cksum_perf_size_alignment(block_size, false);
> +
> + return rc;
> +}
> +
> +static int
> +test_cksum_perf(void)
> +{
> + uint16_t i;
> +
> + printf("### rte_raw_cksum() performance ###\n");
> + printf("Alignment Block size TSC cycles/block TSC cycles/byte\n");
> +
> + for (i = 0; i < RTE_DIM(data_sizes); i++) {
> + int rc;
> +
> + rc = test_cksum_perf_size(data_sizes[i]);
> + if (rc != TEST_SUCCESS)
> + return rc;
> + }
> +
> + return TEST_SUCCESS;
> +}
> +
> +
> +REGISTER_TEST_COMMAND(cksum_perf_autotest, test_cksum_perf);
> +
The last empty line can be removed.
> --
> 2.25.1
>
next prev parent reply other threads:[~2022-07-11 9:47 UTC|newest]
Thread overview: 62+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <98CBD80474FA8B44BF855DF32C47DC35D87139@smartserver.smartshare.dk>
2022-06-17 8:45 ` [PATCH] net: fix checksum with unaligned buffer Morten Brørup
2022-06-17 9:06 ` Morten Brørup
2022-06-20 10:37 ` Emil Berg
2022-06-20 10:57 ` Morten Brørup
2022-06-21 7:16 ` Emil Berg
2022-06-21 8:05 ` Morten Brørup
2022-06-21 8:23 ` Bruce Richardson
2022-06-21 9:35 ` Morten Brørup
2022-06-22 6:26 ` Emil Berg
2022-06-22 9:18 ` Bruce Richardson
2022-06-22 11:26 ` Morten Brørup
2022-06-22 12:25 ` Emil Berg
2022-06-22 14:01 ` Morten Brørup
2022-06-22 14:03 ` Emil Berg
2022-06-23 5:21 ` Emil Berg
2022-06-23 7:01 ` Morten Brørup
2022-06-23 11:39 ` Emil Berg
2022-06-23 12:18 ` Morten Brørup
2022-06-22 13:44 ` [PATCH v2] " Morten Brørup
2022-06-22 13:54 ` [PATCH v3] " Morten Brørup
2022-06-23 12:39 ` [PATCH v4] " Morten Brørup
2022-06-23 12:51 ` Morten Brørup
2022-06-27 7:56 ` Emil Berg
2022-06-27 10:54 ` Morten Brørup
2022-06-27 12:28 ` Mattias Rönnblom
2022-06-27 12:46 ` Emil Berg
2022-06-27 12:50 ` Emil Berg
2022-06-27 13:22 ` Morten Brørup
2022-06-27 17:22 ` Mattias Rönnblom
2022-06-27 20:21 ` Morten Brørup
2022-06-28 6:28 ` Mattias Rönnblom
2022-06-30 16:28 ` Morten Brørup
2022-07-07 15:21 ` Stanisław Kardach
2022-07-07 18:34 ` [PATCH 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-07 18:34 ` [PATCH 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-07 21:44 ` Morten Brørup
2022-07-08 12:43 ` Mattias Rönnblom
2022-07-08 12:56 ` [PATCH v2 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-08 12:56 ` [PATCH v2 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-08 14:44 ` Ferruh Yigit
2022-07-11 9:53 ` Olivier Matz
2022-07-11 10:53 ` Mattias Rönnblom
2022-07-11 9:47 ` Olivier Matz [this message]
2022-07-11 10:42 ` [PATCH v2 1/2] app/test: add cksum performance test Mattias Rönnblom
2022-07-11 11:33 ` Olivier Matz
2022-07-11 12:11 ` [PATCH v3 " Mattias Rönnblom
2022-07-11 12:11 ` [PATCH v3 2/2] net: have checksum routines accept unaligned data Mattias Rönnblom
2022-07-11 13:25 ` Olivier Matz
2022-08-08 9:25 ` Mattias Rönnblom
2022-09-20 12:09 ` Mattias Rönnblom
2022-09-20 16:10 ` Thomas Monjalon
2022-07-11 13:20 ` [PATCH v3 1/2] app/test: add cksum performance test Olivier Matz
2022-07-08 13:02 ` [PATCH 2/2] net: have checksum routines accept unaligned data Morten Brørup
2022-07-08 13:52 ` Mattias Rönnblom
2022-07-08 14:10 ` Bruce Richardson
2022-07-08 14:30 ` Morten Brørup
2022-06-30 17:41 ` [PATCH v4] net: fix checksum with unaligned buffer Stephen Hemminger
2022-06-30 17:45 ` Stephen Hemminger
2022-07-01 4:11 ` Emil Berg
2022-07-01 16:50 ` Morten Brørup
2022-07-01 17:04 ` Stephen Hemminger
2022-07-01 20:46 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YsvxpfTqZZJBI2FD@platinum \
--to=olivier.matz@6wind.com \
--cc=bruce.richardson@intel.com \
--cc=bugzilla@dpdk.org \
--cc=dev@dpdk.org \
--cc=emil.berg@ericsson.com \
--cc=mattias.ronnblom@ericsson.com \
--cc=mb@smartsharesystems.com \
--cc=onar.olsen@ericsson.com \
--cc=stable@dpdk.org \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).