DPDK patches and discussions
 help / color / mirror / Atom feed
From: Henning Schild <henning.schild@siemens.com>
To: Thomas Monjalon <thomas@monjalon.net>
Cc: "Felix Moessbauer" <felix.moessbauer@siemens.com>,
	dev@dpdk.org, ktraynor@redhat.com, ali.rizvi@emumba.com,
	"Muhammad Ahmad" <muhammad.ahamd@emumba.com>,
	david.hunt@intel.com, reshma.pattan@intel.com,
	konstantin.ananyev@intel.com, bruce.richardson@intel.com,
	ferruh.yigit@intel.com, jerinj@marvell.com,
	honnappa.nagarahalli@arm.com, olivier.matz@6wind.com,
	lijuan.tu@intel.com, "Owen Hilyard" <ohilyard@iol.unh.edu>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	viacheslavo@nvidia.com,
	"Stephen Hemminger" <stephen@networkplumber.org>,
	anatoly.burakov@intel.com
Subject: Re: [dpdk-dev] [PATCH v4 2/2] Add l2reflect measurement application
Date: Wed, 2 Sep 2020 09:56:05 +0200	[thread overview]
Message-ID: <20200902095605.4c8e9373@md1za8fc.ad001.siemens.net> (raw)
In-Reply-To: <7023104.8KzyT1xApx@thomas>

Am Tue, 01 Sep 2020 17:07:38 +0200
schrieb Thomas Monjalon <thomas@monjalon.net>:

> 28/08/2020 16:22, Henning Schild:
> > Thomas,
> > 
> > what is the state on this one? Any more steps to take or people to
> > involve?  
> 
> I can try adding some key people in Cc list,
> while doing a quick review.
>
>
> > I first showed that in action back in 2016 on FOSDEM.
> > https://archive.fosdem.org/2016/schedule/event/virt_iaas_real_time_cloud/
> > After my talk two devs from dpdk approached me because they liked
> > the idea of a "network cyclictest". It took us some time to finally
> > go upstream with it, unfortunately i do not recall names.  
> 
> I think I was one of them.
> 
> There will be some talks about latency in the next virtual DPDK event:
> https://events.linuxfoundation.org/dpdk-userspace-summit/program/schedule/
> (speakers are Cc'ed)

Thanks, that sure looks like we are not the only ones who are
interested in latency and probably a way to measure it. From other
doing it as well it would be nice to hear how they currently do that.

> > This application can potentially be integrated into the test-suite
> > for QA, making sure no changes heavily increase the package
> > transmission worst case timing.  
> 
> Good
> 
> 
> > Felix Moessbauer <felix.moessbauer@siemens.com> wrote:
> >   
> > > The l2reflect application implements a ping-pong benchmark to
> > > measure the latency between two instances. For communication,
> > > we use raw ethernet and send one packet at a time. The timing data
> > > is collected locally and min/max/avg values are displayed in a
> > > TUI. Finally, a histogram of the latencies is printed which can be
> > > further processed with the jitterdebugger visualization scripts.  
> 
> Please can you show an example of script usage?
> 
> > > To debug latency spikes, a max threshold can be defined.
> > > If it is hit, a trace point is created on both instances.
> > > 
> > > Signed-off-by: Felix Moessbauer <felix.moessbauer@siemens.com>
> > > Signed-off-by: Henning Schild <henning.schild@siemens.com>  
> [...]
> > > --- a/app/Makefile
> > > +++ b/app/Makefile  
> 
> No need to update Makefile, it will be removed.
> 
> > > --- /dev/null
> > > +++ b/app/l2reflect/l2reflect.h
> > > @@ -0,0 +1,62 @@
> > > +/*
> > > + * SPDX-License-Identifier: BSD-3-Clause  
> 
> SPDX should be the first line.
> 
> > > + * Copyright(c) 2020 Siemens AG
> > > + *
> > > + * authors:
> > > + *   Felix Moessbauer <felix.moessbauer@siemens.com>
> > > + *   Henning Schild <henning.schild@siemens.com>  
> 
> It is uncommon to mention authors in the file.
> In general, git metadata covers this info.
> Any special requirement?

That application has some history. Written by me and eventually
improved for upstream by Felix. The idea here was to give credit to two
people with just one author field in git. If that is really in the way,
we can drop it and make Felix the sole author.

> [...]
> > > +
> > > +/*
> > > + * Compares a packet destination MAC address to a device MAC
> > > address.
> > > + */
> > > +static __rte_always_inline int
> > > +ether_addr_cmp(struct rte_ether_addr *ea, struct rte_ether_addr
> > > *eb) +{
> > > +	return ((*(uint64_t *)ea ^ *(uint64_t *)eb) &
> > > MAC_ADDR_CMP) == 0; +}  
> 
> Please use rte_is_same_ether_addr()
> 
> > > --- /dev/null
> > > +++ b/app/l2reflect/main.c
> > > @@ -0,0 +1,872 @@
> > > +/*
> > > + * SPDX-License-Identifier: BSD-3-Clause
> > > + * Copyright(c) 2020 Siemens AG
> > > + *
> > > + * authors:
> > > + *   Felix Moessbauer <felix.moessbauer@siemens.com>
> > > + *   Henning Schild <henning.schild@siemens.com>
> > > + *
> > > + * launch (non-rt kernel): l2reflect --lcores 0@0,1@6 -n 1
> > > + * launch (rt kernel): l2reflect --lcores 0@0,1@6 -n 1 -- -P 50
> > > -r -l  
> 
> Would be better in a --help section or in the doc.
> 
> > > + *
> > > + * The l2reflect application implements a ping-pong benchmark to
> > > + * measure the latency between two instances. For communication,
> > > + * we use raw ethernet and send one packet at a time. The timing
> > > data
> > > + * is collected locally and min/max/avg values are displayed in a
> > > TUI.
> > > + * Finally, a histogram of the latencies is printed which can be
> > > + * further processed with the jitterdebugger visualization
> > > scripts.
> > > + * To debug latency spikes, a max threshold can be defined.
> > > + * If it is hit, a trace point is created on both instances.
> > > + */  
> [...]
> > > +__rte_format_printf(1, 0)
> > > +static void
> > > +trace_write(const char *fmt, ...)
> > > +{
> > > +	va_list ap;
> > > +	char buf[256];
> > > +	int n, err;
> > > +
> > > +	if (trace_fd == 0)
> > > +		trace_fd =
> > > open("/sys/kernel/debug/tracing/trace_marker",
> > > +				O_WRONLY);  
> 
> Why not using rte_trace framework?

I am not sure where rte_trace ends up eventually. But the idea here is
to have kernel traces when looking at latency peaks. You might just be
looking at an incorrect host tuning when you see a spike i.e. higher
prio threads on your possibly isolated core.
It is not tracing dpdk, but more how it fits in the overall picture.
 
> > > +	if (trace_fd < 0)
> > > +		return;
> > > +
> > > +	va_start(ap, fmt);
> > > +	n = vsnprintf(buf, 256, fmt, ap);
> > > +	va_end(ap);
> > > +
> > > +	err = write(trace_fd, buf, n);
> > > +	assert(err >= 1);
> > > +}  
> 
> [...]
> > > +static void
> > > +l2reflect_simple_forward(struct rte_mbuf *m)
> > > +{
> > > +	struct rte_ether_hdr *eth;
> > > +	struct my_magic_packet *pkt;
> > > +
> > > +	eth = rte_pktmbuf_mtod(m, struct rte_ether_hdr *);
> > > +	pkt = (struct my_magic_packet *)eth;
> > > +
> > > +	/* dst addr */
> > > +	rte_ether_addr_copy(&eth->s_addr, &eth->d_addr);
> > > +	/* src addr */
> > > +	rte_ether_addr_copy(&l2reflect_port_eth_addr,
> > > &eth->s_addr); +
> > > +	if ((eth->ether_type ==
> > > rte_cpu_to_be_16(ETHER_TYPE_L2REFLECT)) &&
> > > +	    (pkt->magic == MAGIC_TRACE_PAYLOAD)) {
> > > +		/* and the native one */
> > > +		trace_write("sending traced packet\n");
> > > +	}
> > > +
> > > +	l2reflect_send_packet(&m, l2reflect_port_number);
> > > +}  
> 
> If I understand well, this requires a special cabling?
> Would it be possible to measure latency of hardware NIC internal
> loopback or software PMD loopback?

Not sure. We always want to see the full picture ... i.e would also
want to see if a NIC is slow to put a packet on the "wire".
For testing we do use multi-port NICs with cable-loops. The other setup
is Qemus with the application inside and a dpdk-based vhost switch. But
that virtual setup is also not trivial. In fact we usually do make that
switch send the packets over the cable-loop to again cover the whole
way the packets go "in reality".

> [...]
> > > +	printf("from MAC address: %02X:%02X:%02X:%02X:%02X:%02X
> > > to"
> > > +			" %02X:%02X:%02X:%02X:%02X:%02X\n\n",
> > > +			eth->s_addr.addr_bytes[0],
> > > eth->s_addr.addr_bytes[1],
> > > +			eth->s_addr.addr_bytes[2],
> > > eth->s_addr.addr_bytes[3],
> > > +			eth->s_addr.addr_bytes[4],
> > > eth->s_addr.addr_bytes[5],
> > > +			eth->d_addr.addr_bytes[0],
> > > eth->d_addr.addr_bytes[1],
> > > +			eth->d_addr.addr_bytes[2],
> > > eth->d_addr.addr_bytes[3],
> > > +			eth->d_addr.addr_bytes[4],
> > > eth->d_addr.addr_bytes[5]);  
> 
> You could also use rte_ether_format_addr()
> 
> [...]
> > > +static inline unsigned int
> > > +l2reflect_rx_filter(
> > > +	struct rte_mbuf **bufs,
> > > +	unsigned int nb_rx,
> > > +	unsigned int data_only)
> > > +{
> > > +	struct rte_ether_hdr *eth;
> > > +	struct my_magic_packet *pkt;
> > > +	unsigned int i, ret;
> > > +
> > > +	ret = 0;
> > > +	for (i = 0; i < nb_rx; i++) {
> > > +		eth = rte_pktmbuf_mtod(bufs[i], struct
> > > rte_ether_hdr *);
> > > +		if (l2reflect_state != S_ELECT_LEADER &&
> > > +		    !ether_addr_cmp(&eth->s_addr,
> > > &l2reflect_remote_eth_addr))
> > > +			goto drop;
> > > +
> > > +		if (eth->ether_type !=
> > > rte_cpu_to_be_16(ETHER_TYPE_L2REFLECT))
> > > +			goto drop;
> > > +
> > > +		pkt = (struct my_magic_packet *)eth;
> > > +		if (data_only && (pkt->type != TRACE_TYPE_DATA &&
> > > +				  pkt->type != TRACE_TYPE_RSET &&
> > > +				  pkt->type != TRACE_TYPE_QUIT))
> > > +			goto drop;
> > > +
> > > +		bufs[ret++] = bufs[i];
> > > +		continue;
> > > +drop:
> > > +		rte_pktmbuf_free(bufs[i]);
> > > +	}
> > > +
> > > +	return ret;
> > > +}  
> 
> This function deserves some comments to explain the logic.
> 
> [...]
> > > --- /dev/null
> > > +++ b/app/l2reflect/meson.build
> > > @@ -0,0 +1,25 @@
> > > +# SPDX-License-Identifier: BSD-3-Clause
> > > +# Copyright(c) 2020 Siemens AG
> > > +
> > > +# meson file, for building this example as part of a main DPDK
> > > build. +cc = meson.get_compiler('c')  
> 
> Not sure you need that line.
> 
> > > +
> > > +cjson = dependency('libcjson', required: false)  
> 
> Some other parts require jansson:
> 	jansson = dependency('jansson', required: false)
> How libcjson is different? Would it be possible to unify dependency?
> 
> > > +if not is_linux
> > > +        build = false  
> 
> Why limiting to Linux?

Probably because of the RT stuff (sched param setting, mlock) and the
tracing. I would propose it like that and hope that others might ifdef
when enabling other OSs.

Maintaining low latency requires quite a bit of host tuning. This can
get especially tricky when consuming the whole CPU with polling, you
can starve your OS. The application alone does not do that for you, not
even on Linux. Like cyclictest or jitterdebugger i would see it as a
tool that requires knowledge to measure correctly. And it should
probably be added as such, with an integration into automated tests and
support for other OSs possibly coming in the future.

I hope that others doing dpdk for low latency on preempt-rt will find
it useful as is. They might have been looking for something like it,
possibly have something like it in their test-labs and did not publish
it yet.

Henning

> [...]
> > > +#define ANSI_BOLD_RED "\x1b[01;31m"
> > > +#define ANSI_BOLD_GREEN "\x1b[01;32m"
> > > +#define ANSI_BOLD_YELLOW "\x1b[01;33m"
> > > +#define ANSI_BOLD_BLUE "\x1b[01;34m"
> > > +#define ANSI_BOLD_MAGENTA "\x1b[01;35m"
> > > +#define ANSI_BOLD_CYAN "\x1b[01;36m"
> > > +#define ANSI_RESET "\x1b[0m"  
> 
> Not sure about using colors.
> If it really adds a value, I think it should be an option,
> automatically disabled if stdout is redirected.
> 
> [...]
> > > --- a/app/meson.build
> > > +++ b/app/meson.build
> > > @@ -8,6 +8,7 @@ endif
> > >  apps = [
> > >  	'pdump',
> > >  	'proc-info',
> > > +	'l2reflect',
> > >  	'test-acl',
> > >  	'test-bbdev',
> > >  	'test-cmdline',  
> 
> I think you should keep alphabetical ordering.
> 
> I'm not sure about adding this application in DPDK.
> I think we need some latency tools,
> but if it requires a specific setup, it could better fit in DTS.
> We need more opinions here.
> 
> 


  reply	other threads:[~2020-09-02 14:23 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-07-03 15:18 [dpdk-dev] [RFC PATCH 0/1] " Felix Moessbauer
2020-07-03 15:18 ` [dpdk-dev] [RFC PATCH 1/1] " Felix Moessbauer
2020-07-07 14:11   ` [dpdk-dev] [PATCH v2 0/2] " Felix Moessbauer
2020-07-07 14:11   ` [dpdk-dev] [PATCH v2 1/2] Fix build of apps with external dependencies Felix Moessbauer
2020-07-07 14:11   ` [dpdk-dev] [PATCH v2 2/2] Add l2reflect measurement application Felix Moessbauer
2020-07-09  9:27   ` [dpdk-dev] [PATCH v3 0/2] " Felix Moessbauer
2020-07-09  9:27   ` [dpdk-dev] [PATCH v3 1/2] Fix build of apps with external dependencies Felix Moessbauer
2020-07-09  9:27   ` [dpdk-dev] [PATCH v3 2/2] Add l2reflect measurement application Felix Moessbauer
2020-07-15 15:54     ` [dpdk-dev] [PATCH v4 0/2] " Felix Moessbauer
2020-07-15 15:54     ` [dpdk-dev] [PATCH v4 1/2] Fix build of apps with external dependencies Felix Moessbauer
2020-07-15 15:54     ` [dpdk-dev] [PATCH v4 2/2] Add l2reflect measurement application Felix Moessbauer
2020-08-28 14:22       ` Henning Schild
2020-09-01 15:07         ` Thomas Monjalon
2020-09-02  7:56           ` Henning Schild [this message]
2020-09-02  8:30             ` Thomas Monjalon
2020-09-02  9:10               ` Henning Schild
2020-09-15  8:38       ` [dpdk-dev] [PATCH v5 0/2] " Felix Moessbauer
2020-09-15  8:38       ` [dpdk-dev] [PATCH 1/2] Fix build of apps with external dependencies Felix Moessbauer
2020-10-20 14:12         ` Thomas Monjalon
2020-10-20 14:32           ` Bruce Richardson
2020-09-15  8:38       ` [dpdk-dev] [PATCH 2/2] Add l2reflect measurement application Felix Moessbauer
2020-07-03 15:51 ` [dpdk-dev] [RFC PATCH 0/1] " Thomas Monjalon

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200902095605.4c8e9373@md1za8fc.ad001.siemens.net \
    --to=henning.schild@siemens.com \
    --cc=ali.rizvi@emumba.com \
    --cc=anatoly.burakov@intel.com \
    --cc=bruce.richardson@intel.com \
    --cc=david.hunt@intel.com \
    --cc=dev@dpdk.org \
    --cc=felix.moessbauer@siemens.com \
    --cc=ferruh.yigit@intel.com \
    --cc=honnappa.nagarahalli@arm.com \
    --cc=jerinj@marvell.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=ktraynor@redhat.com \
    --cc=lijuan.tu@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=muhammad.ahamd@emumba.com \
    --cc=ohilyard@iol.unh.edu \
    --cc=olivier.matz@6wind.com \
    --cc=reshma.pattan@intel.com \
    --cc=stephen@networkplumber.org \
    --cc=thomas@monjalon.net \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).