DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process
@ 2019-08-22  8:42 Jim Harris
  2019-08-26 13:39 ` [dpdk-dev] [PATCH v4] eal: use memzone to share tsc hz with secondary processes Jim Harris
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Jim Harris @ 2019-08-22  8:42 UTC (permalink / raw)
  To: dev, anatoly.burakov

Ideally, get_tsc_freq_arch() is able to provide the
TSC rate using architecture-specific means.  When that
is not possible, DPDK reverts to calculating the
TSC rate with a 100ms nanosleep or 1s sleep.  The latter
occurs more frequently in VMs which often do not have
access to the data they need from arch-specific means
(CPUID leaf 0x15 or MSR 0xCE on x86).

In secondary processes, the extra 100ms is especially
noticeable and consumes the bulk of rte_eal_init()
execution time.  So in secondary processes, if
we cannot get the TSC rate using get_tsc_freq_arch(),
try to get the TSC rate from the primary process
instead using rte_mp_msg.  This is much faster than
100ms.

Reduces rte_eal_init() execution time in a secondary
process from 165ms to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
---
 lib/librte_eal/common/eal_common_timer.c |   62 ++++++++++++++++++++++++++++++
 1 file changed, 62 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c
index 145543de7..ad965455d 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -15,9 +15,17 @@
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_pause.h>
+#include <rte_eal.h>
+#include <rte_errno.h>
 
 #include "eal_private.h"
 
+#define EAL_TIMER_MP "eal_timer_mp_sync"
+
+struct timer_mp_param {
+	uint64_t tsc_hz;
+};
+
 /* The frequency of the RDTSC timer resolution */
 static uint64_t eal_tsc_resolution_hz;
 
@@ -74,12 +82,58 @@ estimate_tsc_freq(void)
 	return RTE_ALIGN_MUL_NEAR(rte_rdtsc() - start, CYC_PER_10MHZ);
 }
 
+static uint64_t
+get_tsc_freq_from_primary(void)
+{
+	struct rte_mp_msg mp_req = {0};
+	struct rte_mp_reply mp_reply = {0};
+	struct timer_mp_param *r;
+	struct timespec ts = {.tv_sec = 1, .tv_nsec = 0};
+	uint64_t tsc_hz;
+
+	strcpy(mp_req.name, EAL_TIMER_MP);
+	if (rte_mp_request_sync(&mp_req, &mp_reply, &ts) ||
+	    mp_reply.nb_received != 1) {
+		tsc_hz = 0;
+	} else {
+		r = (struct timer_mp_param *)mp_reply.msgs[0].param;
+		tsc_hz = r->tsc_hz;
+	}
+
+	free(mp_reply.msgs);
+	return tsc_hz;
+}
+
+static int
+timer_mp_primary(__attribute__((unused)) const struct rte_mp_msg *msg,
+		 const void *peer)
+{
+	struct rte_mp_msg reply = {0};
+	struct timer_mp_param *r = (struct timer_mp_param *)reply.param;
+
+	r->tsc_hz = eal_tsc_resolution_hz;
+	strcpy(reply.name, EAL_TIMER_MP);
+	reply.len_param = sizeof(*r);
+
+	return rte_mp_reply(&reply, peer);
+}
+
 void
 set_tsc_freq(void)
 {
 	uint64_t freq;
+	int rc;
 
 	freq = get_tsc_freq_arch();
+	if (!freq && rte_eal_process_type() != RTE_PROC_PRIMARY) {
+		/* We couldn't get the TSC frequency through arch-specific
+		 *  means.  If this is a secondary process, try to get the
+		 *  TSC frequency from the primary process - this will
+		 *  be much faster than get_tsc_freq() or estimate_tsc_freq()
+		 *  below.
+		 */
+		freq = get_tsc_freq_from_primary();
+	}
 	if (!freq)
 		freq = get_tsc_freq();
 	if (!freq)
@@ -87,6 +141,14 @@ set_tsc_freq(void)
 
 	RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000);
 	eal_tsc_resolution_hz = freq;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		rc = rte_mp_action_register(EAL_TIMER_MP, timer_mp_primary);
+		if (rc && rte_errno != ENOTSUP) {
+			RTE_LOG(WARNING, EAL, "Could not register mp_action - "
+				"secondary processes will calculate TSC rate "
+				"independently.\n");
+		}
+	}
 }
 
 void rte_delay_us_callback_register(void (*userfunc)(unsigned int))


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v4] eal: use memzone to share tsc hz with secondary processes
  2019-08-22  8:42 [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process Jim Harris
@ 2019-08-26 13:39 ` Jim Harris
  2019-08-26 13:44 ` [dpdk-dev] [PATCH v5] " Jim Harris
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Jim Harris @ 2019-08-26 13:39 UTC (permalink / raw)
  To: dev, bruce.richardson, anatoly.burakov

Ideally, get_tsc_freq_arch() is able to provide the
TSC rate using arch-specific means.  When that is not
possible, DPDK reverts to calculating the TSC rate with
a 100ms nanosleep or 1s sleep.  The latter occurs more
frequently in VMs which often do not have access to the
data they need from arch-specific means (CPUID leaf 0x15
or MSR 0xCE on x86).

In secondary processes, the extra 100ms is especially
noticeable and consumes the bulk of rte_eal_init()
execution time.  To resolve this extra delay, have
the primary process put the TSC rate into a shared
memory region that the secondary process can lookup.

Reduces rte_eal_init() execution time in a secondary
process from 165ms to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
---
 lib/librte_eal/common/eal_common_timer.c |   24 +++++++++++++++++++++---
 1 file changed, 21 insertions(+), 3 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c
index 145543de7..b2c813444 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -15,9 +15,12 @@
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_pause.h>
+#include <rte_memzone.h>
 
 #include "eal_private.h"
 
+static const char *MZ_RTE_TSC_FREQ = "rte_tsc_freq";
+
 /* The frequency of the RDTSC timer resolution */
 static uint64_t eal_tsc_resolution_hz;
 
@@ -77,9 +80,17 @@ estimate_tsc_freq(void)
 void
 set_tsc_freq(void)
 {
-	uint64_t freq;
-
-	freq = get_tsc_freq_arch();
+	const struct rte_memzone *mz;
+	uint64_t freq = 0;
+
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		mz = rte_memzone_lookup(MZ_RTE_TSC_FREQ);
+		if (mz != NULL) {
+			freq = *(uint64_t *)mz->addr;
+		}
+	}
+	if (!freq)
+		freq = get_tsc_freq_arch();
 	if (!freq)
 		freq = get_tsc_freq();
 	if (!freq)
@@ -87,6 +98,13 @@ set_tsc_freq(void)
 
 	RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000);
 	eal_tsc_resolution_hz = freq;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		mz = rte_memzone_reserve(MZ_RTE_TSC_FREQ, sizeof(uint64_t),
+					 SOCKET_ID_ANY, 0);
+		if (mz != NULL) {
+			*(uint64_t *)mz->addr = freq;
+		}
+	}
 }
 
 void rte_delay_us_callback_register(void (*userfunc)(unsigned int))


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-22  8:42 [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process Jim Harris
  2019-08-26 13:39 ` [dpdk-dev] [PATCH v4] eal: use memzone to share tsc hz with secondary processes Jim Harris
@ 2019-08-26 13:44 ` Jim Harris
  2019-08-27  8:14   ` Bruce Richardson
  2019-08-27 12:04   ` Burakov, Anatoly
  2019-08-27 16:16 ` [dpdk-dev] [PATCH v6] eal: add tsc_hz to rte_mem_config Jim Harris
  2019-10-07 15:28 ` [dpdk-dev] [PATCH v6 RESEND] " Jim Harris
  3 siblings, 2 replies; 12+ messages in thread
From: Jim Harris @ 2019-08-26 13:44 UTC (permalink / raw)
  To: dev, bruce.richardson, anatoly.burakov

Ideally, get_tsc_freq_arch() is able to provide the
TSC rate using arch-specific means.  When that is not
possible, DPDK reverts to calculating the TSC rate with
a 100ms nanosleep or 1s sleep.  The latter occurs more
frequently in VMs which often do not have access to the
data they need from arch-specific means (CPUID leaf 0x15
or MSR 0xCE on x86).

In secondary processes, the extra 100ms is especially
noticeable and consumes the bulk of rte_eal_init()
execution time.  To resolve this extra delay, have
the primary process put the TSC rate into a shared
memory region that the secondary process can lookup.

Reduces rte_eal_init() execution time in a secondary
process from 165ms to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
---
 lib/librte_eal/common/eal_common_timer.c |   20 ++++++++++++++++++--
 1 file changed, 18 insertions(+), 2 deletions(-)

diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c
index 145543de7..2aeab6462 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -15,9 +15,12 @@
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_pause.h>
+#include <rte_memzone.h>
 
 #include "eal_private.h"
 
+static const char *MZ_RTE_TSC_FREQ = "rte_tsc_freq";
+
 /* The frequency of the RDTSC timer resolution */
 static uint64_t eal_tsc_resolution_hz;
 
@@ -77,9 +80,16 @@ estimate_tsc_freq(void)
 void
 set_tsc_freq(void)
 {
-	uint64_t freq;
+	const struct rte_memzone *mz;
+	uint64_t freq = 0;
 
-	freq = get_tsc_freq_arch();
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		mz = rte_memzone_lookup(MZ_RTE_TSC_FREQ);
+		if (mz != NULL)
+			freq = *(uint64_t *)mz->addr;
+	}
+	if (!freq)
+		freq = get_tsc_freq_arch();
 	if (!freq)
 		freq = get_tsc_freq();
 	if (!freq)
@@ -87,6 +97,12 @@ set_tsc_freq(void)
 
 	RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000);
 	eal_tsc_resolution_hz = freq;
+	if (rte_eal_process_type() == RTE_PROC_PRIMARY) {
+		mz = rte_memzone_reserve(MZ_RTE_TSC_FREQ, sizeof(uint64_t),
+					 SOCKET_ID_ANY, 0);
+		if (mz != NULL)
+			*(uint64_t *)mz->addr = freq;
+	}
 }
 
 void rte_delay_us_callback_register(void (*userfunc)(unsigned int))


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-26 13:44 ` [dpdk-dev] [PATCH v5] " Jim Harris
@ 2019-08-27  8:14   ` Bruce Richardson
  2019-08-27 12:04   ` Burakov, Anatoly
  1 sibling, 0 replies; 12+ messages in thread
From: Bruce Richardson @ 2019-08-27  8:14 UTC (permalink / raw)
  To: Jim Harris; +Cc: dev, anatoly.burakov

On Mon, Aug 26, 2019 at 06:44:46AM -0700, Jim Harris wrote:
> Ideally, get_tsc_freq_arch() is able to provide the
> TSC rate using arch-specific means.  When that is not
> possible, DPDK reverts to calculating the TSC rate with
> a 100ms nanosleep or 1s sleep.  The latter occurs more
> frequently in VMs which often do not have access to the
> data they need from arch-specific means (CPUID leaf 0x15
> or MSR 0xCE on x86).
> 
> In secondary processes, the extra 100ms is especially
> noticeable and consumes the bulk of rte_eal_init()
> execution time.  To resolve this extra delay, have
> the primary process put the TSC rate into a shared
> memory region that the secondary process can lookup.
> 
> Reduces rte_eal_init() execution time in a secondary
> process from 165ms to 66ms on my test system.
> 
> Signed-off-by: Jim Harris <james.r.harris@intel.com>
> ---

I think using shared memory is a lot simpler to manage, so LGTM.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-26 13:44 ` [dpdk-dev] [PATCH v5] " Jim Harris
  2019-08-27  8:14   ` Bruce Richardson
@ 2019-08-27 12:04   ` Burakov, Anatoly
  2019-08-27 12:48     ` Bruce Richardson
  2019-08-27 14:00     ` Bruce Richardson
  1 sibling, 2 replies; 12+ messages in thread
From: Burakov, Anatoly @ 2019-08-27 12:04 UTC (permalink / raw)
  To: Jim Harris, dev, bruce.richardson

On 26-Aug-19 2:44 PM, Jim Harris wrote:
> Ideally, get_tsc_freq_arch() is able to provide the
> TSC rate using arch-specific means.  When that is not
> possible, DPDK reverts to calculating the TSC rate with
> a 100ms nanosleep or 1s sleep.  The latter occurs more
> frequently in VMs which often do not have access to the
> data they need from arch-specific means (CPUID leaf 0x15
> or MSR 0xCE on x86).
> 
> In secondary processes, the extra 100ms is especially
> noticeable and consumes the bulk of rte_eal_init()
> execution time.  To resolve this extra delay, have
> the primary process put the TSC rate into a shared
> memory region that the secondary process can lookup.
> 
> Reduces rte_eal_init() execution time in a secondary
> process from 165ms to 66ms on my test system.
> 
> Signed-off-by: Jim Harris <james.r.harris@intel.com>
> ---

I think this is a bad idea. If you're allocating something, you're 
supposed to free it in rte_eal_cleanup(). If you don't, that memory 
leaks (i.e. there are leftover hugepages after process exit). Since both 
primary and secondary are referencing it (even if only at init), there 
is no safe way to free this memzone.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-27 12:04   ` Burakov, Anatoly
@ 2019-08-27 12:48     ` Bruce Richardson
  2019-08-28  9:01       ` Burakov, Anatoly
  2019-08-27 14:00     ` Bruce Richardson
  1 sibling, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2019-08-27 12:48 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: Jim Harris, dev

On Tue, Aug 27, 2019 at 01:04:18PM +0100, Burakov, Anatoly wrote:
> On 26-Aug-19 2:44 PM, Jim Harris wrote:
> > Ideally, get_tsc_freq_arch() is able to provide the
> > TSC rate using arch-specific means.  When that is not
> > possible, DPDK reverts to calculating the TSC rate with
> > a 100ms nanosleep or 1s sleep.  The latter occurs more
> > frequently in VMs which often do not have access to the
> > data they need from arch-specific means (CPUID leaf 0x15
> > or MSR 0xCE on x86).
> > 
> > In secondary processes, the extra 100ms is especially
> > noticeable and consumes the bulk of rte_eal_init()
> > execution time.  To resolve this extra delay, have
> > the primary process put the TSC rate into a shared
> > memory region that the secondary process can lookup.
> > 
> > Reduces rte_eal_init() execution time in a secondary
> > process from 165ms to 66ms on my test system.
> > 
> > Signed-off-by: Jim Harris <james.r.harris@intel.com>
> > ---
> 
> I think this is a bad idea. If you're allocating something, you're supposed
> to free it in rte_eal_cleanup(). If you don't, that memory leaks (i.e. there
> are leftover hugepages after process exit). Since both primary and secondary
> are referencing it (even if only at init), there is no safe way to free this
> memzone.
> 
What is the issue with not freeing it? How do we handle this for other
global data to be shared?

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-27 12:04   ` Burakov, Anatoly
  2019-08-27 12:48     ` Bruce Richardson
@ 2019-08-27 14:00     ` Bruce Richardson
  1 sibling, 0 replies; 12+ messages in thread
From: Bruce Richardson @ 2019-08-27 14:00 UTC (permalink / raw)
  To: Burakov, Anatoly; +Cc: Jim Harris, dev

On Tue, Aug 27, 2019 at 01:04:18PM +0100, Burakov, Anatoly wrote:
> On 26-Aug-19 2:44 PM, Jim Harris wrote:
> > Ideally, get_tsc_freq_arch() is able to provide the
> > TSC rate using arch-specific means.  When that is not
> > possible, DPDK reverts to calculating the TSC rate with
> > a 100ms nanosleep or 1s sleep.  The latter occurs more
> > frequently in VMs which often do not have access to the
> > data they need from arch-specific means (CPUID leaf 0x15
> > or MSR 0xCE on x86).
> > 
> > In secondary processes, the extra 100ms is especially
> > noticeable and consumes the bulk of rte_eal_init()
> > execution time.  To resolve this extra delay, have
> > the primary process put the TSC rate into a shared
> > memory region that the secondary process can lookup.
> > 
> > Reduces rte_eal_init() execution time in a secondary
> > process from 165ms to 66ms on my test system.
> > 
> > Signed-off-by: Jim Harris <james.r.harris@intel.com>
> > ---
> 
> I think this is a bad idea. If you're allocating something, you're supposed
> to free it in rte_eal_cleanup(). If you don't, that memory leaks (i.e. there
> are leftover hugepages after process exit). Since both primary and secondary
> are referencing it (even if only at init), there is no safe way to free this
> memzone.
> 
Actually, after looking at the code again, the secondary processes only
reference the memzone once on init, and fallback to the current path if the
memzone does not exist. Therefore, it should be safe to remove the memzone
on termination of the primary process.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v6] eal: add tsc_hz to rte_mem_config
  2019-08-22  8:42 [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process Jim Harris
  2019-08-26 13:39 ` [dpdk-dev] [PATCH v4] eal: use memzone to share tsc hz with secondary processes Jim Harris
  2019-08-26 13:44 ` [dpdk-dev] [PATCH v5] " Jim Harris
@ 2019-08-27 16:16 ` Jim Harris
  2019-10-07 15:28 ` [dpdk-dev] [PATCH v6 RESEND] " Jim Harris
  3 siblings, 0 replies; 12+ messages in thread
From: Jim Harris @ 2019-08-27 16:16 UTC (permalink / raw)
  To: dev, bruce.richardson, anatoly.burakov

This ensures secondary processes never have to
calculate the TSC rate themselves, which can be
noticeable in VMs that don't have access to
arch-specific detection mechanism (such as
CPUID leaf 0x15 or MSR 0xCE on x86).

Since rte_mem_config is now internal to the rte_eal
library, we can add tsc_hz without ABI breakage
concerns.

Reduces rte_eal_init() execution time in a secondary
process from 165ms to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
---
 lib/librte_eal/common/eal_common_timer.c |   15 +++++++++++++++
 lib/librte_eal/common/eal_memcfg.h       |    3 +++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c
index 145543de7..fa9ee1b22 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -15,8 +15,10 @@
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_pause.h>
+#include <rte_eal.h>
 
 #include "eal_private.h"
+#include "eal_memcfg.h"
 
 /* The frequency of the RDTSC timer resolution */
 static uint64_t eal_tsc_resolution_hz;
@@ -77,8 +79,20 @@ estimate_tsc_freq(void)
 void
 set_tsc_freq(void)
 {
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	uint64_t freq;
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		/*
+		 * Just use the primary process calculated TSC rate in any
+		 * secondary process.  It avoids any unnecessary overhead on
+		 * systems where arch-specific frequency detection is not
+		 * available.
+		 */
+		eal_tsc_resolution_hz = mcfg->tsc_hz;
+		return;
+	}
+
 	freq = get_tsc_freq_arch();
 	if (!freq)
 		freq = get_tsc_freq();
@@ -87,6 +101,7 @@ set_tsc_freq(void)
 
 	RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000);
 	eal_tsc_resolution_hz = freq;
+	mcfg->tsc_hz = freq;
 }
 
 void rte_delay_us_callback_register(void (*userfunc)(unsigned int))
diff --git a/lib/librte_eal/common/eal_memcfg.h b/lib/librte_eal/common/eal_memcfg.h
index 359beb216..73be6fbae 100644
--- a/lib/librte_eal/common/eal_memcfg.h
+++ b/lib/librte_eal/common/eal_memcfg.h
@@ -70,6 +70,9 @@ struct rte_mem_config {
 	uint32_t single_file_segments;
 	/**< stored single file segments parameter. */
 
+	uint64_t tsc_hz;
+	/**< TSC rate */
+
 	uint8_t dma_maskbits; /**< Keeps the more restricted dma mask. */
 };
 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v5] eal: use memzone to share tsc hz with secondary processes
  2019-08-27 12:48     ` Bruce Richardson
@ 2019-08-28  9:01       ` Burakov, Anatoly
  0 siblings, 0 replies; 12+ messages in thread
From: Burakov, Anatoly @ 2019-08-28  9:01 UTC (permalink / raw)
  To: Bruce Richardson; +Cc: Jim Harris, dev

On 27-Aug-19 1:48 PM, Bruce Richardson wrote:
> On Tue, Aug 27, 2019 at 01:04:18PM +0100, Burakov, Anatoly wrote:
>> On 26-Aug-19 2:44 PM, Jim Harris wrote:
>>> Ideally, get_tsc_freq_arch() is able to provide the
>>> TSC rate using arch-specific means.  When that is not
>>> possible, DPDK reverts to calculating the TSC rate with
>>> a 100ms nanosleep or 1s sleep.  The latter occurs more
>>> frequently in VMs which often do not have access to the
>>> data they need from arch-specific means (CPUID leaf 0x15
>>> or MSR 0xCE on x86).
>>>
>>> In secondary processes, the extra 100ms is especially
>>> noticeable and consumes the bulk of rte_eal_init()
>>> execution time.  To resolve this extra delay, have
>>> the primary process put the TSC rate into a shared
>>> memory region that the secondary process can lookup.
>>>
>>> Reduces rte_eal_init() execution time in a secondary
>>> process from 165ms to 66ms on my test system.
>>>
>>> Signed-off-by: Jim Harris <james.r.harris@intel.com>
>>> ---
>>
>> I think this is a bad idea. If you're allocating something, you're supposed
>> to free it in rte_eal_cleanup(). If you don't, that memory leaks (i.e. there
>> are leftover hugepages after process exit). Since both primary and secondary
>> are referencing it (even if only at init), there is no safe way to free this
>> memzone.
>>
> What is the issue with not freeing it? How do we handle this for other
> global data to be shared?
> 

I don't think we have any shared data that persists after process exit. 
Or at least not in EAL we don't - i'm aware that some drivers leak 
memory in that way, but that's on them, not on me :) The immediate 
consequence of such an approach is failed EAL flags unit tests (because 
they expect that nothing is left after DPDK process termination).

Most shared data in libraries is explicitly initialized/deinitialized by 
the library user, so it's not an issue. An exception to that is the 
timer library, which uses refcounting to initialize/deinitialize shared 
state.

-- 
Thanks,
Anatoly

^ permalink raw reply	[flat|nested] 12+ messages in thread

* [dpdk-dev] [PATCH v6 RESEND] eal: add tsc_hz to rte_mem_config
  2019-08-22  8:42 [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process Jim Harris
                   ` (2 preceding siblings ...)
  2019-08-27 16:16 ` [dpdk-dev] [PATCH v6] eal: add tsc_hz to rte_mem_config Jim Harris
@ 2019-10-07 15:28 ` Jim Harris
  2019-10-08  8:38   ` Bruce Richardson
  3 siblings, 1 reply; 12+ messages in thread
From: Jim Harris @ 2019-10-07 15:28 UTC (permalink / raw)
  To: dev, bruce.richardson, anatoly.burakov

This ensures secondary processes never have to
calculate the TSC rate themselves, which can be
noticeable in VMs that don't have access to
arch-specific detection mechanism (such as
CPUID leaf 0x15 or MSR 0xCE on x86).

Since rte_mem_config is now internal to the rte_eal
library, we can add tsc_hz without ABI breakage
concerns.

Reduces rte_eal_init() execution time in a secondary
process from 165ms to 66ms on my test system.

Signed-off-by: Jim Harris <james.r.harris@intel.com>
---
 lib/librte_eal/common/eal_common_timer.c |   15 +++++++++++++++
 lib/librte_eal/common/eal_memcfg.h       |    3 +++
 2 files changed, 18 insertions(+)

diff --git a/lib/librte_eal/common/eal_common_timer.c b/lib/librte_eal/common/eal_common_timer.c
index 145543de7..fa9ee1b22 100644
--- a/lib/librte_eal/common/eal_common_timer.c
+++ b/lib/librte_eal/common/eal_common_timer.c
@@ -15,8 +15,10 @@
 #include <rte_log.h>
 #include <rte_cycles.h>
 #include <rte_pause.h>
+#include <rte_eal.h>
 
 #include "eal_private.h"
+#include "eal_memcfg.h"
 
 /* The frequency of the RDTSC timer resolution */
 static uint64_t eal_tsc_resolution_hz;
@@ -77,8 +79,20 @@ estimate_tsc_freq(void)
 void
 set_tsc_freq(void)
 {
+	struct rte_mem_config *mcfg = rte_eal_get_configuration()->mem_config;
 	uint64_t freq;
 
+	if (rte_eal_process_type() == RTE_PROC_SECONDARY) {
+		/*
+		 * Just use the primary process calculated TSC rate in any
+		 * secondary process.  It avoids any unnecessary overhead on
+		 * systems where arch-specific frequency detection is not
+		 * available.
+		 */
+		eal_tsc_resolution_hz = mcfg->tsc_hz;
+		return;
+	}
+
 	freq = get_tsc_freq_arch();
 	if (!freq)
 		freq = get_tsc_freq();
@@ -87,6 +101,7 @@ set_tsc_freq(void)
 
 	RTE_LOG(DEBUG, EAL, "TSC frequency is ~%" PRIu64 " KHz\n", freq / 1000);
 	eal_tsc_resolution_hz = freq;
+	mcfg->tsc_hz = freq;
 }
 
 void rte_delay_us_callback_register(void (*userfunc)(unsigned int))
diff --git a/lib/librte_eal/common/eal_memcfg.h b/lib/librte_eal/common/eal_memcfg.h
index 359beb216..73be6fbae 100644
--- a/lib/librte_eal/common/eal_memcfg.h
+++ b/lib/librte_eal/common/eal_memcfg.h
@@ -70,6 +70,9 @@ struct rte_mem_config {
 	uint32_t single_file_segments;
 	/**< stored single file segments parameter. */
 
+	uint64_t tsc_hz;
+	/**< TSC rate */
+
 	uint8_t dma_maskbits; /**< Keeps the more restricted dma mask. */
 };
 


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v6 RESEND] eal: add tsc_hz to rte_mem_config
  2019-10-07 15:28 ` [dpdk-dev] [PATCH v6 RESEND] " Jim Harris
@ 2019-10-08  8:38   ` Bruce Richardson
  2019-10-21  8:23     ` David Marchand
  0 siblings, 1 reply; 12+ messages in thread
From: Bruce Richardson @ 2019-10-08  8:38 UTC (permalink / raw)
  To: Jim Harris; +Cc: dev, anatoly.burakov

On Mon, Oct 07, 2019 at 08:28:21AM -0700, Jim Harris wrote:
> This ensures secondary processes never have to
> calculate the TSC rate themselves, which can be
> noticeable in VMs that don't have access to
> arch-specific detection mechanism (such as
> CPUID leaf 0x15 or MSR 0xCE on x86).
> 
> Since rte_mem_config is now internal to the rte_eal
> library, we can add tsc_hz without ABI breakage
> concerns.
> 
> Reduces rte_eal_init() execution time in a secondary
> process from 165ms to 66ms on my test system.
> 
> Signed-off-by: Jim Harris <james.r.harris@intel.com>
> ---
This seems a good idea to me.

Acked-by: Bruce Richardson <bruce.richardson@intel.com>

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [dpdk-dev] [PATCH v6 RESEND] eal: add tsc_hz to rte_mem_config
  2019-10-08  8:38   ` Bruce Richardson
@ 2019-10-21  8:23     ` David Marchand
  0 siblings, 0 replies; 12+ messages in thread
From: David Marchand @ 2019-10-21  8:23 UTC (permalink / raw)
  To: Jim Harris; +Cc: dev, Bruce Richardson, Burakov, Anatoly

On Tue, Oct 8, 2019 at 10:39 AM Bruce Richardson
<bruce.richardson@intel.com> wrote:
>
> On Mon, Oct 07, 2019 at 08:28:21AM -0700, Jim Harris wrote:
> > This ensures secondary processes never have to
> > calculate the TSC rate themselves, which can be
> > noticeable in VMs that don't have access to
> > arch-specific detection mechanism (such as
> > CPUID leaf 0x15 or MSR 0xCE on x86).
> >
> > Since rte_mem_config is now internal to the rte_eal
> > library, we can add tsc_hz without ABI breakage
> > concerns.
> >
> > Reduces rte_eal_init() execution time in a secondary
> > process from 165ms to 66ms on my test system.
> >
> > Signed-off-by: Jim Harris <james.r.harris@intel.com>
> > ---
> This seems a good idea to me.
>
> Acked-by: Bruce Richardson <bruce.richardson@intel.com>

I wonder if we can get rid of eal_tsc_resolution_hz and just rely on
the shared memory to get/store this info.
Feel free to look at this later and send a followup patch :-).


Applied, thanks.


--
David Marchand


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-10-21  8:23 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-08-22  8:42 [dpdk-dev] [PATCH v3] timer: use rte_mp_msg to get freq from primary process Jim Harris
2019-08-26 13:39 ` [dpdk-dev] [PATCH v4] eal: use memzone to share tsc hz with secondary processes Jim Harris
2019-08-26 13:44 ` [dpdk-dev] [PATCH v5] " Jim Harris
2019-08-27  8:14   ` Bruce Richardson
2019-08-27 12:04   ` Burakov, Anatoly
2019-08-27 12:48     ` Bruce Richardson
2019-08-28  9:01       ` Burakov, Anatoly
2019-08-27 14:00     ` Bruce Richardson
2019-08-27 16:16 ` [dpdk-dev] [PATCH v6] eal: add tsc_hz to rte_mem_config Jim Harris
2019-10-07 15:28 ` [dpdk-dev] [PATCH v6 RESEND] " Jim Harris
2019-10-08  8:38   ` Bruce Richardson
2019-10-21  8:23     ` David Marchand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).