[dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into sub-functions

DPDK patches and discussions
 help / color / mirror / Atom feed

From: Rahul Gupta <rahulgupt@linux.microsoft.com>
To: dev@dpdk.org, thomas@monjalon.net, bruce.richardson@intel.com,
	dmitry.kozliuk@gmail.com, stephen@networkplumber.org
Cc: sovaradh@linux.microsoft.com, okaya@kernel.org,
	sujithsankar@microsoft.com, sowmini.varadhan@microsoft.com,
	krathinavel@microsoft.com, rahulrgupta27@gmail.com,
	Rahul Gupta <rahulgupt@microsoft.com>
Subject: [dpdk-dev] [PATCH v4] eal: refactor rte_eal_init into sub-functions
Date: Wed, 24 Jan 2024 05:45:11 -0800	[thread overview]
Message-ID: <1706103911-6907-1-git-send-email-rahulgupt@linux.microsoft.com> (raw)

From: Rahul Gupta <rahulgupt@microsoft.com>

In continuation to the following email, I am sending this patch.
(https://inbox.dpdk.org/dev/20231110172523.GA17466@microsoft.com/)

Initialization requires rte_eal_init + rte_pktmbuf_pool_create which
can consume a total time of 500-600 ms:
a) For many devices FLR may take a significant chunk of time
   (200-250 ms in our use-case), this FLR is triggered during device
   probe in rte_eal_init().
b) rte_pktmbuf_pool_create() can consume up to 300-350 ms for
applications that require huge memory.

This cost is incurred on each restart (which happens in our use-case
during binary updates for servicing).
This patch provides an optimization using pthreads that applications
can use and which can save 200-230ms.

In this patch, rte_eal_init() is refactored into two parts-
a) 1st part is dependent code ie- it’s a perquisite of the FLR and
   mempool creation. So this code needs to be executed before any
   pthreads. Its named as rte_eal_init_setup()
b) 2nd part of code is independent code ie- it can execute in parallel
   to mempool creation in a pthread. Its named as rte_eal_init_async_setup().

In existing applications no changes are required unless they wish to leverage
the optimization.

If the application wants to leverage this optimization, then it needs to call
rte_eal_init_async() (instead of call rte_eal_init()), then it can create a
thread using rte_eal_remote_launch() to schedule a task it would like todo in
parallel rte_eal_init_async_setup(), this task can be a mbuf pool creation
using- rte_pktmbuf_pool_create()

After this, if next operations require completion of above task, then
user can use rte_eal_init_wait_async_setup_complete(),
or if user wants to just check status of that thread, then use-
rte_eal_init_async_setup_done()

---
v2: Address Stephen Hemminger's comment
---
v3: address support for single lcore
---
v4: address Brue Richardson and Stephen Hemminger comment
Existing application need not do any changes if bootup optimization is not needed.

 app/test-pmd/testpmd.c    |  24 ++++++++-
 lib/eal/include/rte_eal.h | 107 ++++++++++++++++++++++++++++++++++++++
 lib/eal/linux/eal.c       |  62 ++++++++++++++++++++--
 lib/eal/version.map       |   7 +++
 4 files changed, 196 insertions(+), 4 deletions(-)

diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c
index 9e4e99e53b..c8eb194f64 100644
--- a/app/test-pmd/testpmd.c
+++ b/app/test-pmd/testpmd.c
@@ -4531,6 +4531,8 @@ main(int argc, char** argv)
 	portid_t port_id;
 	uint16_t count;
 	int ret;
+	int lcore_id;
+	int main_lcore_id;
 
 #ifdef RTE_EXEC_ENV_WINDOWS
 	signal(SIGINT, signal_handler);
@@ -4550,11 +4552,31 @@ main(int argc, char** argv)
 		rte_exit(EXIT_FAILURE, "Cannot register log type");
 	rte_log_set_level(testpmd_logtype, RTE_LOG_DEBUG);
 
-	diag = rte_eal_init(argc, argv);
+	diag = rte_eal_init_async(argc, argv);
 	if (diag < 0)
 		rte_exit(EXIT_FAILURE, "Cannot init EAL: %s\n",
 			 rte_strerror(rte_errno));
 
+	main_lcore_id = rte_get_main_lcore();
+	lcore_id = rte_get_next_lcore(main_lcore_id, 0, 1);
+	/* Gives status of rte_eal_init_async() */
+	if (main_lcore_id != lcore_id)
+		while (rte_eal_init_async_setup_done(lcore_id) == 0)
+			;
+
+	/*
+	 * Use rte_eal_init_wait_async_setup_complete() to get return value of
+	 * rte_eal_init_async().
+	 * Or
+	 * if testpmd application don't want to know progress/status of
+	 * rte_eal_init_async() and just want to wait till it finishes
+	 * then use following function.
+	 */
+	ret = rte_eal_init_wait_async_setup_complete();
+	if (ret < 0)
+		rte_exit(EXIT_FAILURE, "Cannot init EAL: "
+			 "rte_eal_init_async() failed: %s\n",
+			 strerror(ret));
 	/* allocate port structures, and init them */
 	init_port();
 
diff --git a/lib/eal/include/rte_eal.h b/lib/eal/include/rte_eal.h
index c2256f832e..6d7044b632 100644
--- a/lib/eal/include/rte_eal.h
+++ b/lib/eal/include/rte_eal.h
@@ -111,6 +111,113 @@ int rte_eal_iopl_init(void);
  */
 int rte_eal_init(int argc, char **argv);
 
+/**
+ * Initialize the Environment Abstraction Layer (EAL).
+ *
+ * This function is to be executed on the MAIN lcore only, as soon
+ * as possible in the application's main() function.
+ * It puts the WORKER lcores in the WAIT state.
+ *
+ * @param argc
+ *   A non-negative value.  If it is greater than 0, the array members
+ *   for argv[0] through argv[argc] (non-inclusive) shall contain pointers
+ *   to strings.
+ * @param argv
+ *   An array of strings.  The contents of the array, as well as the strings
+ *   which are pointed to by the array, may be modified by this function.
+ *   The program name pointer argv[0] is copied into the last parsed argv
+ *   so that argv[0] is still the same after deducing the parsed arguments.
+ * @return
+ *   - On success, the number of parsed arguments, which is greater or
+ *     equal to zero. After the call to rte_eal_init_async(),
+ *     all arguments argv[x] with x < ret may have been modified by this
+ *     function call and should not be further interpreted by the
+ *     application.  The EAL does not take any ownership of the memory used
+ *     for either the argv array, or its members.
+ *   - On failure, -1 and rte_errno is set to a value indicating the cause
+ *     for failure.  In some instances, the application will need to be
+ *     restarted as part of clearing the issue.
+ *
+ *   Error codes returned via rte_errno:
+ *     EACCES indicates a permissions issue.
+ *
+ *     EAGAIN indicates either a bus or system resource was not available,
+ *            setup may be attempted again.
+ *
+ *     EALREADY indicates that the rte_eal_init_async function has already been
+ *              called, and cannot be called again.
+ *
+ *     EFAULT indicates the tailq configuration name was not found in
+ *            memory configuration.
+ *
+ *     EINVAL indicates invalid parameters were passed as argv/argc.
+ *
+ *     ENOMEM indicates failure likely caused by an out-of-memory condition.
+ *
+ *     ENODEV indicates memory setup issues.
+ *
+ *     ENOTSUP indicates that the EAL cannot initialize on this system.
+ *
+ *     EPROTO indicates that the PCI bus is either not present, or is not
+ *            readable by the eal.
+ *
+ *     ENOEXEC indicates that a service core failed to launch successfully.
+ */
+int rte_eal_init_async(int argc, char **argv);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Initial setup
+ *
+ * Its called from rte_eal_init() on MAIN lcore only and must NOT be directly
+ * called by user application.
+ * The driver dependent code is present in this function, ie before calling any other
+ * function eal library function this function must be complete successfully.
+ *
+ * return value is same as rte_eal_init().
+ */
+
+__rte_experimental
+int rte_eal_init_setup(int argc, char **argv);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): FLR and probe device
+ *
+ * Its thread is forked by rte_eal_init() and must NOT be directly called by user application.
+ * Launched on next available worker lcore.
+ * In this function initialisation needed for memory pool creation is completed,
+ * so this code can be executed in parallel to non device related operations
+ * like mbuf pool creation.
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_async_setup(__attribute__((unused)) void *arg);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion
+ *
+ * It waits for rte_eal_init_async() to finish. It MUST be called from application,
+ * when a thread join is needed. Typically application will call this function after
+ * it performs all device independent operation (like mbuf pool creation) on initial lcore.
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_wait_async_setup_complete(void);
+
+/**
+ * Initialize the Environment Abstraction Layer (EAL): Indication of rte_eal_init() completion
+ *
+ * It shows status of rte_eal_init_async() ie the function is executing or completed.
+ * It MUST be called from application,
+ * Typically an application will call this function when it wants to know status of
+ * rte_eal_init_async() (ie FLR and probe thread).
+ *
+ * return value is same as rte_eal_init().
+ */
+__rte_experimental
+int rte_eal_init_async_setup_done(int lcore_id);
+
 /**
  * Clean up the Environment Abstraction Layer (EAL)
  *
diff --git a/lib/eal/linux/eal.c b/lib/eal/linux/eal.c
index fd422f1f62..e86aac5980 100644
--- a/lib/eal/linux/eal.c
+++ b/lib/eal/linux/eal.c
@@ -962,9 +962,8 @@ eal_worker_thread_create(unsigned int lcore_id)
 	return ret;
 }
 
-/* Launch threads, called at application init(). */
-int
-rte_eal_init(int argc, char **argv)
+__rte_experimental
+int rte_eal_init_setup(int argc, char **argv)
 {
 	int i, fctret, ret;
 	static RTE_ATOMIC(uint32_t) run_once;
@@ -1268,7 +1267,15 @@ rte_eal_init(int argc, char **argv)
 	 */
 	rte_eal_mp_remote_launch(sync_func, NULL, SKIP_MAIN);
 	rte_eal_mp_wait_lcore();
+	return fctret;
+}
 
+__rte_experimental int
+rte_eal_init_async_setup(__attribute__((unused)) void *arg)
+{
+	int ret = 0;
+	struct internal_config *internal_conf =
+		eal_get_internal_configuration();
 	/* initialize services so vdevs register service during bus_probe. */
 	ret = rte_service_init();
 	if (ret) {
@@ -1322,6 +1329,55 @@ rte_eal_init(int argc, char **argv)
 
 	eal_mcfg_complete();
 
+	return 0;
+}
+
+/*
+ * waits until function executing on given lcore finishes.
+ * returns value returned by the function executing on that lcore.
+ */
+__rte_experimental int
+rte_eal_init_wait_async_setup_complete(void)
+{
+	int lcore_id = -1;
+	lcore_id = rte_lcore_id();
+	lcore_id = rte_get_next_lcore(lcore_id, 0, 1);
+	int ret = rte_eal_wait_lcore(lcore_id);
+	return ret;
+}
+
+/*
+ * returns current status of execution on a given lcore
+ */
+__rte_experimental int
+rte_eal_init_async_setup_done(int lcore_id)
+{
+	int ret = (lcore_config[lcore_id].state);
+	return (ret == WAIT);
+}
+
+/* Launch threads, called at application init(). */
+int
+rte_eal_init(int argc, char **argv)
+{
+	int fctret = rte_eal_init_setup(argc, argv);
+	if (fctret < 0)
+		return fctret;
+	return rte_eal_init_async_setup(NULL);
+}
+
+/* Launch threads, called at application init(). */
+__rte_experimental int
+rte_eal_init_async(int argc, char **argv)
+{
+	int lcore_id;
+	int fctret = rte_eal_init_setup(argc, argv);	/* initial lcore*/
+	if (fctret < 0)
+		return fctret;
+	lcore_id = rte_lcore_id();
+	lcore_id = rte_get_next_lcore(lcore_id, 0, 1);
+	/* running on a worker lcore */
+	rte_eal_remote_launch(rte_eal_init_async_setup, NULL, lcore_id);
 	return fctret;
 }
 
diff --git a/lib/eal/version.map b/lib/eal/version.map
index 5e0cd47c82..5e7ccb67c4 100644
--- a/lib/eal/version.map
+++ b/lib/eal/version.map
@@ -393,6 +393,13 @@ EXPERIMENTAL {
 	# added in 23.07
 	rte_memzone_max_get;
 	rte_memzone_max_set;
+
+	# added in 24.01
+	rte_eal_init_async;
+	rte_eal_init_setup;
+	rte_eal_init_async_setup;
+	rte_eal_init_async_setup_done;
+	rte_eal_init_wait_async_setup_complete;
 };
 
 INTERNAL {
-- 
2.25.1

next             reply	other threads:[~2024-01-24 13:45 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-24 13:45 Rahul Gupta [this message]
2024-01-24 15:47 ` Stephen Hemminger
2024-01-24 15:47 ` Stephen Hemminger
2024-01-24 15:53 ` David Marchand
2024-01-29  5:35   ` Rahul Gupta
2024-01-29  7:55     ` David Marchand
2024-02-02 10:21       ` Thomas Monjalon
2024-02-03 12:57         ` Rahul Gupta
2024-02-08 17:05           ` Rahul Gupta
2024-01-24 17:49 ` Tyler Retzlaff

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1706103911-6907-1-git-send-email-rahulgupt@linux.microsoft.com \
    --to=rahulgupt@linux.microsoft.com \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=dmitry.kozliuk@gmail.com \
    --cc=krathinavel@microsoft.com \
    --cc=okaya@kernel.org \
    --cc=rahulgupt@microsoft.com \
    --cc=rahulrgupta27@gmail.com \
    --cc=sovaradh@linux.microsoft.com \
    --cc=sowmini.varadhan@microsoft.com \
    --cc=stephen@networkplumber.org \
    --cc=sujithsankar@microsoft.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).