From: "Morten Brørup" <mb@smartsharesystems.com>
To: "Don Wallwork" <donw@xsightlabs.com>,
"Stephen Hemminger" <stephen@networkplumber.org>,
"Anatoly Burakov" <anatoly.burakov@intel.com>,
"Dmitry Kozlyuk" <dmitry.kozliuk@gmail.com>,
"Bruce Richardson" <bruce.richardson@intel.com>
Cc: <dev@dpdk.org>
Subject: RE: [RFC] eal: allow worker lcore stacks to be allocated from hugepage memory
Date: Wed, 27 Apr 2022 10:17:40 +0200 [thread overview]
Message-ID: <98CBD80474FA8B44BF855DF32C47DC35D87006@smartserver.smartshare.dk> (raw)
In-Reply-To: <aa79c2bf-adb1-a489-6162-f2021fd4c144@xsightlabs.com>
+CC: EAL and Memory maintainers.
> From: Don Wallwork [mailto:donw@xsightlabs.com]
> Sent: Tuesday, 26 April 2022 23.26
>
> On 4/26/2022 5:21 PM, Stephen Hemminger wrote:
> > On Tue, 26 Apr 2022 17:01:18 -0400
> > Don Wallwork <donw@xsightlabs.com> wrote:
> >
> >> On 4/26/2022 10:58 AM, Stephen Hemminger wrote:
> >>> On Tue, 26 Apr 2022 08:19:59 -0400
> >>> Don Wallwork <donw@xsightlabs.com> wrote:
> >>>
> >>>> Add support for using hugepages for worker lcore stack memory.
> The
> >>>> intent is to improve performance by reducing stack memory related
> TLB
> >>>> misses and also by using memory local to the NUMA node of each
> lcore.
This certainly seems like a good idea!
However, I wonder: Does the O/S assign memory local to the NUMA node to an lcore-pinned thread's stack when instantiating the tread? And does the DPDK EAL ensure that the preconditions for the O/S to do that are present?
(Not relevant for this patch, but the same locality questions come to mind regarding Thread Local Storage.)
> >>>>
> >>>> Platforms desiring to make use of this capability must enable the
> >>>> associated option flag and stack size settings in platform config
> >>>> files.
> >>>> ---
> >>>> lib/eal/linux/eal.c | 39
> +++++++++++++++++++++++++++++++++++++++
> >>>> 1 file changed, 39 insertions(+)
> >>>>
> >>> Good idea but having a fixed size stack makes writing complex
> application
> >>> more difficult. Plus you lose the safety of guard pages.
Would it be possible to add a guard page or guard region by using the O/S memory allocator instead of rte_zmalloc_socket()? Since the stack is considered private to the process, i.e. not accessible from other processes, this patch does not need to provide remote access to stack memory from secondary processes - and thus it is not a requirement for this features to use DPDK managed memory.
> >> Thanks for the quick reply.
> >>
> >> The expectation is that use of this optional feature would be
> limited to
> >> cases where
> >> the performance gains justify the implications of these tradeoffs.
> For
> >> example, a specific
> >> data plane application may be okay with limited stack size and could
> be
> >> tested to ensure
> >> stack usage remains within limits.
How to identify the required stack size and verify it... If aiming for small stacks, some instrumentation would be nice, like rte_mempool_audit() and rte_mempool_list_dump().
Alternatively, just assume that the stack is "always big enough", and don't worry about it - like the default O/S stack size. And as Stephen already mentioned: Regardless of stack size, overflowing the stack will cause memory corruption instead of a segmentation fault.
Keep in mind that the required stack size not only depends on the application, but also on DPDK and other libraries being used by the application.
> >>
> >> Also, since this applies only to worker threads, the main thread
> would
> >> not be impacted
> >> by this change.
> >>
> >>
> > I would prefer it as a runtime, not compile time option.
> > That way distributions could ship DPDK and application could opt in
> if it wanted.
> Good point.. I'll work on a v2 and will post that when it's ready.
May I suggest using the stack size configured in the O/S, from pthread_attr_getstacksize() or similar, instead of choosing the stack size manually? If you want it to be configurable, use the default size unless explicitly specified otherwise.
Do the worker threads need a different stack size than the main thread? In my opinion: "Nice to have", not "must have".
Do the worker threads need different stack sizes individually? In my opinion: Perhaps "nice to have", certainly not "must have".
next prev parent reply other threads:[~2022-04-27 8:17 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-04-26 12:19 Don Wallwork
2022-04-26 14:58 ` Stephen Hemminger
2022-04-26 21:01 ` Don Wallwork
2022-04-26 21:21 ` Stephen Hemminger
2022-04-26 21:25 ` Don Wallwork
2022-04-27 8:17 ` Morten Brørup [this message]
2022-04-29 18:52 ` Don Wallwork
2022-04-29 19:03 ` Stephen Hemminger
2022-05-02 13:15 ` Don Wallwork
2022-04-30 7:55 ` Morten Brørup
2022-04-27 0:42 ` Honnappa Nagarahalli
2022-04-27 17:50 ` Don Wallwork
2022-04-27 19:09 ` Honnappa Nagarahalli
2022-04-29 20:00 ` [RFC v2] " Don Wallwork
2022-04-30 7:20 ` Morten Brørup
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=98CBD80474FA8B44BF855DF32C47DC35D87006@smartserver.smartshare.dk \
--to=mb@smartsharesystems.com \
--cc=anatoly.burakov@intel.com \
--cc=bruce.richardson@intel.com \
--cc=dev@dpdk.org \
--cc=dmitry.kozliuk@gmail.com \
--cc=donw@xsightlabs.com \
--cc=stephen@networkplumber.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).