DPDK patches and discussions
 help / color / mirror / Atom feed
From: Bruce Richardson <bruce.richardson@intel.com>
To: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: dev <dev@dpdk.org>, Luca Boccassi <bluca@debian.org>
Subject: Re: Should we try to be more graceful in library init on old Hardware?
Date: Thu, 30 Mar 2023 14:28:00 +0100	[thread overview]
Message-ID: <ZCWOYG7LSNYKN2Kq@bricha3-MOBL.ger.corp.intel.com> (raw)
In-Reply-To: <ZCWLfjnqCPqbmLtX@bricha3-MOBL.ger.corp.intel.com>

On Thu, Mar 30, 2023 at 02:15:42PM +0100, Bruce Richardson wrote:
> On Thu, Mar 30, 2023 at 02:53:41PM +0200, Christian Ehrhardt wrote:
> > Hi,
> > I've recently gotten a kind of bug I was waiting for many years.
> > In fact I wondered if it would still come up as each year  made it less likely.
> > But it happened and I got a crash report of someone using dpdk a
> > rather old pre sse4.2 hardware.
> > => https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/2009635/comments/9
> > 
> > The reporter was nice and tried the newer 22.11, but that is just as affected.
> > 
> > I understand that DPDK, as a project, has set this as the minimal
> > accepted hardware capability.
> > But due to some programs - in this case UHD - being able to do many
> > other things it might happen that UHD or any else just links to DPDK
> > (as it could be used with it) and due to that runs into a crash when
> > loading. In theory other tools like collectd which has dpdk support
> > would be affected by the same.
> > 
> > Example:
> > root@1bee22d20ca0:/# uhd_usrp_probe
> > Illegal instruction (core dumped)
> > 
> > (gdb) bt
> > #0 0x00007f4b2d3a3374 in rte_srand () from
> > /lib/x86_64-linux-gnu/librte_eal.so.23
> > #1 0x00007f4b2d3967ec in ?? () from /lib/x86_64-linux-gnu/librte_eal.so.23
> > #2 0x00007f4b2e5d1fbe in call_init (l=<optimized out>,
> > argc=argc@entry=1, argv=argv@entry=0x7ffeabf5b488,
> > env=env@entry=0x7ffeabf5b498)
> >     at ./elf/dl-init.c:70
> > #3 0x00007f4b2e5d20a8 in call_init (env=0x7ffeabf5b498,
> > argv=0x7ffeabf5b488, argc=1, l=<optimized out>) at ./elf/dl-init.c:33
> > #4 _dl_init (main_map=0x7f4b2e6042e0, argc=1, argv=0x7ffeabf5b488,
> > env=0x7ffeabf5b498) at ./elf/dl-init.c:117
> > #5 0x00007f4b2e5ea8b0 in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
> > #6 0x0000000000000001 in ?? ()
> > #7 0x00007ffeabf5c844 in ?? ()
> > #8 0x0000000000000000 in ?? ()
> > 
> > Right now all we could do is:
> > a) say bad luck old hardware (not nice)
> > b) make super complex alternative builds with and without dpdk support
> > c) ask the DPDK project to work on non sse4.2 (unlikely and too late
> > in 2023 I guess)
> > d) Somehow make the initialization graceful (that is what I'm RFC here)
> > 
> > If we could manage to get that DPDK to ensure the lib loading paths
> > are SSE4.2 free.
> > Then we could check the capabilities on the actual initialization and
> > return a proper bad result instead of a crash.
> > Due to that only real-users of DPDK would be required to have
> > sufficiently new hardware.
> > And OTOH users of software that links, but in the current config would
> > not use DPDK would suffer less.
> > 
> > WDYT?
> > Maybe it has been already discussed and I did neither remember nor find it?
> > 
> It certainly hasn't been discussed previously, but there is meant to be
> support for this in EAL init itself. Almost the first function called
> from eal_init() is "rte_cpu_is_supported()" [1] which checks the build-time
> CPU flags against those of the current system.
> Unfortunately, from the error message you are getting, that doesn't seem to
> be working ok in the case of SSE4.2. It seems the compiler is inserting
> SSE4 instructions before we even get to that point. :-(
> 
> Perhaps we need to move eal init to a new file, and compile it (and the
> cpuflag checks) with very minimal CPU flags.
> 

Following up to my own mail...

I believe we may be able to solve this easier by maybe using the "target"
attribute for those functions. For x86 builds I don't see why eal init
cannot be compiled for an earlier SSE version, (march=core2, perhaps). It's
not a performance-sensitive function.

Thoughts?
/Bruce

  reply	other threads:[~2023-03-30 13:28 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-30 12:53 Christian Ehrhardt
2023-03-30 13:15 ` Bruce Richardson
2023-03-30 13:28   ` Bruce Richardson [this message]
2023-03-30 14:31     ` Dmitry Kozlyuk

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZCWOYG7LSNYKN2Kq@bricha3-MOBL.ger.corp.intel.com \
    --to=bruce.richardson@intel.com \
    --cc=bluca@debian.org \
    --cc=christian.ehrhardt@canonical.com \
    --cc=dev@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).