DPDK CI discussions
 help / color / mirror / Atom feed
* [dpdk-ci] ABI test failing for openSUSE and Arch Linux
@ 2021-06-04  7:58 David Marchand
  2021-06-04 13:52 ` [dpdk-ci] [dpdklab] " Lincoln Lavoie
  0 siblings, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-06-04  7:58 UTC (permalink / raw)
  To: dpdklab; +Cc: ci, Aaron Conole, Thomas Monjalon, Ray Kinsella

Hello,

Looking at the dashboard, we have ABI checks reporting failures.
Example: https://lab.dpdk.org/results/dashboard/patchsets/17309/

I can't debug this, but I suspect that the reference dumps are not
generated the same way the patches are tested.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-04  7:58 [dpdk-ci] ABI test failing for openSUSE and Arch Linux David Marchand
@ 2021-06-04 13:52 ` Lincoln Lavoie
  2021-06-04 14:02   ` David Marchand
  2021-06-07  8:33   ` David Marchand
  0 siblings, 2 replies; 8+ messages in thread
From: Lincoln Lavoie @ 2021-06-04 13:52 UTC (permalink / raw)
  To: David Marchand; +Cc: dpdklab, ci, Aaron Conole, Thomas Monjalon, Ray Kinsella

[-- Attachment #1: Type: text/plain, Size: 3157 bytes --]

All,

The ABI references for all systems were updated this week to the 21.05
release code.  The two failures look like places where the interfaces
didn't actually change. We do see the failures across multiple patches,
which might imply something got merged that caused these changes / failures.

---------------------------------------------------------------------------------------------
OpenSUSE
2 Removed function symbols not referenced by debug info:

  [D] _fini
  [D] _init

---------------------------------------------------------------------------------------------
Arch Linux (one example from the output)

Functions changes summary: 0 Removed, 0 Changed, 0 Added function
Variables changes summary: 0 Removed, 1 Changed (13 filtered out), 0 Added
variables

1 Changed variable:

  [C] 'rte_table_ops rte_table_acl_ops' was changed at rte_table_acl.h:60:1:
    type of variable changed:
      type size hasn't changed
      1 data member change:
        type of 'rte_table_op_lookup f_lookup' changed:
          underlying type 'int (void*, rte_mbuf**, typedef uint64_t,
uint64_t*, void**)*' changed:
            in pointed to type 'function type int (void*, rte_mbuf**,
typedef uint64_t, uint64_t*, void**)':
              parameter 2 of type 'rte_mbuf**' has sub-type changes:
                in pointed to type 'rte_mbuf*':
                  in pointed to type 'struct rte_mbuf' at
rte_mbuf_core.h:484:1:
                    type size hasn't changed
                    1 data member changes (1 filtered):
                      type of 'anonymous data member union {uint32_t
packet_type; struct {uint8_t l2_type; uint8_t l3_type; uint8_t l4_type;
uint8_t tun_type; union {uint8_t inner_esp_next_proto; struct {uint8_t
inner_l2_type; uint8_t inner_l3_type;};}; uint8_t inner_l4_type;};}'
changed:
                        type size hasn't changed
                        1 data member change:
                          type of 'anonymous data member struct {uint8_t
l2_type; uint8_t l3_type; uint8_t l4_type; uint8_t tun_type; union {uint8_t
inner_esp_next_proto; struct {uint8_t inner_l2_type; uint8_t
inner_l3_type;};}; uint8_t inner_l4_type;}' changed:
                            type size hasn't changed
                            3 data member changes:
                              'uint8_t inner_l4_type' offset changed from 0
to 24 (in bits) (by +24 bits)
                              'uint8_t l4_type' offset changed from 0 to 8
(in bits) (by +8 bits)
                              'uint8_t tun_type' offset changed from 4 to
12 (in bits) (by +8 bits)




On Fri, Jun 4, 2021 at 3:59 AM David Marchand <david.marchand@redhat.com>
wrote:

> Hello,
>
> Looking at the dashboard, we have ABI checks reporting failures.
> Example: https://lab.dpdk.org/results/dashboard/patchsets/17309/
>
> I can't debug this, but I suspect that the reference dumps are not
> generated the same way the patches are tested.
>
>
> --
> David Marchand
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 5555 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-04 13:52 ` [dpdk-ci] [dpdklab] " Lincoln Lavoie
@ 2021-06-04 14:02   ` David Marchand
  2021-06-04 18:28     ` Owen Hilyard
  2021-06-07  8:33   ` David Marchand
  1 sibling, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-06-04 14:02 UTC (permalink / raw)
  To: Lincoln Lavoie; +Cc: dpdklab, ci, Aaron Conole, Thomas Monjalon, Ray Kinsella

On Fri, Jun 4, 2021 at 3:53 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
>
> All,
>
> The ABI references for all systems were updated this week to the 21.05 release code.  The two failures look like places where the interfaces didn't actually change. We do see the failures across multiple patches, which might imply something got merged that caused these changes / failures.

I checked ABI for 64307fad7d2b ("telemetry: remove static limit on
callbacks count") against v21.05, yesterday.
There is no issue.

The report I mentioned
https://lab.dpdk.org/results/dashboard/patchsets/17309/ is for a patch
added on top of 64307fad7d2b.
And the patch for this patchset won't break ABI.


Can you confirm the ABI check runs fine against the main repo?
I can't find a report for it in lab.dpdk.org.
https://lab.dpdk.org/results/dashboard/tarballs/


-- 
David Marchand


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-04 14:02   ` David Marchand
@ 2021-06-04 18:28     ` Owen Hilyard
  0 siblings, 0 replies; 8+ messages in thread
From: Owen Hilyard @ 2021-06-04 18:28 UTC (permalink / raw)
  To: David Marchand
  Cc: Lincoln Lavoie, dpdklab, ci, Aaron Conole, Thomas Monjalon, Ray Kinsella

[-- Attachment #1: Type: text/plain, Size: 1191 bytes --]

All,

The ABI reference seems to have been generated incorrectly. After
re-generating the ABI reference and re-running all of the patches, ABI
passed for all affected patches.

Owen Hilyard

On Fri, Jun 4, 2021 at 10:02 AM David Marchand <david.marchand@redhat.com>
wrote:

> On Fri, Jun 4, 2021 at 3:53 PM Lincoln Lavoie <lylavoie@iol.unh.edu>
> wrote:
> >
> > All,
> >
> > The ABI references for all systems were updated this week to the 21.05
> release code.  The two failures look like places where the interfaces
> didn't actually change. We do see the failures across multiple patches,
> which might imply something got merged that caused these changes / failures.
>
> I checked ABI for 64307fad7d2b ("telemetry: remove static limit on
> callbacks count") against v21.05, yesterday.
> There is no issue.
>
> The report I mentioned
> https://lab.dpdk.org/results/dashboard/patchsets/17309/ is for a patch
> added on top of 64307fad7d2b.
> And the patch for this patchset won't break ABI.
>
>
> Can you confirm the ABI check runs fine against the main repo?
> I can't find a report for it in lab.dpdk.org.
> https://lab.dpdk.org/results/dashboard/tarballs/
>
>
> --
> David Marchand
>
>

[-- Attachment #2: Type: text/html, Size: 1959 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-04 13:52 ` [dpdk-ci] [dpdklab] " Lincoln Lavoie
  2021-06-04 14:02   ` David Marchand
@ 2021-06-07  8:33   ` David Marchand
  2021-06-09 15:07     ` Brandon Lo
  1 sibling, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-06-07  8:33 UTC (permalink / raw)
  To: Lincoln Lavoie
  Cc: dpdklab, ci, Aaron Conole, Thomas Monjalon, Ray Kinsella, Dodji Seketeli

Trying to do a post mortem.


On Fri, Jun 4, 2021 at 3:53 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
> The ABI references for all systems were updated this week to the 21.05 release code.  The two failures look like places where the interfaces didn't actually change. We do see the failures across multiple patches, which might imply something got merged that caused these changes / failures.
>
> ---------------------------------------------------------------------------------------------
> OpenSUSE
> 2 Removed function symbols not referenced by debug info:
>
>   [D] _fini
>   [D] _init

This one, I am not sure what was wrong.
This is not a symbol from DPDK itself.
This comes from a libc / toolchain and/or linker change.


>
> ---------------------------------------------------------------------------------------------
> Arch Linux (one example from the output)
>
> Functions changes summary: 0 Removed, 0 Changed, 0 Added function
> Variables changes summary: 0 Removed, 1 Changed (13 filtered out), 0 Added variables
>
> 1 Changed variable:
>
>   [C] 'rte_table_ops rte_table_acl_ops' was changed at rte_table_acl.h:60:1:
>     type of variable changed:
>       type size hasn't changed
>       1 data member change:
>         type of 'rte_table_op_lookup f_lookup' changed:
>           underlying type 'int (void*, rte_mbuf**, typedef uint64_t, uint64_t*, void**)*' changed:
>             in pointed to type 'function type int (void*, rte_mbuf**, typedef uint64_t, uint64_t*, void**)':
>               parameter 2 of type 'rte_mbuf**' has sub-type changes:
>                 in pointed to type 'rte_mbuf*':
>                   in pointed to type 'struct rte_mbuf' at rte_mbuf_core.h:484:1:
>                     type size hasn't changed
>                     1 data member changes (1 filtered):
>                       type of 'anonymous data member union {uint32_t packet_type; struct {uint8_t l2_type; uint8_t l3_type; uint8_t l4_type; uint8_t tun_type; union {uint8_t inner_esp_next_proto; struct {uint8_t inner_l2_type; uint8_t inner_l3_type;};}; uint8_t inner_l4_type;};}' changed:
>                         type size hasn't changed
>                         1 data member change:
>                           type of 'anonymous data member struct {uint8_t l2_type; uint8_t l3_type; uint8_t l4_type; uint8_t tun_type; union {uint8_t inner_esp_next_proto; struct {uint8_t inner_l2_type; uint8_t inner_l3_type;};}; uint8_t inner_l4_type;}' changed:
>                             type size hasn't changed
>                             3 data member changes:
>                               'uint8_t inner_l4_type' offset changed from 0 to 24 (in bits) (by +24 bits)
>                               'uint8_t l4_type' offset changed from 0 to 8 (in bits) (by +8 bits)
>                               'uint8_t tun_type' offset changed from 4 to 12 (in bits) (by +8 bits)

For this one, this is probably due to
https://sourceware.org/bugzilla/show_bug.cgi?id=26684

I had a discussion with Dodji (libabigail maintainer).

Going from dwarf 4 to dwarf 5 is something that is decided at gcc /
binutils level.

Arch Linux recently adopted gcc 11.
https://archlinux.org/packages/core/x86_64/gcc/

If we compare gcc 10 and gcc 11:
- https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Debugging-Options.html
"Produce debugging information in DWARF format (if that is supported).
The value of version may be either 2, 3, 4 or 5; the default version
for most targets is 4. DWARF Version 5 is only experimental. "
- https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Debugging-Options.html
"Produce debugging information in DWARF format (if that is supported).
The value of version may be either 2, 3, 4 or 5; the default version
for most targets is 5 (with the exception of VxWorks, TPF and
Darwin/Mac OS X, which default to version 2, and AIX, which defaults
to version 4)."

So the reason, for the issue reported above, could be that the
reference had been generated before upgrading gcc to 11.


I have some trouble finding the actual date for the Arch Linux switch
to gcc 11 (maybe 2021/05/17).
Are you able to correlate this with the ABI reference previous generation?


-- 
David Marchand


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-07  8:33   ` David Marchand
@ 2021-06-09 15:07     ` Brandon Lo
  2021-06-10  8:02       ` David Marchand
  0 siblings, 1 reply; 8+ messages in thread
From: Brandon Lo @ 2021-06-09 15:07 UTC (permalink / raw)
  To: David Marchand
  Cc: Lincoln Lavoie, dpdklab, ci, Aaron Conole, Thomas Monjalon,
	Ray Kinsella, Dodji Seketeli

Hi David,

For the Arch container, the gcc version did update to 11.1 which
contributed to this error.
The OpenSUSE container also underwent a similar update process where
various packages related to the toolchain were updated. However, it
appears that OpenSUSE remained on gcc version 7.5.0.
This is, in part, also caused by the fact that we generate the ABI
references once and store them. This is simply to reduce the amount of
time per ABI test.

To streamline this entire process, we are working on a job or pipeline
to automate refreshing all of the images and recreate the ABI
references.

Thanks,
Brandon

On Mon, Jun 7, 2021 at 4:34 AM David Marchand <david.marchand@redhat.com> wrote:
>
> Trying to do a post mortem.
>
>
> On Fri, Jun 4, 2021 at 3:53 PM Lincoln Lavoie <lylavoie@iol.unh.edu> wrote:
> > The ABI references for all systems were updated this week to the 21.05 release code.  The two failures look like places where the interfaces didn't actually change. We do see the failures across multiple patches, which might imply something got merged that caused these changes / failures.
> >
> > ---------------------------------------------------------------------------------------------
> > OpenSUSE
> > 2 Removed function symbols not referenced by debug info:
> >
> >   [D] _fini
> >   [D] _init
>
> This one, I am not sure what was wrong.
> This is not a symbol from DPDK itself.
> This comes from a libc / toolchain and/or linker change.
>
>
> >
> > ---------------------------------------------------------------------------------------------
> > Arch Linux (one example from the output)
> >
> > Functions changes summary: 0 Removed, 0 Changed, 0 Added function
> > Variables changes summary: 0 Removed, 1 Changed (13 filtered out), 0 Added variables
> >
> > 1 Changed variable:
> >
> >   [C] 'rte_table_ops rte_table_acl_ops' was changed at rte_table_acl.h:60:1:
> >     type of variable changed:
> >       type size hasn't changed
> >       1 data member change:
> >         type of 'rte_table_op_lookup f_lookup' changed:
> >           underlying type 'int (void*, rte_mbuf**, typedef uint64_t, uint64_t*, void**)*' changed:
> >             in pointed to type 'function type int (void*, rte_mbuf**, typedef uint64_t, uint64_t*, void**)':
> >               parameter 2 of type 'rte_mbuf**' has sub-type changes:
> >                 in pointed to type 'rte_mbuf*':
> >                   in pointed to type 'struct rte_mbuf' at rte_mbuf_core.h:484:1:
> >                     type size hasn't changed
> >                     1 data member changes (1 filtered):
> >                       type of 'anonymous data member union {uint32_t packet_type; struct {uint8_t l2_type; uint8_t l3_type; uint8_t l4_type; uint8_t tun_type; union {uint8_t inner_esp_next_proto; struct {uint8_t inner_l2_type; uint8_t inner_l3_type;};}; uint8_t inner_l4_type;};}' changed:
> >                         type size hasn't changed
> >                         1 data member change:
> >                           type of 'anonymous data member struct {uint8_t l2_type; uint8_t l3_type; uint8_t l4_type; uint8_t tun_type; union {uint8_t inner_esp_next_proto; struct {uint8_t inner_l2_type; uint8_t inner_l3_type;};}; uint8_t inner_l4_type;}' changed:
> >                             type size hasn't changed
> >                             3 data member changes:
> >                               'uint8_t inner_l4_type' offset changed from 0 to 24 (in bits) (by +24 bits)
> >                               'uint8_t l4_type' offset changed from 0 to 8 (in bits) (by +8 bits)
> >                               'uint8_t tun_type' offset changed from 4 to 12 (in bits) (by +8 bits)
>
> For this one, this is probably due to
> https://sourceware.org/bugzilla/show_bug.cgi?id=26684
>
> I had a discussion with Dodji (libabigail maintainer).
>
> Going from dwarf 4 to dwarf 5 is something that is decided at gcc /
> binutils level.
>
> Arch Linux recently adopted gcc 11.
> https://archlinux.org/packages/core/x86_64/gcc/
>
> If we compare gcc 10 and gcc 11:
> - https://gcc.gnu.org/onlinedocs/gcc-10.1.0/gcc/Debugging-Options.html
> "Produce debugging information in DWARF format (if that is supported).
> The value of version may be either 2, 3, 4 or 5; the default version
> for most targets is 4. DWARF Version 5 is only experimental. "
> - https://gcc.gnu.org/onlinedocs/gcc-11.1.0/gcc/Debugging-Options.html
> "Produce debugging information in DWARF format (if that is supported).
> The value of version may be either 2, 3, 4 or 5; the default version
> for most targets is 5 (with the exception of VxWorks, TPF and
> Darwin/Mac OS X, which default to version 2, and AIX, which defaults
> to version 4)."
>
> So the reason, for the issue reported above, could be that the
> reference had been generated before upgrading gcc to 11.
>
>
> I have some trouble finding the actual date for the Arch Linux switch
> to gcc 11 (maybe 2021/05/17).
> Are you able to correlate this with the ABI reference previous generation?
>
>
> --
> David Marchand
>


-- 

Brandon Lo

UNH InterOperability Laboratory

21 Madbury Rd, Suite 100, Durham, NH 03824

blo@iol.unh.edu

www.iol.unh.edu

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-09 15:07     ` Brandon Lo
@ 2021-06-10  8:02       ` David Marchand
  2021-06-10  8:13         ` Lincoln Lavoie
  0 siblings, 1 reply; 8+ messages in thread
From: David Marchand @ 2021-06-10  8:02 UTC (permalink / raw)
  To: Brandon Lo
  Cc: Lincoln Lavoie, dpdklab, ci, Aaron Conole, Thomas Monjalon,
	Ray Kinsella, Dodji Seketeli

On Wed, Jun 9, 2021 at 5:07 PM Brandon Lo <blo@iol.unh.edu> wrote:
> To streamline this entire process, we are working on a job or pipeline
> to automate refreshing all of the images and recreate the ABI
> references.

I understand the motivation, but will we have a clear idea of which
ABI reference has been used and how to reproduce its generation (sha1,
toolchain, libc, libabigail and such packages versions, version of the
script generating the reference) ?
If something breaks later and we don't know clearly how/if a reference
changed, it will be a pain to analyse.


-- 
David Marchand


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [dpdk-ci] [dpdklab] ABI test failing for openSUSE and Arch Linux
  2021-06-10  8:02       ` David Marchand
@ 2021-06-10  8:13         ` Lincoln Lavoie
  0 siblings, 0 replies; 8+ messages in thread
From: Lincoln Lavoie @ 2021-06-10  8:13 UTC (permalink / raw)
  To: David Marchand
  Cc: Brandon Lo, dpdklab, ci, Aaron Conole, Thomas Monjalon,
	Ray Kinsella, Dodji Seketeli

[-- Attachment #1: Type: text/plain, Size: 1583 bytes --]

Hi David,

I think yes.  What Brandon was referring to is linking the process we use
to refresh the container images and the rebuild of the ABI references, so
one triggers the other.  What happened with the failure was the
container images got rebuilt, and that pulled in updates that change the
ABI output (in valid ways), which then "look like" a failure or change from
the reference that was previously saved off.

We save off older versions of the container images (i.e. things are
tagged), so we can always roll back if need to.  ABI reference generation
should be deterministic on that container image, so we don't save
"versions" of those references.

Cheers,
Lincoln

On Thu, Jun 10, 2021 at 4:02 AM David Marchand <david.marchand@redhat.com>
wrote:

> On Wed, Jun 9, 2021 at 5:07 PM Brandon Lo <blo@iol.unh.edu> wrote:
> > To streamline this entire process, we are working on a job or pipeline
> > to automate refreshing all of the images and recreate the ABI
> > references.
>
> I understand the motivation, but will we have a clear idea of which
> ABI reference has been used and how to reproduce its generation (sha1,
> toolchain, libc, libabigail and such packages versions, version of the
> script generating the reference) ?
> If something breaks later and we don't know clearly how/if a reference
> changed, it will be a pain to analyse.
>
>
> --
> David Marchand
>
>

-- 
*Lincoln Lavoie*
Principal Engineer, Broadband Technologies
21 Madbury Rd., Ste. 100, Durham, NH 03824
lylavoie@iol.unh.edu
https://www.iol.unh.edu
+1-603-674-2755 (m)
<https://www.iol.unh.edu>

[-- Attachment #2: Type: text/html, Size: 3106 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2021-06-10  8:13 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-04  7:58 [dpdk-ci] ABI test failing for openSUSE and Arch Linux David Marchand
2021-06-04 13:52 ` [dpdk-ci] [dpdklab] " Lincoln Lavoie
2021-06-04 14:02   ` David Marchand
2021-06-04 18:28     ` Owen Hilyard
2021-06-07  8:33   ` David Marchand
2021-06-09 15:07     ` Brandon Lo
2021-06-10  8:02       ` David Marchand
2021-06-10  8:13         ` Lincoln Lavoie

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).