DPDK patches and discussions
 help / color / mirror / Atom feed
* [dpdk-dev] Bug with commit 64051bb1 (devargs: unify scratch buffer storage)
@ 2021-04-16 22:04 Harris, James R
  2021-04-17 14:59 ` Xueming(Steven) Li
  0 siblings, 1 reply; 2+ messages in thread
From: Harris, James R @ 2021-04-16 22:04 UTC (permalink / raw)
  To: dev, Xueming(Steven) Li

Hi,

SPDK has identified a regression with commit 64051bb1 (devargs: unify scratch buffer storage).  The issue seems to be with this part of the patch:

@@ -276,15 +287,8 @@ rte_devargs_insert(struct rte_devargs **da)
                if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
                                strcmp(listed_da->name, (*da)->name) == 0) {
                        /* device already in devargs list, must be updated */
-                       listed_da->type = (*da)->type;
-                       listed_da->policy = (*da)->policy;
-                       free(listed_da->args);
-                       listed_da->args = (*da)->args;
-                       listed_da->bus = (*da)->bus;
-                       listed_da->cls = (*da)->cls;
-                       listed_da->bus_str = (*da)->bus_str;
-                       listed_da->cls_str = (*da)->cls_str;
-                       listed_da->data = (*da)->data;
+                       rte_devargs_reset(listed_da);
+                       *listed_da = **da;
                        /* replace provided devargs with found one */
                        free(*da);
                        *da = listed_da;


Previously the data members were copied one-by-one, preserving the pointers in the listed_da’s TAILQ_ENTRY.  But after this patch, rte_devargs_reset() zeroes the entire rte_devargs structure, including the pointers in the TAILQ_ENTRY.  If we do a subsequent rte_devargs_remove() on this same entry, we segfault since the TAILQ_ENTRY’s pointers are invalid.  There could be similar segfaults with any subsequent rte_devargs_insert() calls that require iterating the global list of devargs entries.

rte_devargs_insert() could manually copy the TAILQ_ENTRY pointers to *da before calling rte_devargs_reset() – that at least fixes the SPDK regression.  But it’s not clear to me how many of the other rte_devargs_reset() callsites added by this patch also need to be changed in some way.

Thanks,

-Jim




^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: [dpdk-dev] Bug with commit 64051bb1 (devargs: unify scratch buffer storage)
  2021-04-16 22:04 [dpdk-dev] Bug with commit 64051bb1 (devargs: unify scratch buffer storage) Harris, James R
@ 2021-04-17 14:59 ` Xueming(Steven) Li
  0 siblings, 0 replies; 2+ messages in thread
From: Xueming(Steven) Li @ 2021-04-17 14:59 UTC (permalink / raw)
  To: Harris, James R; +Cc: dev

Hi Jim,

> From: Harris, James R <james.r.harris@intel.com> 
> Sent: Saturday, April 17, 2021 6:05 AM
> To: dev@dpdk.org; Xueming(Steven) Li <xuemingl@nvidia.com>
> Subject: Bug with commit 64051bb1 (devargs: unify scratch buffer storage)
> 
> Hi,
> 
> SPDK has identified a regression with commit 64051bb1 (devargs: unify scratch buffer storage).  The issue seems to be with this part of the patch:
> 
> @@ -276,15 +287,8 @@ rte_devargs_insert(struct rte_devargs **da)
>                 if (strcmp(listed_da->bus->name, (*da)->bus->name) == 0 &&
>                                 strcmp(listed_da->name, (*da)->name) == 0) {
>                         /* device already in devargs list, must be updated */
> -                       listed_da->type = (*da)->type;
> -                       listed_da->policy = (*da)->policy;
> -                       free(listed_da->args);
> -                       listed_da->args = (*da)->args;
> -                       listed_da->bus = (*da)->bus;
> -                       listed_da->cls = (*da)->cls;
> -                       listed_da->bus_str = (*da)->bus_str;
> -                       listed_da->cls_str = (*da)->cls_str;
> -                       listed_da->data = (*da)->data;
> +                       rte_devargs_reset(listed_da);
> +                       *listed_da = **da;
>                         /* replace provided devargs with found one */
>                         free(*da);
>                         *da = listed_da;
> 
> 
> Previously the data members were copied one-by-one, preserving the pointers in the listed_da’s TAILQ_ENTRY.  But after this patch, rte_devargs_reset() zeroes the entire rte_devargs structure, including the pointers in the TAILQ_ENTRY.  If we do a subsequent rte_devargs_remove() on this same entry, we segfault since the TAILQ_ENTRY’s pointers are invalid.  There could be similar segfaults with any subsequent rte_devargs_insert() calls that require iterating the global list of devargs entries.
> 
> rte_devargs_insert() could manually copy the TAILQ_ENTRY pointers to *da before calling rte_devargs_reset() – that at least fixes the SPDK regression.  But it’s not clear to me how many of the other rte_devargs_reset() callsites added by this patch also need to be changed in some way.

Thanks for reporting this issue, your fix should work. Rte_devargs_reset() simply free and clear da->data field, not all of da.

I will send a patch to fix this, thanks again for pointing this out.

> 
> Thanks,
> 
> -Jim
> 



^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2021-04-17 14:59 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-04-16 22:04 [dpdk-dev] Bug with commit 64051bb1 (devargs: unify scratch buffer storage) Harris, James R
2021-04-17 14:59 ` Xueming(Steven) Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).