DPDK patches and discussions
 help / color / mirror / Atom feed
From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
To: "thomas@monjalon.net" <thomas@monjalon.net>,
	Jerin Jacob <jerinjacobk@gmail.com>,
	"Ananyev, Konstantin" <konstantin.ananyev@intel.com>
Cc: "Richardson, Bruce" <bruce.richardson@intel.com>,
	"Morten Brørup" <mb@smartsharesystems.com>,
	"dev@dpdk.org" <dev@dpdk.org>,
	"olivier.matz@6wind.com" <olivier.matz@6wind.com>,
	"andrew.rybchenko@oktetlabs.ru" <andrew.rybchenko@oktetlabs.ru>,
	"Yigit, Ferruh" <ferruh.yigit@intel.com>,
	"jerinj@marvell.com" <jerinj@marvell.com>,
	"gakhil@marvell.com" <gakhil@marvell.com>, nd <nd@arm.com>,
	"Honnappa Nagarahalli" <Honnappa.Nagarahalli@arm.com>,
	nd <nd@arm.com>
Subject: Re: [dpdk-dev] [PATCH] parray: introduce internal API for dynamic arrays
Date: Tue, 15 Jun 2021 14:37:47 +0000	[thread overview]
Message-ID: <DBAPR08MB5814F5EB09E70348A22B769498309@DBAPR08MB5814.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <2133178.xeBKh9gUzh@thomas>

<snip>

> 
> 15/06/2021 12:08, Ananyev, Konstantin:
> > > 15/06/2021 11:33, Ananyev, Konstantin:
> > > > > 14/06/2021 17:48, Jerin Jacob:
> > > > > > On Mon, Jun 14, 2021 at 8:29 PM Ananyev, Konstantin
> > > > > > <konstantin.ananyev@intel.com> wrote:
> > > > > > > I had only a quick look at your approach so far.
> > > > > > > But from what I can read, in MT environment your suggestion
> > > > > > > will require extra synchronization for each read-write access to
> such parray element (lock, rcu, ...).
> > > > > > > I think what Bruce suggests will be much ligther, easier to
> implement and less error prone.
> > > > > > > At least for rte_ethdevs[] and friends.
> > > > > >
> > > > > > +1
> > > > >
> > > > > Please could you have a deeper look and tell me why we need more
> locks?
> > > > > The element pointers doesn't change.
> > > > > Only the array pointer change at resize,
> > > >
> > > > Yes, array pointer changes at resize, and reader has to read that
> > > > value to access elements in the parray. Which means that we need
> > > > some sync between readers and updaters to avoid reader using stale
> pointer (ref-counter, rcu, etc.).
> > >
> > > No
> > > The old array is still there, so we don't need sync.
> > >
> > > > I.E. updater can free old array pointer *only* when it can
> > > > guarantee that there are no readers that still use it.
> > >
> > > No
> > > Reading an element is OK because the pointer to the element is not
> changed.
> > > Getting the pointer to an element from the index is the only thing
> > > which is blocking the freeing of an array, and I see no reason why
> > > dereferencing an index would be longer than 2 consecutive resizes of
> > > the array.
> >
> > In general, your thread can be switched off the cpu at any moment.
> > And you don't know for sure when it will be scheduled back.
> >
> > >
> > > > > but the old one is still usable until the next resize.
> > > >
> > > > Ok, but what is the guarantee that reader would *always* finish till next
> resize?
> > > > As an example of such race condition:
> > > >
> > > > /* global one */
> > > > 	struct rte_parray pa;
> > > >
> > > > /* thread #1, tries to read elem from the array */
> > > >  	....
> > > > 	int **x = pa->array;
> > >
> > > We should not save the array pointer.
> > > Each index must be dereferenced with the macro getting the current
> > > array pointer.
> > > So the interrupt is during dereference of a single index.
> >
> > You still need to read your pa->array somewhere (let say into a register).
> > Straight after that your thread can be interrupted.
> > Then when it is scheduled back to the CPU that value (in a register) might be
> s stale one.
> >
> > >
> > > > /* thread # 1 get suspended for a while  at that point */
> > > >
> > > > /* meanwhile thread #2 does: */
> > > > 	....
> > > > 	/* causes first resize(), x still valid, points to pa->old_array */
> > > > 	rte_parray_alloc(&pa, ...);
> > > > 	.....
> > > > 	/* causes second resize(), x now points to freed memory */
> > > > 	rte_parray_alloc(&pa, ...);
> > > > 	...
> > >
> > > 2 resizes is a very long time, it is at minimum 33 allocations!
> > >
> > > > /* at that point thread #1 resumes: */
> > > >
> > > > 	/* contents of x[0] are undefined, 'p' could point anywhere,
> > > > 	     might cause segfault or silent memory corruption */
> > > > 	int *p = x[0];
> > > >
> > > >
> > > > Yes probability of such situation is quite small.
> > > > But it is still possible.
> > >
> > > In device probing, I don't see how it is realistically possible:
> > > 33 device allocations during 1 device index being dereferenced.
> >
> > Yeh, it would work fine 1M times, but sometimes will crash.
> 
> Sometimes a thread will be interrupted during 33 device allocations?
> 
> > Which will make it even harder to reproduce, debug and fix.
> > I think that when introducing a new generic library into DPDK, we
> > should avoid making such assumptions.
> 
> I intend to make it internal-only (I should have named it eal_parray).
> 
> > > I agree it is tricky, but that's the whole point of finding tricks
> > > to keep fast code.
> >
> > It is not tricky, it is buggy 😊
> > You introducing a race condition into the new core generic library by
> > design, and trying to convince people that it is *OK*.
> 
> Yes, because I am convinced myself.
> 
> > Sorry, but NACK from me till that issue will be addressed.
Agree here that a synchronization mechanism is required to indicate when it is safe to free the old array. An ACK from the readers is required to free the old array. We cannot use "enough time has passed" argument.

As others have mentioned, I think the key is the use case. Not all use cases require a dynamically resized array. Dynamically allocated array at init time would be enough.

If a dynamically resized array is required, using RCU (or any other mechanism) is necessary. I do not think these use cases should be characterized by the size of the memory/array in question (it might be a small chunk in a system with abundant memory, but might be a big chunk in a system with small amount of memory). The current RCU library provides good options to hide complexities from the application or allow the application to handle complexities if it wants.

> 
> It is not an issue, but a design.
> If you think that a thread can be interrupted during 33 device allocations then
> we should find another implementation, but I am quite sure it will be slower.
> 


  reply	other threads:[~2021-06-15 14:38 UTC|newest]

Thread overview: 61+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-06-14 10:58 Thomas Monjalon
2021-06-14 12:22 ` Morten Brørup
2021-06-14 13:15   ` Bruce Richardson
2021-06-14 13:32     ` Thomas Monjalon
2021-06-14 14:59       ` Ananyev, Konstantin
2021-06-14 15:48         ` Jerin Jacob
2021-06-15  6:52           ` Thomas Monjalon
2021-06-15  8:00             ` Jerin Jacob
2021-06-15  9:18               ` Thomas Monjalon
2021-06-15  9:33             ` Ananyev, Konstantin
2021-06-15  9:50               ` Thomas Monjalon
2021-06-15 10:08                 ` Ananyev, Konstantin
2021-06-15 14:02                   ` Thomas Monjalon
2021-06-15 14:37                     ` Honnappa Nagarahalli [this message]
2021-06-14 15:54         ` Ananyev, Konstantin
2021-06-17 13:08           ` Ferruh Yigit
2021-06-17 14:58             ` Ananyev, Konstantin
2021-06-17 15:17               ` Morten Brørup
2021-06-17 16:12                 ` Ferruh Yigit
2021-06-17 16:55                   ` Morten Brørup
2021-06-18 10:21                     ` Ferruh Yigit
2021-06-17 17:05                   ` Ananyev, Konstantin
2021-06-18  9:14                     ` Morten Brørup
2021-06-18 10:47                       ` Ferruh Yigit
2021-06-18 11:16                         ` Morten Brørup
2021-06-18 10:28                     ` Ferruh Yigit
2021-06-17 15:44               ` Ferruh Yigit
2021-06-18 10:41                 ` Ananyev, Konstantin
2021-06-18 10:49                   ` Ferruh Yigit
2021-06-21 11:06                   ` Ananyev, Konstantin
2021-06-21 12:10                     ` Morten Brørup
2021-06-21 12:30                       ` Ananyev, Konstantin
2021-06-21 13:28                         ` Morten Brørup
     [not found]                           ` <DM6PR11MB4491D4F6FAFDD6E8EEC2A78F9A099@DM6PR11MB4491.namprd11.prod.outlook .com>
2021-06-22  8:33                           ` Ananyev, Konstantin
2021-06-22 10:01                             ` Morten Brørup
2021-06-22 12:13                               ` Ananyev, Konstantin
2021-06-22 13:18                                 ` Morten Brørup
2021-06-21 14:10                         ` Ferruh Yigit
2021-06-21 14:38                           ` Ananyev, Konstantin
2021-06-21 15:56                             ` Ferruh Yigit
2021-06-21 18:17                               ` Ananyev, Konstantin
2021-06-21 14:05                     ` Ferruh Yigit
2021-06-21 14:42                       ` Ananyev, Konstantin
2021-06-21 15:32                         ` Ferruh Yigit
2021-06-21 15:37                           ` Ananyev, Konstantin
2021-06-14 15:48       ` Morten Brørup
2021-06-15  6:48         ` Thomas Monjalon
2021-06-15  7:53           ` Morten Brørup
2021-06-15  8:44             ` Bruce Richardson
2021-06-15  9:28               ` Thomas Monjalon
2021-06-16  9:42           ` Jerin Jacob
2021-06-16 11:27             ` Morten Brørup
2021-06-16 12:00               ` Jerin Jacob
2021-06-16 13:02               ` Bruce Richardson
2021-06-16 15:01                 ` Morten Brørup
2021-06-16 17:40                   ` Bruce Richardson
2021-06-16 12:22             ` Burakov, Anatoly
2021-06-16 12:59               ` Jerin Jacob
2021-06-16 22:58                 ` Dmitry Kozlyuk
2021-06-14 13:28   ` Thomas Monjalon
2021-06-16 11:11 ` Burakov, Anatoly

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DBAPR08MB5814F5EB09E70348A22B769498309@DBAPR08MB5814.eurprd08.prod.outlook.com \
    --to=honnappa.nagarahalli@arm.com \
    --cc=andrew.rybchenko@oktetlabs.ru \
    --cc=bruce.richardson@intel.com \
    --cc=dev@dpdk.org \
    --cc=ferruh.yigit@intel.com \
    --cc=gakhil@marvell.com \
    --cc=jerinj@marvell.com \
    --cc=jerinjacobk@gmail.com \
    --cc=konstantin.ananyev@intel.com \
    --cc=mb@smartsharesystems.com \
    --cc=nd@arm.com \
    --cc=olivier.matz@6wind.com \
    --cc=thomas@monjalon.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).