[dpdk-stable] 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache

patches for DPDK stable branches
 help / color / mirror / Atom feed

From: Feifei Wang <Feifei.Wang2@arm.com>
To: Slava Ovsiienko <viacheslavo@nvidia.com>,
	Matan Azrad <matan@nvidia.com>,
	 Shahaf Shuler <shahafs@nvidia.com>
Cc: "dev@dpdk.org" <dev@dpdk.org>, nd <nd@arm.com>,
	"stable@dpdk.org" <stable@dpdk.org>,
	Ruifeng Wang <Ruifeng.Wang@arm.com>, nd <nd@arm.com>,
	nd <nd@arm.com>
Subject: [dpdk-stable] 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
Date: Sat, 8 May 2021 03:13:07 +0000	[thread overview]
Message-ID: <DB9PR08MB69234963F4135CECF542BA7CC8569@DB9PR08MB6923.eurprd08.prod.outlook.com> (raw)
In-Reply-To: <DM6PR12MB375397B5E116BC11D9D63CDCDF579@DM6PR12MB3753.namprd12.prod.outlook.com>

Hi, Slava

Thanks for your explanation.

Thus we can ignore the order between update global cache and
update dev_gen due to R/W lock. 

Furthermore, it is unnecessary to keep wmb, and the last wmb (1d)
I think can be removed. 
Two reasons for this:
1. wmb has only one function, this is for the local thread to keep
the write-write order. It cannot ensure write operation above it can be
seen by other threads.

2. rwunlock (1e) has a atomic_release operation in it, I think this release
operation  is  same as the last wmb : keep order.

Best Regards
Feifei

> -----邮件原件-----
> 发件人: Slava Ovsiienko <viacheslavo@nvidia.com>
> 发送时间: 2021年5月7日 18:15
> 收件人: Feifei Wang <Feifei.Wang2@arm.com>; Matan Azrad
> <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng Wang
> <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> 主题: RE: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region cache
> 
> Hi, Feifei
> 
> We should consider the locks in your scenario - it is crucial for the complete
> model description:
> 
> How agent_1 (in your terms) rebuilds global cache:
> 
> 1a) lock()
> 1b) rebuild(global cache)
> 1c) update(dev_gen)
> 1d) wmb()
> 1e) unlock()
> 
> How agent_2 checks:
> 
> 2a) check(dev_gen) (assume positive - changed)
> 2b) clear(local_cache)
> 2c) miss(on empty local_cache) - > eventually it goes to mr_lookup_caches()
> 2d) lock()
> 2e) get(new MR)
> 2f) unlock()
> 2g) update(local cache with obtained new MR)
> 
> Hence, even if 1c) becomes visible in 2a) before 1b) committed (say, due to
> out-of-order Arch) - the agent 2 would be blocked on 2d) and scenario
> depicted on your Fig2 would not happen (agent_2 will wait before step 3 till
> agent 1 unlocks after its step 5).
> 
> With best regards,
> Slava
> 
> > -----Original Message-----
> > From: Feifei Wang <Feifei.Wang2@arm.com>
> > Sent: Friday, May 7, 2021 9:36> To: Slava Ovsiienko
> > <viacheslavo@nvidia.com>; Matan Azrad <matan@nvidia.com>; Shahaf
> > Shuler <shahafs@nvidia.com>
> > Cc: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng Wang
> > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> > Subject: 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory
> > Region cache
> >
> > Hi, Slava
> >
> > Thanks very much for your reply.
> >
> > > -----邮件原件-----
> > > 发件人: Slava Ovsiienko <viacheslavo@nvidia.com>
> > > 发送时间: 2021年5月6日 19:22
> > > 收件人: Feifei Wang <Feifei.Wang2@arm.com>; Matan Azrad
> > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng
> Wang
> > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> > > 主题: RE: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory Region
> > > cache
> > >
> > > Hi, Feifei
> > >
> > > Sorry, I do not follow why we should get rid of the last (after
> > > dev_gen update) wmb.
> > > We've rebuilt the global cache, we should notify other agents it's
> > > happened and they should flush local caches. So, dev_gen change
> > > should be made visible to other agents to trigger this activity and
> > > the second wmb is here to ensure this.
> >
> > 1. For the first problem why we should get rid of the last wmb and
> > move it before dev_gen updated, I think our attention is how the wmb
> > implements the synchronization between multiple agents.
> > 					Fig1
> > ----------------------------------------------------------------------
> > ------------------------
> > -------
> > Timeslot        		agent_1               		   agent_2
> > 1          		rebuild global cache
> > 2                       		wmb
> > 3            		     update dev_gen ----------------------- load changed
> > dev_gen
> > 4                                  			        	           rebuild local cache
> > ----------------------------------------------------------------------
> > ------------------------
> > -------
> >
> > First, wmb is only for local thread to keep the order between local
> > write- write :
> > Based on the picture above, for agent_1, wmb keeps the order that
> > rebuilding global cache is always before updating dev_gen.
> >
> > Second, agent_1 communicates with agent_2 by the global variable
> > "dev_gen" :
> > If agent_1 updates dev_gen, agent_2 will load it and then it knows it
> > should rebuild local cache
> >
> > Finally, agent_2 rebuilds local cache according to whether agent_1 has
> > rebuilt global cache, and agent_2 knows this information by the variable
> "dev_gen".
> > 					Fig2
> > ----------------------------------------------------------------------
> > ------------------------
> > -------
> > Timeslot        		agent_1               		   agent_2
> > 1		        update dev_gen
> > 2						      load changed dev_gen
> > 3						          rebuild local cache
> > 4        		    rebuild global cache
> > 5			 wmb
> > ----------------------------------------------------------------------
> > ------------------------
> > -------
> >
> > However, in arm platform, if wmb is after dev_gen updated, "dev_gen"
> > may be updated before agent_1 rebuilding global cache, then agent_2
> > maybe receive error message and rebuild its local cache in advance.
> >
> > To summarize, it is not important which time other agents can see the
> > changed global variable "dev_gen".
> > (Actually, wmb after "dev_gen" cannot ensure changed "dev_gen" is
> > committed to the global).
> > It is more important that if other agents see the changed "dev_gen",
> > they also can know global cache has been updated.
> >
> > > One more point, due to registering new/destroying existing MR
> > > involves FW (via kernel) calls, it takes so many CPU cycles that we
> > > could neglect wmb overhead at all.
> >
> > We just move the last wmb into the right place, and not delete it for
> > performance.
> >
> > >
> > > Also, regarding this:
> > >
> > >  > > Another question suddenly occurred to me, in order to keep the
> > > >
> > > > order that rebuilding global cache before updating ”dev_gen“, the
> > > > > wmb should be before updating "dev_gen" rather than after it.
> > >  > > Otherwise, in the out-of-order platforms, current order cannot
> > > be
> > kept.
> > >
> > > it is not clear why ordering is important - global cache update and
> > > dev_gen change happen under spinlock protection, so only the last
> > > wmb is meaningful.
> > >
> >
> > 2. The second function of wmb before "dev_gen" updated is for
> > performance according to our previous discussion.
> > According to Fig2, if there is no wmb between "global cache updated"
> > and "dev_gen updated", "dev_gen" may update before global cache
> updated.
> >
> > Then agent_2 may see the changed "dev_gen" and flush entire local
> > cache in advance.
> >
> > This entire flush can degrade the performance:
> > "the local cache is getting empty and can't provide translation for
> > other valid (not being removed) MRs, and the translation has to look
> > up in the global cache, that is locked now for rebuilding, this causes
> > the delays in data path on acquiring global cache lock."
> >
> > Furthermore, spinlock is just for global cache, not for dev_gen and
> > local cache.
> >
> > > To summarize, in my opinion:
> > > - if you see some issue with ordering of global cache update/dev_gen
> > > signalling,
> > >   could you, please, elaborate? I'm not sure we should maintain an
> > > order (due to spinlock protection)
> > > - the last rte_smp_wmb() after dev_gen incrementing should be kept
> > > intact
> > >
> >
> > At last, for my view, there are two functions that moving wmb before
> > "dev_gen"
> > for the write-write order:
> > --------------------------------
> > a) rebuild global cache;
> > b) rte_smp_wmb();
> > c) updating dev_gen
> > --------------------------------
> > 1. Achieve synchronization between multiple threads in the right way 2.
> > Prevent other agents from flushing local cache early to ensure
> > performance
> >
> > Best Regards
> > Feifei
> >
> > > With best regards,
> > > Slava
> > >
> > > > -----Original Message-----
> > > > From: Feifei Wang <Feifei.Wang2@arm.com>
> > > > Sent: Thursday, May 6, 2021 5:52
> > > > To: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > Cc: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng
> Wang
> > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> > > > Subject: 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory
> > > > Region cache
> > > >
> > > > Hi, Slava
> > > >
> > > > Would you have more comments about this patch?
> > > > For my sight, only one wmb before "dev_gen" updating is enough to
> > > > synchronize.
> > > >
> > > > Thanks very much for your attention.
> > > >
> > > >
> > > > Best Regards
> > > > Feifei
> > > >
> > > > > -----邮件原件-----
> > > > > 发件人: Feifei Wang
> > > > > 发送时间: 2021年4月20日 16:42
> > > > > 收件人: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> > > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > > 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng
> > > Wang
> > > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> > > > > 主题: 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory
> > Region
> > > > > cache
> > > > >
> > > > > Hi, Slava
> > > > >
> > > > > I think the second wmb can be removed.
> > > > > As I know, wmb is just a barrier to keep the order between write
> > > > > and
> > > write.
> > > > > and it cannot tell the CPU when it should commit the changes.
> > > > >
> > > > > It is usually used before guard variable to keep the order that
> > > > > updating guard variable after some changes, which you want to
> > > > > release,
> > > > have been done.
> > > > >
> > > > > For example, for the wmb  after global cache update/before
> > > > > altering dev_gen, it can ensure the order that updating global
> > > > > cache before altering
> > > > > dev_gen:
> > > > > 1)If other agent load the changed "dev_gen", it can know the
> > > > > global cache has been updated.
> > > > > 2)If other agents load the unchanged, "dev_gen", it means the
> > > > > global cache has not been updated, and the local cache will not
> > > > > be
> > flushed.
> > > > >
> > > > > As a result, we use  wmb and guard variable "dev_gen" to ensure
> > > > > the global cache updating is "visible".
> > > > > The "visible" means when updating guard variable "dev_gen" is
> > > > > known by other agents, they also can confirm global cache has
> > > > > been updated in the meanwhile. Thus, just one wmb before
> > > > > altering dev_gen can ensure
> > > > this.
> > > > >
> > > > > Best Regards
> > > > > Feifei
> > > > >
> > > > > > -----邮件原件-----
> > > > > > 发件人: Slava Ovsiienko <viacheslavo@nvidia.com>
> > > > > > 发送时间: 2021年4月20日 15:54
> > > > > > 收件人: Feifei Wang <Feifei.Wang2@arm.com>; Matan Azrad
> > > > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > > > 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org;
> Ruifeng
> > > > Wang
> > > > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; nd
> <nd@arm.com>;
> > > nd
> > > > > > <nd@arm.com>
> > > > > > 主题: RE: [PATCH v1 3/4] net/mlx5: fix rebuild bug for Memory
> > > > > > Region cache
> > > > > >
> > > > > > Hi, Feifei
> > > > > >
> > > > > > In my opinion, there should be 2 barriers:
> > > > > >  - after global cache update/before altering dev_gen, to
> > > > > > ensure the correct order
> > > > > >  - after altering dev_gen to make this change visible for
> > > > > > other agents and to trigger local cache update
> > > > > >
> > > > > > With best regards,
> > > > > > Slava
> > > > > >
> > > > > > > -----Original Message-----
> > > > > > > From: Feifei Wang <Feifei.Wang2@arm.com>
> > > > > > > Sent: Tuesday, April 20, 2021 10:30
> > > > > > > To: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan Azrad
> > > > > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > > > > Cc: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org; Ruifeng
> > > > Wang
> > > > > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; nd
> > <nd@arm.com>;
> > > > nd
> > > > > > > <nd@arm.com>
> > > > > > > Subject: 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for
> > > > > > > Memory Region cache
> > > > > > >
> > > > > > > Hi, Slava
> > > > > > >
> > > > > > > Another question suddenly occurred to me, in order to keep
> > > > > > > the order that rebuilding global cache before updating
> > > > > > > ”dev_gen“, the wmb should be before updating "dev_gen" rather
> than after it.
> > > > > > > Otherwise, in the out-of-order platforms, current order
> > > > > > > cannot be
> > > kept.
> > > > > > >
> > > > > > > Thus, we should change the code as:
> > > > > > > a) rebuild global cache;
> > > > > > > b) rte_smp_wmb();
> > > > > > > c) updating dev_gen
> > > > > > >
> > > > > > > Best Regards
> > > > > > > Feifei
> > > > > > > > -----邮件原件-----
> > > > > > > > 发件人: Feifei Wang
> > > > > > > > 发送时间: 2021年4月20日 13:54
> > > > > > > > 收件人: Slava Ovsiienko <viacheslavo@nvidia.com>; Matan
> Azrad
> > > > > > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > > > > > 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org;
> > > Ruifeng
> > > > > > Wang
> > > > > > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>; nd
> > <nd@arm.com>
> > > > > > > > 主题: 回复: [PATCH v1 3/4] net/mlx5: fix rebuild bug for
> > > > > > > > Memory
> > > > > Region
> > > > > > > > cache
> > > > > > > >
> > > > > > > > Hi, Slava
> > > > > > > >
> > > > > > > > Thanks very much for your explanation.
> > > > > > > >
> > > > > > > > I can understand the app can wait all mbufs are returned
> > > > > > > > to the memory pool, and then it can free this mbufs, I
> > > > > > > > agree with
> > this.
> > > > > > > >
> > > > > > > > As a result, I will remove the bug fix patch from this
> > > > > > > > series and just replace the smp barrier with C11 thread
> > > > > > > > fence. Thanks very much for your patient explanation again.
> > > > > > > >
> > > > > > > > Best Regards
> > > > > > > > Feifei
> > > > > > > >
> > > > > > > > > -----邮件原件-----
> > > > > > > > > 发件人: Slava Ovsiienko <viacheslavo@nvidia.com>
> > > > > > > > > 发送时间: 2021年4月20日 2:51
> > > > > > > > > 收件人: Feifei Wang <Feifei.Wang2@arm.com>; Matan Azrad
> > > > > > > > > <matan@nvidia.com>; Shahaf Shuler <shahafs@nvidia.com>
> > > > > > > > > 抄送: dev@dpdk.org; nd <nd@arm.com>; stable@dpdk.org;
> > > > Ruifeng
> > > > > > > Wang
> > > > > > > > > <Ruifeng.Wang@arm.com>; nd <nd@arm.com>
> > > > > > > > > 主题: RE: [PATCH v1 3/4] net/mlx5: fix rebuild bug for
> > > > > > > > > Memory Region cache
> > > > > > > > >
> > > > > > > > > Hi, Feifei
> > > > > > > > >
> > > > > > > > > Please, see below
> > > > > > > > >
> > > > > > > > > ....
> > > > > > > > >
> > > > > > > > > > > Hi, Feifei
> > > > > > > > > > >
> > > > > > > > > > > Sorry, I do not follow what this patch fixes. Do we
> > > > > > > > > > > have some issue/bug with MR cache in practice?
> > > > > > > > > >
> > > > > > > > > > This patch fixes the bug which is based on logical
> > > > > > > > > > deduction, and it doesn't actually happen.
> > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Each Tx queue has its own dedicated "local" cache
> > > > > > > > > > > for MRs to convert buffer address in mbufs being
> > > > > > > > > > > transmitted to LKeys (HW-related entity
> > > > > > > > > > > handle) and the "global" cache for all MR registered
> > > > > > > > > > > on the
> > > > device.
> > > > > > > > > > >
> > > > > > > > > > > AFAIK, how conversion happens in datapath:
> > > > > > > > > > > - check the local queue cache flush request
> > > > > > > > > > > - lookup in local cache
> > > > > > > > > > > - if not found:
> > > > > > > > > > > - acquire lock for global cache read access
> > > > > > > > > > > - lookup in global cache
> > > > > > > > > > > - release lock for global cache
> > > > > > > > > > >
> > > > > > > > > > > How cache update on memory freeing/unregistering
> > happens:
> > > > > > > > > > > - acquire lock for global cache write access
> > > > > > > > > > > - [a] remove relevant MRs from the global cache
> > > > > > > > > > > - [b] set local caches flush request
> > > > > > > > > > > - free global cache lock
> > > > > > > > > > >
> > > > > > > > > > > If I understand correctly, your patch swaps [a] and
> > > > > > > > > > > [b], and local caches flush is requested earlier.
> > > > > > > > > > > What problem does it
> > > > solve?
> > > > > > > > > > > It is not supposed there are in datapath some mbufs
> > > > > > > > > > > referencing to the memory being freed. Application
> > > > > > > > > > > must ensure this and must not allocate new mbufs
> > > > > > > > > > > from this memory regions
> > > > > > being freed.
> > > > > > > > > > > Hence, the lookups for these MRs in caches should
> > > > > > > > > > > not
> > occur.
> > > > > > > > > >
> > > > > > > > > > For your first point that, application can take charge
> > > > > > > > > > of preventing MR freed memory being allocated to data path.
> > > > > > > > > >
> > > > > > > > > > Does it means that If there is an emergency of MR
> > > > > > > > > > fragment, such as hotplug, the application must inform
> > > > > > > > > > thedata path in advance, and this memory will not be
> > > > > > > > > > allocated, and then the control path will free this
> > > > > > > > > > memory? If application  can do like this, I agree that
> > > > > > > > > > this bug
> > > > > > > > cannot happen.
> > > > > > > > >
> > > > > > > > > Actually,  this is the only correct way for application to operate.
> > > > > > > > > Let's suppose we have some memory area that application
> > > > > > > > > wants to
> > > > > > free.
> > > > > > > > > ALL references to this area must be removed. If we have
> > > > > > > > > some mbufs allocated from this area, it means that we
> > > > > > > > > have memory pool created
> > > > > > > there.
> > > > > > > > >
> > > > > > > > > What application should do:
> > > > > > > > > - notify all its components/agents the memory area is
> > > > > > > > > going to be freed
> > > > > > > > > - all components/agents free the mbufs they might own
> > > > > > > > > - PMD might not support freeing for some mbufs (for
> > > > > > > > > example being sent and awaiting for completion), so app
> > > > > > > > > should just wait
> > > > > > > > > - wait till all mbufs are returned to the memory pool
> > > > > > > > > (by monitoring available obj == pool size)
> > > > > > > > >
> > > > > > > > > Otherwise - it is dangerous to free the memory. There
> > > > > > > > > are just some mbufs still allocated, it is regardless to
> > > > > > > > > buf address to MR translation. We just can't free the
> > > > > > > > > memory - the mapping will be destroyed and might cause
> > > > > > > > > the segmentation fault by SW or some HW issues on DMA
> > > > > > > > > access to unmapped memory.  It is very generic safety
> > > > > > > > > approach - do not free the memory that is still in
> > > > use.
> > > > > > > > > Hence, at the moment of freeing and unregistering the
> > > > > > > > > MR, there MUST BE NO any
> > > > > > > > mbufs in flight referencing to the addresses being freed.
> > > > > > > > > No translation to MR being invalidated can happen.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > > For other side, the cache flush has negative effect
> > > > > > > > > > > - the local cache is getting empty and can't provide
> > > > > > > > > > > translation for other valid (not being removed) MRs,
> > > > > > > > > > > and the translation has to look up in the global
> > > > > > > > > > > cache, that is locked now for rebuilding, this
> > > > > > > > > > > causes the delays in datapatch
> > > > > > > > > > on acquiring global cache lock.
> > > > > > > > > > > So, I see some potential performance impact.
> > > > > > > > > >
> > > > > > > > > > If above assumption is true, we can go to your second point.
> > > > > > > > > > I think this is a problem of the tradeoff between
> > > > > > > > > > cache coherence and
> > > > > > > > > performance.
> > > > > > > > > >
> > > > > > > > > > I can understand your meaning that though global cache
> > > > > > > > > > has been changed, we should keep the valid MR in local
> > > > > > > > > > cache as long as possible to ensure the fast searching speed.
> > > > > > > > > > In the meanwhile, the local cache can be rebuilt later
> > > > > > > > > > to reduce its waiting time for acquiring the global cache lock.
> > > > > > > > > >
> > > > > > > > > > However,  this mechanism just ensures the performance
> > > > > > > > > > unchanged for the first few mbufs.
> > > > > > > > > > During the next mbufs lkey searching after 'dev_gen'
> > > > > > > > > > updated, it is still necessary to update the local cache.
> > > > > > > > > > And the performance can firstly reduce and then returns.
> > > > > > > > > > Thus, no matter whether there is this patch or not,
> > > > > > > > > > the performance will jitter in a certain period of
> > > > > > > time.
> > > > > > > > >
> > > > > > > > > Local cache should be updated to remove MRs no longer valid.
> > > > > > > > > But we just flush the entire cache.
> > > > > > > > > Let's suppose we have valid MR0, MR1, and not valid MRX
> > > > > > > > > in local
> > > > > cache.
> > > > > > > > > And there are traffic in the datapath for MR0 and MR1,
> > > > > > > > > and no traffic for MRX anymore.
> > > > > > > > >
> > > > > > > > > 1) If we do as you propose:
> > > > > > > > > a) take a lock
> > > > > > > > > b) request flush local cache first - all MR0, MR1, MRX
> > > > > > > > > will be removed on translation in datapath
> > > > > > > > > c) update global cache,
> > > > > > > > > d) free lock
> > > > > > > > > All the traffic for valid MR0, MR1 ALWAYS will be
> > > > > > > > > blocked on lock taken for cache update since point b) till point
> d).
> > > > > > > > >
> > > > > > > > > 2) If we do as it is implemented now:
> > > > > > > > > a) take a lock
> > > > > > > > > b) update global cache
> > > > > > > > > c) request flush local cache
> > > > > > > > > d) free lock
> > > > > > > > > The traffic MIGHT be locked ONLY for MRs non-existing in
> > > > > > > > > local cache (not happens for MR0 and MR1, must not
> > > > > > > > > happen for MRX), and probability should be minor. And
> > > > > > > > > lock might happen since
> > > > > > > > > c) till
> > > > > > > > > d)
> > > > > > > > > - quite short period of time
> > > > > > > > >
> > > > > > > > > Summary, the difference between 1) and 2)
> > > > > > > > >
> > > > > > > > > Lock probability:
> > > > > > > > > - 1) lock ALWAYS happen for ANY MR translation after b),
> > > > > > > > >   2) lock MIGHT happen, for cache miss ONLY, after c)
> > > > > > > > >
> > > > > > > > > Lock duration:
> > > > > > > > > - 1) lock since b) till d),
> > > > > > > > >   2) lock since c) till d), that seems to be  much shorter.
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Finally, in conclusion, I tend to think that the
> > > > > > > > > > bottom layer can do more things to ensure the correct
> > > > > > > > > > execution of the program, which may have a negative
> > > > > > > > > > impact on the performance in a short time, but in the
> > > > > > > > > > long run, the performance will eventually
> > > > > > come back.
> > > > > > > > > > Furthermore, maybe we should pay attention to the
> > > > > > > > > > performance in the stable period, and try our best to
> > > > > > > > > > ensure the correctness of the program in case of
> > > > > > > > > emergencies.
> > > > > > > > >
> > > > > > > > > If we have some mbufs still allocated in memory being
> > > > > > > > > freed
> > > > > > > > > - there is nothing to say about correctness, it is
> > > > > > > > > totally incorrect. In my opinion, we should not think
> > > > > > > > > how to mitigate this incorrect behavior, we should not
> > > > > > > > > encourage application developers to follow the wrong
> > > > > > > > approaches.
> > > > > > > > >
> > > > > > > > > With best regards,
> > > > > > > > > Slava
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Best Regards
> > > > > > > > > > Feifei
> > > > > > > > > >
> > > > > > > > > > > With best regards,
> > > > > > > > > > > Slava

next prev parent reply	other threads:[~2021-05-08  3:13 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20210318071840.359957-1-feifei.wang2@arm.com>
2021-03-18  7:18 ` [dpdk-stable] [PATCH v1 1/4] net/mlx4: " Feifei Wang
2021-03-18  7:18 ` [dpdk-stable] [PATCH v1 3/4] net/mlx5: " Feifei Wang
2021-04-12  8:27   ` Slava Ovsiienko
2021-04-13  5:20     ` [dpdk-stable] 回复: " Feifei Wang
2021-04-19 18:50       ` [dpdk-stable] " Slava Ovsiienko
2021-04-20  5:53         ` [dpdk-stable] 回复: " Feifei Wang
2021-04-20  7:29           ` Feifei Wang
2021-04-20  7:53             ` [dpdk-stable] " Slava Ovsiienko
2021-04-20  8:42               ` [dpdk-stable] 回复: " Feifei Wang
2021-05-06  2:52                 ` Feifei Wang
2021-05-06 11:21                   ` [dpdk-stable] " Slava Ovsiienko
2021-05-07  6:36                     ` [dpdk-stable] 回复: " Feifei Wang
2021-05-07 10:14                       ` [dpdk-stable] " Slava Ovsiienko
2021-05-08  3:13                         ` Feifei Wang [this message]
2021-05-11  8:18                           ` Slava Ovsiienko
2021-05-12  5:34                             ` [dpdk-stable] 回复: " Feifei Wang
2021-05-12 11:07                               ` [dpdk-stable] " Slava Ovsiienko
2021-05-13  5:49                                 ` [dpdk-stable] 回复: " Feifei Wang
2021-05-13 10:49                                   ` [dpdk-stable] " Slava Ovsiienko
2021-05-14  5:18                                     ` [dpdk-stable] 回复: " Feifei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DB9PR08MB69234963F4135CECF542BA7CC8569@DB9PR08MB6923.eurprd08.prod.outlook.com \
    --to=feifei.wang2@arm.com \
    --cc=Ruifeng.Wang@arm.com \
    --cc=dev@dpdk.org \
    --cc=matan@nvidia.com \
    --cc=nd@arm.com \
    --cc=shahafs@nvidia.com \
    --cc=stable@dpdk.org \
    --cc=viacheslavo@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).