Re: [dpdk-users] Multi-process recovery (is it even possible?)

DPDK usage discussions
 help / color / mirror / Atom feed

From: "Tan, Jianfeng" <jianfeng.tan@intel.com>
To: "Lazarenko, Vlad (WorldQuant)" <Vlad.Lazarenko@worldquant.com>,
	"'users@dpdk.org'" <users@dpdk.org>
Subject: Re: [dpdk-users] Multi-process recovery (is it even possible?)
Date: Fri, 2 Mar 2018 00:39:58 +0000	[thread overview]
Message-ID: <ED26CBA2FAD1BF48A8719AEF02201E365145A788@SHSMSX103.ccr.corp.intel.com> (raw)
In-Reply-To: <790E2AC11206AC46B8F4BB82078E34F8081E29C2@PWSSMTEXMBX002.AD.MLP.com>



> -----Original Message-----
> From: Lazarenko, Vlad (WorldQuant)
> [mailto:Vlad.Lazarenko@worldquant.com]
> Sent: Thursday, March 1, 2018 10:53 PM
> To: Tan, Jianfeng; 'users@dpdk.org'
> Subject: RE: Multi-process recovery (is it even possible?)
> 
> Hello Jianfeng,
> 
> Thanks for getting back to me.  I thought about using "udata64", too. But that
> didn't work for me if a single packet was fanned out to multiple slave
> processes.  But most importantly, it looks like if a slave process crashes
> somewhere in the middle of getting or putting packets from/to a pool, we
> could end up with a deadlock. So I guess I'd have to think about a different
> design or be ready to bounce all of the processes if one of them fails.

OK, a better design to avoid such hard issue is good way to go. Good luck!

Thanks,
Jianfeng

> 
> Thanks,
> Vlad
> 
> > -----Original Message-----
> > From: Tan, Jianfeng [mailto:jianfeng.tan@intel.com]
> > Sent: Thursday, March 01, 2018 3:20 AM
> > To: Lazarenko, Vlad (WorldQuant); 'users@dpdk.org'
> > Subject: RE: Multi-process recovery (is it even possible?)
> >
> >
> >
> > > -----Original Message-----
> > > From: users [mailto:users-bounces@dpdk.org] On Behalf Of Lazarenko,
> > > Vlad
> > > (WorldQuant)
> > > Sent: Thursday, March 1, 2018 2:54 AM
> > > To: 'users@dpdk.org'
> > > Subject: [dpdk-users] Multi-process recovery (is it even possible?)
> > >
> > > Guys,
> > >
> > > I am looking for possible solutions for the following problems that
> > > come along with asymmetric multi-process architecture...
> > >
> > > Given multiple processes share the same RX/TX queue(s) and packet
> > > pool(s) and the possibility of one packet from RX queue being fanned
> > > out to multiple slave processes, is there a way to recover from slave
> > > crashing (or exits w/o cleaning up properly)? In theory it could have
> > > incremented mbuf reference count more than once and unless
> everything
> > > is restarted, I don't see a reliable way to release those mbufs back to the
> > pool.
> >
> > Recycle an element is too difficult; from what I know, it's next to impossible.
> > To recycle a memzone/mempool is easier. So in your case, you might want
> to
> > use different pools for different queues (processes).
> >
> > If you really want to recycle an element, rte_mbuf in your case, it might be
> > doable by:
> > 1. set up rx callback for each process, and in the callback, store a special flag
> > at rte_mbuf->udata64.
> > 2. when the primary to detect a secondary is down, we iterate all element
> > with the special flag, and put them back into the ring.
> >
> > There is small chance to fail that , mbuf is allocated by a secondary process,
> > and before it's flagged, it crashes.
> >
> > Thanks,
> > Jianfeng
> >
> >
> > >
> > > Also, if spinlock is involved and either master or slave crashes,
> > > everything simply gets stuck. Is there any way to detect this (i.e. outside
> of
> > data path)..?
> > >
> > > Thanks,
> > > Vlad
> > >
> 
> 
> 
> ##########################################################
> #########################
> 
> The information contained in this communication is confidential, may be
> 
> subject to legal privilege, and is intended only for the individual named.
> 
> If you are not the named addressee, please notify the sender immediately
> and
> 
> delete this email from your system.  The views expressed in this email are
> 
> the views of the sender only.  Outgoing and incoming electronic
> communications
> 
> to this address are electronically archived and subject to review and/or
> disclosure
> 
> to someone other than the recipient.
> 
> ##########################################################
> #########################

     prev parent reply	other threads:[~2018-03-02  0:40 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-02-28 18:53 Lazarenko, Vlad (WorldQuant)
2018-03-01  8:19 ` Tan, Jianfeng
2018-03-01 14:53   ` Lazarenko, Vlad (WorldQuant)
2018-03-02  0:39     ` Tan, Jianfeng [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ED26CBA2FAD1BF48A8719AEF02201E365145A788@SHSMSX103.ccr.corp.intel.com \
    --to=jianfeng.tan@intel.com \
    --cc=Vlad.Lazarenko@worldquant.com \
    --cc=users@dpdk.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).