From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga14.intel.com (mga14.intel.com [192.55.52.115])
 by dpdk.org (Postfix) with ESMTP id 7FAEE9AB0
 for <dev@dpdk.org>; Wed,  8 Jun 2016 11:19:48 +0200 (CEST)
Received: from fmsmga004.fm.intel.com ([10.253.24.48])
 by fmsmga103.fm.intel.com with ESMTP; 08 Jun 2016 02:19:48 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.26,438,1459839600"; d="scan'208";a="118152672"
Received: from irsmsx151.ger.corp.intel.com ([163.33.192.59])
 by fmsmga004.fm.intel.com with ESMTP; 08 Jun 2016 02:19:46 -0700
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.51]) by
 IRSMSX151.ger.corp.intel.com ([169.254.4.151]) with mapi id 14.03.0248.002;
 Wed, 8 Jun 2016 10:19:46 +0100
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Lu, Wenzhuo" <wenzhuo.lu@intel.com>, "Tao, Zhe" <zhe.tao@intel.com>,
 "dev@dpdk.org" <dev@dpdk.org>
CC: "Richardson, Bruce" <bruce.richardson@intel.com>, "Chen, Jing D"
 <jing.d.chen@intel.com>, "Liang, Cunming" <cunming.liang@intel.com>, "Wu,
 Jingjing" <jingjing.wu@intel.com>, "Zhang, Helin" <helin.zhang@intel.com>
Thread-Topic: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode
Thread-Index: AQHRwIlPe9ZormIF7U2HZ+7hR7jCn5/du9hggAFfxoCAACa6QA==
Date: Wed, 8 Jun 2016 09:19:45 +0000
Message-ID: <2601191342CEEE43887BDE71AB97725836B6CCE1@irsmsx105.ger.corp.intel.com>
References: <1465278858-5131-1-git-send-email-zhe.tao@intel.com>
 <1465282390-6025-1-git-send-email-zhe.tao@intel.com>
 <1465282390-6025-3-git-send-email-zhe.tao@intel.com>
 <2601191342CEEE43887BDE71AB97725836B6C44E@irsmsx105.ger.corp.intel.com>
 <6A0DE07E22DDAD4C9103DF62FEBC090903483A2C@shsmsx102.ccr.corp.intel.com>
In-Reply-To: <6A0DE07E22DDAD4C9103DF62FEBC090903483A2C@shsmsx102.ccr.corp.intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.180]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock
	mode
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Jun 2016 09:19:49 -0000



>=20
> Hi Konstantin,
>=20
>=20
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Tuesday, June 7, 2016 5:59 PM
> > To: Tao, Zhe; dev@dpdk.org
> > Cc: Lu, Wenzhuo; Richardson, Bruce; Chen, Jing D; Liang, Cunming; Wu, J=
ingjing;
> > Zhang, Helin
> > Subject: RE: [PATCH v4 2/8] lib/librte_ether: defind RX/TX lock mode
> >
> >
> > Hi Zhe & Wenzhuo,
> >
> > Please find my comments below.
> > BTW, for clarification - is that patch for 16.11?
> > I believe it's too late to introduce such significant change in 16.07.
> > Thanks
> > Konstantin
> Thanks for the comments.
> Honestly, our purpose is 16.07. Realizing the big impact, we use NEXT_ABI=
 to comment our change. So, I think although we want to
> merge it in 16.07 this change will become effective after we remove NEXT_=
ABI in 16.11.

I don't think it is achievable.
First I think your code is not in proper shape yet, right now.
Second, as you said, it is a significant change and I would like to hear op=
inions from the rest of the community.

>=20
> >
> > > Define lock mode for RX/TX queue. Because when resetting the device w=
e
> > > want the resetting thread to get the lock of the RX/TX queue to make
> > > sure the RX/TX is stopped.
> > >
> > > Using next ABI macro for this ABI change as it has too much impact. 7
> > > APIs and 1 global variable are impacted.
> > >
> > > Signed-off-by: Wenzhuo Lu <wenzhuo.lu@intel.com>
> > > Signed-off-by: Zhe Tao <zhe.tao@intel.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.h | 62
> > > +++++++++++++++++++++++++++++++++++++++++++
> > >  1 file changed, 62 insertions(+)
> > >
> > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > b/lib/librte_ether/rte_ethdev.h index 74e895f..4efb5e9 100644
> > > --- a/lib/librte_ether/rte_ethdev.h
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -354,7 +354,12 @@ struct rte_eth_rxmode {
> > >  		jumbo_frame      : 1, /**< Jumbo Frame Receipt enable. */
> > >  		hw_strip_crc     : 1, /**< Enable CRC stripping by hardware. */
> > >  		enable_scatter   : 1, /**< Enable scatter packets rx handler */
> > > +#ifndef RTE_NEXT_ABI
> > >  		enable_lro       : 1; /**< Enable LRO */
> > > +#else
> > > +		enable_lro       : 1, /**< Enable LRO */
> > > +		lock_mode        : 1; /**< Using lock path */
> > > +#endif
> > >  };
> > >
> > >  /**
> > > @@ -634,11 +639,68 @@ struct rte_eth_txmode {
> > >  		/**< If set, reject sending out tagged pkts */
> > >  		hw_vlan_reject_untagged : 1,
> > >  		/**< If set, reject sending out untagged pkts */
> > > +#ifndef RTE_NEXT_ABI
> > >  		hw_vlan_insert_pvid : 1;
> > >  		/**< If set, enable port based VLAN insertion */
> > > +#else
> > > +		hw_vlan_insert_pvid : 1,
> > > +		/**< If set, enable port based VLAN insertion */
> > > +		lock_mode : 1;
> > > +		/**< If set, using lock path */
> > > +#endif
> > >  };
> > >
> > >  /**
> > > + * The macros for the RX/TX lock mode functions  */ #ifdef
> > > +RTE_NEXT_ABI #define RX_LOCK_FUNCTION(dev, func) \
> > > +	(dev->data->dev_conf.rxmode.lock_mode ? \
> > > +	func ## _lock : func)
> > > +
> > > +#define TX_LOCK_FUNCTION(dev, func) \
> > > +	(dev->data->dev_conf.txmode.lock_mode ? \
> > > +	func ## _lock : func)
> > > +#else
> > > +#define RX_LOCK_FUNCTION(dev, func) func
> > > +
> > > +#define TX_LOCK_FUNCTION(dev, func) func #endif
> > > +
> > > +/* Add the lock RX/TX function for VF reset */ #define
> > > +GENERATE_RX_LOCK(func, nic) \ uint16_t func ## _lock(void *rx_queue,
> > > +\
> > > +		      struct rte_mbuf **rx_pkts, \
> > > +		      uint16_t nb_pkts) \
> > > +{					\
> > > +	struct nic ## _rx_queue *rxq =3D rx_queue; \
> > > +	uint16_t nb_rx =3D 0; \
> > > +						\
> > > +	if (rte_spinlock_trylock(&rxq->rx_lock)) { \
> > > +		nb_rx =3D func(rx_queue, rx_pkts, nb_pkts); \
> > > +		rte_spinlock_unlock(&rxq->rx_lock); \
> > > +	} \
> > > +	\
> > > +	return nb_rx; \
> > > +}
> > > +
> > > +#define GENERATE_TX_LOCK(func, nic) \ uint16_t func ## _lock(void
> > > +*tx_queue, \
> > > +		      struct rte_mbuf **tx_pkts, \
> > > +		      uint16_t nb_pkts) \
> > > +{					\
> > > +	struct nic ## _tx_queue *txq =3D tx_queue; \
> > > +	uint16_t nb_tx =3D 0; \
> > > +						\
> > > +	if (rte_spinlock_trylock(&txq->tx_lock)) { \
> > > +		nb_tx =3D func(tx_queue, tx_pkts, nb_pkts); \
> > > +		rte_spinlock_unlock(&txq->tx_lock); \
> > > +	} \
> > > +	\
> > > +	return nb_tx; \
> > > +}
> >
> > 1. As I said in off-line dicussiion, I think this locking could (and I =
think better be)
> > impelented completely on rte_ethdev layer.
> > So actual PMD code will be unaffected.
> > Again that avoids us to introduce _lock version of every RX/Tx function=
 in each
> > PMD.
> One purpose of implementing the lock in PMD layer is to avoid ABI change.=
 But we introduce the field lock_mode in struct
> rte_eth_rx/txmode. So seems it's not a good reason now :)
> The other purpose is we want to add a lock for every queue. But in rte la=
yer the queue is void *, so we add the lock in the specific
> structures of the NICs. But as you mentioned below, we can add the lock a=
s dev->data->rx_queue_state it the struct
> rte_eth_dev_data.
> So, I prefer to add the lock in rte layer now.

OK.

>=20
> >
> > 2. Again, as discussed offline, I think it is better to have an explici=
t
> > rte_eth_(rx|tx)_burst_lock(sync?) API, instead of add new fileds into R=
X/TX
> > config strcutures.
> > Would help to avoid any confusion, I think.
> We want the users to choose the rx/tx path without  lock if they're sensi=
tive to the performance and can handle the reset event in
> their APP. After introducing new fields of config struct, users can chang=
e the config to choose the different path.

I understand what you are doing.

> If we introduce new API, it may be harder for the use to use it. I mean w=
hen users want to use lock mode, they may need to replace
> all the rte_eth_rx/tx_burst by rte_eth_rx/tx_burst_lock.=20

Yes, my opinion if users would like to use locking API they need to call it=
 explicitly.


>So if we add the lock in rte layer, I still prefer adding lock_mode in the
> configuration, and the rte_eth_rx/tx_burst is changed like this,
> rte_eth_rx/tx_burst
> {
> + if lock_mode
> + try_lock
> ......
> + if lock_mode
> + release_lock
> }

My preference is to keep existing rx/tx_burst() functions unaffected by tha=
t patch.
At least for now.
I suppose that will minimise the risks and help users to avoid confusion wh=
at API=20
(locking/non-locking) is in use.=20

>=20
>=20
> >
> > 3.  I thought the plan was to introduce a locking in all appropriate co=
ntrol path
> > functions (dev_start/dev_stop etc.) Without that locking version of RX/=
TX seems
> > a bit useless.
> > Yes, I understand that you do use locking inside dev_reset, but I suppo=
se the
> > plan was to have a generic solution, no?
> > Again, interrupt fire when user invokes dev_start/stop or so, so we sti=
ll need
> > some synchronisation between them.
> >
> > To be more specific, I thought about something like that:
> >
> > static inline uint16_t
> > rte_eth_rx_burst_lock(uint8_t port_id, uint16_t queue_id,
> >                  struct rte_mbuf **rx_pkts, const uint16_t nb_pkts) {
> >         struct rte_eth_dev *dev =3D &rte_eth_devices[port_id];
> >
> > #ifdef RTE_LIBRTE_ETHDEV_DEBUG
> >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, 0);
> >         RTE_FUNC_PTR_OR_ERR_RET(*dev->rx_pkt_burst, 0);
> >
> >         if (queue_id >=3D dev->data->nb_rx_queues) {
> >                 RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=3D%d\n", queue=
_id);
> >                 return 0;
> >         }
> > #endif
> >
> > + if (rte_spinlock_trylock(&dev->data->rx_queue_state[rx_queue_id].lock=
) =3D=3D 0)
> > +	return 0;
> > +  else if (dev->data->rx_queue_state[rx_queue_id] =3D=3D
> > RTE_ETH_QUEUE_STATE_STOPPED)) {
> > +	rte_spinlock_unlock(&dev->data->rx_queue_state[rx_queue_id].unlock);
> > +	return 0;
> > +
> >
> >  nb_rx =3D (*dev->rx_pkt_burst)(dev->data->rx_queues[queue_id],
> >                         rx_pkts, nb_pkts);
> >
> > + rte_spinlock_unlock(&dev->data->rx_queue_state[rx_queue_id].unlock);
> >
> > ....
> >
> > return nb_rx;
> > }
> >
> > And inside queue_start:
> >
> > int
> > rte_eth_dev_rx_queue_start(uint8_t port_id, uint16_t rx_queue_id) {
> >         struct rte_eth_dev *dev;
> >
> >         RTE_ETH_VALID_PORTID_OR_ERR_RET(port_id, -EINVAL);
> >
> >         dev =3D &rte_eth_devices[port_id];
> >         if (rx_queue_id >=3D dev->data->nb_rx_queues) {
> >                 RTE_PMD_DEBUG_TRACE("Invalid RX queue_id=3D%d\n", rx_qu=
eue_id);
> >                 return -EINVAL;
> >         }
> >
> >         RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_start, -ENOTSUP=
);
> >
> >      rte_spinlock_lock(&dev->data->rx_queue_state[rx_queue_id].lock)
> I think you add the lock here to stop the rx/tx.
> But to my opinion, we should lock the rx/tx much earlier before starting =
the queue. For example, when stop the port, the resource of
> the queues may be released.=20

I didn't get you here...
Before releasing the queue resources, queue_stop() has to be executed, righ=
t?=20

>The rx/tx cannot be executed. So I prefer to get the lock before stopping =
the ports.=20

Might be I wasn't clear enough here.
What I think we need to have:
 -To stop/start/rx/tx the queue (or do any other action that might change t=
he queue internal structure)
   you have to grab the lock.
   After queue is stopped it's state has to be changed to  QUEUE_STATE_STOP=
PED (whti queue lock grabbed),
   so rx/tx_locked wouldn't    proceed with that queue.
  - dev_stop() - has to stop all its queues first, i.e. it needs to call qu=
eue_stop() for all of them.
 So after dev_stop() had finished  - all device queues have to be in   QUEU=
E_STATE_STOPPED
Same about dev_start() - after it does all other things - it will call queu=
e_start() for all it's queues.
that will bring them into QUEUE_STARTED.
After that rx/tx_locked can use them again.

>Maybe better to keep the spinlock in the dev_reset.

Might be not :)

>=20
> >
> >         if (dev->data->rx_queue_state[rx_queue_id] !=3D
> > RTE_ETH_QUEUE_STATE_STOPPED) {
> >                 RTE_PMD_DEBUG_TRACE("Queue %" PRIu16" of device with
> > port_id=3D%" PRIu8
> >                         " already started\n",
> >                         rx_queue_id, port_id);
> >                 ret =3D -EINVAL 0;
> >         } else
> >         	ret =3D dev->dev_ops->rx_queue_start(dev, rx_queue_id);
> >
> >     rte_spinlock_unlock(&dev->data->rx_queue_state[rx_queue_id].unlock)=
;
> >
> >    return ret;
> > }
> >
> > Then again, we don't need to do explicit locking inside dev_reset().
> > Does it make sense to you guys?
> Please see the answer above.
>=20
> >
> >
> > > +
> > > +/**
> > >   * A structure used to configure an RX ring of an Ethernet port.
> > >   */
> > >  struct rte_eth_rxconf {
> > > --
> > > 2.1.4