From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 8BFB85F1F
 for <dev@dpdk.org>; Thu, 15 Mar 2018 16:38:37 +0100 (CET)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga006.fm.intel.com ([10.253.24.20])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 15 Mar 2018 08:38:35 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,311,1517904000"; d="scan'208";a="211621542"
Received: from irsmsx154.ger.corp.intel.com ([163.33.192.96])
 by fmsmga006.fm.intel.com with ESMTP; 15 Mar 2018 08:38:33 -0700
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.221]) by
 IRSMSX154.ger.corp.intel.com ([169.254.12.108]) with mapi id 14.03.0319.002;
 Thu, 15 Mar 2018 15:38:33 +0000
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Zhang, Qi Z" <qi.z.zhang@intel.com>, "thomas@monjalon.net"
 <thomas@monjalon.net>
CC: "dev@dpdk.org" <dev@dpdk.org>, "Xing, Beilei" <beilei.xing@intel.com>,
 "Wu, Jingjing" <jingjing.wu@intel.com>, "Lu, Wenzhuo" <wenzhuo.lu@intel.com>
Thread-Topic: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
Thread-Index: AQHTsdzYjJFpxTZzDkGudLbJsh6cp6PPudHggAD5wQCAAKFxQIAAJjkAgAAEaxA=
Date: Thu, 15 Mar 2018 15:38:32 +0000
Message-ID: <2601191342CEEE43887BDE71AB9772589E2902E9@irsmsx105.ger.corp.intel.com>
References: <20180212045314.171616-1-qi.z.zhang@intel.com>
 <20180302041306.90324-1-qi.z.zhang@intel.com>
 <20180302041306.90324-2-qi.z.zhang@intel.com>
 <2601191342CEEE43887BDE71AB9772589E28FA63@irsmsx105.ger.corp.intel.com>
 <039ED4275CED7440929022BC67E706115316DF0F@SHSMSX103.ccr.corp.intel.com>
 <2601191342CEEE43887BDE71AB9772589E2901FF@irsmsx105.ger.corp.intel.com>
 <039ED4275CED7440929022BC67E706115316E27D@SHSMSX103.ccr.corp.intel.com>
In-Reply-To: <039ED4275CED7440929022BC67E706115316E27D@SHSMSX103.ccr.corp.intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiZjI1NjQwYjktMzRkZS00OTY5LThiNzItMTVkNzBiNzQ3YzZlIiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX05UIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6InNucVdLNU9MTHd6d1dLTDR5ZVdmXC9UZm9KSEpEcEg0N1liZENNdXRIaHRvPSJ9
x-ctpclassification: CTP_NT
dlp-product: dlpe-windows
dlp-version: 11.0.0.116
dlp-reaction: no-action
x-originating-ip: [163.33.239.181]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setup
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 15 Mar 2018 15:38:38 -0000



> -----Original Message-----
> From: Zhang, Qi Z
> Sent: Thursday, March 15, 2018 3:09 PM
> To: Ananyev, Konstantin <konstantin.ananyev@intel.com>; thomas@monjalon.n=
et
> Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing <jin=
gjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue setu=
p
>=20
>=20
>=20
> > -----Original Message-----
> > From: Ananyev, Konstantin
> > Sent: Thursday, March 15, 2018 9:17 PM
> > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue se=
tup
> >
> > Hi Qi,
> >
> > > -----Original Message-----
> > > From: Zhang, Qi Z
> > > Sent: Thursday, March 15, 2018 3:14 AM
> > > To: Ananyev, Konstantin <konstantin.ananyev@intel.com>;
> > > thomas@monjalon.net
> > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjing
> > > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>
> > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queue
> > > setup
> > >
> > > Hi Konstantin:
> > >
> > > > -----Original Message-----
> > > > From: Ananyev, Konstantin
> > > > Sent: Wednesday, March 14, 2018 8:32 PM
> > > > To: Zhang, Qi Z <qi.z.zhang@intel.com>; thomas@monjalon.net
> > > > Cc: dev@dpdk.org; Xing, Beilei <beilei.xing@intel.com>; Wu, Jingjin=
g
> > > > <jingjing.wu@intel.com>; Lu, Wenzhuo <wenzhuo.lu@intel.com>; Zhang,
> > > > Qi Z <qi.z.zhang@intel.com>
> > > > Subject: RE: [dpdk-dev] [PATCH v2 1/4] ether: support deferred queu=
e
> > > > setup
> > > >
> > > > Hi Qi,
> > > >
> > > > >
> > > > > The patch let etherdev driver expose the capability flag through
> > > > > rte_eth_dev_info_get when it support deferred queue configuraiton=
,
> > > > > then base on the flag rte_eth_[rx|tx]_queue_setup could decide
> > > > > continue to setup the queue or just return fail when device
> > > > > already started.
> > > > >
> > > > > Signed-off-by: Qi Zhang <qi.z.zhang@intel.com>
> > > > > ---
> > > > >  doc/guides/nics/features.rst  |  8 ++++++++
> > > > > lib/librte_ether/rte_ethdev.c | 30 ++++++++++++++++++------------
> > > > > lib/librte_ether/rte_ethdev.h | 11 +++++++++++
> > > > >  3 files changed, 37 insertions(+), 12 deletions(-)
> > > > >
> > > > > diff --git a/doc/guides/nics/features.rst
> > > > > b/doc/guides/nics/features.rst index 1b4fb979f..36ad21a1f 100644
> > > > > --- a/doc/guides/nics/features.rst
> > > > > +++ b/doc/guides/nics/features.rst
> > > > > @@ -892,7 +892,15 @@ Documentation describes performance
> > values.
> > > > >
> > > > >  See ``dpdk.org/doc/perf/*``.
> > > > >
> > > > > +.. _nic_features_queue_deferred_setup_capabilities:
> > > > >
> > > > > +Queue deferred setup capabilities
> > > > > +---------------------------------
> > > > > +
> > > > > +Supports queue setup / release after device started.
> > > > > +
> > > > > +* **[provides] rte_eth_dev_info**:
> > > > >
> > > >
> > ``deferred_queue_config_capa:DEV_DEFERRED_RX_QUEUE_SETUP,DEV_DEFE
> > > > RRED_
> > > > > TX_QUEUE_SETUP,DEV_DEFERRED_RX_QUEUE_RELE
> > > > > ASE,DEV_DEFERRED_TX_QUEUE_RELEASE``.
> > > > > +* **[related]  API**: ``rte_eth_dev_info_get()``.
> > > > >
> > > > >  .. _nic_features_other:
> > > > >
> > > > > diff --git a/lib/librte_ether/rte_ethdev.c
> > > > > b/lib/librte_ether/rte_ethdev.c index a6ce2a5ba..6c906c4df 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.c
> > > > > +++ b/lib/librte_ether/rte_ethdev.c
> > > > > @@ -1425,12 +1425,6 @@ rte_eth_rx_queue_setup(uint16_t port_id,
> > > > uint16_t rx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > -	if (dev->data->dev_started) {
> > > > > -		RTE_PMD_DEBUG_TRACE(
> > > > > -		    "port %d must be stopped to allow configuration\n",
> > port_id);
> > > > > -		return -EBUSY;
> > > > > -	}
> > > > > -
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > > -ENOTSUP);
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_setup,
> > > > -ENOTSUP);
> > > > >
> > > > > @@ -1474,10 +1468,19 @@ rte_eth_rx_queue_setup(uint16_t
> > port_id,
> > > > uint16_t rx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > +	if (dev->data->dev_started &&
> > > > > +		!(dev_info.deferred_queue_config_capa &
> > > > > +			DEV_DEFERRED_RX_QUEUE_SETUP))
> > > > > +		return -EINVAL;
> > > > > +
> > > >
> > > > I think now you have to check here that the queue is stopped.
> > > > Otherwise you might attempt to reconfigure running queue.
> > >
> > > I'm not sure if it's necessary to let application use different API s=
equence
> > for a deferred configure and deferred re-configure.
> > > Can we just call dev_ops->rx_queue_stop before rx_queue_release here
> >
> > I don't follow you here.
> > Let say now inside queue_start() we do check:
> >
> > if (dev->data->rx_queue_state[rx_queue_id] !=3D
> > RTE_ETH_QUEUE_STATE_STOPPED)
> >
> > Right now it is not possible to call queue_setup() without dev_stop() b=
efore
> > it - that's why we have check if (dev->data->dev_started) in queue_setu=
p()
> > right now.
> > Though with your patch it not the case anymore - user is able to call
> > queue_setup() without stopping the whole device.
> > But he still has to stop the queue.
>=20
> >
> > >
> > > >
> > > >
> > > > >  	rxq =3D dev->data->rx_queues;
> > > > >  	if (rxq[rx_queue_id]) {
> > > > >
> > 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->rx_queue_release,
> > > > >  					-ENOTSUP);
> > > >
> > > > I don't think it is *that* straightforward.
> > > > rx_queue_setup() parameters can imply different rx function (and
> > > > related dev
> > > > icesettings) that are already setuped by previous
> > queue_setup()/dev_start.
> > > > So I think you need to do one of 2 things:
> > > > 1. rework ethdev layer to introduce a separate rx function (and
> > > > related
> > > > settings) for each queue.
> > > > 2. at rx_queue_setup() if it is invoked after dev_start - check tha=
t
> > > > given queue settings wouldn't contradict with current device
> > > > settings  (rx function, etc.).
> > > > If they do - return an error.
> > > Yes, I think what we have is option 2 here, the
> > > dev_ops->rx_queue_setup will return fail if conflict with previous
> > > setting
> >
> > Hmm and what makes you think that?
> > As I know it is not the case  right now.
> > Let say I do:
> >     ....
> >    rx_queue_setup(port=3D0,queue=3D0, mp=3Dmb_size_2048);
> >    dev_start(port=3D0);
> >    ...
> >    rx_queue_setup(port=3D0,queue=3D1,mp=3Dmb_size_1024);
> >
> >  If current rx function doesn't support multi-segs then second
> > rx_queue_setup() should fail.
> >  Though I don't think that would happen with the current implementation=
.
>=20
> Why you think that would not happen? dev_ops->rx_queue_setup can fail, ri=
ght?
> I mean it's the responsibility of low level driver (i40e) to check the co=
nflict with current implementation.

Yes it is responsibility if the PMD because only it knows its own logic of =
rx/tx function selection.
But I don't see such changes in i40e in your patch series.
Probably I missed them?

> >
> > Same story for TX offloads, though it probably not that critical, as fo=
r most
> > Intel PMDs HW TX offloads will become per port in 18.05.
> >
> > As I can see you do have either of these options implemented right  now=
 -
> > that's the problem.
> >
> > > I'm also thinking about option 1, the idea is to move per queue rx/tx
> > function into driver layer, so it will not break existing API.
> > >
> > > 1. driver can expose the capability like per_queue_rx or per_queue_tx
> > > 2. application can enable this capability by dev_config with
> > > rte_eth_conf 3, if per_queue_rx is not enable, nothing change, so we
> > > are at option 2 4. if per_queue_rx is enabled, driver will set
> > > rx_pkt_burst with a hook function which redirect to an function ptr i=
n
> > > a per queue rx function tables ( I guess performance is impacted
> > > somehow, but this is the cost if you want different offload for
> > > different queue)
> >
> > I don't think we need to overcomplicate things here.
> > It should be transparent to the user - user just calls queue_setup() - =
based on
> > its input parameters PMD selects a function that fits best.
> > Pretty much what we have right now, just possibly have an array of func=
tions
> > (one per queue).
>=20
> If we don't introduce a new capability or something like, but just take p=
er queue functions as default way,
> does that mean, we need to change all drivers to adapt this?
> Or do you mean below?
>=20
> If (dev->rx_pkt_burst)
> 	/* default way */
> else
> 	/* per queue function */

For me either way seems ok.
Second one probably a bit easier, as no changes from PMDs are required.
But again - might be even rte_ethdev layer can fill queue's rx_pkt_burst[] =
array
for the drivers that don't support it - just by copying dev->rx_pkt_burst i=
nto it.
Konstantin=20

>=20
> Regards
> Qi
>=20
> >
> > >
> > > >
> > > > From my perspective - 1) is a better choice though it required more
> > > > work, and possibly ABI breakage.
> > > > I did some work in that direction as RFC:
> > > > http://dpdk.org/dev/patchwork/patch/31866/
> > >
> > > I will learn this, thanks for the heads up.
> > > >
> > > > 2) might be also possible, but looks a bit clumsy as
> > > > rx_queue_setup() might now fail even with valid parameters - all
> > > > depends on previous queue configurations.
> > > >
> > > > Same story applies for TX.
> > > >
> > > >
> > > > > +		if (dev->data->dev_started &&
> > > > > +			!(dev_info.deferred_queue_config_capa &
> > > > > +				DEV_DEFERRED_RX_QUEUE_RELEASE))
> > > > > +			return -EINVAL;
> > > > >  		(*dev->dev_ops->rx_queue_release)(rxq[rx_queue_id]);
> > > > >  		rxq[rx_queue_id] =3D NULL;
> > > > >  	}
> > > > > @@ -1573,12 +1576,6 @@ rte_eth_tx_queue_setup(uint16_t port_id,
> > > > uint16_t tx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > -	if (dev->data->dev_started) {
> > > > > -		RTE_PMD_DEBUG_TRACE(
> > > > > -		    "port %d must be stopped to allow configuration\n",
> > port_id);
> > > > > -		return -EBUSY;
> > > > > -	}
> > > > > -
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->dev_infos_get,
> > > > -ENOTSUP);
> > > > >  	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_setup,
> > > > -ENOTSUP);
> > > > >
> > > > > @@ -1596,10 +1593,19 @@ rte_eth_tx_queue_setup(uint16_t
> > port_id,
> > > > uint16_t tx_queue_id,
> > > > >  		return -EINVAL;
> > > > >  	}
> > > > >
> > > > > +	if (dev->data->dev_started &&
> > > > > +		!(dev_info.deferred_queue_config_capa &
> > > > > +			DEV_DEFERRED_TX_QUEUE_SETUP))
> > > > > +		return -EINVAL;
> > > > > +
> > > > >  	txq =3D dev->data->tx_queues;
> > > > >  	if (txq[tx_queue_id]) {
> > > > >
> > 	RTE_FUNC_PTR_OR_ERR_RET(*dev->dev_ops->tx_queue_release,
> > > > >  					-ENOTSUP);
> > > > > +		if (dev->data->dev_started &&
> > > > > +			!(dev_info.deferred_queue_config_capa &
> > > > > +				DEV_DEFERRED_TX_QUEUE_RELEASE))
> > > > > +			return -EINVAL;
> > > > >  		(*dev->dev_ops->tx_queue_release)(txq[tx_queue_id]);
> > > > >  		txq[tx_queue_id] =3D NULL;
> > > > >  	}
> > > > > diff --git a/lib/librte_ether/rte_ethdev.h
> > > > > b/lib/librte_ether/rte_ethdev.h index 036153306..410e58c50 100644
> > > > > --- a/lib/librte_ether/rte_ethdev.h
> > > > > +++ b/lib/librte_ether/rte_ethdev.h
> > > > > @@ -981,6 +981,15 @@ struct rte_eth_conf {
> > > > >   */
> > > > >  #define DEV_TX_OFFLOAD_SECURITY         0x00020000
> > > > >
> > > > > +#define DEV_DEFERRED_RX_QUEUE_SETUP 0x00000001 /**<
> > Deferred
> > > > setup rx
> > > > > +queue */ #define DEV_DEFERRED_TX_QUEUE_SETUP 0x00000002
> > /**<
> > > > Deferred
> > > > > +setup tx queue */ #define DEV_DEFERRED_RX_QUEUE_RELEASE
> > > > 0x00000004
> > > > > +/**< Deferred release rx queue */ #define
> > > > > +DEV_DEFERRED_TX_QUEUE_RELEASE 0x00000008 /**< Deferred
> > release
> > > > tx
> > > > > +queue */
> > > > > +
> > > >
> > > > I don't think we do need flags for both setup a and release.
> > > > If runtime setup is supported - surely dynamic release should be
> > > > supported too.
> > > > Also probably RUNTIME_RX_QUEUE_SETUP sounds a bit better.
> > >
> > > Agree
> > >
> > > Thanks
> > > Qi
> > >
> > > >
> > > > Konstantin
> > > >
> > > > >  /*
> > > > >   * If new Tx offload capabilities are defined, they also must be
> > > > >   * mentioned in rte_tx_offload_names in rte_ethdev.c file.
> > > > > @@ -1029,6 +1038,8 @@ struct rte_eth_dev_info {
> > > > >  	/** Configured number of rx/tx queues */
> > > > >  	uint16_t nb_rx_queues; /**< Number of RX queues. */
> > > > >  	uint16_t nb_tx_queues; /**< Number of TX queues. */
> > > > > +	uint64_t deferred_queue_config_capa;
> > > > > +	/**< queues can be setup/release after dev_start
> > > > > +(DEV_DEFERRED_). */
> > > > >  };
> > > > >
> > > > >  /**
> > > > > --
> > > > > 2.13.6