From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-HE1-obe.outbound.protection.outlook.com (mail-he1eur01on0084.outbound.protection.outlook.com [104.47.0.84]) by dpdk.org (Postfix) with ESMTP id 797572BF3 for ; Tue, 5 Sep 2017 17:12:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=jtk3etYm/NMwbcpcdXrFPa0R2a25XSjKTFJpqJVLrM8=; b=pjPNcGaO63x7Bmb0dMBmReRI2hW3PZAX82as4J+uRDTHPLdfj057BZc4F/CWkNYI1kiUMchmpm3QI5Dj4qMa9G12OsdgpLmeUY8U9bXdTa22+vqM7xQw5QCmM+XFdJYo66wXhc4fA1q5TuWOM26Rp41+oRfgUHuA8Z9tC0HwOEc= Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com (10.172.250.136) by DB6PR0502MB3061.eurprd05.prod.outlook.com (10.172.245.135) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.13.10; Tue, 5 Sep 2017 15:12:55 +0000 Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::e938:efe9:effc:7ef8]) by DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::e938:efe9:effc:7ef8%14]) with mapi id 15.20.0013.018; Tue, 5 Sep 2017 15:12:55 +0000 From: Matan Azrad To: =?iso-8859-1?Q?Ga=EBtan_Rivet?= CC: "dev@dpdk.org" Thread-Topic: [PATCH] net/failsafe: stat support enhancement Thread-Index: AQHTJkoXg9AtWO/g8EmNE6CKdkBdbqKmYMig Date: Tue, 5 Sep 2017 15:12:55 +0000 Message-ID: References: <1504605394-6450-1-git-send-email-matan@mellanox.com> <20170905132244.GF21444@bidouze.vm.6wind.com> In-Reply-To: <20170905132244.GF21444@bidouze.vm.6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0502MB3061; 6:/I0et1+JHeDvzE5E3HTuklbQVks32FBtZxVq3UHLnuaNprO84//W2dwO+lbNA7qycWCEfiXmbkQ6/3tct90XZkNTsrBpMj/WQ2iHB+OrZfbIlE6GFy/Rhb0xozxMhdN5YjlVyQVzpf8J3O/+0KUZpQsNfnwixxA1Wo+JeNlra+JAsbJI4ezV4o80Vq3bmekpFJN6RxH/jxbTxN7ZDyZwsCxIcp8SDIB8+Rw/g08kiH2es+u8717X7hGlP2zuCYgbpDQBIORYtRGlJW9FLf2Xqz2YE1IvPOWTJcmvZXBp0JOqNfqjLFoKVqD0THoJ8bJmTCsZUy0e64dkswmhGvELLQ==; 5:JIgYy5kisHPfOMcSQrl3e+WskWsoZCFW00nLXSMoTsNew7IcPm5Mx4MDhp1BgV2TixiE94ekbR3B3UVx9UqtZTI7+7j27n5Nj+11soHpKEzQ/M3eo6gKsLTS056FVHVQPq2wZME9sv5j6KoJpWLXig==; 24:kaRjUAlLcGEqBkpVQkSimzW5ScTF0xxGFkH7Xdi+ntFy2DXRdAwj3UEKeBCX7xQcf0a8mIobgXXWQ9/TXJP6YwqHrOzujfECUK8Jnw2qlh0=; 7:Wot5OiddYRJ9sITihr1aPFQpWg5UnZurqBcOk2BBL8eakq/uMUaxA2oZNwTihfT3MGGpPXq5oTdv64qXTwVl9cNlH6GUh+dnVtlAiABTqoPk9z9rSBWcJql2pMPZ7ugYIepKm2W49+AZetO4Tfkn4iQHX5pUOyKsvloEL59gW4hmrAJM6VdFZJXLoVfiJf876Zw3n78+P8kuMXPJKs/smBfZ0fUjzji9BWClrsQV3Wo= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ms-office365-filtering-correlation-id: 5de9e7c3-02da-46d7-b5e8-08d4f470961f x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254152)(48565401081)(300000503095)(300135400095)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:DB6PR0502MB3061; x-ms-traffictypediagnostic: DB6PR0502MB3061: x-exchange-antispam-report-test: UriScan:; x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(5005006)(8121501046)(93006095)(93001095)(100000703101)(100105400095)(3002001)(10201501046)(6055026)(6041248)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123555025)(20161123558100)(20161123560025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6PR0502MB3061; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6PR0502MB3061; x-forefront-prvs: 0421BF7135 x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39860400002)(189002)(199003)(24454002)(13464003)(377454003)(478600001)(3660700001)(68736007)(101416001)(2906002)(86362001)(54356999)(7696004)(3280700002)(189998001)(99286003)(55016002)(2900100001)(76176999)(97736004)(6246003)(74316002)(25786009)(50986999)(53936002)(110136004)(9686003)(5250100002)(5660300001)(8676002)(53546010)(81166006)(81156014)(6916009)(14454004)(551984002)(2950100002)(8936002)(4326008)(6436002)(3846002)(6506006)(66066001)(33656002)(229853002)(7736002)(106356001)(105586002)(305945005)(6116002)(102836003); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0502MB3061; H:DB6PR0502MB3048.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-originalarrivaltime: 05 Sep 2017 15:12:55.7320 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0502MB3061 Subject: Re: [dpdk-dev] [PATCH] net/failsafe: stat support enhancement X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Tue, 05 Sep 2017 15:12:58 -0000 Hi Gaetan, > -----Original Message----- > From: Ga=EBtan Rivet [mailto:gaetan.rivet@6wind.com] > Sent: Tuesday, September 5, 2017 4:23 PM > To: Matan Azrad > Cc: dev@dpdk.org > Subject: Re: [PATCH] net/failsafe: stat support enhancement >=20 > Hi Matan, >=20 > On Tue, Sep 05, 2017 at 12:56:34PM +0300, Matan Azrad wrote: > > The previous stats code returned only the current TX sub device stats. > > > > This enhancement extends it to return the sum of all sub devices stats > > with history of removed sub-devices. > > > > Dedicated stats accumulator saves the stat history of all sub device > > remove events. > > > > Each failsafe sub device contains the last stats asked by the user and > > updates the accumulator in removal time. > > > > I would like to implement ultimate snapshot on removal time. > > The stats_get API needs to be changed to return error in the case it > > is too late to retrieve statistics. >=20 > You need to have this API evolution first. It is not assured to be accept= ed by > the community. >=20 I already sent suggestion to mailing list, find "stats_get API: return valu= e suggestion": No objections for now. > As it is, this version is incorrect, because the only available stats for= a > removed slave is the last snapshot. >=20 It is for sure better from the previous version, the accuracy can be better= if=20 my suggestion will be accepted.=20 If it is, I will improve it. > Thus it complicates the rules while still being incorrect. Even if you we= re able > to push for this API evolution, PMDs with hard counters would be the ones > to return an error code on stat read while removed. > You may be lucky at the moment because MLX drivers do not support hard > counters, but this won't always be true (and it will be false soon enough= ). > You will thus be back at square one, with a new useless API and still inc= orrect > stats in the fail-safe. On the other hand, the fail-safe should strive to= be as > easy to use as possible with most drivers, and not cater specifically to = soft- > counters ones. >=20 > So, my take on this: I understand that aggregated stats would be interest= ing. > Keep it simple & stupid: simple aggregation on stats_get. > You will have a rollback at some point on device removal, but this is sti= ll easier > to detect / deal with than partially / sometimes incorrect stats. >=20 100% accuracy is impossible when removal event is in the game, but we can do it better as we can. As I wrote in commit log(if stat gets will return value): When some drivers can give the stats after RMV interrupt and before dev_clo= se, We just can to improve accuracy and be closer to 100%. The stats of the sub devices which return error will be taken from the last= snapshot.=20 It is just enhancement which is important for customers. > > By this way, failsafe can get stats snapshot in removal interrupt > > callback for each PMD which can give stats after removal event. > > > > Signed-off-by: Matan Azrad > > --- > > drivers/net/failsafe/failsafe_ether.c | 33 > +++++++++++++++++++++++++++++++++ > > drivers/net/failsafe/failsafe_ops.c | 16 ++++++++++++---- > > drivers/net/failsafe/failsafe_private.h | 5 +++++ > > 3 files changed, 50 insertions(+), 4 deletions(-) > > > > diff --git a/drivers/net/failsafe/failsafe_ether.c > > b/drivers/net/failsafe/failsafe_ether.c > > index a3a8cce..133080d 100644 > > --- a/drivers/net/failsafe/failsafe_ether.c > > +++ b/drivers/net/failsafe/failsafe_ether.c > > @@ -399,6 +399,37 @@ failsafe_eth_dev_state_sync(struct rte_eth_dev > *dev) > > return ret; > > } > > > > +void > > +fs_increment_stats(struct rte_eth_stats *from, struct rte_eth_stats > > +*to) { > > + uint8_t i; > > + > > + RTE_ASSERT(from !=3D NULL && to !=3D NULL); > > + to->ipackets +=3D from->ipackets; > > + to->opackets +=3D from->opackets; > > + to->ibytes +=3D from->ibytes; > > + to->obytes +=3D from->obytes; > > + to->imissed +=3D from->imissed; > > + to->ierrors +=3D from->ierrors; > > + to->oerrors +=3D from->oerrors; > > + to->rx_nombuf +=3D from->rx_nombuf; > > + for (i =3D 0; i < RTE_ETHDEV_QUEUE_STAT_CNTRS; i++) { > > + to->q_ipackets[i] +=3D from->q_ipackets[i]; > > + to->q_opackets[i] +=3D from->q_opackets[i]; > > + to->q_ibytes[i] +=3D from->q_ibytes[i]; > > + to->q_obytes[i] +=3D from->q_obytes[i]; > > + to->q_errors[i] +=3D from->q_errors[i]; > > + } > > +} > > + > > +void > > +fs_increment_stats_accumulator(struct sub_device *sdev) { > > + fs_increment_stats(&sdev->stats_snapshot, > > + &PRIV(sdev->fs_dev)->stats_accumulator); > > + memset(&sdev->stats_snapshot, 0, sizeof(struct rte_eth_stats)); } > > + > > int > > failsafe_eth_rmv_event_callback(uint8_t port_id __rte_unused, > > enum rte_eth_event_type event > __rte_unused, @@ -410,6 +441,8 @@ > > failsafe_eth_rmv_event_callback(uint8_t port_id __rte_unused, > > fs_switch_dev(sdev->fs_dev, sdev); > > /* Use safe bursts in any case. */ > > set_burst_fn(sdev->fs_dev, 1); > > + /* Increment the stats accumulator by the last stats snapshot. */ > > + fs_increment_stats_accumulator(sdev); > > /* > > * Async removal, the sub-PMD will try to unregister > > * the callback at the source of the current thread context. > > diff --git a/drivers/net/failsafe/failsafe_ops.c > > b/drivers/net/failsafe/failsafe_ops.c > > index ff9ad15..e47cc85 100644 > > --- a/drivers/net/failsafe/failsafe_ops.c > > +++ b/drivers/net/failsafe/failsafe_ops.c > > @@ -586,9 +586,14 @@ static void > > fs_stats_get(struct rte_eth_dev *dev, > > struct rte_eth_stats *stats) > > { > > - if (TX_SUBDEV(dev) =3D=3D NULL) > > - return; > > - rte_eth_stats_get(PORT_ID(TX_SUBDEV(dev)), stats); > > + struct sub_device *sdev; > > + uint8_t i; > > + > > + memcpy(stats, &PRIV(dev)->stats_accumulator, sizeof(*stats)); > > + FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > + rte_eth_stats_get(PORT_ID(sdev), &sdev->stats_snapshot); > > + fs_increment_stats(&sdev->stats_snapshot, stats); > > + } > > } > > > > static void > > @@ -597,8 +602,11 @@ fs_stats_reset(struct rte_eth_dev *dev) > > struct sub_device *sdev; > > uint8_t i; > > > > - FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) > > + FOREACH_SUBDEV_STATE(sdev, i, dev, DEV_ACTIVE) { > > rte_eth_stats_reset(PORT_ID(sdev)); > > + memset(&sdev->stats_snapshot, 0, sizeof(struct > rte_eth_stats)); > > + } > > + memset(&PRIV(dev)->stats_accumulator, 0, sizeof(struct > > +rte_eth_stats)); > > } > > > > /** > > diff --git a/drivers/net/failsafe/failsafe_private.h > > b/drivers/net/failsafe/failsafe_private.h > > index 0361cf4..267c749 100644 > > --- a/drivers/net/failsafe/failsafe_private.h > > +++ b/drivers/net/failsafe/failsafe_private.h > > @@ -102,6 +102,8 @@ struct sub_device { > > uint8_t sid; > > /* Device state machine */ > > enum dev_state state; > > + /* Last stats snapshot passed to user */ > > + struct rte_eth_stats stats_snapshot; > > /* Some device are defined as a command line */ > > char *cmdline; > > /* fail-safe device backreference */ @@ -140,6 +142,7 @@ struct > > fs_priv { > > * synchronized state. > > */ > > enum dev_state state; > > + struct rte_eth_stats stats_accumulator; > > unsigned int pending_alarm:1; /* An alarm is pending */ > > /* flow isolation state */ > > int flow_isolated:1; > > @@ -180,6 +183,8 @@ int failsafe_eal_uninit(struct rte_eth_dev *dev); > > > > int failsafe_eth_dev_state_sync(struct rte_eth_dev *dev); void > > failsafe_dev_remove(struct rte_eth_dev *dev); > > +void fs_increment_stats(struct rte_eth_stats *from, struct > > +rte_eth_stats *to); void fs_increment_stats_accumulator(struct > > +sub_device *sdev); > > int failsafe_eth_rmv_event_callback(uint8_t port_id, > > enum rte_eth_event_type type, > > void *arg, void *out); > > -- > > 2.7.4 > > >=20 > -- > Ga=EBtan Rivet > 6WIND