From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-DB5-obe.outbound.protection.outlook.com (mail-db5eur01on0080.outbound.protection.outlook.com [104.47.2.80]) by dpdk.org (Postfix) with ESMTP id 867817CC0 for ; Mon, 4 Sep 2017 19:52:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=+3X571JyErP4jPCWGd+Yi83kKsL8LDKg7kNJySDlID4=; b=aHzr1/Tb4KyOBF2BJZ2yd6AHjO4RTKUwZO7gjG8j07KtLq5+pHBEwfGRs0NReWXgQr+aGsllbWSrFoc2bNjl5wFwCirlsd7GldKDWdCQ6uHUwSD7fkPIYlmZRFxCcC/fuSCvDrJnBvHUtDWc0oR1bK1QyOJvcpZ3/5lSrK3mBQY= Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com (10.172.250.136) by DB6PR0502MB3110.eurprd05.prod.outlook.com (10.172.245.148) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.13.10; Mon, 4 Sep 2017 17:52:56 +0000 Received: from DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::e938:efe9:effc:7ef8]) by DB6PR0502MB3048.eurprd05.prod.outlook.com ([fe80::e938:efe9:effc:7ef8%14]) with mapi id 15.20.0013.018; Mon, 4 Sep 2017 17:52:56 +0000 From: Matan Azrad To: Adrien Mazarguil CC: =?iso-8859-1?Q?N=E9lio_Laranjeiro?= , "dev@dpdk.org" Thread-Topic: [dpdk-dev] [PATCH v3] net/mlx5: support device removal event Thread-Index: AQHTJYWZcEMdv0PQZ0+HMybm5OY32qKk2y2AgAAVxBA= Date: Mon, 4 Sep 2017 17:52:55 +0000 Message-ID: References: <20170904124943.2pep4kbglu4q5qg4@localhost> <1504533353-38337-1-git-send-email-matan@mellanox.com> <20170904153308.GY4301@6wind.com> In-Reply-To: <20170904153308.GY4301@6wind.com> Accept-Language: en-US, he-IL Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=matan@mellanox.com; x-originating-ip: [85.64.136.190] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB6PR0502MB3110; 6:VFWWYkSFmuXWzPun4TalHb+LYW0+o8AqtVfxcMFgz70IcyevzlPMP8eg/+ydQBJaVEkLpPcdFPnS8jEI8D6N/7m9ci9NmUicKarpjLYsZ1YelGizHPE0jAv/GYFfoZTjDjhP7Ylc34AscfMUIMqFvOGf9q5eedVmf162yFDLtvPi4vz98ijcMrJmysjH4LLef3R4U89nXPVsTd7EhXT+87c4aWk/jk4UHzXMN3wc/wNb/ZV/z53xn749+VtOiLdVSjM1INb5D/x7+GvLUCIhZtd2GiWs0Ok5GxbfO3yJCfD0pATHg4+01mtgwoTnNIwgSS+saR5p6lBNPhkxM8b/ow==; 5:MGW8SLJigW1chzzLTulPATKShE6bystoDo9MJj9JC9Fptu1eoU9XBM2r+KNc4RHI1P3J68boBCwfu6NA5FaPrtF+KQMRsemSqCDjn2Uub/yaDw8Z7673XXCzjQh8w/rNHTnBGkwWwm5XMqFp+2XUxA==; 24:TPVXxqENb3NSc07R4juLeHDcCz45g4DgCIQ5W+ZkoxiH65XKe4xDFs1zO4/esAwsdpD0glvX0WcgHdnKR1ujApikWsgzrNuCMmPwnVcjppk=; 7:nCsFVWdZIyPuZha1RheicIOPmIun2TMZgjmPGNO/YvF+fsosJWPIipfBPW81H9+dZtrHBqi1uSv/Jmx7srqwYHp46SAqoBRB+n32UkHF5Wi4Ut2U5QHTcGUipufZ6pppC0EiIErmnn4PE5pkwZlyhlyRYwKSWUK0NmlM/WHY++Hie2/7FoRxKrgMlT1d1eT+s8PHEIUOU/RGb+N/bUS64aJXdsBStapa45lyo8I5mAQ= x-ms-exchange-antispam-srfa-diagnostics: SSOS; x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr x-ms-office365-filtering-correlation-id: 5feddd72-ae63-407b-ad8e-08d4f3bdc5e6 x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(300000502095)(300135100095)(22001)(2017030254152)(300000503095)(300135400095)(48565401081)(2017052603199)(201703131423075)(201703031133081)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:DB6PR0502MB3110; x-ms-traffictypediagnostic: DB6PR0502MB3110: x-exchange-antispam-report-test: UriScan:(788757137089); x-microsoft-antispam-prvs: x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(100000703101)(100105400095)(93006095)(93001095)(10201501046)(3002001)(6055026)(6041248)(20161123560025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123564025)(20161123558100)(20161123562025)(20161123555025)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:DB6PR0502MB3110; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:DB6PR0502MB3110; x-forefront-prvs: 0420213CCD x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(6009001)(39860400002)(377454003)(199003)(24454002)(13464003)(189002)(6506006)(25786009)(7736002)(2900100001)(4326008)(74316002)(6246003)(305945005)(6916009)(110136004)(2950100002)(5250100002)(14454004)(189998001)(53546010)(97736004)(66066001)(106356001)(478600001)(8676002)(7696004)(86362001)(33656002)(2906002)(105586002)(81166006)(81156014)(54906002)(3846002)(76176999)(53936002)(102836003)(68736007)(8936002)(55016002)(3280700002)(6436002)(99286003)(101416001)(5660300001)(6116002)(9686003)(229853002)(3660700001)(54356999)(50986999); DIR:OUT; SFP:1101; SCL:1; SRVR:DB6PR0502MB3110; H:DB6PR0502MB3048.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-originalarrivaltime: 04 Sep 2017 17:52:55.8024 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB6PR0502MB3110 Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: support device removal event X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 04 Sep 2017 17:52:58 -0000 Hi Adrien, > -----Original Message----- > From: Adrien Mazarguil [mailto:adrien.mazarguil@6wind.com] > Sent: Monday, September 4, 2017 6:33 PM > To: Matan Azrad > Cc: N=E9lio Laranjeiro ; dev@dpdk.org > Subject: Re: [dpdk-dev] [PATCH v3] net/mlx5: support device removal event >=20 > Hi Matan, >=20 > One comment I have is, while this patch adds support for RMV, it also sil= ently > addresses a bug (see large comment you added to > priv_link_status_update()). >=20 > This should be split in two commits, with the fix part coming first and C= C > stable@dpdk.org, and a second commit adding RMV support proper. >=20 Actually, the mlx4 bug was not appeared in the mlx5 previous code, Probably because the RMV interrupt was not implemented in mlx5 before this = patch. The big comment just explains the link inconsistent issue and was added here since Nelio and I think the new function, priv_link_status_update(), justifies this comment for future review. =20 > More below. >=20 > On Mon, Sep 04, 2017 at 04:55:53PM +0300, Matan Azrad wrote: > > Extend the LSC event handling to support the device removal as well. > > The Verbs library may send several related events, which are different > > from LSC event. > > > > The mlx5 event handling has been made capable of receiving and > > signaling several event types at once. > > > > This support includes next: > > 1. Removal event detection according to the user configuration. > > 2. Calling to all registered mlx5 removal callbacks. > > 3. Capabilities extension to include removal interrupt handling. > > > > Signed-off-by: Matan Azrad > > --- > > drivers/net/mlx5/mlx5.c | 2 +- > > drivers/net/mlx5/mlx5_ethdev.c | 103 > > +++++++++++++++++++++++++++++------------ > > 2 files changed, 74 insertions(+), 31 deletions(-) > > > > Changes: > > V2: > > Replace link status update function name. > > add inconsistent link workaround comment. > > > > V3: > > Fix indentations. > > Accurate inconsistent link comment. > > > > > > diff --git a/drivers/net/mlx5/mlx5.c b/drivers/net/mlx5/mlx5.c index > > bd66a7c..1a3d7f1 100644 > > --- a/drivers/net/mlx5/mlx5.c > > +++ b/drivers/net/mlx5/mlx5.c > > @@ -865,7 +865,7 @@ static struct rte_pci_driver mlx5_driver =3D { > > }, > > .id_table =3D mlx5_pci_id_map, > > .probe =3D mlx5_pci_probe, > > - .drv_flags =3D RTE_PCI_DRV_INTR_LSC, > > + .drv_flags =3D RTE_PCI_DRV_INTR_LSC | RTE_PCI_DRV_INTR_RMV, > > }; > > > > /** > > diff --git a/drivers/net/mlx5/mlx5_ethdev.c > > b/drivers/net/mlx5/mlx5_ethdev.c index 57f6237..cdbd723 100644 > > --- a/drivers/net/mlx5/mlx5_ethdev.c > > +++ b/drivers/net/mlx5/mlx5_ethdev.c > > @@ -1112,47 +1112,84 @@ mlx5_ibv_device_to_pci_addr(const struct > > ibv_device *device, } > > > > /** > > - * Link status handler. > > + * Update the link status. > > * > > * @param priv > > * Pointer to private structure. > > - * @param dev > > - * Pointer to the rte_eth_dev structure. > > * > > * @return > > - * Nonzero if the callback process can be called immediately. > > + * Zero if the callback process can be called immediately. > > */ > > static int > > -priv_dev_link_status_handler(struct priv *priv, struct rte_eth_dev > > *dev) > > +priv_link_status_update(struct priv *priv) { > > + struct rte_eth_link *link =3D &priv->dev->data->dev_link; > > + > > + mlx5_link_update(priv->dev, 0); > > + if (((link->link_speed =3D=3D 0) && link->link_status) || > > + ((link->link_speed !=3D 0) && !link->link_status)) { > > + /* > > + * Inconsistent status. > > + * The link status is read from Ethtool through an IOCTL, > > + * but as the application may work in polling mode it > > + * may get the port event before the Kernel driver had > > + * time to process it. PMD then request the link from > > + * the kernel but the event is still not processed (due > > + * to more urgent interrupts) and finally the PMD may > > + * get an inconsistent link. > > + * Setting alarm for later checking. > > + */ >=20 > While adding a comment is nice, there's too much info in there. From the > PMD standpoint, what happens is the interrupt occurs much before the > kernel netdevice exposes the new status, so it needs to be checked later. > Can you sum it up in fewer words? >=20 Yes, sure :) > > + if (!priv->pending_alarm) { > > + priv->pending_alarm =3D 1; > > + rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US, > > + mlx5_dev_link_status_handler, > > + priv->dev); > > + } > > + return 1; > > + } else if (unlikely(priv->pending_alarm)) { > > + /* In case of link interrupt while link alarm was setting. */ > > + priv->pending_alarm =3D 0; > > + rte_eal_alarm_cancel(mlx5_dev_link_status_handler, priv- > >dev); > > + } > > + return 0; > > +} > > + > > +/** > > + * Device status handler. > > + * > > + * @param priv > > + * Pointer to private structure. > > + * @param events > > + * Pointer to event flags holder. > > + * > > + * @return > > + * Events bitmap of callback process which can be called immediately= . > > + */ > > +static uint32_t > > +priv_dev_status_handler(struct priv *priv) > > { > > struct ibv_async_event event; > > - struct rte_eth_link *link =3D &dev->data->dev_link; > > - int ret =3D 0; > > + uint32_t ret =3D 0; > > > > /* Read all message and acknowledge them. */ > > for (;;) { > > if (ibv_get_async_event(priv->ctx, &event)) > > break; > > - > > - if (event.event_type !=3D IBV_EVENT_PORT_ACTIVE && > > - event.event_type !=3D IBV_EVENT_PORT_ERR) > > + if ((event.event_type =3D=3D IBV_EVENT_PORT_ACTIVE || > > + event.event_type =3D=3D IBV_EVENT_PORT_ERR) && > > + (priv->dev->data->dev_conf.intr_conf.lsc =3D=3D 1)) > > + ret |=3D (1 << RTE_ETH_EVENT_INTR_LSC); > > + else if (event.event_type =3D=3D IBV_EVENT_DEVICE_FATAL && > > + priv->dev->data->dev_conf.intr_conf.rmv =3D=3D 1) > > + ret |=3D (1 << RTE_ETH_EVENT_INTR_RMV); > > + else > > DEBUG("event type %d on port %d not handled", > > event.event_type, event.element.port_num); >=20 > What you also need to mention in the commit log of the fix is that splitt= ing > priv_dev_status_handler() and priv_link_status_update() addresses another > bug here: this loop consumed *all* events, even during alarms. An alarm > occurring for a LSC event could eat a RMV event that the application woul= d > never receive. This also affects mlx4, for which I intend to submit a fix= soon. >=20 I think also this issue is only mlx4 bug, Since in the previous mlx5 code only LCS event was supported, all these problems was not there.=20 > > ibv_ack_async_event(&event); > > } > > - mlx5_link_update(dev, 0); > > - if (((link->link_speed =3D=3D 0) && link->link_status) || > > - ((link->link_speed !=3D 0) && !link->link_status)) { > > - if (!priv->pending_alarm) { > > - /* Inconsistent status, check again later. */ > > - priv->pending_alarm =3D 1; > > - rte_eal_alarm_set(MLX5_ALARM_TIMEOUT_US, > > - mlx5_dev_link_status_handler, > > - dev); > > - } > > - } else { > > - ret =3D 1; > > - } > > + if (ret & (1 << RTE_ETH_EVENT_INTR_LSC)) > > + if (priv_link_status_update(priv)) > > + ret &=3D ~(1 << RTE_ETH_EVENT_INTR_LSC); > > return ret; > > } > > > > @@ -1172,9 +1209,9 @@ mlx5_dev_link_status_handler(void *arg) > > priv_lock(priv); > > assert(priv->pending_alarm =3D=3D 1); > > priv->pending_alarm =3D 0; > > - ret =3D priv_dev_link_status_handler(priv, dev); > > + ret =3D priv_link_status_update(priv); > > priv_unlock(priv); > > - if (ret) > > + if (!ret) > > _rte_eth_dev_callback_process(dev, > RTE_ETH_EVENT_INTR_LSC, NULL, > > NULL); > > } > > @@ -1192,14 +1229,17 @@ mlx5_dev_interrupt_handler(void *cb_arg) { > > struct rte_eth_dev *dev =3D cb_arg; > > struct priv *priv =3D dev->data->dev_private; > > - int ret; > > + uint32_t events; > > > > priv_lock(priv); > > - ret =3D priv_dev_link_status_handler(priv, dev); > > + events =3D priv_dev_status_handler(priv); > > priv_unlock(priv); > > - if (ret) > > + if (events & (1 << RTE_ETH_EVENT_INTR_LSC)) > > _rte_eth_dev_callback_process(dev, > RTE_ETH_EVENT_INTR_LSC, NULL, > > NULL); > > + if (events & (1 << RTE_ETH_EVENT_INTR_RMV)) > > + _rte_eth_dev_callback_process(dev, > RTE_ETH_EVENT_INTR_RMV, NULL, > > + NULL); > > } > > > > /** > > @@ -1213,7 +1253,8 @@ mlx5_dev_interrupt_handler(void *cb_arg) void > > priv_dev_interrupt_handler_uninstall(struct priv *priv, struct > > rte_eth_dev *dev) { > > - if (!dev->data->dev_conf.intr_conf.lsc) > > + if (!dev->data->dev_conf.intr_conf.lsc && > > + !dev->data->dev_conf.intr_conf.rmv) > > return; > > rte_intr_callback_unregister(&priv->intr_handle, > > mlx5_dev_interrupt_handler, > > @@ -1238,7 +1279,8 @@ priv_dev_interrupt_handler_install(struct priv > > *priv, struct rte_eth_dev *dev) { > > int rc, flags; > > > > - if (!dev->data->dev_conf.intr_conf.lsc) > > + if (!dev->data->dev_conf.intr_conf.lsc && > > + !dev->data->dev_conf.intr_conf.rmv) > > return; > > assert(priv->ctx->async_fd > 0); > > flags =3D fcntl(priv->ctx->async_fd, F_GETFL); @@ -1246,6 +1288,7 @@ > > priv_dev_interrupt_handler_install(struct priv *priv, struct rte_eth_de= v > *dev) > > if (rc < 0) { > > INFO("failed to change file descriptor async event queue"); > > dev->data->dev_conf.intr_conf.lsc =3D 0; > > + dev->data->dev_conf.intr_conf.rmv =3D 0; > > } else { > > priv->intr_handle.fd =3D priv->ctx->async_fd; > > priv->intr_handle.type =3D RTE_INTR_HANDLE_EXT; > > -- > > 2.7.4 > > >=20 > -- > Adrien Mazarguil > 6WIND Thanks, Matan Azrad