* [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports
@ 2020-03-19 6:41 Phil Yang
2020-03-19 10:51 ` Lijian Zhang
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Phil Yang @ 2020-03-19 6:41 UTC (permalink / raw)
To: dev, konstantin.ananyev, wenzhuo.lu
Cc: qi.z.zhang, lijian.zhang, gavin.hu, honnappa.nagarahalli, nd, stable
With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it
is possible when a port is started to experience a timing issue
which prevents the link from ever being fully set up.
In ixgbe_dev_link_update_share(), if the media type is fiber and the
link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
to ixgbe_dev_setup_link_thread_handler() is scheduled which should
try to set up the link and clear the flag afterwards.
If the device is started before the flag is cleared, the scheduled
callback is cancelled. This causes the flag to remain set and
subsequent calls to ixgbe_dev_link_update_share() return
without trying to retrieve the link state because the flag is set.
In ixgbe_dev_cancel_link_thread(), after cancelling the callback,
unset the flag on the device to avoid this condition.
Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
Cc: stable@dpdk.org
Bugzilla ID: 388
Signed-off-by: Phil Yang <phil.yang@arm.com>
Signed-off-by: Lijian Zhang <lijian.zhang@arm.com>
Reviewed-by: Gavin Hu <gavin.hu@arm.com>
---
drivers/net/ixgbe/ixgbe_ethdev.c | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index 23b3f5b..2b65750 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -4147,11 +4147,19 @@ static void
ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev)
{
struct ixgbe_adapter *ad = dev->data->dev_private;
+ struct ixgbe_interrupt *intr =
+ IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
void *retval;
if (rte_atomic32_read(&ad->link_thread_running)) {
pthread_cancel(ad->link_thread_tid);
pthread_join(ad->link_thread_tid, &retval);
+ /* clear this flag once the thread has been
+ * cancelled, to avoid link status error in
+ * case unfinished threads cannot clean up
+ * this flag.
+ */
+ intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
rte_atomic32_clear(&ad->link_thread_running);
}
}
@@ -4262,8 +4270,12 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
if (link_up == 0) {
if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
- intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
if (rte_atomic32_test_and_set(&ad->link_thread_running)) {
+ /* To avoid race condition between threads, set
+ * the IXGBE_FLAG_NEED_LINK_CONFIG flag only
+ * when there is no link thread running.
+ */
+ intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
if (rte_ctrl_thread_create(&ad->link_thread_tid,
"ixgbe-link-handler",
NULL,
--
2.7.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports
2020-03-19 6:41 [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports Phil Yang
@ 2020-03-19 10:51 ` Lijian Zhang
2020-05-08 2:48 ` Phil Yang
2020-05-08 10:28 ` [dpdk-dev] [PATCH v2] " Phil Yang
2 siblings, 0 replies; 7+ messages in thread
From: Lijian Zhang @ 2020-03-19 10:51 UTC (permalink / raw)
To: Phil Yang, dev, konstantin.ananyev, wenzhuo.lu
Cc: qi.z.zhang, Gavin Hu, Honnappa Nagarahalli, nd, stable
This issue is firstly observed with an ixgbe NIC in VPP project, which is software switching application based on DPDK.
There's a daemon thread running in background keeping polling hardware link status, using ixgbe_dev_link_update_share().
Once flag IXGBE_FLAG_NEED_LINK_CONFIG is set, ixgbe_dev_link_update_share() will just return link down status without actually polling hardware status.
In the issue, flag IXGBE_FLAG_NEED_LINK_CONFIG is always set, and never be cleared, meaning ixgbe_dev_link_update_share() cannot get hardware status, but always get link down status.
The condition causing IXGBE_FLAG_NEED_LINK_CONFIG always set is as below.
The ixgbe_dev_link_update_share() is always running in the background.
1. In the beginning, IXGBE_FLAG_NEED_LINK_CONFIG is 0 and it is link down status.
2. ixgbe_dev_link_update_share() will set IXGBE_FLAG_NEED_LINK_CONFIG to 1
3. Then it triggers ixgbe_dev_setup_link_thread_handler() thread to configure the interface.
4. At the end of configuring thread, ixgbe_dev_setup_link_thread_handler() will clear the flag IXGBE_FLAG_NEED_LINK_CONFIG.
5. With IXGBE_FLAG_NEED_LINK_CONFIG being cleared, ixgbe_dev_link_update_share() can poll hardware link status in the next round.
But when the user is setting interface link up or down in the CLI, it will call ixgbe_dev_start() or ixgbe_dev_stop(). In both function, they will call ixgbe_dev_cancel_link_thread() to interrupt any running configuring thread (which is running in above step 3 and step 4), without clearing the flag IXGBE_FLAG_NEED_LINK_CONFIG. This will leave IXGBE_FLAG_NEED_LINK_CONFIG always set, and ixgbe_dev_link_update_share() cannot get hardware status.
Thanks.
> -----Original Message-----
> From: Phil Yang <phil.yang@arm.com>
> Sent: 2020年3月19日 14:42
> To: dev@dpdk.org; konstantin.ananyev@intel.com; wenzhuo.lu@intel.com
> Cc: qi.z.zhang@intel.com; Lijian Zhang <Lijian.Zhang@arm.com>; Gavin Hu
> <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; stable@dpdk.org
> Subject: [PATCH] net/ixgbe: fix link state timing issue on fiber ports
>
> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it is possible
> when a port is started to experience a timing issue which prevents the link
> from ever being fully set up.
>
> In ixgbe_dev_link_update_share(), if the media type is fiber and the link is
> down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback to
> ixgbe_dev_setup_link_thread_handler() is scheduled which should try to set up
> the link and clear the flag afterwards.
>
> If the device is started before the flag is cleared, the scheduled callback is
> cancelled. This causes the flag to remain set and subsequent calls to
> ixgbe_dev_link_update_share() return without trying to retrieve the link state
> because the flag is set.
>
> In ixgbe_dev_cancel_link_thread(), after cancelling the callback, unset the flag
> on the device to avoid this condition.
>
> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
> Cc: stable@dpdk.org
>
> Bugzilla ID: 388
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Signed-off-by: Lijian Zhang <lijian.zhang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> ---
> drivers/net/ixgbe/ixgbe_ethdev.c | 14 +++++++++++++-
> 1 file changed, 13 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c
> b/drivers/net/ixgbe/ixgbe_ethdev.c
> index 23b3f5b..2b65750 100644
> --- a/drivers/net/ixgbe/ixgbe_ethdev.c
> +++ b/drivers/net/ixgbe/ixgbe_ethdev.c
> @@ -4147,11 +4147,19 @@ static void
> ixgbe_dev_cancel_link_thread(struct rte_eth_dev *dev) {
> struct ixgbe_adapter *ad = dev->data->dev_private;
> + struct ixgbe_interrupt *intr =
> + IXGBE_DEV_PRIVATE_TO_INTR(dev->data->dev_private);
> void *retval;
>
> if (rte_atomic32_read(&ad->link_thread_running)) {
> pthread_cancel(ad->link_thread_tid);
> pthread_join(ad->link_thread_tid, &retval);
> + /* clear this flag once the thread has been
> + * cancelled, to avoid link status error in
> + * case unfinished threads cannot clean up
> + * this flag.
> + */
> + intr->flags &= ~IXGBE_FLAG_NEED_LINK_CONFIG;
> rte_atomic32_clear(&ad->link_thread_running);
> }
> }
> @@ -4262,8 +4270,12 @@ ixgbe_dev_link_update_share(struct rte_eth_dev
> *dev,
>
> if (link_up == 0) {
> if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
> - intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> if (rte_atomic32_test_and_set(&ad-
> >link_thread_running)) {
> + /* To avoid race condition between threads,
> set
> + * the IXGBE_FLAG_NEED_LINK_CONFIG flag
> only
> + * when there is no link thread running.
> + */
> + intr->flags |=
> IXGBE_FLAG_NEED_LINK_CONFIG;
> if (rte_ctrl_thread_create(&ad-
> >link_thread_tid,
> "ixgbe-link-handler",
> NULL,
> --
> 2.7.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports
2020-03-19 6:41 [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports Phil Yang
2020-03-19 10:51 ` Lijian Zhang
@ 2020-05-08 2:48 ` Phil Yang
2020-05-08 8:36 ` Ye Xiaolong
2020-05-08 10:28 ` [dpdk-dev] [PATCH v2] " Phil Yang
2 siblings, 1 reply; 7+ messages in thread
From: Phil Yang @ 2020-05-08 2:48 UTC (permalink / raw)
To: Phil Yang, dev, konstantin.ananyev, wenzhuo.lu, Ye Xiaolong
Cc: qi.z.zhang, Lijian Zhang, Gavin Hu, Honnappa Nagarahalli, nd, stable, nd
> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber
> ports
>
> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it
> is possible when a port is started to experience a timing issue
> which prevents the link from ever being fully set up.
>
> In ixgbe_dev_link_update_share(), if the media type is fiber and the
> link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
> to ixgbe_dev_setup_link_thread_handler() is scheduled which should
> try to set up the link and clear the flag afterwards.
>
> If the device is started before the flag is cleared, the scheduled
> callback is cancelled. This causes the flag to remain set and
> subsequent calls to ixgbe_dev_link_update_share() return
> without trying to retrieve the link state because the flag is set.
>
> In ixgbe_dev_cancel_link_thread(), after cancelling the callback,
> unset the flag on the device to avoid this condition.
>
> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
> Cc: stable@dpdk.org
>
> Bugzilla ID: 388
>
> Signed-off-by: Phil Yang <phil.yang@arm.com>
> Signed-off-by: Lijian Zhang <lijian.zhang@arm.com>
> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> ---
Ping.
Thanks,
Phil
<Snip>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports
2020-05-08 2:48 ` Phil Yang
@ 2020-05-08 8:36 ` Ye Xiaolong
2020-05-08 10:31 ` Phil Yang
0 siblings, 1 reply; 7+ messages in thread
From: Ye Xiaolong @ 2020-05-08 8:36 UTC (permalink / raw)
To: Phil Yang
Cc: dev, konstantin.ananyev, wenzhuo.lu, qi.z.zhang, Lijian Zhang,
Gavin Hu, Honnappa Nagarahalli, nd, stable
On 05/08, Phil Yang wrote:
>> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber
>> ports
>>
>> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it
>> is possible when a port is started to experience a timing issue
>> which prevents the link from ever being fully set up.
>>
>> In ixgbe_dev_link_update_share(), if the media type is fiber and the
>> link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
>> to ixgbe_dev_setup_link_thread_handler() is scheduled which should
>> try to set up the link and clear the flag afterwards.
>>
>> If the device is started before the flag is cleared, the scheduled
>> callback is cancelled. This causes the flag to remain set and
>> subsequent calls to ixgbe_dev_link_update_share() return
>> without trying to retrieve the link state because the flag is set.
>>
>> In ixgbe_dev_cancel_link_thread(), after cancelling the callback,
>> unset the flag on the device to avoid this condition.
>>
>> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
>> Cc: stable@dpdk.org
>>
>> Bugzilla ID: 388
>>
>> Signed-off-by: Phil Yang <phil.yang@arm.com>
>> Signed-off-by: Lijian Zhang <lijian.zhang@arm.com>
>> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
>> ---
>
>Ping.
This fix makes sense to me, thanks for the work.
And it seems can't be applied to latest dpdk-next-net-intel cleanly, could you
do a rebase?
Thanks,
Xiaolong
>
>Thanks,
>Phil
>
><Snip>
^ permalink raw reply [flat|nested] 7+ messages in thread
* [dpdk-dev] [PATCH v2] net/ixgbe: fix link state timing issue on fiber ports
2020-03-19 6:41 [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports Phil Yang
2020-03-19 10:51 ` Lijian Zhang
2020-05-08 2:48 ` Phil Yang
@ 2020-05-08 10:28 ` Phil Yang
2020-05-11 2:49 ` Ye Xiaolong
2 siblings, 1 reply; 7+ messages in thread
From: Phil Yang @ 2020-05-08 10:28 UTC (permalink / raw)
To: dev, xiaolong.ye; +Cc: wei.zhao1, jia.guo, Lijian.Zhang, nd, stable
In ixgbe_dev_link_update_share(), if the media type is fiber and the
link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
to ixgbe_dev_setup_link_thread_handler() is scheduled which should
try to set up the link and clear the flag afterwards. This flag works
as a guard variable between threads.
To avoid potential race condition between threads, set the
IXGBE_FLAG_NEED_LINK_CONFIG flag only when there is no link thread
running.
Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
Cc: stable@dpdk.org
Bugzilla ID: 388
Signed-off-by: Phil Yang <phil.yang@arm.com>
Reviewed-by: Lijian Zhang <lijian.zhang@arm.com>
---
v2:
1. rebase code.
drivers/net/ixgbe/ixgbe_ethdev.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
index aa1e8aa..dc19f0c 100644
--- a/drivers/net/ixgbe/ixgbe_ethdev.c
+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
@@ -4276,9 +4276,13 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
if (link_up == 0) {
if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
- intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
ixgbe_dev_wait_setup_link_complete(dev, 0);
if (rte_atomic32_test_and_set(&ad->link_thread_running)) {
+ /* To avoid race condition between threads, set
+ * the IXGBE_FLAG_NEED_LINK_CONFIG flag only
+ * when there is no link thread running.
+ */
+ intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
if (rte_ctrl_thread_create(&ad->link_thread_tid,
"ixgbe-link-handler",
NULL,
--
2.7.4
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports
2020-05-08 8:36 ` Ye Xiaolong
@ 2020-05-08 10:31 ` Phil Yang
0 siblings, 0 replies; 7+ messages in thread
From: Phil Yang @ 2020-05-08 10:31 UTC (permalink / raw)
To: Ye Xiaolong
Cc: dev, konstantin.ananyev, wenzhuo.lu, qi.z.zhang, Lijian Zhang,
Gavin Hu, Honnappa Nagarahalli, nd, stable, nd
> -----Original Message-----
> From: Ye Xiaolong <xiaolong.ye@intel.com>
> Sent: Friday, May 8, 2020 4:36 PM
> To: Phil Yang <Phil.Yang@arm.com>
> Cc: dev@dpdk.org; konstantin.ananyev@intel.com; wenzhuo.lu@intel.com;
> qi.z.zhang@intel.com; Lijian Zhang <Lijian.Zhang@arm.com>; Gavin Hu
> <Gavin.Hu@arm.com>; Honnappa Nagarahalli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>; stable@dpdk.org
> Subject: Re: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber
> ports
>
> On 05/08, Phil Yang wrote:
> >> Subject: [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber
> >> ports
> >>
> >> With some models of fiber ports (e.g. X520-2 device ID 0x10fb), it
> >> is possible when a port is started to experience a timing issue
> >> which prevents the link from ever being fully set up.
> >>
> >> In ixgbe_dev_link_update_share(), if the media type is fiber and the
> >> link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
> >> to ixgbe_dev_setup_link_thread_handler() is scheduled which should
> >> try to set up the link and clear the flag afterwards.
> >>
> >> If the device is started before the flag is cleared, the scheduled
> >> callback is cancelled. This causes the flag to remain set and
> >> subsequent calls to ixgbe_dev_link_update_share() return
> >> without trying to retrieve the link state because the flag is set.
> >>
> >> In ixgbe_dev_cancel_link_thread(), after cancelling the callback,
> >> unset the flag on the device to avoid this condition.
> >>
> >> Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
> >> Cc: stable@dpdk.org
> >>
> >> Bugzilla ID: 388
> >>
> >> Signed-off-by: Phil Yang <phil.yang@arm.com>
> >> Signed-off-by: Lijian Zhang <lijian.zhang@arm.com>
> >> Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> >> ---
> >
> >Ping.
>
> This fix makes sense to me, thanks for the work.
> And it seems can't be applied to latest dpdk-next-net-intel cleanly, could you
> do a rebase?
>
Thank you Xiaolong.
I updated in V2, please review it.
Thanks,
Phil
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v2] net/ixgbe: fix link state timing issue on fiber ports
2020-05-08 10:28 ` [dpdk-dev] [PATCH v2] " Phil Yang
@ 2020-05-11 2:49 ` Ye Xiaolong
0 siblings, 0 replies; 7+ messages in thread
From: Ye Xiaolong @ 2020-05-11 2:49 UTC (permalink / raw)
To: Phil Yang; +Cc: dev, wei.zhao1, jia.guo, Lijian.Zhang, nd, stable
On 05/08, Phil Yang wrote:
>In ixgbe_dev_link_update_share(), if the media type is fiber and the
>link is down, a flag (IXGBE_FLAG_NEED_LINK_CONFIG) is set. A callback
>to ixgbe_dev_setup_link_thread_handler() is scheduled which should
>try to set up the link and clear the flag afterwards. This flag works
>as a guard variable between threads.
>
>To avoid potential race condition between threads, set the
>IXGBE_FLAG_NEED_LINK_CONFIG flag only when there is no link thread
>running.
>
>Fixes: 819d0d1d57f1 ("net/ixgbe: fix blocking system events")
>Cc: stable@dpdk.org
>
>Bugzilla ID: 388
>
>Signed-off-by: Phil Yang <phil.yang@arm.com>
>Reviewed-by: Lijian Zhang <lijian.zhang@arm.com>
>---
>v2:
>1. rebase code.
>
> drivers/net/ixgbe/ixgbe_ethdev.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
>diff --git a/drivers/net/ixgbe/ixgbe_ethdev.c b/drivers/net/ixgbe/ixgbe_ethdev.c
>index aa1e8aa..dc19f0c 100644
>--- a/drivers/net/ixgbe/ixgbe_ethdev.c
>+++ b/drivers/net/ixgbe/ixgbe_ethdev.c
>@@ -4276,9 +4276,13 @@ ixgbe_dev_link_update_share(struct rte_eth_dev *dev,
>
> if (link_up == 0) {
> if (ixgbe_get_media_type(hw) == ixgbe_media_type_fiber) {
>- intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> ixgbe_dev_wait_setup_link_complete(dev, 0);
> if (rte_atomic32_test_and_set(&ad->link_thread_running)) {
>+ /* To avoid race condition between threads, set
>+ * the IXGBE_FLAG_NEED_LINK_CONFIG flag only
>+ * when there is no link thread running.
>+ */
>+ intr->flags |= IXGBE_FLAG_NEED_LINK_CONFIG;
> if (rte_ctrl_thread_create(&ad->link_thread_tid,
> "ixgbe-link-handler",
> NULL,
>--
>2.7.4
>
Acked-by: Xiaolong Ye <xiaolong.ye@intel.com>
Applied to dpdk-next-net-intel, Thanks.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2020-05-11 2:57 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-03-19 6:41 [dpdk-dev] [PATCH] net/ixgbe: fix link state timing issue on fiber ports Phil Yang
2020-03-19 10:51 ` Lijian Zhang
2020-05-08 2:48 ` Phil Yang
2020-05-08 8:36 ` Ye Xiaolong
2020-05-08 10:31 ` Phil Yang
2020-05-08 10:28 ` [dpdk-dev] [PATCH v2] " Phil Yang
2020-05-11 2:49 ` Ye Xiaolong
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).