This fix is proposed as an alternative to https://patches.dpdk.org/patch/65088/ The benefit is to fix a race condition in addition. Thomas Monjalon (2): app/testpmd: rename function for detaching by devargs app/testpmd: fix hot-unplug detaching app/test-pmd/cmdline.c | 2 +- app/test-pmd/testpmd.c | 51 +++++++++++++++++++++++++----------------- app/test-pmd/testpmd.h | 2 +- 3 files changed, 32 insertions(+), 23 deletions(-) -- 2.25.0
There is a function detach_port_device() which takes a port_id, and a function detach_device() which takes a devargs string. In order to add a third function accepting a rte_device pointer, the function detach_device() is renamed into detach_devargs(). Signed-off-by: Thomas Monjalon <thomas@monjalon.net> --- app/test-pmd/cmdline.c | 2 +- app/test-pmd/testpmd.c | 3 ++- app/test-pmd/testpmd.h | 2 +- 3 files changed, 4 insertions(+), 3 deletions(-) diff --git a/app/test-pmd/cmdline.c b/app/test-pmd/cmdline.c index 99e4168103..38b6d804c6 100644 --- a/app/test-pmd/cmdline.c +++ b/app/test-pmd/cmdline.c @@ -1541,7 +1541,7 @@ static void cmd_operate_detach_device_parsed(void *parsed_result, struct cmd_operate_detach_device_result *res = parsed_result; if (!strcmp(res->keyword, "detach")) - detach_device(res->identifier); + detach_devargs(res->identifier); else printf("Unknown parameter\n"); } diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index a1dd4043e0..11203cb03d 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2683,7 +2683,8 @@ detach_port_device(portid_t port_id) } void -detach_device(char *identifier) +void +detach_devargs(char *identifier) { struct rte_dev_iterator iterator; struct rte_devargs da; diff --git a/app/test-pmd/testpmd.h b/app/test-pmd/testpmd.h index 3dd5fc750b..b8eb28f62b 100644 --- a/app/test-pmd/testpmd.h +++ b/app/test-pmd/testpmd.h @@ -810,7 +810,7 @@ void stop_port(portid_t pid); void close_port(portid_t pid); void reset_port(portid_t pid); void attach_port(char *identifier); -void detach_device(char *identifier); +void detach_devargs(char *identifier); void detach_port_device(portid_t port_id); int all_ports_stopped(void); int port_is_stopped(portid_t port_id); -- 2.25.0
There is a possible race condition in the hotplug path in rmv_port_callback(). If a port is created between close_port(port_id) and detach_port_device(port_id), then the port_id will have been reallocated to a different device which will be wrongly detached. Since a check was added in detach_port_device() for manual detach case, the hotplug path was even more broken. It became impossible to run because the new check prevented to run detach_port_device() after the port is closed. The solution for both issues is to not rely on the port_id for detaching the rte_device. The function detach_port_device() is split to allow calling detach_device() directly with the rte_device pointer, saved before closing the port. Fixes: 43d0e304980a ("app/testpmd: fix invalid port detaching") Cc: stable@dpdk.org Signed-off-by: Thomas Monjalon <thomas@monjalon.net> --- app/test-pmd/testpmd.c | 48 ++++++++++++++++++++++++------------------ 1 file changed, 28 insertions(+), 20 deletions(-) diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index 11203cb03d..035836adfb 100644 --- a/app/test-pmd/testpmd.c +++ b/app/test-pmd/testpmd.c @@ -2633,32 +2633,17 @@ setup_attached_port(portid_t pi) printf("Done\n"); } -void -detach_port_device(portid_t port_id) +static void +detach_device(struct rte_device *dev) { - struct rte_device *dev; portid_t sibling; - printf("Removing a device...\n"); - - if (port_id_is_invalid(port_id, ENABLED_WARN)) - return; - - dev = rte_eth_devices[port_id].device; if (dev == NULL) { printf("Device already removed\n"); return; } - if (ports[port_id].port_status != RTE_PORT_CLOSED) { - if (ports[port_id].port_status != RTE_PORT_STOPPED) { - printf("Port not stopped\n"); - return; - } - printf("Port was not closed\n"); - if (ports[port_id].flow_list) - port_flow_flush(port_id); - } + printf("Removing a device...\n"); if (rte_dev_remove(dev) < 0) { TESTPMD_LOG(ERR, "Failed to detach device %s\n", dev->name); @@ -2676,13 +2661,31 @@ detach_port_device(portid_t port_id) remove_invalid_ports(); - printf("Device of port %u is detached\n", port_id); + printf("Device is detached\n"); printf("Now total ports is %d\n", nb_ports); printf("Done\n"); return; } void +detach_port_device(portid_t port_id) +{ + if (port_id_is_invalid(port_id, ENABLED_WARN)) + return; + + if (ports[port_id].port_status != RTE_PORT_CLOSED) { + if (ports[port_id].port_status != RTE_PORT_STOPPED) { + printf("Port not stopped\n"); + return; + } + printf("Port was not closed\n"); + if (ports[port_id].flow_list) + port_flow_flush(port_id); + } + + detach_device(rte_eth_devices[port_id].device); +} + void detach_devargs(char *identifier) { @@ -2873,6 +2876,7 @@ rmv_port_callback(void *arg) int need_to_start = 0; int org_no_link_check = no_link_check; portid_t port_id = (intptr_t)arg; + struct rte_device *dev; RTE_ETH_VALID_PORTID_OR_RET(port_id); @@ -2883,8 +2887,12 @@ rmv_port_callback(void *arg) no_link_check = 1; stop_port(port_id); no_link_check = org_no_link_check; + + /* Save rte_device pointer before closing ethdev port */ + dev = rte_eth_devices[port_id].device; close_port(port_id); - detach_port_device(port_id); + detach_device(dev); /* might be already removed or have more ports */ + if (need_to_start) start_packet_forwarding(0); } -- 2.25.0
On 2/13/2020 3:52 PM, Thomas Monjalon wrote:
> This fix is proposed as an alternative to
> https://patches.dpdk.org/patch/65088/
> The benefit is to fix a race condition in addition.
>
> Thomas Monjalon (2):
> app/testpmd: rename function for detaching by devargs
> app/testpmd: fix hot-unplug detaching
>
For series,
Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
Series applied to dpdk-next-net/master, thanks.
(Patch splitting fixed on 'detach_devargs()' return type)
13/02/2020 19:27, Ferruh Yigit:
> On 2/13/2020 3:52 PM, Thomas Monjalon wrote:
> > This fix is proposed as an alternative to
> > https://patches.dpdk.org/patch/65088/
> > The benefit is to fix a race condition in addition.
> >
> > Thomas Monjalon (2):
> > app/testpmd: rename function for detaching by devargs
> > app/testpmd: fix hot-unplug detaching
> >
>
> For series,
> Reviewed-by: Ferruh Yigit <ferruh.yigit@intel.com>
>
> Series applied to dpdk-next-net/master, thanks.
>
> (Patch splitting fixed on 'detach_devargs()' return type)
Good catch, thank you.
It seems you are well testing compilation after each patch :-)
Hi Thomas Thanks for the patches, I Saw it just now. please see small comment below: From: Thomas Monjalon > There is a possible race condition in the hotplug path in rmv_port_callback(). > If a port is created between > close_port(port_id) and detach_port_device(port_id), then the port_id will > have been reallocated to a different device which will be wrongly detached. > > Since a check was added in detach_port_device() for manual detach case, > the hotplug path was even more broken. > It became impossible to run because the new check prevented to run > detach_port_device() after the port is closed. > > The solution for both issues is to not rely on the port_id for detaching the > rte_device. > The function detach_port_device() is split to allow calling > detach_device() directly with the rte_device pointer, saved before closing > the port. > > Fixes: 43d0e304980a ("app/testpmd: fix invalid port detaching") If you fix the race, Don't you think you need to add fixes line for the patch which created the race? > Cc: stable@dpdk.org > > Signed-off-by: Thomas Monjalon <thomas@monjalon.net> > --- > app/test-pmd/testpmd.c | 48 ++++++++++++++++++++++++------------------ > 1 file changed, 28 insertions(+), 20 deletions(-) > > diff --git a/app/test-pmd/testpmd.c b/app/test-pmd/testpmd.c index > 11203cb03d..035836adfb 100644 > --- a/app/test-pmd/testpmd.c > +++ b/app/test-pmd/testpmd.c > @@ -2633,32 +2633,17 @@ setup_attached_port(portid_t pi) > printf("Done\n"); > } > > -void > -detach_port_device(portid_t port_id) > +static void > +detach_device(struct rte_device *dev) > { > - struct rte_device *dev; > portid_t sibling; > > - printf("Removing a device...\n"); > - > - if (port_id_is_invalid(port_id, ENABLED_WARN)) > - return; > - > - dev = rte_eth_devices[port_id].device; > if (dev == NULL) { > printf("Device already removed\n"); > return; > } > > - if (ports[port_id].port_status != RTE_PORT_CLOSED) { > - if (ports[port_id].port_status != RTE_PORT_STOPPED) { > - printf("Port not stopped\n"); > - return; > - } > - printf("Port was not closed\n"); > - if (ports[port_id].flow_list) > - port_flow_flush(port_id); > - } > + printf("Removing a device...\n"); > > if (rte_dev_remove(dev) < 0) { > TESTPMD_LOG(ERR, "Failed to detach device %s\n", dev- > >name); @@ -2676,13 +2661,31 @@ detach_port_device(portid_t port_id) > > remove_invalid_ports(); > > - printf("Device of port %u is detached\n", port_id); > + printf("Device is detached\n"); > printf("Now total ports is %d\n", nb_ports); > printf("Done\n"); > return; > } > > void > +detach_port_device(portid_t port_id) > +{ > + if (port_id_is_invalid(port_id, ENABLED_WARN)) > + return; > + > + if (ports[port_id].port_status != RTE_PORT_CLOSED) { > + if (ports[port_id].port_status != RTE_PORT_STOPPED) { > + printf("Port not stopped\n"); > + return; > + } > + printf("Port was not closed\n"); > + if (ports[port_id].flow_list) > + port_flow_flush(port_id); > + } > + > + detach_device(rte_eth_devices[port_id].device); > +} > + > void > detach_devargs(char *identifier) > { > @@ -2873,6 +2876,7 @@ rmv_port_callback(void *arg) > int need_to_start = 0; > int org_no_link_check = no_link_check; > portid_t port_id = (intptr_t)arg; > + struct rte_device *dev; > > RTE_ETH_VALID_PORTID_OR_RET(port_id); > > @@ -2883,8 +2887,12 @@ rmv_port_callback(void *arg) > no_link_check = 1; > stop_port(port_id); > no_link_check = org_no_link_check; > + > + /* Save rte_device pointer before closing ethdev port */ > + dev = rte_eth_devices[port_id].device; > close_port(port_id); > - detach_port_device(port_id); > + detach_device(dev); /* might be already removed or have more > ports */ > + > if (need_to_start) > start_packet_forwarding(0); > } > -- > 2.25.0
16/02/2020 09:09, Matan Azrad: > Hi Thomas > > Thanks for the patches, I Saw it just now. > please see small comment below: > > From: Thomas Monjalon > > There is a possible race condition in the hotplug path in rmv_port_callback(). > > If a port is created between > > close_port(port_id) and detach_port_device(port_id), then the port_id will > > have been reallocated to a different device which will be wrongly detached. > > > > Since a check was added in detach_port_device() for manual detach case, > > the hotplug path was even more broken. > > It became impossible to run because the new check prevented to run > > detach_port_device() after the port is closed. > > > > The solution for both issues is to not rely on the port_id for detaching the > > rte_device. > > The function detach_port_device() is split to allow calling > > detach_device() directly with the rte_device pointer, saved before closing > > the port. > > > > Fixes: 43d0e304980a ("app/testpmd: fix invalid port detaching") > > If you fix the race, Don't you think you need to add fixes line for the patch which created the race? Yes, you're right, I forgot it. It is too late now unfortunately, but we may request to backport it properly in 18.11 as well. Please Kevin, add this tag so it will be part of 18.11: Fixes: cbb4c648c5df ("ethdev: use device handle to detach") > > Cc: stable@dpdk.org
On 16/02/2020 09:47, Thomas Monjalon wrote: > 16/02/2020 09:09, Matan Azrad: >> Hi Thomas >> >> Thanks for the patches, I Saw it just now. >> please see small comment below: >> >> From: Thomas Monjalon >>> There is a possible race condition in the hotplug path in rmv_port_callback(). >>> If a port is created between >>> close_port(port_id) and detach_port_device(port_id), then the port_id will >>> have been reallocated to a different device which will be wrongly detached. >>> >>> Since a check was added in detach_port_device() for manual detach case, >>> the hotplug path was even more broken. >>> It became impossible to run because the new check prevented to run >>> detach_port_device() after the port is closed. >>> >>> The solution for both issues is to not rely on the port_id for detaching the >>> rte_device. >>> The function detach_port_device() is split to allow calling >>> detach_device() directly with the rte_device pointer, saved before closing >>> the port. >>> >>> Fixes: 43d0e304980a ("app/testpmd: fix invalid port detaching") >> >> If you fix the race, Don't you think you need to add fixes line for the patch which created the race? > > Yes, you're right, I forgot it. > It is too late now unfortunately, but we may request to backport it > properly in 18.11 as well. > > Please Kevin, add this tag so it will be part of 18.11: > > Fixes: cbb4c648c5df ("ethdev: use device handle to detach") > Ack. Will adjust the tag when applying, thanks. >>> Cc: stable@dpdk.org > > >