* RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-09 17:19 ` Dumitrescu, Cristian
@ 2023-03-09 19:09 ` Dumitrescu, Cristian
2023-03-09 20:22 ` Stephen Hemminger
2023-03-10 12:00 ` Ferruh Yigit
2 siblings, 0 replies; 20+ messages in thread
From: Dumitrescu, Cristian @ 2023-03-09 19:09 UTC (permalink / raw)
To: Dumitrescu, Cristian, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying
Cc: dev, R, Kamalakannan, Suresh Narayane, Harshad
> -----Original Message-----
> From: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Sent: Thursday, March 9, 2023 5:20 PM
> To: Stephen Hemminger <stephen@networkplumber.org>; Jangra, Yogesh
> <yogesh.jangra@intel.com>
> Cc: dev@dpdk.org; R, Kamalakannan <kamalakannan.r@intel.com>; Suresh
> Narayane, Harshad <harshad.suresh.narayane@intel.com>
> Subject: RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>
>
>
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Thursday, March 9, 2023 4:31 PM
> > To: Jangra, Yogesh <yogesh.jangra@intel.com>
> > Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
> > Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> > <harshad.suresh.narayane@intel.com>
> > Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
> >
> > On Thu, 9 Mar 2023 14:42:49 +0000
> > Yogesh Jangra <yogesh.jangra@intel.com> wrote:
> >
> > > + /*
> > > + * SoftNIC runs on the sevice core, it uses the resources from
> > > + * the testpmd application. When we run quit command, the
> > testpmd
> > > + * application stops ethdev ports first, SoftNIC will try to
> > > + * access the port and sometimes that result in segmentation
> > > + * error. So first closing the SoftNIC port.
> > > + */
> > > + RTE_ETH_FOREACH_DEV(pt_id) {
> > > + if (!strcmp(ports[pt_id].dev_info.driver_name,
> > "net_softnic")) {
> > > + stop_port(pt_id);
> > > + close_port(pt_id);
> > > + }
> > > + }
> > > +
> >
> > NAK
> > No driver specific hacks please.
> >
> > Instead fix the driver design or bug please.
>
> Hi Stephen,
>
> This is not a Soft NIC driver-specific hack, this is required for working around
> some of the ethdev drivers that don't implement the stop() API correctly and
> free up the device queues or some other internal resources on stop() instead of
> close().
>
> The Soft NIC is a meta-device that sits on top of other "physical" ethdev devices,
> so when the Soft NIC device continues to poll the queues of those physical
> devices after their queues have been freed, the Soft NIC will get a segfault. This
> fix is required to protect against this sort of incorrect driver behavior by simply
> stopping the Soft NIC devices first.
>
> We already have several driver specific branches in the test-pmd for e.g. LAG or
> virtual devices; IMO this small change falls in the same category and it should
> get accepted.
>
> Please let us know if this makes sense to you?
>
> Regards,
> Cristian
Adding Aman and Yuying, the test-pmd maintainers, to this conversation.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-09 17:19 ` Dumitrescu, Cristian
2023-03-09 19:09 ` Dumitrescu, Cristian
@ 2023-03-09 20:22 ` Stephen Hemminger
2023-03-10 9:09 ` Singh, Aman Deep
2023-03-10 12:00 ` Ferruh Yigit
2 siblings, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2023-03-09 20:22 UTC (permalink / raw)
To: Dumitrescu, Cristian
Cc: Jangra, Yogesh, dev, R, Kamalakannan, Suresh Narayane, Harshad
On Thu, 9 Mar 2023 17:19:59 +0000
"Dumitrescu, Cristian" <cristian.dumitrescu@intel.com> wrote:
> > -----Original Message-----
> > From: Stephen Hemminger <stephen@networkplumber.org>
> > Sent: Thursday, March 9, 2023 4:31 PM
> > To: Jangra, Yogesh <yogesh.jangra@intel.com>
> > Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
> > Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> > <harshad.suresh.narayane@intel.com>
> > Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
> >
> > On Thu, 9 Mar 2023 14:42:49 +0000
> > Yogesh Jangra <yogesh.jangra@intel.com> wrote:
> >
> > > + /*
> > > + * SoftNIC runs on the sevice core, it uses the resources from
> > > + * the testpmd application. When we run quit command, the
> > testpmd
> > > + * application stops ethdev ports first, SoftNIC will try to
> > > + * access the port and sometimes that result in segmentation
> > > + * error. So first closing the SoftNIC port.
> > > + */
> > > + RTE_ETH_FOREACH_DEV(pt_id) {
> > > + if (!strcmp(ports[pt_id].dev_info.driver_name,
> > "net_softnic")) {
> > > + stop_port(pt_id);
> > > + close_port(pt_id);
> > > + }
> > > + }
> > > +
> >
> > NAK
> > No driver specific hacks please.
> >
> > Instead fix the driver design or bug please.
>
> Hi Stephen,
>
> This is not a Soft NIC driver-specific hack, this is required for working around some of the ethdev drivers that don't implement the stop() API correctly and free up the device queues or some other internal resources on stop() instead of close().
>
> The Soft NIC is a meta-device that sits on top of other "physical" ethdev devices, so when the Soft NIC device continues to poll the queues of those physical devices after their queues have been freed, the Soft NIC will get a segfault. This fix is required to protect against this sort of incorrect driver behavior by simply stopping the Soft NIC devices first.
>
> We already have several driver specific branches in the test-pmd for e.g. LAG or virtual devices; IMO this small change falls in the same category and it should get accepted.
>
> Please let us know if this makes sense to you?
>
> Regards,
> Cristian
If the application has to do this then something is wrong with the architecture.
You aren't propagating state changes through in a safe manner.
Other applications will have same issue.
If LAG or virtual devices have similar problems then a generic solution is needed.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-09 20:22 ` Stephen Hemminger
@ 2023-03-10 9:09 ` Singh, Aman Deep
2023-03-10 13:45 ` Dumitrescu, Cristian
0 siblings, 1 reply; 20+ messages in thread
From: Singh, Aman Deep @ 2023-03-10 9:09 UTC (permalink / raw)
To: Stephen Hemminger, Dumitrescu, Cristian
Cc: Jangra, Yogesh, dev, R, Kamalakannan, Suresh Narayane, Harshad
On 3/10/2023 1:52 AM, Stephen Hemminger wrote:
> On Thu, 9 Mar 2023 17:19:59 +0000
> "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com> wrote:
>
>>> -----Original Message-----
>>> From: Stephen Hemminger <stephen@networkplumber.org>
>>> Sent: Thursday, March 9, 2023 4:31 PM
>>> To: Jangra, Yogesh <yogesh.jangra@intel.com>
>>> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
>>> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
>>> <harshad.suresh.narayane@intel.com>
>>> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>>>
>>> On Thu, 9 Mar 2023 14:42:49 +0000
>>> Yogesh Jangra <yogesh.jangra@intel.com> wrote:
>>>
>>>> + /*
>>>> + * SoftNIC runs on the sevice core, it uses the resources from
>>>> + * the testpmd application. When we run quit command, the
>>> testpmd
>>>> + * application stops ethdev ports first, SoftNIC will try to
>>>> + * access the port and sometimes that result in segmentation
>>>> + * error. So first closing the SoftNIC port.
>>>> + */
>>>> + RTE_ETH_FOREACH_DEV(pt_id) {
>>>> + if (!strcmp(ports[pt_id].dev_info.driver_name,
>>> "net_softnic")) {
>>>> + stop_port(pt_id);
>>>> + close_port(pt_id);
>>>> + }
>>>> + }
>>>> +
>>> NAK
>>> No driver specific hacks please.
>>>
>>> Instead fix the driver design or bug please.
>> Hi Stephen,
>>
>> This is not a Soft NIC driver-specific hack, this is required for working around some of the ethdev drivers that don't implement the stop() API correctly and free up the device queues or some other internal resources on stop() instead of close().
>>
>> The Soft NIC is a meta-device that sits on top of other "physical" ethdev devices, so when the Soft NIC device continues to poll the queues of those physical devices after their queues have been freed, the Soft NIC will get a segfault. This fix is required to protect against this sort of incorrect driver behavior by simply stopping the Soft NIC devices first.
>>
>> We already have several driver specific branches in the test-pmd for e.g. LAG or virtual devices; IMO this small change falls in the same category and it should get accepted.
>>
>> Please let us know if this makes sense to you?
>>
>> Regards,
>> Cristian
> If the application has to do this then something is wrong with the architecture.
>
>
> You aren't propagating state changes through in a safe manner.
> Other applications will have same issue.
>
> If LAG or virtual devices have similar problems then a generic solution is needed.
At exit, there is call to stop_packet_forwarding(). Shouldn't that stop SoftNic from polling of the queues ?
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 9:09 ` Singh, Aman Deep
@ 2023-03-10 13:45 ` Dumitrescu, Cristian
0 siblings, 0 replies; 20+ messages in thread
From: Dumitrescu, Cristian @ 2023-03-10 13:45 UTC (permalink / raw)
To: Singh, Aman Deep, Stephen Hemminger
Cc: Jangra, Yogesh, dev, R, Kamalakannan, Suresh Narayane, Harshad
> -----Original Message-----
> From: Singh, Aman Deep <aman.deep.singh@intel.com>
> Sent: Friday, March 10, 2023 9:09 AM
> To: Stephen Hemminger <stephen@networkplumber.org>; Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com>
> Cc: Jangra, Yogesh <yogesh.jangra@intel.com>; dev@dpdk.org; R,
> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> <harshad.suresh.narayane@intel.com>
> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>
>
> On 3/10/2023 1:52 AM, Stephen Hemminger wrote:
> > On Thu, 9 Mar 2023 17:19:59 +0000
> > "Dumitrescu, Cristian" <cristian.dumitrescu@intel.com> wrote:
> >
> >>> -----Original Message-----
> >>> From: Stephen Hemminger <stephen@networkplumber.org>
> >>> Sent: Thursday, March 9, 2023 4:31 PM
> >>> To: Jangra, Yogesh <yogesh.jangra@intel.com>
> >>> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>;
> R,
> >>> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> >>> <harshad.suresh.narayane@intel.com>
> >>> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev
> ports
> >>>
> >>> On Thu, 9 Mar 2023 14:42:49 +0000
> >>> Yogesh Jangra <yogesh.jangra@intel.com> wrote:
> >>>
> >>>> + /*
> >>>> + * SoftNIC runs on the sevice core, it uses the resources from
> >>>> + * the testpmd application. When we run quit command, the
> >>> testpmd
> >>>> + * application stops ethdev ports first, SoftNIC will try to
> >>>> + * access the port and sometimes that result in segmentation
> >>>> + * error. So first closing the SoftNIC port.
> >>>> + */
> >>>> + RTE_ETH_FOREACH_DEV(pt_id) {
> >>>> + if (!strcmp(ports[pt_id].dev_info.driver_name,
> >>> "net_softnic")) {
> >>>> + stop_port(pt_id);
> >>>> + close_port(pt_id);
> >>>> + }
> >>>> + }
> >>>> +
> >>> NAK
> >>> No driver specific hacks please.
> >>>
> >>> Instead fix the driver design or bug please.
> >> Hi Stephen,
> >>
> >> This is not a Soft NIC driver-specific hack, this is required for working around
> some of the ethdev drivers that don't implement the stop() API correctly and
> free up the device queues or some other internal resources on stop() instead of
> close().
> >>
> >> The Soft NIC is a meta-device that sits on top of other "physical" ethdev
> devices, so when the Soft NIC device continues to poll the queues of those
> physical devices after their queues have been freed, the Soft NIC will get a
> segfault. This fix is required to protect against this sort of incorrect driver
> behavior by simply stopping the Soft NIC devices first.
> >>
> >> We already have several driver specific branches in the test-pmd for e.g. LAG
> or virtual devices; IMO this small change falls in the same category and it should
> get accepted.
> >>
> >> Please let us know if this makes sense to you?
> >>
> >> Regards,
> >> Cristian
> > If the application has to do this then something is wrong with the architecture.
> >
> >
> > You aren't propagating state changes through in a safe manner.
> > Other applications will have same issue.
> >
> > If LAG or virtual devices have similar problems then a generic solution is
> needed.
>
> At exit, there is call to stop_packet_forwarding(). Shouldn't that stop SoftNic
> from polling of the queues ?
Hi Aman,
Unfortunately not. Calling the stop_packet_forwarding() will not stop the core running the Soft NIC, as Soft NIC is running on a service core, and the service cores are not handled by this function.
Moreover, a Soft NIC device is not taking over the entire service core; the same service core can also run other services e.g. Soft NIC for device A, telemetry, stats, Soft NIC for device B, etc). Therefore, stopping the service core associated with a Soft NIC device will also incorrectly result in stopping all the other services running on the same service core.
Stopping the service for a Soft NIC device is done by the stop() function, which is exactly what we are trying to do with this short patch.
Regards,
Cristian
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-09 17:19 ` Dumitrescu, Cristian
2023-03-09 19:09 ` Dumitrescu, Cristian
2023-03-09 20:22 ` Stephen Hemminger
@ 2023-03-10 12:00 ` Ferruh Yigit
2023-03-10 13:47 ` David Marchand
2023-03-10 13:47 ` Dumitrescu, Cristian
2 siblings, 2 replies; 20+ messages in thread
From: Ferruh Yigit @ 2023-03-10 12:00 UTC (permalink / raw)
To: Dumitrescu, Cristian, Stephen Hemminger, Jangra, Yogesh,
Aman Singh, Yuying Zhang
Cc: dev, R, Kamalakannan, Suresh Narayane, Harshad
On 3/9/2023 5:19 PM, Dumitrescu, Cristian wrote:
>
>
>> -----Original Message-----
>> From: Stephen Hemminger <stephen@networkplumber.org>
>> Sent: Thursday, March 9, 2023 4:31 PM
>> To: Jangra, Yogesh <yogesh.jangra@intel.com>
>> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
>> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
>> <harshad.suresh.narayane@intel.com>
>> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>>
>> On Thu, 9 Mar 2023 14:42:49 +0000
>> Yogesh Jangra <yogesh.jangra@intel.com> wrote:
>>
>>> + /*
>>> + * SoftNIC runs on the sevice core, it uses the resources from
>>> + * the testpmd application. When we run quit command, the
>> testpmd
>>> + * application stops ethdev ports first, SoftNIC will try to
>>> + * access the port and sometimes that result in segmentation
>>> + * error. So first closing the SoftNIC port.
>>> + */
>>> + RTE_ETH_FOREACH_DEV(pt_id) {
>>> + if (!strcmp(ports[pt_id].dev_info.driver_name,
>> "net_softnic")) {
>>> + stop_port(pt_id);
>>> + close_port(pt_id);
>>> + }
>>> + }
>>> +
>>
>> NAK
>> No driver specific hacks please.
>>
>> Instead fix the driver design or bug please.
>
> Hi Stephen,
>
> This is not a Soft NIC driver-specific hack, this is required for working around some of the ethdev drivers that don't implement the stop() API correctly and free up the device queues or some other internal resources on stop() instead of close().
>
Why not fix the misbehaving drivers, instead of working around for
softnic, as Stephen suggested?
Is there a list of problematic drivers?
> The Soft NIC is a meta-device that sits on top of other "physical" ethdev devices, so when the Soft NIC device continues to poll the queues of those physical devices after their queues have been freed, the Soft NIC will get a segfault. This fix is required to protect against this sort of incorrect driver behavior by simply stopping the Soft NIC devices first.
>
> We already have several driver specific branches in the test-pmd for e.g. LAG or virtual devices; IMO this small change falls in the same category and it should get accepted.
>
> Please let us know if this makes sense to you?
>
> Regards,
> Cristian
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 12:00 ` Ferruh Yigit
@ 2023-03-10 13:47 ` David Marchand
2023-03-10 13:47 ` Dumitrescu, Cristian
1 sibling, 0 replies; 20+ messages in thread
From: David Marchand @ 2023-03-10 13:47 UTC (permalink / raw)
To: Ferruh Yigit
Cc: Dumitrescu, Cristian, Stephen Hemminger, Jangra, Yogesh,
Aman Singh, Yuying Zhang, dev, R, Kamalakannan, Suresh Narayane,
Harshad
On Fri, Mar 10, 2023 at 1:00 PM Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> >> NAK
> >> No driver specific hacks please.
> >>
> >> Instead fix the driver design or bug please.
> >
> > Hi Stephen,
> >
> > This is not a Soft NIC driver-specific hack, this is required for working around some of the ethdev drivers that don't implement the stop() API correctly and free up the device queues or some other internal resources on stop() instead of close().
> >
>
> Why not fix the misbehaving drivers, instead of working around for
> softnic, as Stephen suggested?
+1
--
David Marchand
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 12:00 ` Ferruh Yigit
2023-03-10 13:47 ` David Marchand
@ 2023-03-10 13:47 ` Dumitrescu, Cristian
2023-03-10 13:49 ` David Marchand
2023-03-10 13:58 ` Ferruh Yigit
1 sibling, 2 replies; 20+ messages in thread
From: Dumitrescu, Cristian @ 2023-03-10 13:47 UTC (permalink / raw)
To: Ferruh Yigit, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying
Cc: dev, R, Kamalakannan, Suresh Narayane, Harshad
> -----Original Message-----
> From: Ferruh Yigit <ferruh.yigit@amd.com>
> Sent: Friday, March 10, 2023 12:00 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Stephen Hemminger
> <stephen@networkplumber.org>; Jangra, Yogesh <yogesh.jangra@intel.com>;
> Singh, Aman Deep <aman.deep.singh@intel.com>; Zhang, Yuying
> <yuying.zhang@intel.com>
> Cc: dev@dpdk.org; R, Kamalakannan <kamalakannan.r@intel.com>; Suresh
> Narayane, Harshad <harshad.suresh.narayane@intel.com>
> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>
> On 3/9/2023 5:19 PM, Dumitrescu, Cristian wrote:
> >
> >
> >> -----Original Message-----
> >> From: Stephen Hemminger <stephen@networkplumber.org>
> >> Sent: Thursday, March 9, 2023 4:31 PM
> >> To: Jangra, Yogesh <yogesh.jangra@intel.com>
> >> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
> >> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> >> <harshad.suresh.narayane@intel.com>
> >> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev
> ports
> >>
> >> On Thu, 9 Mar 2023 14:42:49 +0000
> >> Yogesh Jangra <yogesh.jangra@intel.com> wrote:
> >>
> >>> + /*
> >>> + * SoftNIC runs on the sevice core, it uses the resources from
> >>> + * the testpmd application. When we run quit command, the
> >> testpmd
> >>> + * application stops ethdev ports first, SoftNIC will try to
> >>> + * access the port and sometimes that result in segmentation
> >>> + * error. So first closing the SoftNIC port.
> >>> + */
> >>> + RTE_ETH_FOREACH_DEV(pt_id) {
> >>> + if (!strcmp(ports[pt_id].dev_info.driver_name,
> >> "net_softnic")) {
> >>> + stop_port(pt_id);
> >>> + close_port(pt_id);
> >>> + }
> >>> + }
> >>> +
> >>
> >> NAK
> >> No driver specific hacks please.
> >>
> >> Instead fix the driver design or bug please.
> >
> > Hi Stephen,
> >
> > This is not a Soft NIC driver-specific hack, this is required for working around
> some of the ethdev drivers that don't implement the stop() API correctly and
> free up the device queues or some other internal resources on stop() instead of
> close().
> >
>
> Why not fix the misbehaving drivers, instead of working around for
> softnic, as Stephen suggested?
>
> Is there a list of problematic drivers?
>
Ferruh, I think this is not a reasonable request. We don't have the expertise to fix all drivers, not the hardware to test all drivers.
> > The Soft NIC is a meta-device that sits on top of other "physical" ethdev
> devices, so when the Soft NIC device continues to poll the queues of those
> physical devices after their queues have been freed, the Soft NIC will get a
> segfault. This fix is required to protect against this sort of incorrect driver
> behavior by simply stopping the Soft NIC devices first.
> >
> > We already have several driver specific branches in the test-pmd for e.g. LAG
> or virtual devices; IMO this small change falls in the same category and it should
> get accepted.
> >
> > Please let us know if this makes sense to you?
> >
> > Regards,
> > Cristian
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 13:47 ` Dumitrescu, Cristian
@ 2023-03-10 13:49 ` David Marchand
2023-03-10 14:36 ` Dumitrescu, Cristian
2023-03-10 13:58 ` Ferruh Yigit
1 sibling, 1 reply; 20+ messages in thread
From: David Marchand @ 2023-03-10 13:49 UTC (permalink / raw)
To: Dumitrescu, Cristian
Cc: Ferruh Yigit, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying, dev, R, Kamalakannan, Suresh Narayane,
Harshad
On Fri, Mar 10, 2023 at 2:48 PM Dumitrescu, Cristian
<cristian.dumitrescu@intel.com> wrote:
> > Why not fix the misbehaving drivers, instead of working around for
> > softnic, as Stephen suggested?
> >
> > Is there a list of problematic drivers?
> >
>
> Ferruh, I think this is not a reasonable request. We don't have the expertise to fix all drivers, not the hardware to test all drivers.
Then, please report the issues.
I am also NAKing this horror.
--
David Marchand
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 13:49 ` David Marchand
@ 2023-03-10 14:36 ` Dumitrescu, Cristian
2023-03-10 14:39 ` David Marchand
2023-03-10 14:58 ` Thomas Monjalon
0 siblings, 2 replies; 20+ messages in thread
From: Dumitrescu, Cristian @ 2023-03-10 14:36 UTC (permalink / raw)
To: David Marchand
Cc: Ferruh Yigit, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying, dev, R, Kamalakannan, Suresh Narayane,
Harshad
> -----Original Message-----
> From: David Marchand <david.marchand@redhat.com>
> Sent: Friday, March 10, 2023 1:49 PM
> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> Cc: Ferruh Yigit <ferruh.yigit@amd.com>; Stephen Hemminger
> <stephen@networkplumber.org>; Jangra, Yogesh <yogesh.jangra@intel.com>;
> Singh, Aman Deep <aman.deep.singh@intel.com>; Zhang, Yuying
> <yuying.zhang@intel.com>; dev@dpdk.org; R, Kamalakannan
> <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> <harshad.suresh.narayane@intel.com>
> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>
> On Fri, Mar 10, 2023 at 2:48 PM Dumitrescu, Cristian
> <cristian.dumitrescu@intel.com> wrote:
> > > Why not fix the misbehaving drivers, instead of working around for
> > > softnic, as Stephen suggested?
> > >
> > > Is there a list of problematic drivers?
> > >
> >
> > Ferruh, I think this is not a reasonable request. We don't have the expertise to
> fix all drivers, not the hardware to test all drivers.
>
> Then, please report the issues.
> I am also NAKing this horror.
>
>
> --
> David Marchand
Hi David,
Your ideas on how to fix this issue would be way more helpful than your NAK😉
Can you please explain why this fix is a problem? We have device X that sits on top of devices A, B, C, ... who are created and started before device X, to me it makes sense to stop device X first before stopping A, B, C, ...
Thanks,
Cristian
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 14:36 ` Dumitrescu, Cristian
@ 2023-03-10 14:39 ` David Marchand
2023-03-10 14:58 ` Thomas Monjalon
1 sibling, 0 replies; 20+ messages in thread
From: David Marchand @ 2023-03-10 14:39 UTC (permalink / raw)
To: Dumitrescu, Cristian
Cc: Ferruh Yigit, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying, dev, R, Kamalakannan, Suresh Narayane,
Harshad
On Fri, Mar 10, 2023 at 3:37 PM Dumitrescu, Cristian
<cristian.dumitrescu@intel.com> wrote:
>
>
>
> > -----Original Message-----
> > From: David Marchand <david.marchand@redhat.com>
> > Sent: Friday, March 10, 2023 1:49 PM
> > To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>
> > Cc: Ferruh Yigit <ferruh.yigit@amd.com>; Stephen Hemminger
> > <stephen@networkplumber.org>; Jangra, Yogesh <yogesh.jangra@intel.com>;
> > Singh, Aman Deep <aman.deep.singh@intel.com>; Zhang, Yuying
> > <yuying.zhang@intel.com>; dev@dpdk.org; R, Kamalakannan
> > <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
> > <harshad.suresh.narayane@intel.com>
> > Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
> >
> > On Fri, Mar 10, 2023 at 2:48 PM Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com> wrote:
> > > > Why not fix the misbehaving drivers, instead of working around for
> > > > softnic, as Stephen suggested?
> > > >
> > > > Is there a list of problematic drivers?
> > > >
> > >
> > > Ferruh, I think this is not a reasonable request. We don't have the expertise to
> > fix all drivers, not the hardware to test all drivers.
> >
> > Then, please report the issues.
> > I am also NAKing this horror.
> >
> >
> > --
> > David Marchand
>
> Hi David,
>
> Your ideas on how to fix this issue would be way more helpful than your NAK😉
Well sorry, but Ferruh request was reasonable to me.
Trying to drop the ball made me a bit angry.
>
> Can you please explain why this fix is a problem? We have device X that sits on top of devices A, B, C, ... who are created and started before device X, to me it makes sense to stop device X first before stopping A, B, C, ...
There are other drivers like the failsafe driver that sit on top of
other devices.
I'd recommend looking at what it does.
--
David Marchand
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 14:36 ` Dumitrescu, Cristian
2023-03-10 14:39 ` David Marchand
@ 2023-03-10 14:58 ` Thomas Monjalon
1 sibling, 0 replies; 20+ messages in thread
From: Thomas Monjalon @ 2023-03-10 14:58 UTC (permalink / raw)
To: Dumitrescu, Cristian
Cc: David Marchand, dev, Ferruh Yigit, Stephen Hemminger, Jangra,
Yogesh, Singh, Aman Deep, Zhang, Yuying, dev, R, Kamalakannan,
Suresh Narayane, Harshad
10/03/2023 15:36, Dumitrescu, Cristian:
> From: David Marchand <david.marchand@redhat.com>
> > On Fri, Mar 10, 2023 at 2:48 PM Dumitrescu, Cristian
> > <cristian.dumitrescu@intel.com> wrote:
> > > > Why not fix the misbehaving drivers, instead of working around for
> > > > softnic, as Stephen suggested?
> > > >
> > > > Is there a list of problematic drivers?
> > > >
> > >
> > > Ferruh, I think this is not a reasonable request. We don't have the expertise to
> > fix all drivers, not the hardware to test all drivers.
> >
> > Then, please report the issues.
> > I am also NAKing this horror.
> >
> > --
> > David Marchand
>
> Hi David,
>
> Your ideas on how to fix this issue would be way more helpful than your NAK😉
>
> Can you please explain why this fix is a problem? We have device X that sits on top of devices A, B, C, ... who are created and started before device X, to me it makes sense to stop device X first before stopping A, B, C, ...
Ferruh has already explained it well: it is not a testpmd problem.
The problem you describe can happen in any application,
so we need to fix the root cause.
If there are problems in some drivers, please tell us which one
and we'll look at how to fix them.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 13:47 ` Dumitrescu, Cristian
2023-03-10 13:49 ` David Marchand
@ 2023-03-10 13:58 ` Ferruh Yigit
2023-03-10 16:44 ` Stephen Hemminger
1 sibling, 1 reply; 20+ messages in thread
From: Ferruh Yigit @ 2023-03-10 13:58 UTC (permalink / raw)
To: Dumitrescu, Cristian, Stephen Hemminger, Jangra, Yogesh, Singh,
Aman Deep, Zhang, Yuying
Cc: dev, R, Kamalakannan, Suresh Narayane, Harshad
On 3/10/2023 1:47 PM, Dumitrescu, Cristian wrote:
>
>
>> -----Original Message-----
>> From: Ferruh Yigit <ferruh.yigit@amd.com>
>> Sent: Friday, March 10, 2023 12:00 PM
>> To: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Stephen Hemminger
>> <stephen@networkplumber.org>; Jangra, Yogesh <yogesh.jangra@intel.com>;
>> Singh, Aman Deep <aman.deep.singh@intel.com>; Zhang, Yuying
>> <yuying.zhang@intel.com>
>> Cc: dev@dpdk.org; R, Kamalakannan <kamalakannan.r@intel.com>; Suresh
>> Narayane, Harshad <harshad.suresh.narayane@intel.com>
>> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
>>
>> On 3/9/2023 5:19 PM, Dumitrescu, Cristian wrote:
>>>
>>>
>>>> -----Original Message-----
>>>> From: Stephen Hemminger <stephen@networkplumber.org>
>>>> Sent: Thursday, March 9, 2023 4:31 PM
>>>> To: Jangra, Yogesh <yogesh.jangra@intel.com>
>>>> Cc: dev@dpdk.org; Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; R,
>>>> Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad
>>>> <harshad.suresh.narayane@intel.com>
>>>> Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev
>> ports
>>>>
>>>> On Thu, 9 Mar 2023 14:42:49 +0000
>>>> Yogesh Jangra <yogesh.jangra@intel.com> wrote:
>>>>
>>>>> + /*
>>>>> + * SoftNIC runs on the sevice core, it uses the resources from
>>>>> + * the testpmd application. When we run quit command, the
>>>> testpmd
>>>>> + * application stops ethdev ports first, SoftNIC will try to
>>>>> + * access the port and sometimes that result in segmentation
>>>>> + * error. So first closing the SoftNIC port.
>>>>> + */
>>>>> + RTE_ETH_FOREACH_DEV(pt_id) {
>>>>> + if (!strcmp(ports[pt_id].dev_info.driver_name,
>>>> "net_softnic")) {
>>>>> + stop_port(pt_id);
>>>>> + close_port(pt_id);
>>>>> + }
>>>>> + }
>>>>> +
>>>>
>>>> NAK
>>>> No driver specific hacks please.
>>>>
>>>> Instead fix the driver design or bug please.
>>>
>>> Hi Stephen,
>>>
>>> This is not a Soft NIC driver-specific hack, this is required for working around
>> some of the ethdev drivers that don't implement the stop() API correctly and
>> free up the device queues or some other internal resources on stop() instead of
>> close().
>>>
>>
>> Why not fix the misbehaving drivers, instead of working around for
>> softnic, as Stephen suggested?
>>
>> Is there a list of problematic drivers?
>>
>
> Ferruh, I think this is not a reasonable request. We don't have the expertise to fix all drivers, not the hardware to test all drivers.
>
Please don't make it over dramatic ;), this is not about having
expertise in all drivers or having their hardware to test.
You claim some drivers does free up their resources on stop() and
continue to polling from them cause segfault. Action is move resource
free from stop() to close().
And my intention was not request a fix from you, if you have any
particular misbehaving drivers, I can facilitate a fix from those driver
maintainers.
Eventually drivers freeing resources in stop() is a bigger problem and
can hit other applications too, this is not just testpmd problem.
>>> The Soft NIC is a meta-device that sits on top of other "physical" ethdev
>> devices, so when the Soft NIC device continues to poll the queues of those
>> physical devices after their queues have been freed, the Soft NIC will get a
>> segfault. This fix is required to protect against this sort of incorrect driver
>> behavior by simply stopping the Soft NIC devices first.
>>>
>>> We already have several driver specific branches in the test-pmd for e.g. LAG
>> or virtual devices; IMO this small change falls in the same category and it should
>> get accepted.
>>>
>>> Please let us know if this makes sense to you?
>>>
>>> Regards,
>>> Cristian
>
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 13:58 ` Ferruh Yigit
@ 2023-03-10 16:44 ` Stephen Hemminger
2023-03-17 7:11 ` Jangra, Yogesh
0 siblings, 1 reply; 20+ messages in thread
From: Stephen Hemminger @ 2023-03-10 16:44 UTC (permalink / raw)
To: Ferruh Yigit
Cc: Dumitrescu, Cristian, Jangra, Yogesh, Singh, Aman Deep, Zhang,
Yuying, dev, R, Kamalakannan, Suresh Narayane, Harshad
On Fri, 10 Mar 2023 13:58:52 +0000
Ferruh Yigit <ferruh.yigit@amd.com> wrote:
> >>
> >> Why not fix the misbehaving drivers, instead of working around for
> >> softnic, as Stephen suggested?
> >>
> >> Is there a list of problematic drivers?
> >>
> >
> > Ferruh, I think this is not a reasonable request. We don't have the expertise to fix all drivers, not the hardware to test all drivers.
> >
>
> Please don't make it over dramatic ;), this is not about having
> expertise in all drivers or having their hardware to test.
>
> You claim some drivers does free up their resources on stop() and
> continue to polling from them cause segfault. Action is move resource
> free from stop() to close().
>
> And my intention was not request a fix from you, if you have any
> particular misbehaving drivers, I can facilitate a fix from those driver
> maintainers.
> Eventually drivers freeing resources in stop() is a bigger problem and
> can hit other applications too, this is not just testpmd problem.
Lets all work together to resolve this.
I and others are willing to review and fix drivers you identify as problematic.
If this is a common problem, ideally we can update CI infrastructure to
test shutdown and resource issues more thoroughly via ASAN builds etc.
^ permalink raw reply [flat|nested] 20+ messages in thread
* RE: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
2023-03-10 16:44 ` Stephen Hemminger
@ 2023-03-17 7:11 ` Jangra, Yogesh
0 siblings, 0 replies; 20+ messages in thread
From: Jangra, Yogesh @ 2023-03-17 7:11 UTC (permalink / raw)
To: Stephen Hemminger
Cc: David Marchand, Thomas Monjalon, Ferruh Yigit, Dumitrescu,
Cristian, Singh, Aman Deep, Zhang, Yuying, dev, R, Kamalakannan,
Suresh Narayane, Harshad
[-- Attachment #1.1: Type: text/plain, Size: 2552 bytes --]
-----Original Message-----
From: Stephen Hemminger <stephen@networkplumber.org>
Sent: Friday, March 10, 2023 10:15 PM
To: Ferruh Yigit <ferruh.yigit@amd.com>
Cc: Dumitrescu, Cristian <cristian.dumitrescu@intel.com>; Jangra, Yogesh <yogesh.jangra@intel.com>; Singh, Aman Deep <aman.deep.singh@intel.com>; Zhang, Yuying <yuying.zhang@intel.com>; dev@dpdk.org; R, Kamalakannan <kamalakannan.r@intel.com>; Suresh Narayane, Harshad <harshad.suresh.narayane@intel.com>
Subject: Re: [PATCH] app/testpmd: fix closing softnic port before ethdev ports
On Fri, 10 Mar 2023 13:58:52 +0000
Ferruh Yigit <ferruh.yigit@amd.com<mailto:ferruh.yigit@amd.com>> wrote:
> >>
> >> Why not fix the misbehaving drivers, instead of working around for
> >> softnic, as Stephen suggested?
> >>
> >> Is there a list of problematic drivers?
> >>
> >
> > Ferruh, I think this is not a reasonable request. We don't have the expertise to fix all drivers, not the hardware to test all drivers.
> >
>
> Please don't make it over dramatic ;), this is not about having
> expertise in all drivers or having their hardware to test.
>
> You claim some drivers does free up their resources on stop() and
> continue to polling from them cause segfault. Action is move resource
> free from stop() to close().
>
> And my intention was not request a fix from you, if you have any
> particular misbehaving drivers, I can facilitate a fix from those
> driver maintainers.
> Eventually drivers freeing resources in stop() is a bigger problem and
> can hit other applications too, this is not just testpmd problem.
Lets all work together to resolve this.
I and others are willing to review and fix drivers you identify as problematic.
If this is a common problem, ideally we can update CI infrastructure to test shutdown and resource issues more thoroughly via ASAN builds etc.
Hi Stephen,
I am getting segmentation error while stopping only the physical port .
I run testpmd application using Soft NIC driver. I used one physical port and Soft NIC is running on the top of that.
In my scenario, I started the traffic on the port and when I stop the port I am getting segmentation error in rte_eth_rx_burst api.
This shows the order of stopping the port is casing the issue. As stopping the port is not handling the service thread stop.
Please refer screenshot for reference.
[cid:image001.png@01D958CD.BE580600]
Thanks & Regards,
Yogesh
[-- Attachment #1.2: Type: text/html, Size: 7099 bytes --]
[-- Attachment #2: image001.png --]
[-- Type: image/png, Size: 81727 bytes --]
^ permalink raw reply [flat|nested] 20+ messages in thread