* [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
@ 2017-05-11 1:56 Tonghao Zhang
2017-06-22 15:15 ` Sergio Gonzalez Monroy
0 siblings, 1 reply; 7+ messages in thread
From: Tonghao Zhang @ 2017-05-11 1:56 UTC (permalink / raw)
To: dev; +Cc: Tonghao Zhang
The NUMA node information for PCI devices provided through
sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
It is good to see more checking for valid values.
Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
---
lib/librte_eal/linuxapp/eal/eal_pci.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
index 595622b..c817b4c 100644
--- a/lib/librte_eal/linuxapp/eal/eal_pci.c
+++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
@@ -310,18 +310,18 @@
dev->max_vfs = (uint16_t)tmp;
}
- /* get numa node */
+ /* get numa node, default to 0 if not present */
snprintf(filename, sizeof(filename), "%s/numa_node",
dirname);
- if (access(filename, R_OK) != 0) {
- /* if no NUMA support, set default to 0 */
- dev->device.numa_node = 0;
- } else {
- if (eal_parse_sysfs_value(filename, &tmp) < 0) {
- free(dev);
- return -1;
- }
+
+ if (eal_parse_sysfs_value(filename, &tmp) == 0 &&
+ tmp < RTE_MAX_NUMA_NODES)
dev->device.numa_node = tmp;
+ else {
+ RTE_LOG(WARNING, EAL,
+ "numa_node is invalid or not present. "
+ "Set it 0 as default\n");
+ dev->device.numa_node = 0;
}
rte_pci_device_name(addr, dev->name, sizeof(dev->name));
--
1.8.3.1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-05-11 1:56 [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it Tonghao Zhang
@ 2017-06-22 15:15 ` Sergio Gonzalez Monroy
2017-06-23 13:02 ` Thomas Monjalon
0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-22 15:15 UTC (permalink / raw)
To: Tonghao Zhang, dev; +Cc: Thomas Monjalon
Just fyi, the summary line should be lowercase apart from acronyms (DPDK
guidelines).
On 11/05/2017 02:56, Tonghao Zhang wrote:
> The NUMA node information for PCI devices provided through
> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> It is good to see more checking for valid values.
>
> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> ---
IMHO the message could be slightly improved by adding some of the
replies that you made to your v3.
ie. Typical wrong numa node in VMs
$ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
-1
> lib/librte_eal/linuxapp/eal/eal_pci.c | 18 +++++++++---------
> 1 file changed, 9 insertions(+), 9 deletions(-)
>
> diff --git a/lib/librte_eal/linuxapp/eal/eal_pci.c b/lib/librte_eal/linuxapp/eal/eal_pci.c
> index 595622b..c817b4c 100644
> --- a/lib/librte_eal/linuxapp/eal/eal_pci.c
> +++ b/lib/librte_eal/linuxapp/eal/eal_pci.c
> @@ -310,18 +310,18 @@
> dev->max_vfs = (uint16_t)tmp;
> }
>
> - /* get numa node */
> + /* get numa node, default to 0 if not present */
> snprintf(filename, sizeof(filename), "%s/numa_node",
> dirname);
> - if (access(filename, R_OK) != 0) {
> - /* if no NUMA support, set default to 0 */
> - dev->device.numa_node = 0;
> - } else {
> - if (eal_parse_sysfs_value(filename, &tmp) < 0) {
> - free(dev);
> - return -1;
> - }
> +
> + if (eal_parse_sysfs_value(filename, &tmp) == 0 &&
> + tmp < RTE_MAX_NUMA_NODES)
> dev->device.numa_node = tmp;
> + else {
> + RTE_LOG(WARNING, EAL,
> + "numa_node is invalid or not present. "
> + "Set it 0 as default\n");
> + dev->device.numa_node = 0;
> }
>
> rte_pci_device_name(addr, dev->name, sizeof(dev->name));
The code changes look fine, so I leave it to Thomas regarding the commit
message :)
Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-06-22 15:15 ` Sergio Gonzalez Monroy
@ 2017-06-23 13:02 ` Thomas Monjalon
2017-06-26 9:14 ` Sergio Gonzalez Monroy
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-23 13:02 UTC (permalink / raw)
To: Tonghao Zhang; +Cc: dev, Sergio Gonzalez Monroy
22/06/2017 17:15, Sergio Gonzalez Monroy:
> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
> guidelines).
>
> On 11/05/2017 02:56, Tonghao Zhang wrote:
> > The NUMA node information for PCI devices provided through
> > sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> > on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> > It is good to see more checking for valid values.
> >
> > Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> > ---
>
> IMHO the message could be slightly improved by adding some of the
> replies that you made to your v3.
> ie. Typical wrong numa node in VMs
>
> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> -1
[...]
> The code changes look fine, so I leave it to Thomas regarding the commit
> message :)
>
> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
Applied, thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-06-23 13:02 ` Thomas Monjalon
@ 2017-06-26 9:14 ` Sergio Gonzalez Monroy
2017-06-26 9:39 ` Thomas Monjalon
0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-26 9:14 UTC (permalink / raw)
To: Thomas Monjalon, Tonghao Zhang; +Cc: dev
On 23/06/2017 14:02, Thomas Monjalon wrote:
> 22/06/2017 17:15, Sergio Gonzalez Monroy:
>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
>> guidelines).
>>
>> On 11/05/2017 02:56, Tonghao Zhang wrote:
>>> The NUMA node information for PCI devices provided through
>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
>>> It is good to see more checking for valid values.
>>>
>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
>>> ---
>> IMHO the message could be slightly improved by adding some of the
>> replies that you made to your v3.
>> ie. Typical wrong numa node in VMs
>>
>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
>> -1
> [...]
>> The code changes look fine, so I leave it to Thomas regarding the commit
>> message :)
>>
>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> Applied, thanks
It looks like some systems have quite a few devices that report -1 as
numa_node value causing lots of warning messages being printed.
Quick fixes that come to mind would be:
1) Change log level to DEBUG
2) Add static var to only print the message once.
I also think that the message itself should show at least the BDF to at
least know which devices are reporting bad numa_node values.
Thoughts?
Sergio
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-06-26 9:14 ` Sergio Gonzalez Monroy
@ 2017-06-26 9:39 ` Thomas Monjalon
2017-06-26 12:50 ` Sergio Gonzalez Monroy
0 siblings, 1 reply; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-26 9:39 UTC (permalink / raw)
To: Sergio Gonzalez Monroy; +Cc: Tonghao Zhang, dev
26/06/2017 11:14, Sergio Gonzalez Monroy:
> On 23/06/2017 14:02, Thomas Monjalon wrote:
> > 22/06/2017 17:15, Sergio Gonzalez Monroy:
> >> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
> >> guidelines).
> >>
> >> On 11/05/2017 02:56, Tonghao Zhang wrote:
> >>> The NUMA node information for PCI devices provided through
> >>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> >>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> >>> It is good to see more checking for valid values.
> >>>
> >>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> >>> ---
> >> IMHO the message could be slightly improved by adding some of the
> >> replies that you made to your v3.
> >> ie. Typical wrong numa node in VMs
> >>
> >> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> >> -1
> > [...]
> >> The code changes look fine, so I leave it to Thomas regarding the commit
> >> message :)
> >>
> >> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> > Applied, thanks
>
> It looks like some systems have quite a few devices that report -1 as
> numa_node value causing lots of warning messages being printed.
> Quick fixes that come to mind would be:
> 1) Change log level to DEBUG
As it is important for performance, it should not be just for DEBUG.
> 2) Add static var to only print the message once.
Yes good idea.
> I also think that the message itself should show at least the BDF to at
> least know which devices are reporting bad numa_node values.
With the static variable, we will have only the first device BDF.
Is it relevant?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-06-26 9:39 ` Thomas Monjalon
@ 2017-06-26 12:50 ` Sergio Gonzalez Monroy
2017-06-26 14:36 ` Thomas Monjalon
0 siblings, 1 reply; 7+ messages in thread
From: Sergio Gonzalez Monroy @ 2017-06-26 12:50 UTC (permalink / raw)
To: Thomas Monjalon; +Cc: Tonghao Zhang, dev
On 26/06/2017 10:39, Thomas Monjalon wrote:
> 26/06/2017 11:14, Sergio Gonzalez Monroy:
>> On 23/06/2017 14:02, Thomas Monjalon wrote:
>>> 22/06/2017 17:15, Sergio Gonzalez Monroy:
>>>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
>>>> guidelines).
>>>>
>>>> On 11/05/2017 02:56, Tonghao Zhang wrote:
>>>>> The NUMA node information for PCI devices provided through
>>>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
>>>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
>>>>> It is good to see more checking for valid values.
>>>>>
>>>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
>>>>> ---
>>>> IMHO the message could be slightly improved by adding some of the
>>>> replies that you made to your v3.
>>>> ie. Typical wrong numa node in VMs
>>>>
>>>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
>>>> -1
>>> [...]
>>>> The code changes look fine, so I leave it to Thomas regarding the commit
>>>> message :)
>>>>
>>>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
>>> Applied, thanks
>> It looks like some systems have quite a few devices that report -1 as
>> numa_node value causing lots of warning messages being printed.
>> Quick fixes that come to mind would be:
>> 1) Change log level to DEBUG
> As it is important for performance, it should not be just for DEBUG.
>
>> 2) Add static var to only print the message once.
> Yes good idea.
>
>> I also think that the message itself should show at least the BDF to at
>> least know which devices are reporting bad numa_node values.
> With the static variable, we will have only the first device BDF.
> Is it relevant?
>
I think it is relevant if it affects a device used by DPDK, but we don't
know that when doing full pci_scan.
At least on x86 platforms we usually see many PCI devices without numa_node:
ls /sys/bus/pci/devices | xargs -n 1 -I {} head -v
"/sys/bus/pci/devices/{}/numa_node"
A single warning is not going to mean much if all platforms have PCI
devices without proper numa_node, right?
A more cleaner solution might be to leave -1 if we failed to parse
numa_node, then on rte_pci_probe_one_driver after checking if it is
blacklisted check if socket_id is -1 and show warning message defaulting
to 0?
I would be inclined to:
a) leave it as it is with DEBUG log level, also showing PCI BDF (very
noisy in debug mode).
b) show the warning and default to 0 in rte_pci_probe_one_driver,
showing only relevant devices.
Sergio
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it.
2017-06-26 12:50 ` Sergio Gonzalez Monroy
@ 2017-06-26 14:36 ` Thomas Monjalon
0 siblings, 0 replies; 7+ messages in thread
From: Thomas Monjalon @ 2017-06-26 14:36 UTC (permalink / raw)
To: Sergio Gonzalez Monroy; +Cc: Tonghao Zhang, dev
26/06/2017 14:50, Sergio Gonzalez Monroy:
> On 26/06/2017 10:39, Thomas Monjalon wrote:
> > 26/06/2017 11:14, Sergio Gonzalez Monroy:
> >> On 23/06/2017 14:02, Thomas Monjalon wrote:
> >>> 22/06/2017 17:15, Sergio Gonzalez Monroy:
> >>>> Just fyi, the summary line should be lowercase apart from acronyms (DPDK
> >>>> guidelines).
> >>>>
> >>>> On 11/05/2017 02:56, Tonghao Zhang wrote:
> >>>>> The NUMA node information for PCI devices provided through
> >>>>> sysfs is invalid for AMD Opteron(TM) Processor 62xx and 63xx
> >>>>> on Red Hat Enterprise Linux 6, and VMs on some hypervisors.
> >>>>> It is good to see more checking for valid values.
> >>>>>
> >>>>> Signed-off-by: Tonghao Zhang <nic@opencloud.tech>
> >>>>> ---
> >>>> IMHO the message could be slightly improved by adding some of the
> >>>> replies that you made to your v3.
> >>>> ie. Typical wrong numa node in VMs
> >>>>
> >>>> $ cat /sys/devices/pci0000:00/0000:00:18.6/numa_node
> >>>> -1
> >>> [...]
> >>>> The code changes look fine, so I leave it to Thomas regarding the commit
> >>>> message :)
> >>>>
> >>>> Acked-by: Sergio Gonzalez Monroy <sergio.gonzalez.monroy@intel.com>
> >>> Applied, thanks
> >> It looks like some systems have quite a few devices that report -1 as
> >> numa_node value causing lots of warning messages being printed.
> >> Quick fixes that come to mind would be:
> >> 1) Change log level to DEBUG
> > As it is important for performance, it should not be just for DEBUG.
> >
> >> 2) Add static var to only print the message once.
> > Yes good idea.
> >
> >> I also think that the message itself should show at least the BDF to at
> >> least know which devices are reporting bad numa_node values.
> > With the static variable, we will have only the first device BDF.
> > Is it relevant?
> >
>
> I think it is relevant if it affects a device used by DPDK, but we don't
> know that when doing full pci_scan.
>
> At least on x86 platforms we usually see many PCI devices without numa_node:
> ls /sys/bus/pci/devices | xargs -n 1 -I {} head -v
> "/sys/bus/pci/devices/{}/numa_node"
>
> A single warning is not going to mean much if all platforms have PCI
> devices without proper numa_node, right?
>
> A more cleaner solution might be to leave -1 if we failed to parse
> numa_node, then on rte_pci_probe_one_driver after checking if it is
> blacklisted check if socket_id is -1 and show warning message defaulting
> to 0?
>
> I would be inclined to:
> a) leave it as it is with DEBUG log level, also showing PCI BDF (very
> noisy in debug mode).
> b) show the warning and default to 0 in rte_pci_probe_one_driver,
> showing only relevant devices.
Looks a good proposal Sergio!
Thanks
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2017-06-26 14:36 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2017-05-11 1:56 [dpdk-dev] [PATCH v4] eal: Set numa node value for system which not support it Tonghao Zhang
2017-06-22 15:15 ` Sergio Gonzalez Monroy
2017-06-23 13:02 ` Thomas Monjalon
2017-06-26 9:14 ` Sergio Gonzalez Monroy
2017-06-26 9:39 ` Thomas Monjalon
2017-06-26 12:50 ` Sergio Gonzalez Monroy
2017-06-26 14:36 ` Thomas Monjalon
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).