From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <mhall@mhcomputing.net>
Received: from mail.mhcomputing.net (master.mhcomputing.net [74.208.46.186])
 by dpdk.org (Postfix) with ESMTP id 41BB82E8A
 for <dev@dpdk.org>; Mon,  6 Oct 2014 11:07:20 +0200 (CEST)
Received: by mail.mhcomputing.net (Postfix, from userid 1000)
 id A66B580C50B; Mon,  6 Oct 2014 02:13:44 -0700 (PDT)
Date: Mon, 6 Oct 2014 02:13:44 -0700
From: Matthew Hall <mhall@mhcomputing.net>
To: dev@dpdk.org
Message-ID: <20141006091344.GA14759@mhcomputing.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.5.23 (2014-03-12)
Subject: [dpdk-dev] Possible bug in eal_pci pci_scan_one
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Oct 2014 09:07:20 -0000

Hi Guys,

I'm doing my development on kind of a cheap machine with no NUMA support... 
but several years ago I used DPDK to build a NUMA box that could do 40 gbits 
bidirectional L4-L7 stateful traffic replay.

So given the past experiences I had before, I wanted to clean the code up so 
it'd work well if some crazy guy tried my code on one of these huge boxes, 
too, but then I ran into some weird issues.

1) When I call rte_eth_dev_socket_id() I get back -1. But the call can return 
-1 if the port_id is bogus or if pci_scan_one didn't get a numa_node (because 
you're on a non-NUMA box for example).

int rte_eth_dev_socket_id(uint8_t port_id)
{
        if (port_id >= nb_ports)
                return -1;
        return rte_eth_devices[port_id].pci_dev->numa_node;
}

So you couldn't tell the different between non-NUMA or a bad port value, etc.

2) The code's behavior and comments disagree with one another. In the 
pci_scan_one function, there's this code:

/* get numa node */
snprintf(filename, sizeof(filename), "%s/numa_node",
         dirname);
if (access(filename, R_OK) != 0) {
        /* if no NUMA support just set node to 0 */
        dev->numa_node = -1;
} else {
        if (eal_parse_sysfs_value(filename, &tmp) < 0) {
                free(dev);
                return -1;
        }
        dev->numa_node = tmp;
}

It says, just use NUMA node 0 if there is no NUMA support. But then proceeds 
to set the value to -1 in disagreement with the comment, and also stomping on 
the other meaning for -1 in the higher function rte_eth_dev_socket_id.

3) In conclusion, it seems like some stuff is missing... first there needs to 
be a function that will tell you the number of NUMA nodes present on the box 
so you can create the right number of mbuf_pools, but I couldn't find that function.

Then if you have the function, you can do some magic and shuffle the NICs 
around to get them hooked to a core on the same NUMA, and the mbuf_pool on the 
same NUMA.

When NUMA is not present, can we return 0 instead of -1, or return a specific 
error code that the client can use to know he should just use Socket 0? Right 
now I can't tell apart any potential errors or weird values from correct 
values.

4) I'm willing to help make and test some patches... but first I want to 
understand what is happening with these funny functions before doing things 
blindly.

Thanks,
Matthew.