From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <nhorman@tuxdriver.com>
Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58])
 by dpdk.org (Postfix) with ESMTP id 22BFB8D99
 for <dev@dpdk.org>; Wed, 30 Sep 2015 15:14:57 +0200 (CEST)
Received: from hmsreliant.think-freely.org
 ([2001:470:8:a08:7aac:c0ff:fec2:933b] helo=localhost)
 by smtp.tuxdriver.com with esmtpsa (TLSv1:AES128-SHA:128) (Exim 4.63)
 (envelope-from <nhorman@tuxdriver.com>)
 id 1ZhHDi-0004Cp-3p; Wed, 30 Sep 2015 09:14:55 -0400
Date: Wed, 30 Sep 2015 09:14:48 -0400
From: Neil Horman <nhorman@tuxdriver.com>
To: Bruce Richardson <bruce.richardson@intel.com>
Message-ID: <20150930131448.GA32524@hmsreliant.think-freely.org>
References: <PATCH>
 <1443445418-18498-1-git-send-email-bernard.iremonger@intel.com>
 <1443445418-18498-3-git-send-email-bernard.iremonger@intel.com>
 <20150929190812.GA3154@hmsreliant.think-freely.org>
 <20150930095603.GA10264@bricha3-MOBL3>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <20150930095603.GA10264@bricha3-MOBL3>
User-Agent: Mutt/1.5.23 (2014-03-12)
X-Spam-Score: -1.0 (-)
X-Spam-Status: No
Cc: dev@dpdk.org
Subject: Re: [dpdk-dev] [PATCH 02/20] librte_ether: add fields from
 rte_pci_driver to rte_eth_dev_data
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 30 Sep 2015 13:14:57 -0000

On Wed, Sep 30, 2015 at 10:56:04AM +0100, Bruce Richardson wrote:
> On Tue, Sep 29, 2015 at 03:08:12PM -0400, Neil Horman wrote:
> > On Mon, Sep 28, 2015 at 02:03:20PM +0100, Bernard Iremonger wrote:
> > > add dev_flags to rte_eth_dev_data, add macros for dev_flags.
> > > add kdrv to rte_eth_dev_data.
> > > add numa_node to rte_eth_dev_data.
> > > add drv_name to rte_eth_dev_data.
> > > use dev_type to distinguish between vdev's and pdev's.
> > > remove pci_dev branches.
> > > 
> > > Signed-off-by: Bernard Iremonger <bernard.iremonger@intel.com>
> > > ---
> > >  lib/librte_ether/rte_ethdev.c | 53 ++++++++++++++++++++++++-------------------
> > >  lib/librte_ether/rte_ethdev.h | 15 ++++++++++++
> > >  2 files changed, 45 insertions(+), 23 deletions(-)
> > > 
> <snip>
> > > +++ b/lib/librte_ether/rte_ethdev.h
> > > @@ -1635,8 +1635,23 @@ struct rte_eth_dev_data {
> > >  		all_multicast : 1, /**< RX all multicast mode ON(1) / OFF(0). */
> > >  		dev_started : 1,   /**< Device state: STARTED(1) / STOPPED(0). */
> > >  		lro         : 1;   /**< RX LRO is ON(1) / OFF(0) */
> > > +	uint32_t dev_flags; /**< Flags controlling handling of device. */
> > > +	enum rte_kernel_driver kdrv;	/**< Kernel driver passthrough */
> > Why add this here? The ennumerated driver types are all variants on PCI bus
> > types.  Not sure why the ethernet interface needs to know this info
> > 
> > > +	int numa_node;
> > Ditto, this seems like information that is only relevant if the device is on a
> > physical bus (i.e. virual devices are likely to not have a numa node)
> >
> Actually, I disagree. For some virtual devices they will have a numa node. For
> ring or other virtual PMDs the numa node will be the node on which the ring /
> mempool etc. memory is allocated on, and can be of relevance.
> 
> /Bruce
> 

I think its fairly clear that some devices (including virtual ones) have some
relevant relation to a numa_node (There are even some that have no numa_node,
for which a -1 value makes some sense).  That said, there are just as many that
don't have a relevant numa_node.

1) There are some drivers for which numa_node make no sense (regardless of
value):
 * af_packet - The numa node is at best determined at run time by the interface
the socket is bound to

 * pcap - same as af_packet

 * bonding - multiple interfaces mean multiple numa_nodes, any value set here is
just as likely to be wrong as right

 * mpipe - no real large memory area to associate with a numa node

 * virtio - uses iopl for communication, and cannot know its numa_node

 * vmxnet3 - same concept as virtio

 * xenvirt - same as vmxnet3

I think its better that you store numa locality information in a pmd's private
bus data, and export it to applications via a device method.  that provides the
flexibility to tell the application that there is no numa locality for a device
(by not implementing the method), without having to expose an unset data field
to the application.

Neil