From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 24B724567F; Mon, 22 Jul 2024 12:07:49 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 2BBA043386; Mon, 22 Jul 2024 12:07:48 +0200 (CEST) Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.16]) by mails.dpdk.org (Postfix) with ESMTP id C493F402CD; Mon, 22 Jul 2024 12:03:10 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721642592; x=1753178592; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=3Urxi96+VIqiMm7QZKvVAzw7ig347iTVYGk6IXhBYyw=; b=Q/5/0BhoKPE3eD655prrYIzz/juvli8BgAiHgqwN5G0BZpXcQigwHlVf HiG/2mMEDR7DHx0jDal8q+YmMD6I4lrj1uGdVsGGzwzocB6Lkm6ssKDpm 5WVgEzYPvIPmeHLgMfKPreIunQ8i4c3SxOkgMRluko7DaQRw8SfxB3Swe pFgfj2w3aPp9sSM+yXf2Sl9JLHr8s413fQhMs3MBO3wvZhxFjRH3ve2UY tiMSCi9ylzWs4vU+oSs1oMcTHR4+SE2CwmWgR4lk0slNKGhN5NkoC0uji TYNDtWmBGomY0IRcnctRySNd0miyFm/lxDBVR26c9beylCz2NoaaS0aJ1 Q==; X-CSE-ConnectionGUID: 3Zdmp6T0Q6uwB8Wgkh6ZpA== X-CSE-MsgGUID: K7B2Xx6LQ+6Xtc7AMJwT0Q== X-IronPort-AV: E=McAfee;i="6700,10204,11140"; a="19334158" X-IronPort-AV: E=Sophos;i="6.09,227,1716274800"; d="scan'208";a="19334158" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by orvoesa108.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Jul 2024 03:03:10 -0700 X-CSE-ConnectionGUID: 6aSV9rfGScqJqEZdhaZ3lA== X-CSE-MsgGUID: Il4KadDuQ4ChSP6rENwEBg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,227,1716274800"; d="scan'208";a="56679793" Received: from unknown (HELO silpixa00401385.ir.intel.com) ([10.237.214.33]) by orviesa003.jf.intel.com with ESMTP; 22 Jul 2024 03:03:08 -0700 From: Bruce Richardson To: dev@dpdk.org Cc: ferruh.yigit@amd.com, Bruce Richardson , stable@dpdk.org, Padraig Connolly Subject: [PATCH v2] ethdev: fix device init without socket-local memory Date: Mon, 22 Jul 2024 11:02:28 +0100 Message-ID: <20240722100228.618616-1-bruce.richardson@intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240711123500.483119-1-bruce.richardson@intel.com> References: <20240711123500.483119-1-bruce.richardson@intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org When allocating memory for an ethdev, the rte_malloc_socket call used only allocates memory on the NUMA node/socket local to the device. This means that even if the user wanted to, they could never use a remote NIC without also having memory on that NIC's socket. For example, if we change examples/skeleton/basicfwd.c to have SOCKET_ID_ANY as the socket_id parameter for Rx and Tx rings, we should be able to run the app cross-numa e.g. as below, where the two PCI devices are on socket 1, and core 1 is on socket 0: ./build/examples/dpdk-skeleton -l 1 --legacy-mem --socket-mem=1024,0 \ -a a8:00.0 -a b8:00.0 This fails however, with the error: ETHDEV: failed to allocate private data PCI_BUS: Requested device 0000:a8:00.0 cannot be used We can remove this restriction by doing a fallback call to general rte_malloc after a call to rte_malloc_socket fails. This should be safe to do because the later ethdev calls to setup Rx/Tx queues all take a socket_id parameter, which can be used by applications to enforce the requirement for local-only memory for a device, if so desired. [If device-local memory is present it will be used as before, while if not present the rte_eth_dev_configure call will now pass, but the subsequent queue setup calls requesting local memory will fail]. Fixes: e489007a411c ("ethdev: add generic create/destroy ethdev APIs") Fixes: dcd5c8112bc3 ("ethdev: add PCI driver helpers") Cc: stable@dpdk.org Signed-off-by: Bruce Richardson Signed-off-by: Padraig Connolly --- V2: * Add warning printout in the case where we don't get device-local memory, but we do get memory on another socket. --- lib/ethdev/ethdev_driver.c | 20 +++++++++++++++----- lib/ethdev/ethdev_pci.h | 20 +++++++++++++++++--- 2 files changed, 32 insertions(+), 8 deletions(-) diff --git a/lib/ethdev/ethdev_driver.c b/lib/ethdev/ethdev_driver.c index f48c0eb8bc..c335a25a82 100644 --- a/lib/ethdev/ethdev_driver.c +++ b/lib/ethdev/ethdev_driver.c @@ -303,15 +303,25 @@ rte_eth_dev_create(struct rte_device *device, const char *name, return -ENODEV; if (priv_data_size) { + /* try alloc private data on device-local node. */ ethdev->data->dev_private = rte_zmalloc_socket( name, priv_data_size, RTE_CACHE_LINE_SIZE, device->numa_node); - if (!ethdev->data->dev_private) { - RTE_ETHDEV_LOG_LINE(ERR, - "failed to allocate private data"); - retval = -ENOMEM; - goto probe_failed; + /* fall back to alloc on any socket on failure */ + if (ethdev->data->dev_private == NULL) { + ethdev->data->dev_private = rte_zmalloc(name, + priv_data_size, RTE_CACHE_LINE_SIZE); + + if (ethdev->data->dev_private == NULL) { + RTE_ETHDEV_LOG_LINE(ERR, "failed to allocate private data"); + retval = -ENOMEM; + goto probe_failed; + } + /* got memory, but not local, so issue warning */ + RTE_ETHDEV_LOG_LINE(WARNING, + "Private data for ethdev '%s' not allocated on local NUMA node %d", + device->name, device->numa_node); } } } else { diff --git a/lib/ethdev/ethdev_pci.h b/lib/ethdev/ethdev_pci.h index 737fff1833..ec4f731270 100644 --- a/lib/ethdev/ethdev_pci.h +++ b/lib/ethdev/ethdev_pci.h @@ -93,12 +93,26 @@ rte_eth_dev_pci_allocate(struct rte_pci_device *dev, size_t private_data_size) return NULL; if (private_data_size) { + /* Try and alloc the private-data structure on socket local to the device */ eth_dev->data->dev_private = rte_zmalloc_socket(name, private_data_size, RTE_CACHE_LINE_SIZE, dev->device.numa_node); - if (!eth_dev->data->dev_private) { - rte_eth_dev_release_port(eth_dev); - return NULL; + + /* if cannot allocate memory on the socket local to the device + * use rte_malloc to allocate memory on some other socket, if available. + */ + if (eth_dev->data->dev_private == NULL) { + eth_dev->data->dev_private = rte_zmalloc(name, + private_data_size, RTE_CACHE_LINE_SIZE); + + if (eth_dev->data->dev_private == NULL) { + rte_eth_dev_release_port(eth_dev); + return NULL; + } + /* got memory, but not local, so issue warning */ + RTE_ETHDEV_LOG_LINE(WARNING, + "Private data for ethdev '%s' not allocated on local NUMA node %d", + dev->device.name, dev->device.numa_node); } } } else { -- 2.43.0