From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id C0B52A04A6;
	Fri,  7 Jan 2022 17:10:32 +0100 (CET)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 9FE3641141;
	Fri,  7 Jan 2022 17:10:21 +0100 (CET)
Received: from NAM11-CO1-obe.outbound.protection.outlook.com
 (mail-co1nam11on2052.outbound.protection.outlook.com [40.107.220.52])
 by mails.dpdk.org (Postfix) with ESMTP id 5196F40042
 for <dev@dpdk.org>; Fri,  7 Jan 2022 17:10:19 +0100 (CET)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=Zx+prNlZicwU9LUPw8D3MyOhbgljYD59F0sCEmMtoMvMZ4xNZh3+fHFKWA3mnDgTJ9WcizQrCPFd9E3bunVXzy3upqXUts4ogTM/D0ettIZ12Tprkvz71J22JBeZBNwJhcLMO52NdtBchDcbBilY9ZYTBf0skN28oT0Z0qOJGHYo+GUqEWQFThwIoX9uVAAGWD6HdgSzGQvcV5gO6/zNywoFaLBdPons8t8NqKCETGBKrJwLdcreyYtmhlr4D9BzxoipVKPKVDLr4lYeULURj8Sryfqq0G6FOiKgyzC29l1x+9hDkwDoKnd/yhd9T04lpqgMNiS/TbDQpexmItx+pw==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=a03DfUynRhyTyJpArg5xa6hs1G1OEEhemAVjGSwUFxI=;
 b=oEHw6V356nP6hItql7u0oiB6NfnDiS8+EtpjIt4G2UyHCYJ7n2X10e7MLC9v2AcNAw7pSId0d1cMrAz+mJUhQFlZxwJRtD1GjwSurOd/K/TDwSAdT8WxR87QHDm9el3pDeJlFSnGxYrt8yktqNQ1AiIQOnkqi9QGKV3bw9GJa7kOIlMr7zR6n/X3RzEEWQD0glRGh9uiJyRYy+YbGzfpJPL+0w1RhM1MgVZQTvfLbMyKZIW6OWX88FAeSKSlOL7Sjuhj1UOniF3aGBD/WwSK1FVCatzwNxvpKIs+BK9kyNCaiAfgZ9hp7wEE3mxohc1v/mISWN6mdOkgwYAsJuuadQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is
 12.22.5.235) smtp.rcpttodomain=dpdk.org smtp.mailfrom=nvidia.com; dmarc=pass
 (p=reject sp=reject pct=100) action=none header.from=nvidia.com; dkim=none
 (message not signed); arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com;
 s=selector2;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=a03DfUynRhyTyJpArg5xa6hs1G1OEEhemAVjGSwUFxI=;
 b=F1sTRMhYIuCvilqKyrqxPVBK10mSg+09cnsgaiBbAAJnvnWQSrRG+TdZAlAy8gHaqnpivPSluxcad7J2bir1zzqNxC8UgOVRG2Ho8haAIQ9bizDs0aAzUI5U6FnuK2bCR6hd1ThgEJ5HwmZA3e3sPhce21eqobRhk7LecqzKQr5e3SPGyQJ5cQ/oB/hDpj4FSm7Jmk4BXxHcMkAnQrKAk1FRn2jBcN0kbnjDUPP3kl8VnZtm+wXrUg6OUBScZ7bgMucx4/+NNS0dLybWDYQlJSrEkm0a93YeotQ1Jq5eNtegCcUqIq0rWPqyMWvJfKcTbh1NZB3kavxAehaI5Zsdhw==
Received: from CO2PR04CA0153.namprd04.prod.outlook.com (2603:10b6:104::31) by
 CH2PR12MB3894.namprd12.prod.outlook.com (2603:10b6:610:2b::28) with
 Microsoft
 SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.20.4867.7; Fri, 7 Jan 2022 16:10:16 +0000
Received: from CO1NAM11FT042.eop-nam11.prod.protection.outlook.com
 (2603:10b6:104:0:cafe::7d) by CO2PR04CA0153.outlook.office365.com
 (2603:10b6:104::31) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4867.9 via Frontend
 Transport; Fri, 7 Jan 2022 16:10:16 +0000
X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 12.22.5.235)
 smtp.mailfrom=nvidia.com; dkim=none (message not signed)
 header.d=none;dmarc=pass action=none header.from=nvidia.com;
Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates
 12.22.5.235 as permitted sender) receiver=protection.outlook.com;
 client-ip=12.22.5.235; helo=mail.nvidia.com;
Received: from mail.nvidia.com (12.22.5.235) by
 CO1NAM11FT042.mail.protection.outlook.com (10.13.174.250) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id
 15.20.4867.7 via Frontend Transport; Fri, 7 Jan 2022 16:10:16 +0000
Received: from rnnvmail201.nvidia.com (10.129.68.8) by DRHQMAIL107.nvidia.com
 (10.27.9.16) with Microsoft SMTP Server (TLS) id 15.0.1497.18;
 Fri, 7 Jan 2022 16:10:15 +0000
Received: from nvidia.com (172.20.187.5) by rnnvmail201.nvidia.com
 (10.129.68.8) with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.2.986.9; Fri, 7 Jan 2022
 08:10:15 -0800
From: <eagostini@nvidia.com>
To: <dev@dpdk.org>
CC: Elena Agostini <eagostini@nvidia.com>
Subject: [PATCH v2 3/3] gpu/cuda: mem alloc aligned memory
Date: Sat, 8 Jan 2022 00:20:03 +0000
Message-ID: <20220108002003.21153-3-eagostini@nvidia.com>
X-Mailer: git-send-email 2.17.1
In-Reply-To: <20220108002003.21153-1-eagostini@nvidia.com>
References: <20220104014721.1799-1-eagostini@nvidia.com>
 <20220108002003.21153-1-eagostini@nvidia.com>
MIME-Version: 1.0
Content-Type: text/plain
X-Originating-IP: [172.20.187.5]
X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To
 rnnvmail201.nvidia.com (10.129.68.8)
X-EOPAttributedMessage: 0
X-MS-PublicTrafficType: Email
X-MS-Office365-Filtering-Correlation-Id: 0d1e7347-2582-494d-2716-08d9d1f831a4
X-MS-TrafficTypeDiagnostic: CH2PR12MB3894:EE_
X-Microsoft-Antispam-PRVS: <CH2PR12MB3894E0ECEB7E1F368D4728EDCD4D9@CH2PR12MB3894.namprd12.prod.outlook.com>
X-MS-Oob-TLC-OOBClassifiers: OLM:428;
X-MS-Exchange-SenderADCheck: 1
X-MS-Exchange-AntiSpam-Relay: 0
X-Microsoft-Antispam: BCL:0;
X-Microsoft-Antispam-Message-Info: jvsbYtzHHnwtcdRQP4Pb5uSnZ63OZ0uyo8gx6zfAaB2VbaUdPGR7zL4HsebVaHw8s+ic6dSiVvbbty/QN/s9JKLduRw1mbXH9FyrMRXpZNFVyHOl8glla6RxyZhF5agvCA8x5FK39MInK3haAIO1agokWn2y0ddY02Jvi17tZJWriOjBnW6RBWnEsnq71NObj2+zz9aLDfK1SETC5QqYWhmu33IsVKlg/v7xyKqYhKYujlfJULkd0FoKsTui7uToMh6T75LuJ6t5JnNQV0PdgRpnXSYtbQRdh1aZBA43PVZQ/JkCGP7b8I9cTdkkB3Jb7X3prf7Wi6UgQJF7I8ohkGtN96Zkh0EK0pKZeQOjgcf1WUL88pM9Cq05bvmI37Gvz9of7yUsQQfxxR1yEy+oLMcmgjAW08WSN5f4VO6jZIVqOAofiBezA0k9jl1YSg+D2AT56J2mI4TTPR9OgRumnzCPsB1PGh3jWp5r/EQQII+ubtrz2nJP1NgHrBXyXAAgPqwweqJSj6cXBao4GXe/u1bnWocFN/PUJWLLAEkQoq5WnUQXI0af7AOh7tDWKSaeKCUUEyWIOWXzc7i98Y4qUvk9gHDYkzmYWkzPpjna7XmEu5eIj94GvY96oYgTaCRSqSqiWVbL4ATJy8TW1sYfseDUQ22XPCCOrdNz83zo7N1XbkByQdiX6xL7zN4iDx4RQSf1xd2Dla2oMAihDShhFhAnWk+7UZw+un41GG/82MkR3Z3KlJyW21BoaBh/1PgR8zQIu6IvhOKPXDqzwLAo5uauoAzOWieOLVS0zQkLuts=
X-Forefront-Antispam-Report: CIP:12.22.5.235; CTRY:US; LANG:en; SCL:1; SRV:;
 IPV:CAL; SFV:NSPM; H:mail.nvidia.com; PTR:InfoNoRecords; CAT:NONE;
 SFS:(4636009)(46966006)(36840700001)(40470700002)(36756003)(2906002)(1076003)(7696005)(86362001)(508600001)(36860700001)(186003)(26005)(336012)(47076005)(2616005)(426003)(8676002)(8936002)(5660300002)(16526019)(83380400001)(356005)(6916009)(81166007)(55016003)(82310400004)(6666004)(6286002)(40460700001)(2876002)(4326008)(70586007)(70206006)(107886003)(316002)(36900700001);
 DIR:OUT; SFP:1101; 
X-OriginatorOrg: Nvidia.com
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 07 Jan 2022 16:10:16.3322 (UTC)
X-MS-Exchange-CrossTenant-Network-Message-Id: 0d1e7347-2582-494d-2716-08d9d1f831a4
X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a
X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[12.22.5.235];
 Helo=[mail.nvidia.com]
X-MS-Exchange-CrossTenant-AuthSource: CO1NAM11FT042.eop-nam11.prod.protection.outlook.com
X-MS-Exchange-CrossTenant-AuthAs: Anonymous
X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CH2PR12MB3894
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

From: Elena Agostini <eagostini@nvidia.com>

Implement aligned GPU memory allocation in GPU CUDA driver.

Changelog:
- cuda_mem_alloc parameters order

Signed-off-by: Elena Agostini <eagostini@nvidia.com>
---
 drivers/gpu/cuda/cuda.c | 21 ++++++++++++++++-----
 1 file changed, 16 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/cuda/cuda.c b/drivers/gpu/cuda/cuda.c
index 882df08e56..dc8d3d3b5a 100644
--- a/drivers/gpu/cuda/cuda.c
+++ b/drivers/gpu/cuda/cuda.c
@@ -139,8 +139,10 @@ typedef uintptr_t cuda_ptr_key;
 /* Single entry of the memory list */
 struct mem_entry {
 	CUdeviceptr ptr_d;
+	CUdeviceptr ptr_orig_d;
 	void *ptr_h;
 	size_t size;
+	size_t size_orig;
 	struct rte_gpu *dev;
 	CUcontext ctx;
 	cuda_ptr_key pkey;
@@ -569,7 +571,7 @@ cuda_dev_info_get(struct rte_gpu *dev, struct rte_gpu_info *info)
  */
 
 static int
-cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)
+cuda_mem_alloc(struct rte_gpu *dev, size_t size, unsigned int align, void **ptr)
 {
 	CUresult res;
 	const char *err_string;
@@ -610,8 +612,10 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)
 
 	/* Allocate memory */
 	mem_alloc_list_tail->size = size;
-	res = pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_d),
-			mem_alloc_list_tail->size);
+	mem_alloc_list_tail->size_orig = size + align;
+
+	res = pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_orig_d),
+			mem_alloc_list_tail->size_orig);
 	if (res != 0) {
 		pfn_cuGetErrorString(res, &(err_string));
 		rte_cuda_log(ERR, "cuCtxSetCurrent current failed with %s",
@@ -620,6 +624,13 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)
 		return -rte_errno;
 	}
 
+
+	/* Align memory address */
+	mem_alloc_list_tail->ptr_d = mem_alloc_list_tail->ptr_orig_d;
+	if (align && ((uintptr_t)mem_alloc_list_tail->ptr_d) % align)
+		mem_alloc_list_tail->ptr_d += (align -
+				(((uintptr_t)mem_alloc_list_tail->ptr_d) % align));
+
 	/* GPUDirect RDMA attribute required */
 	res = pfn_cuPointerSetAttribute(&flag,
 			CU_POINTER_ATTRIBUTE_SYNC_MEMOPS,
@@ -634,7 +645,6 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)
 
 	mem_alloc_list_tail->pkey = get_hash_from_ptr((void *)mem_alloc_list_tail->ptr_d);
 	mem_alloc_list_tail->ptr_h = NULL;
-	mem_alloc_list_tail->size = size;
 	mem_alloc_list_tail->dev = dev;
 	mem_alloc_list_tail->ctx = (CUcontext)((uintptr_t)dev->mpshared->info.context);
 	mem_alloc_list_tail->mtype = GPU_MEM;
@@ -761,6 +771,7 @@ cuda_mem_register(struct rte_gpu *dev, size_t size, void *ptr)
 	mem_alloc_list_tail->dev = dev;
 	mem_alloc_list_tail->ctx = (CUcontext)((uintptr_t)dev->mpshared->info.context);
 	mem_alloc_list_tail->mtype = CPU_REGISTERED;
+	mem_alloc_list_tail->ptr_orig_d = mem_alloc_list_tail->ptr_d;
 
 	/* Restore original ctx as current ctx */
 	res = pfn_cuCtxSetCurrent(current_ctx);
@@ -796,7 +807,7 @@ cuda_mem_free(struct rte_gpu *dev, void *ptr)
 	}
 
 	if (mem_item->mtype == GPU_MEM) {
-		res = pfn_cuMemFree(mem_item->ptr_d);
+		res = pfn_cuMemFree(mem_item->ptr_orig_d);
 		if (res != 0) {
 			pfn_cuGetErrorString(res, &(err_string));
 			rte_cuda_log(ERR, "cuMemFree current failed with %s",
-- 
2.17.1