From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 1026AA034D; Mon, 3 Jan 2022 19:15:16 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id C1D7B40042; Mon, 3 Jan 2022 19:15:15 +0100 (CET) Received: from NAM04-BN8-obe.outbound.protection.outlook.com (mail-bn8nam08on2053.outbound.protection.outlook.com [40.107.100.53]) by mails.dpdk.org (Postfix) with ESMTP id 58CED4003C for ; Mon, 3 Jan 2022 19:15:13 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=lgrjpltfxTQ0UT2zrZ4yKcocNK4p435Fyi5ML5a8m0AyoZ9wI9d4A7ODu4Gr6hEzuGuQClu+2Y/Prm7pYGcW+PoRJmA+16Mg8c0IAyODrxuwsjJZTxY53TdKilYnAwNeC4lH4fsj3anrdjsk7kIlkBVUWN28kRV/VJZZ3Od/njJE1oC8RzEGcgbjnZz9DzI1OuOqwSyW1tjQ9RMAiLiKgeLo0uDDvvt+mf5M0Brstaryaz/WBNE5OKdTdj08GWmyHEZ8NAeBWHJqucW6SDx2HbQn0QSBxs2pIJQk3cfd5FWHhrlr8Iqf+Mrwysep6j3zenU2x4smh4cC3lXQm8sHIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=iylLbbs4GsoWWIC1vsy+80+bA4kAcDyCbBTBHSHCs5s=; b=d7yJvUPkwOL8MmxAYJMjqYd9kCMI92rIaSAXfBm5hEkmAdRm0YT1VgCxEyO+6vlzBiMLGabbJRAJL7SPtUsDfXL5N7BaLodt+BRfSQOaNsTKxY7Lba/7qJ4vzlsHz9XWJ0TnJ1Pc2Nn1y+OZTHIwD0MskRNdoXn45f2Bagrjnz3c9vvt842WsEjr8yR3xViq6s2vd5tGEvjs5JVj/eg+YkJ5HR+fNcIZ5T0E9yAbgBX5A/IkDHGf9Y+ZqGmUWHi3RQXFLcIiW2ejAlDMSoMYipU5Ie3tYOMAKliZYEjYfVvLp4mwlYMJwdKGVTbZJsmNf+pEckTshKVg2cerhNdwag== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nvidia.com; dmarc=pass action=none header.from=nvidia.com; dkim=pass header.d=nvidia.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=iylLbbs4GsoWWIC1vsy+80+bA4kAcDyCbBTBHSHCs5s=; b=MZXNibBeAbeClxnYFA3ZWQrF0TfWLIP9IQ28eCQ0AmyDZmrNzvduWnu+VV12ogNMQ+tZRs2gRW+fN67sQfFKBTS60EuOcvIzKyX+UrT7Ppw0+cWQq+ij7knOslZLgvMnfzsVxJx2I22vK7njtcTVWKj0C6U63FBbygShDZlY0yKiYBLY+8VZa6XUzEM+42qmy1lEdTkWEaF8bcdencY4x6miKEHH9EescjCuyMqLNydn3rweKzFimTJdrYAa2VLsAd9g/SarW4heO2ISt/elBLeQL0onmD5f6jpuoO29foMnwdigDXO3wEc+Wpk3NieWuywSDLOJNmxVVNYYPps9gw== Received: from DM6PR12MB4107.namprd12.prod.outlook.com (2603:10b6:5:218::7) by DM6PR12MB2875.namprd12.prod.outlook.com (2603:10b6:5:15c::28) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4844.13; Mon, 3 Jan 2022 18:15:12 +0000 Received: from DM6PR12MB4107.namprd12.prod.outlook.com ([fe80::7006:8ce7:3bc4:512]) by DM6PR12MB4107.namprd12.prod.outlook.com ([fe80::7006:8ce7:3bc4:512%4]) with mapi id 15.20.4844.016; Mon, 3 Jan 2022 18:15:11 +0000 From: Elena Agostini To: Stephen Hemminger CC: "dev@dpdk.org" Subject: Re: [PATCH v1 3/3] gpu/cuda: mem alloc aligned memory Thread-Topic: [PATCH v1 3/3] gpu/cuda: mem alloc aligned memory Thread-Index: AQHYAMiXl96E1P1nikGhAu5uOI9iL6xRl18AgAABZKI= Date: Mon, 3 Jan 2022 18:15:11 +0000 Message-ID: References: <20220104014721.1799-1-eagostini@nvidia.com> <20220104014721.1799-4-eagostini@nvidia.com> <20220103100520.66677c3f@hermes.local> In-Reply-To: <20220103100520.66677c3f@hermes.local> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=nvidia.com; x-ms-publictraffictype: Email x-ms-office365-filtering-correlation-id: 1a247ee8-88f8-4154-c416-08d9cee4fbad x-ms-traffictypediagnostic: DM6PR12MB2875:EE_ x-microsoft-antispam-prvs: x-ms-oob-tlc-oobclassifiers: OLM:8882; x-ms-exchange-senderadcheck: 1 x-ms-exchange-antispam-relay: 0 x-microsoft-antispam: BCL:0; x-microsoft-antispam-message-info: ou85h0Bp6WS+qeKky5U34/iWMULlxbvSGh04pAF0lsyAi08p2Gqh7Cqx34oKOERfGe7XlIQEKyU/hQJnxr/baXN4mM0cIzO8l4BsrDeouPI0Ei5/CmUdybBwpKS9m/4LtJqhzPAwPy2cT3WGeL2VHAgCc+yXg86rlzusUBpPllqt7//xpkNzE9RFtX49w/DUzA8spT0HBjieTcxx+5/SmmG8YSXDHgBLMKDg4o+0YAQCYBgkamccHVG1DXbDQ0nYeFQejDUn/hUbyt/yFfNOJr10MdNSRWWXLI/iAJK2Xor9kJUDy0gPB127uidAYK9uYQ71gnjwVatHek1sMe1jBeYB3OOVmy2Ex5lE+F/cVRrw9GsIjFsU42tCMAs3iJ39FxR5oA8rL+ufb2FH6LnCk9Oz3+iZ5rpUoNl/nvchHvEza+colHvLUA6dEq9L7HVwj3VdmvhSsxFeJRl8P7Kw2Lxe4lGLOcS/YtI2WwXlBqn4Zv8t2r7PvJErAK23oqaJD2An3Zo7m5EzRRmeyFv2g/Zv0wkIZ5Er1vls2eZQ3Ug+5To68S/+zXDQCAr7f/OIVElSfLW2OgLnb4Ibd0GmlqcTBlFmGX9xoRcm75Fyqog/6z/9mMKcc6ZqT3WGFvD97CRBonFQZmCmviOlUAH+HhWVZmJyLXSZ/k0p8Y282tFeZR2PCHY5yho522cgOBC6q8BEnbjWbraW0fgq+TSkWA== x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:DM6PR12MB4107.namprd12.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(2906002)(122000001)(38100700002)(6916009)(52536014)(316002)(9686003)(4326008)(8676002)(86362001)(66946007)(53546011)(33656002)(5660300002)(508600001)(6506007)(7696005)(26005)(66476007)(91956017)(66556008)(64756008)(66446008)(76116006)(186003)(71200400001)(38070700005)(8936002)(55016003); DIR:OUT; SFP:1101; x-ms-exchange-antispam-messagedata-chunkcount: 1 x-ms-exchange-antispam-messagedata-0: =?us-ascii?Q?X9WqVAFHr5go/IX8ru4JftAvy/MRuhj6MksjpBYNT7nQaex/c2Fuwh5p5C0b?= =?us-ascii?Q?mg7xKYcrEQYTV2n17Mj05ujrIeHGTWZpG1MqgvOFC3wvNCNzI43nHOqYMVTt?= =?us-ascii?Q?swuRryTUvesXj8c/qclaC5JT9+w6ar4JNvqUpFFvnGg49xg/YaYwDl8PLjS9?= =?us-ascii?Q?rUHeKu3IYl+aQVI0dSm3yDjPeJEXop+yUig+JyptPpYowUUu+fSWxYZFqFbw?= =?us-ascii?Q?Kvxn2GfleSt1HrN1pOYmduq5PEHlT5yjiyJM9YhJWR5n7XlyNn9hPsRHxh3K?= =?us-ascii?Q?E7djlgOvw7976ixSkEAOU5rKPPoPbixzgysmbHYR//xyR411UXO8RStNRxr/?= =?us-ascii?Q?5EFOveEv302xrpNooKYucfaeLqyR89RWIUAEIWo5nHObye/7ViyqmpaeX8PG?= =?us-ascii?Q?xxnim5VPiG6XzYswB60meylUQz8x0lZC7HIMa7B19pnpgQPpM05Gz7+M2zY+?= =?us-ascii?Q?SzzYAEJmicJ8ij0vwVIUlRSNTQrV+NRY0d0vcMCdZGKqt3DLwDr9raUemm2j?= =?us-ascii?Q?ZwG241nUB3f17RfLukna3e1rJTvy6xV5ZqfMVrWWAkGxb0e86N3VhwjsrDV0?= =?us-ascii?Q?Iz/b59g7pJtg1Gz2jxHj8BqfZ/8tOWoGfZJ6NzqZl5H0EKoES5l7yEF+oiHA?= =?us-ascii?Q?ald2zBOFcCktFuD/gNXCvMjtXRG6eJIphao4yxkyrxv7hTsOK1RqoXm7N7iL?= =?us-ascii?Q?OLu/GxPJ8JTYYWofN6irNLFC9ZEYNIdedCAEMbgbyeDp2IQE/VtY6Xh6u0xg?= =?us-ascii?Q?5lP7Wk6gAzj3Hjz0nHswb/7btx7wp9aW3OLpBxpit6CebtsZYtLFQo5repyl?= =?us-ascii?Q?71y21Ip3fOB2INw1QzMLP99aXEp4uphOwkg5c2AUTcaDeVBi8xX9okGERTeq?= =?us-ascii?Q?yq7wnzJMcMY5k7dqGFpMSsALDEGoUBHDQyftmyzd+0L4Qvaq0olqEUvlLd1R?= =?us-ascii?Q?6nzEj1fXqY3dkJOrMEQoaA7yokQeWugX3vqmMEmxjBIPHVsCePRUhqoE/wqJ?= =?us-ascii?Q?HAnLPv3vDHwXj0UtoubIffu56l1LkgkOYMdzle1bJoaJg099Jj62wkIqX4OY?= =?us-ascii?Q?7zyvCTqVk2hmaSYLyLTZ+hP5KhDJsZlvvCyFbtKuF7+o9vmjZs947jwzIdDC?= =?us-ascii?Q?t6kHqf7/qSygz2qYis5eeQ2GCElBN0daoDZs63X2pAb+mw3Wx+jxdy1hpa7N?= =?us-ascii?Q?d4ETVSydt56EJbeV2QP2vbUgt8DYAKyGzbXDH/oM/TgKp/1krIFFcHeCv1W3?= =?us-ascii?Q?mYgA6K8UR91bowHAqgB1DLJ4QfXF967guUpLq3BvE6DDf6QI4E9ISwZuAWnP?= =?us-ascii?Q?i3CE17Dzemwf3OP/cWkQX2/MUGeidiHwfotB6UAL0MJJLprYEb3X0s57t0np?= =?us-ascii?Q?Zstz6oAdkkH6JP+fwbWneJkqAHNMtrGFLpGL8289pQbVoeNG/NvjgrNFBIjB?= =?us-ascii?Q?fWn+JVtWdkzyic2Zgion/SU51TKW394zezSLv9zeUdVVFDg+ZAMlRVa+y2Ia?= =?us-ascii?Q?aI2+MJWDwH2lkrKXpYMe+XLK/ic/vDgenZqDkAlGcZrEuF80LBzE9Xe+a1IB?= =?us-ascii?Q?i+WS69JhVA8OCl9B0H6t3ET4wOFAaM7UaQmaBk8+vGYgNH//B204eBMNqeKW?= =?us-ascii?Q?WHa+y5F/8WerinUp5i1gSUffAgN+o5JBQ7B7oPIKFJq5?= Content-Type: multipart/alternative; boundary="_000_DM6PR12MB4107C9CA5C45F4614BB71980CD499DM6PR12MB4107namp_" MIME-Version: 1.0 X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-AuthAs: Internal X-MS-Exchange-CrossTenant-AuthSource: DM6PR12MB4107.namprd12.prod.outlook.com X-MS-Exchange-CrossTenant-Network-Message-Id: 1a247ee8-88f8-4154-c416-08d9cee4fbad X-MS-Exchange-CrossTenant-originalarrivaltime: 03 Jan 2022 18:15:11.9657 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-mailboxtype: HOSTED X-MS-Exchange-CrossTenant-userprincipalname: q6PnIeIXv97adIRnzutCBZ9RIJJtNUSrFQN3uJk9hwemoZveetb51rCOoU3vas+qWB8AovsoM24HmPQcVkD+VQ== X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR12MB2875 X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org --_000_DM6PR12MB4107C9CA5C45F4614BB71980CD499DM6PR12MB4107namp_ Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable > From: Stephen Hemminger > Date: Monday, 3 January 2022 at 19:05 > To: Elena Agostini > Cc: dev@dpdk.org > Subject: Re: [PATCH v1 3/3] gpu/cuda: mem alloc aligned memory > External email: Use caution opening links or attachments> > > On Tue, 4 Jan 2022 01:47:21 +0000 > wrote:> > > static int > > -cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr) > > +cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr, unsigned = int align) > > { > > CUresult res; > > const char *err_string; > > @@ -610,8 +612,10 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, v= oid **ptr) > > > > /* Allocate memory */ > > mem_alloc_list_tail->size =3D size; > > - res =3D pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_d), > > - mem_alloc_list_tail->size); > > + mem_alloc_list_tail->size_orig =3D size + align; > > + > > + res =3D pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_orig_d), > > + mem_alloc_list_tail->size_orig); > > if (res !=3D 0) { > > pfn_cuGetErrorString(res, &(err_string)); > > rte_cuda_log(ERR, "cuCtxSetCurrent current failed with %s= ", > > @@ -620,6 +624,13 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, v= oid **ptr) > > return -rte_errno; > > } > > > > + > > + /* Align memory address */ > > + mem_alloc_list_tail->ptr_d =3D mem_alloc_list_tail->ptr_orig_d; > > + if (align && ((uintptr_t)mem_alloc_list_tail->ptr_d) % align) > > + mem_alloc_list_tail->ptr_d +=3D (align - > > + (((uintptr_t)mem_alloc_list_tail->ptr_d) = % align));> > > Posix memalign takes size_t for both size and alignment. I've created this patch based on the rte_malloc function definition for con= sistency. void * rte_malloc(const char *type, size_t size, unsigned align) > Better to put the input parameters first, and then the resulting output p= arameter last > for consistency; follows the Rusty API design manifesto. Got it, will do. > Alignment only makes sense if power of two. The code should check that an= d optimize > for that. > The alignment value is checked in the gpudev library before passing it to the driver. Adding this kind of checks in the driver has been rejected in the past beca= use it was considered dead code (the library was already checking input parameters). Let me know what are the preferred options. --_000_DM6PR12MB4107C9CA5C45F4614BB71980CD499DM6PR12MB4107namp_ Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable

> From: Stephen = Hemminger <stephen@networkplumber.org>

> Date: Monday, = 3 January 2022 at 19:05

> To: Elena Agos= tini <eagostini@nvidia.com>

> Cc: dev@dpdk.o= rg <dev@dpdk.org>

> Subject: Re: [= PATCH v1 3/3] gpu/cuda: mem alloc aligned memory

> External email= : Use caution opening links or attachments>

>

 

> On Tue, 4 Jan = 2022 01:47:21 +0000

> <eagostini@= nvidia.com> wrote:>

 

> >  sta= tic int

> > -cuda_mem= _alloc(struct rte_gpu *dev, size_t size, void **ptr)

> > +cuda_mem= _alloc(struct rte_gpu *dev, size_t size, void **ptr, unsigned int align)

> >  {

> > &nbs= p;     CUresult res;

> > &nbs= p;     const char *err_string;

> > @@ -610,8= +612,10 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)

> >

> > &nbs= p;     /* Allocate memory */

> > &nbs= p;     mem_alloc_list_tail->size =3D size;

> > - &n= bsp;   res =3D pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_d= ),

> > - &n= bsp;            = ;       mem_alloc_list_tail->size);

> > + &n= bsp;   mem_alloc_list_tail->size_orig =3D size + align;

> > +

> > + &n= bsp;   res =3D pfn_cuMemAlloc(&(mem_alloc_list_tail->ptr_o= rig_d),

> > + &n= bsp;            = ;       mem_alloc_list_tail->size_orig);

> > &nbs= p;     if (res !=3D 0) {

> > &nbs= p;             = pfn_cuGetErrorString(res, &(err_string));

> > &nbs= p;             = rte_cuda_log(ERR, "cuCtxSetCurrent current failed with %s",<= /o:p>

> > @@ -620,6= +624,13 @@ cuda_mem_alloc(struct rte_gpu *dev, size_t size, void **ptr)

> > &nbs= p;             = return -rte_errno;

> > &nbs= p;     }

> >

> > +

> > + &n= bsp;   /* Align memory address */

> > + &n= bsp;   mem_alloc_list_tail->ptr_d =3D mem_alloc_list_tail->= ptr_orig_d;

> > + &n= bsp;   if (align && ((uintptr_t)mem_alloc_list_tail->p= tr_d) % align)

> > + &n= bsp;           mem_alloc_= list_tail->ptr_d +=3D (align -

> > + &n= bsp;            &nbs= p;            &= nbsp; (((uintptr_t)mem_alloc_list_tail->ptr_d) % align));>

>

 

> Posix memalign= takes size_t for both size and alignment.

 

I've created this p= atch based on the rte_malloc function definition for consistency.

 

void * rte_malloc(c= onst char *type, size_t size, unsigned align)

 

 

> Better to put = the input parameters first, and then the resulting output parameter last

> for consistenc= y; follows the Rusty API design manifesto.

 

Got it, will do.

 

> Alignment only= makes sense if power of two. The code should check that and optimize<= /o:p>

> for that.=

>

 

The alignment value= is checked in the gpudev library before

passing it to the d= river.

 

Adding this kind of= checks in the driver has been rejected in the past because it was

considered dead cod= e (the library was already checking input parameters).

 

Let me know what ar= e the preferred options.

--_000_DM6PR12MB4107C9CA5C45F4614BB71980CD499DM6PR12MB4107namp_--