From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0047.outbound.protection.outlook.com [104.47.1.47]) by dpdk.org (Postfix) with ESMTP id B171716E for ; Mon, 30 Apr 2018 14:48:31 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=f/TBWUA2xhsfVDjPIrfmyu6QDWW0MEKdVwuLCjRfOdE=; b=bqqKqNbd4z/QThIubNOZEQhaXeJLcqEaZofZHzeRPiXqo+B8o3TS89GbAxXaLu6K73VUQ2+Ma+DKqlV/XuxzeWmKX2Rhs+rQPg44EVMja9yJv9hRQofMqVUUJwyNpqpvO5l0HYuYShfh1rRI4T9xBPJWcJ9mBekviJl/ucj+nfI= Received: from DB7PR05MB4426.eurprd05.prod.outlook.com (52.134.109.15) by DB7PR05MB4297.eurprd05.prod.outlook.com (52.134.108.30) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.715.22; Mon, 30 Apr 2018 12:48:29 +0000 Received: from DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::f116:5be4:ba29:fed8]) by DB7PR05MB4426.eurprd05.prod.outlook.com ([fe80::f116:5be4:ba29:fed8%13]) with mapi id 15.20.0715.022; Mon, 30 Apr 2018 12:48:29 +0000 From: Shahaf Shuler To: Anatoly Burakov , "dev@dpdk.org" CC: "arybchenko@solarflare.com" Thread-Topic: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held Thread-Index: AQHT4G9i2iksmhDQKkSKbGUviVPlWqQZQYdw Date: Mon, 30 Apr 2018 12:48:29 +0000 Message-ID: References: <00fe124e2057fe6c5596fb0a24bdcce9b36c3b90.1525082912.git.anatoly.burakov@intel.com> In-Reply-To: <00fe124e2057fe6c5596fb0a24bdcce9b36c3b90.1525082912.git.anatoly.burakov@intel.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: authentication-results: spf=none (sender IP is ) smtp.mailfrom=shahafs@mellanox.com; x-originating-ip: [193.47.165.251] x-ms-publictraffictype: Email x-microsoft-exchange-diagnostics: 1; DB7PR05MB4297; 7:MdDuovot4iXlnLs77NZu52H9enKvl7GZvXsUyK62LHy/r4/zKcArd6bEyQAJKw4kmDzhbCnpHSeNtWyTXr18qn7pZwuF8QSApMvd/VoARl/DXa7S49QJCayKOnPwh1MnZVHu+giL+zYKqxk+of4x2bKCk1S3DOGszHz2eOG1qWS+XcumN8NLBmO7N5rIOx+YFsevKkMW+e0KwuDAinKWOiXQIj7KRNkWjbo26M6727utKzEhJ+r+XjUKn4hL0S3x x-ms-exchange-antispam-srfa-diagnostics: SOS; x-ms-office365-filtering-ht: Tenant x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(7020095)(4652020)(48565401081)(5600026)(4534165)(4627221)(201703031133081)(201702281549075)(2017052603328)(7153060)(7193020); SRVR:DB7PR05MB4297; x-ms-traffictypediagnostic: DB7PR05MB4297: x-microsoft-antispam-prvs: x-exchange-antispam-report-test: UriScan:(228905959029699); x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(8211001083)(6040522)(2401047)(8121501046)(5005006)(93006095)(93001095)(3231254)(944501410)(52105095)(10201501046)(3002001)(6055026)(6041310)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(20161123564045)(20161123558120)(6072148)(201708071742011); SRVR:DB7PR05MB4297; BCL:0; PCL:0; RULEID:; SRVR:DB7PR05MB4297; x-forefront-prvs: 0658BAF71F x-forefront-antispam-report: SFV:NSPM; SFS:(10009020)(39860400002)(396003)(376002)(39380400002)(366004)(346002)(199004)(189003)(4326008)(2906002)(33656002)(478600001)(14454004)(305945005)(7736002)(5660300001)(105586002)(68736007)(74316002)(8936002)(81156014)(81166006)(6436002)(97736004)(229853002)(2900100001)(3660700001)(3280700002)(6246003)(55016002)(25786009)(53936002)(9686003)(66066001)(106356001)(186003)(5250100002)(11346002)(110136005)(316002)(86362001)(7696005)(2501003)(59450400001)(99286004)(476003)(6506007)(102836004)(486006)(76176011)(3846002)(6116002)(26005)(446003); DIR:OUT; SFP:1101; SCL:1; SRVR:DB7PR05MB4297; H:DB7PR05MB4426.eurprd05.prod.outlook.com; FPR:; SPF:None; LANG:en; PTR:InfoNoRecords; A:1; MX:1; received-spf: None (protection.outlook.com: mellanox.com does not designate permitted sender hosts) x-microsoft-antispam-message-info: jgv5pQ2GX3zX49mvmIKSMarBeD8nFYG3KcAUool1612IOndirTTUg688+M13sdJXzlpq3oVOnFBFTrySvYp8HJxmTHmGySNqyWqy+USER/8WF/cCaVdNe64pWi7z/HNXDvCBmkiQ5qgaQTec5u/I/TlryPmXazWMoVabdvM8o/+RG8pYH11QpHOqyiqIRt/F spamdiagnosticoutput: 1:99 spamdiagnosticmetadata: NSPM Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 X-MS-Office365-Filtering-Correlation-Id: 73b5a64b-0ef9-4e37-64b7-08d5ae98acb6 X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-Network-Message-Id: 73b5a64b-0ef9-4e37-64b7-08d5ae98acb6 X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Apr 2018 12:48:29.7752 (UTC) X-MS-Exchange-CrossTenant-fromentityheader: Hosted X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB7PR05MB4297 Subject: Re: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already being held X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 30 Apr 2018 12:48:32 -0000 Monday, April 30, 2018 1:38 PM, Anatoly Burakov: > Cc: arybchenko@solarflare.com; anatoly.burakov@intel.com > Subject: [dpdk-dev] [PATCH] eal: check if hugedir write lock is already b= eing > held >=20 > At hugepage info initialization, EAL takes out a write lock on hugetlbfs > directories, and drops it after the memory init is finished. However, in = non- > legacy mode, if "-m" or "--socket-mem" > switches are passed, this leads to a deadlock because EAL tries to alloca= te > pages (and thus take out a write lock on hugedir) while still holding a > separate hugedir write lock in EAL. >=20 > Fix it by checking if write lock in hugepage info is active, and not tryi= ng to lock > the directory if the hugedir fd is valid. >=20 > Fixes: 1a7dc2252f28 ("mem: revert to using flock and add per-segment > lockfiles") > Cc: anatoly.burakov@intel.com >=20 > Signed-off-by: Anatoly Burakov > --- > lib/librte_eal/linuxapp/eal/eal_memalloc.c | 71 ++++++++++++++++++------ > ------ > 1 file changed, 42 insertions(+), 29 deletions(-) >=20 > diff --git a/lib/librte_eal/linuxapp/eal/eal_memalloc.c > b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > index 00d7886..360d8f7 100644 > --- a/lib/librte_eal/linuxapp/eal/eal_memalloc.c > +++ b/lib/librte_eal/linuxapp/eal/eal_memalloc.c > @@ -666,7 +666,7 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) > struct alloc_walk_param *wa =3D arg; > struct rte_memseg_list *cur_msl; > size_t page_sz; > - int cur_idx, start_idx, j, dir_fd; > + int cur_idx, start_idx, j, dir_fd =3D -1; > unsigned int msl_idx, need, i; >=20 > if (msl->page_sz !=3D wa->page_sz) > @@ -691,19 +691,24 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) > * because file creation and locking operations are not atomic, > * and we might be the first or the last ones to use a particular page, > * so we need to ensure atomicity of every operation. > + * > + * during init, we already hold a write lock, so don't try to take out > + * another one. > */ > - dir_fd =3D open(wa->hi->hugedir, O_RDONLY); > - if (dir_fd < 0) { > - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > __func__, > - wa->hi->hugedir, strerror(errno)); > - return -1; > - } > - /* blocking writelock */ > - if (flock(dir_fd, LOCK_EX)) { > - RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", __func__, > - wa->hi->hugedir, strerror(errno)); > - close(dir_fd); > - return -1; > + if (wa->hi->lock_descriptor =3D=3D -1) { > + dir_fd =3D open(wa->hi->hugedir, O_RDONLY); > + if (dir_fd < 0) { > + RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + return -1; > + } > + /* blocking writelock */ > + if (flock(dir_fd, LOCK_EX)) { > + RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + close(dir_fd); > + return -1; > + } > } >=20 > for (i =3D 0; i < need; i++, cur_idx++) { @@ -742,7 +747,8 @@ > alloc_seg_walk(const struct rte_memseg_list *msl, void *arg) > if (wa->ms) > memset(wa->ms, 0, sizeof(*wa->ms) * wa- > >n_segs); >=20 > - close(dir_fd); > + if (dir_fd >=3D 0) > + close(dir_fd); > return -1; > } > if (wa->ms) > @@ -754,7 +760,8 @@ alloc_seg_walk(const struct rte_memseg_list *msl, > void *arg) > wa->segs_allocated =3D i; > if (i > 0) > cur_msl->version++; > - close(dir_fd); > + if (dir_fd >=3D 0) > + close(dir_fd); > return 1; > } >=20 > @@ -769,7 +776,7 @@ free_seg_walk(const struct rte_memseg_list *msl, > void *arg) > struct rte_memseg_list *found_msl; > struct free_walk_param *wa =3D arg; > uintptr_t start_addr, end_addr; > - int msl_idx, seg_idx, ret, dir_fd; > + int msl_idx, seg_idx, ret, dir_fd =3D -1; >=20 > start_addr =3D (uintptr_t) msl->base_va; > end_addr =3D start_addr + msl->memseg_arr.len * (size_t)msl- > >page_sz; @@ -788,19 +795,24 @@ free_seg_walk(const struct > rte_memseg_list *msl, void *arg) > * because file creation and locking operations are not atomic, > * and we might be the first or the last ones to use a particular page, > * so we need to ensure atomicity of every operation. > + * > + * during init, we already hold a write lock, so don't try to take out > + * another one. > */ > - dir_fd =3D open(wa->hi->hugedir, O_RDONLY); > - if (dir_fd < 0) { > - RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > __func__, > - wa->hi->hugedir, strerror(errno)); > - return -1; > - } > - /* blocking writelock */ > - if (flock(dir_fd, LOCK_EX)) { > - RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", __func__, > - wa->hi->hugedir, strerror(errno)); > - close(dir_fd); > - return -1; > + if (wa->hi->lock_descriptor =3D=3D -1) { > + dir_fd =3D open(wa->hi->hugedir, O_RDONLY); > + if (dir_fd < 0) { > + RTE_LOG(ERR, EAL, "%s(): Cannot open '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + return -1; > + } > + /* blocking writelock */ > + if (flock(dir_fd, LOCK_EX)) { > + RTE_LOG(ERR, EAL, "%s(): Cannot lock '%s': %s\n", > + __func__, wa->hi->hugedir, strerror(errno)); > + close(dir_fd); > + return -1; > + } > } >=20 > found_msl->version++; > @@ -809,7 +821,8 @@ free_seg_walk(const struct rte_memseg_list *msl, > void *arg) >=20 > ret =3D free_seg(wa->ms, wa->hi, msl_idx, seg_idx); >=20 > - close(dir_fd); > + if (dir_fd >=3D 0) > + close(dir_fd); >=20 > if (ret < 0) > return -1; Tested-By: Shahaf Shuler =20 > -- > 2.7.4