From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <aburakov@ecsmtp.ir.intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 0286C4CC0
 for <dev@dpdk.org>; Tue, 24 Apr 2018 17:54:27 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from fmsmga007.fm.intel.com ([10.253.24.52])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 24 Apr 2018 08:54:25 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.49,323,1520924400"; d="scan'208";a="34294132"
Received: from irvmail001.ir.intel.com ([163.33.26.43])
 by fmsmga007.fm.intel.com with ESMTP; 24 Apr 2018 08:54:24 -0700
Received: from sivswdev01.ir.intel.com (sivswdev01.ir.intel.com
 [10.237.217.45])
 by irvmail001.ir.intel.com (8.14.3/8.13.6/MailSET/Hub) with ESMTP id
 w3OFsNFV006465; Tue, 24 Apr 2018 16:54:23 +0100
Received: from sivswdev01.ir.intel.com (localhost [127.0.0.1])
 by sivswdev01.ir.intel.com with ESMTP id w3OFsNMh024882;
 Tue, 24 Apr 2018 16:54:23 +0100
Received: (from aburakov@localhost)
 by sivswdev01.ir.intel.com with LOCAL id w3OFsNZq024874;
 Tue, 24 Apr 2018 16:54:23 +0100
From: Anatoly Burakov <anatoly.burakov@intel.com>
To: dev@dpdk.org
Cc: bruce.richardson@intel.com, thomas@monjalon.net
Date: Tue, 24 Apr 2018 16:54:21 +0100
Message-Id: <cover.1524585160.git.anatoly.burakov@intel.com>
X-Mailer: git-send-email 1.7.0.7
In-Reply-To: <cover.1524140413.git.anatoly.burakov@intel.com>
References: <cover.1524140413.git.anatoly.burakov@intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Subject: [dpdk-dev] [PATCH v2 0/2] Fix file locking in EAL memory
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Tue, 24 Apr 2018 15:54:28 -0000

This patchset replaces the fcntl()-based locking used in
the original DPDK memory hotplug patchset, to an flock()-
and lockfile-based implementation, due to numerous (well,
one, really) problems with how fcntl() locks work.

Long story short, fcntl() locks will be dropped if any
fd referring to locked file, is closed - even if it's not
the last fd, even if it wasn't even the fd that was used
to lock the file in the first place, even if it wasn't
you who closed that fd, but some other library.

This patchset corrects this saddening design defect in the
original implementation.

One of the ways to work around this was using OFD locks,
but they are only supported on kernels 3.15+, so we cannot
rely on them if we want to support old kernels. Hence, we
use per-segment lockfiles. The number of file descriptors
we open does not end up more than in non-single file
segments case - we still open the same amount of files (a
file per page), plus a file per memseg list.

Additionally, since flock() is not atomic, we also lock the
hugepage dir to prevent multiple processes from concurrently
performing operations on hugetlbfs mounts.

If you know of a more enlightened way of fixing this
limitation, you are certainly welcome to comment :)

v2:
- Fixes as per review comments
- Make lockfiles hidden by default

Anatoly Burakov (2):
  mem: add memalloc init stage
  mem: revert to using flock() and add per-segment lockfiles

 lib/librte_eal/bsdapp/eal/eal_memalloc.c        |   6 +
 lib/librte_eal/common/eal_common_memory.c       |   3 +
 lib/librte_eal/common/eal_filesystem.h          |  18 +
 lib/librte_eal/common/eal_memalloc.h            |   3 +
 lib/librte_eal/linuxapp/eal/eal_hugepage_info.c |  28 +-
 lib/librte_eal/linuxapp/eal/eal_memalloc.c      | 604 +++++++++++++++---------
 lib/librte_eal/linuxapp/eal/eal_memory.c        |  22 +-
 7 files changed, 420 insertions(+), 264 deletions(-)

-- 
2.7.4