From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 9FAC2A0C53; Wed, 3 Nov 2021 12:05:30 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 8F627426D6; Wed, 3 Nov 2021 12:05:04 +0100 (CET) Received: from NAM12-MW2-obe.outbound.protection.outlook.com (mail-mw2nam12on2042.outbound.protection.outlook.com [40.107.244.42]) by mails.dpdk.org (Postfix) with ESMTP id 2D9A1411CE for ; Wed, 3 Nov 2021 12:04:59 +0100 (CET) ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Fd/J74ivrrwZqWMqMvyRiV9/OH5+7FlYfjJC4WEqBBm3TgNrtlTvLKlr2gQbJN7COL2+b33UD82QSLfZnvWdHzQN3GpTtIgHRL/M5JAToIg1bEGrZ4XVJs6IdrQFJYjFjZkznDSbhbry/atHT4XNZWMeKFHlfMOaFzS726HeDJQ+YZkQv3gdHkjf0YJiNZZoPf1qwwt/M19wa2alhuuxsEEE9gs8TDCtZA6ZGk9N/+gp5TPby3X1A2016lsx4CSJyogzPsamxfp9s0LgIpb4S1kzhwGeNo3gYdEpQbe3/Eu65QX6Eo7//lyOpt5Fg/gxUbI4SOGo5ESaRS1KEQcRxg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EyTGReqUbUqXTf8a4NDk7zJE5RD2+qYqn96/Ab8N6GQ=; b=hIn/HKPtJfkRzgV7clYEeLAo05bW5PrXMFyRkaAib6Y/gI/g0Tr5mY+vk4J9buJZJB+8gPzyL0mkOyMxyLqdz4qm0coo69bNT1LMTQBwfW7dPgTaZq+JJNJCQdf1wIM1K/nqS9CBazC0tq3qDTyrGiyNSTHPAm84b5f4pO9B8/tP5R7WlXpi68c1kCmQ3S7sXDFeuB6WXATqQfGZAaIk9pASFUhOCT5QRV6MI3uPyFLTRhNtkELWlrDyMcOxraUYzPVmXTMPIC3XKOQ7+V+HHvpc6eEk0Ypc2nhnaLMx0NbXKNinAw7f8kXjJeprMcys9p8ABMwv01aIcKp2EX73Hw== ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 216.228.112.34) smtp.rcpttodomain=monjalon.net smtp.mailfrom=nvidia.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=nvidia.com; dkim=none (message not signed); arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Nvidia.com; s=selector2; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=EyTGReqUbUqXTf8a4NDk7zJE5RD2+qYqn96/Ab8N6GQ=; b=Eyf6KjhuYa1h9mA4hRSsRE1A1HOAhSTIZsEpqQAqwzMYyynyGa3tKmK/ji08Dubaoyy10phXe7BzhZA04Ii6h28ZNT0+3/nFl7aohGiMLyvHaUTTRr4w8yEUwP6q6xfpW/ntceVGBnA8290nRAtSpIP+zjtwUkXi5y3lPNNHcrX05FSGE6BlQreY/R/jqk4KJudiB/IKKWMjGsJMyQ2ilJ9Mnyywjar3uRKpsFWMPXa+KYmCYktSBEFvRRCgb4mO3Y+Io1CCople3yhZzzANgPpiQUSsOTrxJ4xT6BjJrpA+gOHPzNDeesz9ElmGJvc9h9Lt9zZ+BLldPJT6HYIdQA== Received: from DM6PR03CA0102.namprd03.prod.outlook.com (2603:10b6:5:333::35) by BN6PR1201MB2464.namprd12.prod.outlook.com (2603:10b6:404:ae::22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4649.15; Wed, 3 Nov 2021 11:04:57 +0000 Received: from DM6NAM11FT064.eop-nam11.prod.protection.outlook.com (2603:10b6:5:333:cafe::ad) by DM6PR03CA0102.outlook.office365.com (2603:10b6:5:333::35) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11 via Frontend Transport; Wed, 3 Nov 2021 11:04:56 +0000 X-MS-Exchange-Authentication-Results: spf=pass (sender IP is 216.228.112.34) smtp.mailfrom=nvidia.com; monjalon.net; dkim=none (message not signed) header.d=none;monjalon.net; dmarc=pass action=none header.from=nvidia.com; Received-SPF: Pass (protection.outlook.com: domain of nvidia.com designates 216.228.112.34 as permitted sender) receiver=protection.outlook.com; client-ip=216.228.112.34; helo=mail.nvidia.com; Received: from mail.nvidia.com (216.228.112.34) by DM6NAM11FT064.mail.protection.outlook.com (10.13.172.234) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384) id 15.20.4669.10 via Frontend Transport; Wed, 3 Nov 2021 11:04:56 +0000 Received: from nvidia.com (172.20.187.5) by HQMAIL107.nvidia.com (172.20.187.13) with Microsoft SMTP Server (TLS) id 15.0.1497.18; Wed, 3 Nov 2021 11:04:55 +0000 From: To: CC: Thomas Monjalon Date: Wed, 3 Nov 2021 19:15:49 +0000 Message-ID: <20211103191554.16449-5-eagostini@nvidia.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20211103191554.16449-1-eagostini@nvidia.com> References: <20210602203531.2288645-1-thomas@monjalon.net> <20211103191554.16449-1-eagostini@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [172.20.187.5] X-ClientProxiedBy: HQMAIL111.nvidia.com (172.20.187.18) To HQMAIL107.nvidia.com (172.20.187.13) X-EOPAttributedMessage: 0 X-MS-PublicTrafficType: Email X-MS-Office365-Filtering-Correlation-Id: 754eb8cf-88f7-47a3-bc3c-08d99eb9c57c X-MS-TrafficTypeDiagnostic: BN6PR1201MB2464: X-LD-Processed: 43083d15-7273-40c1-b7db-39efd9ccc17a,ExtAddr X-Microsoft-Antispam-PRVS: X-MS-Oob-TLC-OOBClassifiers: OLM:2043; X-MS-Exchange-SenderADCheck: 1 X-MS-Exchange-AntiSpam-Relay: 0 X-Microsoft-Antispam: BCL:0; X-Microsoft-Antispam-Message-Info: uRCvmHGoOoei60no6qk3kS0yx8MS+xKausxvdIDNO1QDP57ydWpLBUbOHR8qx1I87o5nF5VVnOrPm09w8eHEt/cZgx9+78cKiQkiVWz0vHQCAQMexuUGDe/Ux+l+s/PPwKKmgZIyMMmqV79Hhbrv57nRoLoC/VKS8lUFMFFsEVwot1+GMVsN8g2GrV2VkMidrGD+zmqIiHfMl91V0dkJnMnd8iouDTSv9rJ9+3rXtbfCfWx9ioXXDOcD+Ny/IAtBGxqgK7Aqdgnw5jsZKxQelXHFgtXAr+x0ku7MnYgqvIA0pTHkEzhKzRxAHSnEsutlncWj3XzX1Nk4WwFr7UW/+21/jZAy70avnVedTq8EyHrtR1C6Sj+pvehzdoMNzGe9kB5lPMXrs+3ofC36hJD+35FmFrWyULexaU/GdZB35d8J61r5LBsJ2ylaNU/e3jb+TMjoAjypbD54LO8liT55f2u9N3dfENCg1tS8KTaSe/RH0x1/wEr6lcHzFgVUD4SHSSuF+IwGaKm/w+oaWRBrHPbnT1lGlN/Qep9gmbjFSOUXBrN3oPiYAu13lJO0DqCeI83BUQpJvzNfoOcuJwYu+OAPgPwQZQpzkM7x/Rovnlx9OPx9wQumyn1CEzcgkRsZWG+H8S6KMbthzkUYO+3ytS51adv2D0JC/CA/Q0XyMf1jMYfOBoqa/O7j0dd0sriSQn62QNRYv1dg3ujmHOKbfw== X-Forefront-Antispam-Report: CIP:216.228.112.34; CTRY:US; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:mail.nvidia.com; PTR:schybrid03.nvidia.com; CAT:NONE; SFS:(4636009)(46966006)(36840700001)(7696005)(83380400001)(36860700001)(70206006)(70586007)(508600001)(186003)(6916009)(55016002)(86362001)(1076003)(26005)(2906002)(5660300002)(8936002)(6286002)(2876002)(36906005)(36756003)(82310400003)(47076005)(8676002)(316002)(7636003)(30864003)(4326008)(6666004)(356005)(16526019)(336012)(426003)(2616005); DIR:OUT; SFP:1101; X-OriginatorOrg: Nvidia.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 03 Nov 2021 11:04:56.7871 (UTC) X-MS-Exchange-CrossTenant-Network-Message-Id: 754eb8cf-88f7-47a3-bc3c-08d99eb9c57c X-MS-Exchange-CrossTenant-Id: 43083d15-7273-40c1-b7db-39efd9ccc17a X-MS-Exchange-CrossTenant-OriginalAttributedTenantConnectingIp: TenantId=43083d15-7273-40c1-b7db-39efd9ccc17a; Ip=[216.228.112.34]; Helo=[mail.nvidia.com] X-MS-Exchange-CrossTenant-AuthSource: DM6NAM11FT064.eop-nam11.prod.protection.outlook.com X-MS-Exchange-CrossTenant-AuthAs: Anonymous X-MS-Exchange-CrossTenant-FromEntityHeader: HybridOnPrem X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN6PR1201MB2464 Subject: [dpdk-dev] [PATCH v4 4/9] gpudev: support multi-process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" From: Thomas Monjalon The device data shared between processes are moved in a struct allocated in a shared memory (a new memzone for all GPUs). The main struct rte_gpu references the shared memory via the pointer mpshared. The API function rte_gpu_attach() is added to attach a device from the secondary process. The function rte_gpu_allocate() can be used only by primary process. Signed-off-by: Thomas Monjalon --- lib/gpudev/gpudev.c | 127 +++++++++++++++++++++++++++++++------ lib/gpudev/gpudev_driver.h | 25 ++++++-- lib/gpudev/version.map | 1 + 3 files changed, 127 insertions(+), 26 deletions(-) diff --git a/lib/gpudev/gpudev.c b/lib/gpudev/gpudev.c index 74cdd7f20b..f0690cf730 100644 --- a/lib/gpudev/gpudev.c +++ b/lib/gpudev/gpudev.c @@ -5,6 +5,7 @@ #include #include #include +#include #include #include @@ -28,6 +29,12 @@ static int16_t gpu_max; /* Number of currently valid devices */ static int16_t gpu_count; +/* Shared memory between processes. */ +static const char *GPU_MEMZONE = "rte_gpu_shared"; +static struct { + __extension__ struct rte_gpu_mpshared gpus[0]; +} *gpu_shared_mem; + /* Event callback object */ struct rte_gpu_callback { TAILQ_ENTRY(rte_gpu_callback) next; @@ -75,7 +82,7 @@ bool rte_gpu_is_valid(int16_t dev_id) { if (dev_id >= 0 && dev_id < gpu_max && - gpus[dev_id].state == RTE_GPU_STATE_INITIALIZED) + gpus[dev_id].process_state == RTE_GPU_STATE_INITIALIZED) return true; return false; } @@ -85,7 +92,7 @@ gpu_match_parent(int16_t dev_id, int16_t parent) { if (parent == RTE_GPU_ID_ANY) return true; - return gpus[dev_id].info.parent == parent; + return gpus[dev_id].mpshared->info.parent == parent; } int16_t @@ -94,7 +101,7 @@ rte_gpu_find_next(int16_t dev_id, int16_t parent) if (dev_id < 0) dev_id = 0; while (dev_id < gpu_max && - (gpus[dev_id].state == RTE_GPU_STATE_UNUSED || + (gpus[dev_id].process_state == RTE_GPU_STATE_UNUSED || !gpu_match_parent(dev_id, parent))) dev_id++; @@ -109,7 +116,7 @@ gpu_find_free_id(void) int16_t dev_id; for (dev_id = 0; dev_id < gpu_max; dev_id++) { - if (gpus[dev_id].state == RTE_GPU_STATE_UNUSED) + if (gpus[dev_id].process_state == RTE_GPU_STATE_UNUSED) return dev_id; } return RTE_GPU_ID_NONE; @@ -136,12 +143,35 @@ rte_gpu_get_by_name(const char *name) RTE_GPU_FOREACH(dev_id) { dev = &gpus[dev_id]; - if (strncmp(name, dev->name, RTE_DEV_NAME_MAX_LEN) == 0) + if (strncmp(name, dev->mpshared->name, RTE_DEV_NAME_MAX_LEN) == 0) return dev; } return NULL; } +static int +gpu_shared_mem_init(void) +{ + const struct rte_memzone *memzone; + + if (rte_eal_process_type() == RTE_PROC_PRIMARY) { + memzone = rte_memzone_reserve(GPU_MEMZONE, + sizeof(*gpu_shared_mem) + + sizeof(*gpu_shared_mem->gpus) * gpu_max, + SOCKET_ID_ANY, 0); + } else { + memzone = rte_memzone_lookup(GPU_MEMZONE); + } + if (memzone == NULL) { + GPU_LOG(ERR, "cannot initialize shared memory"); + rte_errno = ENOMEM; + return -rte_errno; + } + + gpu_shared_mem = memzone->addr; + return 0; +} + struct rte_gpu * rte_gpu_allocate(const char *name) { @@ -163,6 +193,10 @@ rte_gpu_allocate(const char *name) if (gpus == NULL && rte_gpu_init(RTE_GPU_DEFAULT_MAX) < 0) return NULL; + /* initialize shared memory before adding first device */ + if (gpu_shared_mem == NULL && gpu_shared_mem_init() < 0) + return NULL; + if (rte_gpu_get_by_name(name) != NULL) { GPU_LOG(ERR, "device with name %s already exists", name); rte_errno = EEXIST; @@ -178,16 +212,20 @@ rte_gpu_allocate(const char *name) dev = &gpus[dev_id]; memset(dev, 0, sizeof(*dev)); - if (rte_strscpy(dev->name, name, RTE_DEV_NAME_MAX_LEN) < 0) { + dev->mpshared = &gpu_shared_mem->gpus[dev_id]; + memset(dev->mpshared, 0, sizeof(*dev->mpshared)); + + if (rte_strscpy(dev->mpshared->name, name, RTE_DEV_NAME_MAX_LEN) < 0) { GPU_LOG(ERR, "device name too long: %s", name); rte_errno = ENAMETOOLONG; return NULL; } - dev->info.name = dev->name; - dev->info.dev_id = dev_id; - dev->info.numa_node = -1; - dev->info.parent = RTE_GPU_ID_NONE; + dev->mpshared->info.name = dev->mpshared->name; + dev->mpshared->info.dev_id = dev_id; + dev->mpshared->info.numa_node = -1; + dev->mpshared->info.parent = RTE_GPU_ID_NONE; TAILQ_INIT(&dev->callbacks); + __atomic_fetch_add(&dev->mpshared->process_refcnt, 1, __ATOMIC_RELAXED); gpu_count++; GPU_LOG(DEBUG, "new device %s (id %d) of total %d", @@ -195,6 +233,55 @@ rte_gpu_allocate(const char *name) return dev; } +struct rte_gpu * +rte_gpu_attach(const char *name) +{ + int16_t dev_id; + struct rte_gpu *dev; + struct rte_gpu_mpshared *shared_dev; + + if (rte_eal_process_type() != RTE_PROC_SECONDARY) { + GPU_LOG(ERR, "only secondary process can attach device"); + rte_errno = EPERM; + return NULL; + } + if (name == NULL) { + GPU_LOG(ERR, "attach device without a name"); + rte_errno = EINVAL; + return NULL; + } + + /* implicit initialization of library before adding first device */ + if (gpus == NULL && rte_gpu_init(RTE_GPU_DEFAULT_MAX) < 0) + return NULL; + + /* initialize shared memory before adding first device */ + if (gpu_shared_mem == NULL && gpu_shared_mem_init() < 0) + return NULL; + + for (dev_id = 0; dev_id < gpu_max; dev_id++) { + shared_dev = &gpu_shared_mem->gpus[dev_id]; + if (strncmp(name, shared_dev->name, RTE_DEV_NAME_MAX_LEN) == 0) + break; + } + if (dev_id >= gpu_max) { + GPU_LOG(ERR, "device with name %s not found", name); + rte_errno = ENOENT; + return NULL; + } + dev = &gpus[dev_id]; + memset(dev, 0, sizeof(*dev)); + + TAILQ_INIT(&dev->callbacks); + dev->mpshared = shared_dev; + __atomic_fetch_add(&dev->mpshared->process_refcnt, 1, __ATOMIC_RELAXED); + + gpu_count++; + GPU_LOG(DEBUG, "attached device %s (id %d) of total %d", + name, dev_id, gpu_count); + return dev; +} + int16_t rte_gpu_add_child(const char *name, int16_t parent, uint64_t child_context) { @@ -210,11 +297,11 @@ rte_gpu_add_child(const char *name, int16_t parent, uint64_t child_context) if (dev == NULL) return -rte_errno; - dev->info.parent = parent; - dev->info.context = child_context; + dev->mpshared->info.parent = parent; + dev->mpshared->info.context = child_context; rte_gpu_complete_new(dev); - return dev->info.dev_id; + return dev->mpshared->info.dev_id; } void @@ -223,8 +310,7 @@ rte_gpu_complete_new(struct rte_gpu *dev) if (dev == NULL) return; - dev->state = RTE_GPU_STATE_INITIALIZED; - dev->state = RTE_GPU_STATE_INITIALIZED; + dev->process_state = RTE_GPU_STATE_INITIALIZED; rte_gpu_notify(dev, RTE_GPU_EVENT_NEW); } @@ -237,7 +323,7 @@ rte_gpu_release(struct rte_gpu *dev) rte_errno = ENODEV; return -rte_errno; } - dev_id = dev->info.dev_id; + dev_id = dev->mpshared->info.dev_id; RTE_GPU_FOREACH_CHILD(child, dev_id) { GPU_LOG(ERR, "cannot release device %d with child %d", dev_id, child); @@ -246,11 +332,12 @@ rte_gpu_release(struct rte_gpu *dev) } GPU_LOG(DEBUG, "free device %s (id %d)", - dev->info.name, dev->info.dev_id); + dev->mpshared->info.name, dev->mpshared->info.dev_id); rte_gpu_notify(dev, RTE_GPU_EVENT_DEL); gpu_free_callbacks(dev); - dev->state = RTE_GPU_STATE_UNUSED; + dev->process_state = RTE_GPU_STATE_UNUSED; + __atomic_fetch_sub(&dev->mpshared->process_refcnt, 1, __ATOMIC_RELAXED); gpu_count--; return 0; @@ -403,7 +490,7 @@ rte_gpu_notify(struct rte_gpu *dev, enum rte_gpu_event event) int16_t dev_id; struct rte_gpu_callback *callback; - dev_id = dev->info.dev_id; + dev_id = dev->mpshared->info.dev_id; rte_rwlock_read_lock(&gpu_callback_lock); TAILQ_FOREACH(callback, &dev->callbacks, next) { if (callback->event != event || callback->function == NULL) @@ -431,7 +518,7 @@ rte_gpu_info_get(int16_t dev_id, struct rte_gpu_info *info) } if (dev->ops.dev_info_get == NULL) { - *info = dev->info; + *info = dev->mpshared->info; return 0; } return GPU_DRV_RET(dev->ops.dev_info_get(dev, info)); diff --git a/lib/gpudev/gpudev_driver.h b/lib/gpudev/gpudev_driver.h index 4d0077161c..9459c7e30f 100644 --- a/lib/gpudev/gpudev_driver.h +++ b/lib/gpudev/gpudev_driver.h @@ -35,19 +35,28 @@ struct rte_gpu_ops { rte_gpu_close_t *dev_close; }; -struct rte_gpu { - /* Backing device. */ - struct rte_device *device; +struct rte_gpu_mpshared { /* Unique identifier name. */ char name[RTE_DEV_NAME_MAX_LEN]; /* Updated by this library. */ + /* Driver-specific private data shared in multi-process. */ + void *dev_private; /* Device info structure. */ struct rte_gpu_info info; + /* Counter of processes using the device. */ + uint16_t process_refcnt; /* Updated by this library. */ +}; + +struct rte_gpu { + /* Backing device. */ + struct rte_device *device; + /* Data shared between processes. */ + struct rte_gpu_mpshared *mpshared; /* Driver functions. */ struct rte_gpu_ops ops; /* Event callback list. */ TAILQ_HEAD(rte_gpu_callback_list, rte_gpu_callback) callbacks; /* Current state (used or not) in the running process. */ - enum rte_gpu_state state; /* Updated by this library. */ + enum rte_gpu_state process_state; /* Updated by this library. */ /* Driver-specific private data for the running process. */ void *process_private; } __rte_cache_aligned; @@ -55,15 +64,19 @@ struct rte_gpu { __rte_internal struct rte_gpu *rte_gpu_get_by_name(const char *name); -/* First step of initialization */ +/* First step of initialization in primary process. */ __rte_internal struct rte_gpu *rte_gpu_allocate(const char *name); +/* First step of initialization in secondary process. */ +__rte_internal +struct rte_gpu *rte_gpu_attach(const char *name); + /* Last step of initialization. */ __rte_internal void rte_gpu_complete_new(struct rte_gpu *dev); -/* Last step of removal. */ +/* Last step of removal (primary or secondary process). */ __rte_internal int rte_gpu_release(struct rte_gpu *dev); diff --git a/lib/gpudev/version.map b/lib/gpudev/version.map index 4a934ed933..58dc632393 100644 --- a/lib/gpudev/version.map +++ b/lib/gpudev/version.map @@ -17,6 +17,7 @@ INTERNAL { global: rte_gpu_allocate; + rte_gpu_attach; rte_gpu_complete_new; rte_gpu_get_by_name; rte_gpu_notify; -- 2.17.1