From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 5DDA4A0582; Tue, 22 Nov 2022 16:25:02 +0100 (CET) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 52E9C42D76; Tue, 22 Nov 2022 16:25:02 +0100 (CET) Received: from mail-wr1-f48.google.com (mail-wr1-f48.google.com [209.85.221.48]) by mails.dpdk.org (Postfix) with ESMTP id A9207427EB for ; Tue, 22 Nov 2022 16:25:00 +0100 (CET) Received: by mail-wr1-f48.google.com with SMTP id i12so21374003wrb.0 for ; Tue, 22 Nov 2022 07:25:00 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=quAlmR6cS1irqQcNzu6r/ByDqXNF4Cy3ZLxHCaQasJA=; b=hF8jEntPmcarfOJk2XGIVlpy1RbaGjW/DgTcjLop5hPMdUJGUku95OD/XA/sKWhhYO QbMJzxvWzNTydwuctzBSi59cO0kaNtIiU9umfs3H/bGb3q5QE/RuM/UJR3bu/NTVjqwL p0cb6KDx/WQTgQ+JujsYytQLvet07sY1Yto4C8QYnMmIObKMfK0+mD5nexvhi4/VvIWi goY74xFeBZbXsLz+acuIYgQMFhaHqkmzmIp4EwGyMaYoVwCv2oBFO9TAHCbz+J5N3qU4 gTfsVYfc9DIYzjogYW/WNhX5WYP+ReckZwPa/+3mGN2RU+TMjHyDJXwmyevqebuCXVGX pvIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=quAlmR6cS1irqQcNzu6r/ByDqXNF4Cy3ZLxHCaQasJA=; b=tXz3IpgdONRkQbIvHmy4AxWe2q9RfjG/giSbaKFlg++o7Nn9F4uVvZrSznVghnyKrQ YoMS2J+CdPiPdxk20B7P4clnNYaztTIAFuG6AirmJDnRk4zsU6wNYU9Me2LmKfJCNA6l aq5vCELryDkdThTvO6/U7a3Evsd3q/TwQ7IhqSHHQTg+INiCcVRfEjP5kQWWS6ivYjYU hBnHZRqjUSIqs9R5JbPIVP6cok+ZHMmpcXCV/WQzOwng27evyRjxCHY/V9ktANxDnlCx 7eZreE3eEgdnGJEsjHY436QQEXtFl0bdouFNpNIXG3fLuD8o1BI+vsLcGCRQbJkQKRky yUuQ== X-Gm-Message-State: ANoB5pmy/wnG9VukdlzkyebX9pm3t/0hIv4gKf854Mk84pupCYwNiJEr syxdDh5CDOm0ZOXRV7INOfiSYUlPX7NlEQ== X-Google-Smtp-Source: AA0mqf4OlXLuWeeVKfQWzdTF7i15Y/EPJdOyL6TMFxP9Ps1bG93w71LYQNt/uqVO7wyzdbCbwjgF9A== X-Received: by 2002:a5d:4d04:0:b0:236:6db4:a5fc with SMTP id z4-20020a5d4d04000000b002366db4a5fcmr15124974wrt.73.1669130700370; Tue, 22 Nov 2022 07:25:00 -0800 (PST) Received: from 6wind.com ([2a01:e0a:5ac:6460:c065:401d:87eb:9b25]) by smtp.gmail.com with ESMTPSA id g17-20020a05600c311100b003a2f2bb72d5sm27760872wmo.45.2022.11.22.07.24.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 22 Nov 2022 07:24:59 -0800 (PST) Date: Tue, 22 Nov 2022 16:24:59 +0100 From: Olivier Matz To: Fengnan Chang Cc: david.marchand@redhat.com, mb@smartsharesystems.com, dev@dpdk.org Subject: Re: [PATCH v2] mempool: fix rte_mempool_avail_count may segment fault when used in multiprocess Message-ID: References: <20221115123502.12560-1-changfengnan@bytedance.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20221115123502.12560-1-changfengnan@bytedance.com> X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Hi, On Tue, Nov 15, 2022 at 08:35:02PM +0800, Fengnan Chang wrote: > rte_mempool_create put tailq entry into rte_mempool_tailq list before > populate, and pool_data set when populate. So in multi process, if > process A create mempool, and process B can get mempool through > rte_mempool_lookup before pool_data set, if B call rte_mempool_avail_count, > it will cause segment fault. > > Fix this by put tailq entry into rte_mempool_tailq after populate. > > Signed-off-by: Fengnan Chang > --- > lib/mempool/rte_mempool.c | 43 ++++++++++++++++++++++----------------- > 1 file changed, 24 insertions(+), 19 deletions(-) > > diff --git a/lib/mempool/rte_mempool.c b/lib/mempool/rte_mempool.c > index 4c78071a34..b3a6572fc8 100644 > --- a/lib/mempool/rte_mempool.c > +++ b/lib/mempool/rte_mempool.c > @@ -155,6 +155,27 @@ get_min_page_size(int socket_id) > return wa.min == SIZE_MAX ? (size_t) rte_mem_page_size() : wa.min; > } > > +static int > +add_mempool_to_list(struct rte_mempool *mp) > +{ > + struct rte_mempool_list *mempool_list; > + struct rte_tailq_entry *te = NULL; > + > + /* try to allocate tailq entry */ > + te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0); > + if (te == NULL) { > + RTE_LOG(ERR, MEMPOOL, "Cannot allocate tailq entry!\n"); > + return -ENOMEM; > + } > + > + te->data = mp; > + mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); > + rte_mcfg_tailq_write_lock(); > + TAILQ_INSERT_TAIL(mempool_list, te, next); > + rte_mcfg_tailq_write_unlock(); > + > + return 0; > +} > > static void > mempool_add_elem(struct rte_mempool *mp, __rte_unused void *opaque, > @@ -304,6 +325,9 @@ mempool_ops_alloc_once(struct rte_mempool *mp) > if (ret != 0) > return ret; > mp->flags |= RTE_MEMPOOL_F_POOL_CREATED; > + ret = add_mempool_to_list(mp); > + if (ret != 0) > + return ret; One issue here is that if the rte_zmalloc("MEMPOOL_TAILQ_ENTRY") fails, the function will fail, but rte_mempool_ops_alloc() may already be successful. I agree it's theorical, because an allocation failure would cause more issues at the end. But, to be rigorous, I think we should do something like this instead (not tested, just for the idea): static int mempool_ops_alloc_once(struct rte_mempool *mp) { struct rte_mempool_list *mempool_list; struct rte_tailq_entry *te = NULL; int ret; /* only create the driver ops and add in tailq in if not already done */ if ((mp->flags & RTE_MEMPOOL_F_POOL_CREATED)) return 0; te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0); if (te == NULL) { RTE_LOG(ERR, MEMPOOL, "Cannot allocate tailq entry!\n"); ret = -rte_errno; goto fail; } te->data = mp; mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); ret = rte_mempool_ops_alloc(mp); if (ret != 0) goto fail; mp->flags |= RTE_MEMPOOL_F_POOL_CREATED; rte_mcfg_tailq_write_lock(); TAILQ_INSERT_TAIL(mempool_list, te, next); rte_mcfg_tailq_write_unlock(); return 0; fail: rte_free(te); return ret; } Thinking a bit more about the problem itself: the segfault that you describe could also happen in a primary, without multi-process: - create an empty mempool - call rte_mempool_avail_count() before it is populated This simply means that an empty mempool is not ready for use, until rte_mempool_set_ops_byname() or rte_mempool_populate*() is called. This is something that we should document above the declaration of rte_mempool_create_empty(). We could also say there that the mempool will become visible to the secondary processes as soon as the driver ops are set. However I still believe that a better synchronization point is required in the application. After all, the presence in the TAILQ does not give any hint on the status of the object. Can we imagine a case where a mempool is created empty in a primary, and populated in a secondary? If such use-case exist, we may not want to take this patch. > } > return 0; > } > @@ -798,9 +822,7 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, > int socket_id, unsigned flags) > { > char mz_name[RTE_MEMZONE_NAMESIZE]; > - struct rte_mempool_list *mempool_list; > struct rte_mempool *mp = NULL; > - struct rte_tailq_entry *te = NULL; > const struct rte_memzone *mz = NULL; > size_t mempool_size; > unsigned int mz_flags = RTE_MEMZONE_1GB|RTE_MEMZONE_SIZE_HINT_ONLY; > @@ -820,8 +842,6 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, > RTE_CACHE_LINE_MASK) != 0); > #endif > > - mempool_list = RTE_TAILQ_CAST(rte_mempool_tailq.head, rte_mempool_list); > - > /* asked for zero items */ > if (n == 0) { > rte_errno = EINVAL; > @@ -866,14 +886,6 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, > private_data_size = (private_data_size + > RTE_MEMPOOL_ALIGN_MASK) & (~RTE_MEMPOOL_ALIGN_MASK); > > - > - /* try to allocate tailq entry */ > - te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0); > - if (te == NULL) { > - RTE_LOG(ERR, MEMPOOL, "Cannot allocate tailq entry!\n"); > - goto exit_unlock; > - } > - > mempool_size = RTE_MEMPOOL_HEADER_SIZE(mp, cache_size); > mempool_size += private_data_size; > mempool_size = RTE_ALIGN_CEIL(mempool_size, RTE_MEMPOOL_ALIGN); > @@ -923,20 +935,13 @@ rte_mempool_create_empty(const char *name, unsigned n, unsigned elt_size, > cache_size); > } > > - te->data = mp; > - > - rte_mcfg_tailq_write_lock(); > - TAILQ_INSERT_TAIL(mempool_list, te, next); > - rte_mcfg_tailq_write_unlock(); > rte_mcfg_mempool_write_unlock(); > - > rte_mempool_trace_create_empty(name, n, elt_size, cache_size, > private_data_size, flags, mp); > return mp; > > exit_unlock: > rte_mcfg_mempool_write_unlock(); > - rte_free(te); > rte_mempool_free(mp); > return NULL; > } > -- > 2.37.0 (Apple Git-136) >