From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 35945A0C4E; Fri, 15 Oct 2021 14:12:09 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 26AEF4121E; Fri, 15 Oct 2021 14:12:09 +0200 (CEST) Received: from mail-wr1-f49.google.com (mail-wr1-f49.google.com [209.85.221.49]) by mails.dpdk.org (Postfix) with ESMTP id 289A5410F1 for ; Fri, 15 Oct 2021 14:12:08 +0200 (CEST) Received: by mail-wr1-f49.google.com with SMTP id g25so26031665wrb.2 for ; Fri, 15 Oct 2021 05:12:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=6wind.com; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=Ps8N66Hh4/w0uciQ9PU034W1325/95ss29kP4uh5fvw=; b=RWD7RyZdUjPGN4zL1yekZkw/eoQI2EmnR5TJNJ4Vm9cHj5PKEjz8zqmlX9g5jCm4yC BJwpSGjLH87spbZwEoje4mBHcT7ek3xrSa7WqKaXPpPjbyCJlBBmrGtA++IiJx6z3A1U XK8UpGzCRPelsYHVadp4p9MOJGwwLDT1RI9EzfOCn8V4eEQ0JbzAqEo45TWhcMGD6/s7 TLTyaiLHgFgIJfqmP70ppUmXvs8ULfVYUhYlqgl5Ykvhjn0gn9kogbyGZ3xeEuzhmqVW 2Da4H/3Lcdbmp1rp5JB4k6o5CPZum11gDzrKMg5Hfk1QJ/dkOPJnzxsbWbAh9688qb6e o3yA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=Ps8N66Hh4/w0uciQ9PU034W1325/95ss29kP4uh5fvw=; b=mYM3WnA7YvQAolfamxBLmoNxgdTjVj3Wt5Ms8thZrYwD9FlzAisqnTAhN+D6YMFcJM xBKV9OXvlLE9SkuG77G5iSYTpllQM3ZMd+hzAcKeEY+xMtVrVf9idA3htdFwh1BfpJAO IZvYTJ2Yiu6Iunl4YitA62k+89zV5SyCPC5lw0IEvkbAEcrqbDg49i7wUZDJMz2o4qk9 gqFX3ip6YzxNGc8mgcBFc2pyE3G3InUEr0CFBJqmqH9rlI7oX+2MA/Wr9Sl9pAPZeydT 3rTXbrY+Df60vcDnozOfjiQPf5d5YOt9UFgriiO7Dk1IhWYZ5j3f9kzGBIKj6oVbiuwb aauQ== X-Gm-Message-State: AOAM5312tE0aVvALAlSTkJ1LswnA2fqMjYZL/Hv9siEnawlg+7kS3iiE MiyNfQBtBUsuotDeXbhCsuYnqw== X-Google-Smtp-Source: ABdhPJxn8A4J8Zguc/Fl8OBopzpksonahaB4oxt7lO/ukVRddPOGiJ35b6PhC+NKIirWF+A43GWpmQ== X-Received: by 2002:adf:bb93:: with SMTP id q19mr14180347wrg.423.1634299927826; Fri, 15 Oct 2021 05:12:07 -0700 (PDT) Received: from 6wind.com ([2a01:e0a:5ac:6460:c065:401d:87eb:9b25]) by smtp.gmail.com with ESMTPSA id l5sm4804817wrq.77.2021.10.15.05.12.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Oct 2021 05:12:07 -0700 (PDT) Date: Fri, 15 Oct 2021 14:12:06 +0200 From: Olivier Matz To: Dmitry Kozlyuk Cc: dev@dpdk.org, Andrew Rybchenko , Matan Azrad , Ray Kinsella , Anatoly Burakov Message-ID: References: <20211012000409.2751908-1-dkozlyuk@nvidia.com> <20211013110131.2909604-1-dkozlyuk@nvidia.com> <20211013110131.2909604-2-dkozlyuk@nvidia.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211013110131.2909604-2-dkozlyuk@nvidia.com> Subject: Re: [dpdk-dev] [PATCH v4 1/4] mempool: add event callbacks X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" Hi Dmitry, On Wed, Oct 13, 2021 at 02:01:28PM +0300, Dmitry Kozlyuk wrote: > Data path performance can benefit if the PMD knows which memory it will > need to handle in advance, before the first mbuf is sent to the PMD. > It is impractical, however, to consider all allocated memory for this > purpose. Most often mbuf memory comes from mempools that can come and > go. PMD can enumerate existing mempools on device start, but it also > needs to track creation and destruction of mempools after the forwarding > starts but before an mbuf from the new mempool is sent to the device. > > Add an API to register callback for mempool life cycle events: > * rte_mempool_event_callback_register() > * rte_mempool_event_callback_unregister() > Currently tracked events are: > * RTE_MEMPOOL_EVENT_READY (after populating a mempool) > * RTE_MEMPOOL_EVENT_DESTROY (before freeing a mempool) > Provide a unit test for the new API. > The new API is internal, because it is primarily demanded by PMDs that > may need to deal with any mempools and do not control their creation, > while an application, on the other hand, knows which mempools it creates > and doesn't care about internal mempools PMDs might create. > > Signed-off-by: Dmitry Kozlyuk > Acked-by: Matan Azrad > --- > app/test/test_mempool.c | 209 ++++++++++++++++++++++++++++++++++++++ > lib/mempool/rte_mempool.c | 137 +++++++++++++++++++++++++ > lib/mempool/rte_mempool.h | 61 +++++++++++ > lib/mempool/version.map | 8 ++ > 4 files changed, 415 insertions(+) (...) > --- a/lib/mempool/rte_mempool.c > +++ b/lib/mempool/rte_mempool.c > @@ -42,6 +42,18 @@ static struct rte_tailq_elem rte_mempool_tailq = { > }; > EAL_REGISTER_TAILQ(rte_mempool_tailq) > > +TAILQ_HEAD(mempool_callback_list, rte_tailq_entry); > + > +static struct rte_tailq_elem callback_tailq = { > + .name = "RTE_MEMPOOL_CALLBACK", > +}; > +EAL_REGISTER_TAILQ(callback_tailq) > + > +/* Invoke all registered mempool event callbacks. */ > +static void > +mempool_event_callback_invoke(enum rte_mempool_event event, > + struct rte_mempool *mp); > + > #define CACHE_FLUSHTHRESH_MULTIPLIER 1.5 > #define CALC_CACHE_FLUSHTHRESH(c) \ > ((typeof(c))((c) * CACHE_FLUSHTHRESH_MULTIPLIER)) > @@ -360,6 +372,10 @@ rte_mempool_populate_iova(struct rte_mempool *mp, char *vaddr, > STAILQ_INSERT_TAIL(&mp->mem_list, memhdr, next); > mp->nb_mem_chunks++; > > + /* Report the mempool as ready only when fully populated. */ > + if (mp->populated_size >= mp->size) > + mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_READY, mp); > + One small comment here. I think it does not happen today, but in the future, something that could happen is: - create empty mempool - populate mempool - use mempool - populate mempool with more objects - use mempool I've seen one usage there: https://www.youtube.com/watch?v=SzQFn9tm4Sw In that case, it would require a POPULATE event instead of a MEMPOOL_CREATE event. Enhancing the documentation to better explain when the callback is invoked looks enough to me for the moment. > rte_mempool_trace_populate_iova(mp, vaddr, iova, len, free_cb, opaque); > return i; > > @@ -722,6 +738,7 @@ rte_mempool_free(struct rte_mempool *mp) > } > rte_mcfg_tailq_write_unlock(); > > + mempool_event_callback_invoke(RTE_MEMPOOL_EVENT_DESTROY, mp); > rte_mempool_trace_free(mp); > rte_mempool_free_memchunks(mp); > rte_mempool_ops_free(mp); > @@ -1343,3 +1360,123 @@ void rte_mempool_walk(void (*func)(struct rte_mempool *, void *), > > rte_mcfg_mempool_read_unlock(); > } > + > +struct mempool_callback { > + rte_mempool_event_callback *func; > + void *user_data; > +}; > + > +static void > +mempool_event_callback_invoke(enum rte_mempool_event event, > + struct rte_mempool *mp) > +{ > + struct mempool_callback_list *list; > + struct rte_tailq_entry *te; > + void *tmp_te; > + > + rte_mcfg_tailq_read_lock(); > + list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list); > + RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) { > + struct mempool_callback *cb = te->data; > + rte_mcfg_tailq_read_unlock(); > + cb->func(event, mp, cb->user_data); > + rte_mcfg_tailq_read_lock(); I think it is dangerous to unlock the list before invoking the callback. During that time, another thread can remove the next mempool callback, and the next iteration will access to a freed element, causing an undefined behavior. Is it a problem to keep the lock held during the callback invocation? I see that you have a test for this, and that you wrote a comment in the documentation: * rte_mempool_event_callback_register() may be called from within the callback, * but the callbacks registered this way will not be invoked for the same event. * rte_mempool_event_callback_unregister() may only be safely called * to remove the running callback. But is there a use-case for this? If no, I'll tend to say that we can document that it is not allowed to create, free or list mempools or register cb from the callback. > + } > + rte_mcfg_tailq_read_unlock(); > +} > + > +int > +rte_mempool_event_callback_register(rte_mempool_event_callback *func, > + void *user_data) > +{ > + struct mempool_callback_list *list; > + struct rte_tailq_entry *te = NULL; > + struct mempool_callback *cb; > + void *tmp_te; > + int ret; > + > + if (func == NULL) { > + rte_errno = EINVAL; > + return -rte_errno; > + } > + > + rte_mcfg_mempool_read_lock(); > + rte_mcfg_tailq_write_lock(); > + > + list = RTE_TAILQ_CAST(callback_tailq.head, mempool_callback_list); > + RTE_TAILQ_FOREACH_SAFE(te, list, next, tmp_te) { > + struct mempool_callback *cb = > + (struct mempool_callback *)te->data; > + if (cb->func == func && cb->user_data == user_data) { > + ret = -EEXIST; > + goto exit; > + } > + } > + > + te = rte_zmalloc("MEMPOOL_TAILQ_ENTRY", sizeof(*te), 0); > + if (te == NULL) { > + RTE_LOG(ERR, MEMPOOL, > + "Cannot allocate event callback tailq entry!\n"); > + ret = -ENOMEM; > + goto exit; > + } > + > + cb = rte_malloc("MEMPOOL_EVENT_CALLBACK", sizeof(*cb), 0); > + if (cb == NULL) { > + RTE_LOG(ERR, MEMPOOL, > + "Cannot allocate event callback!\n"); > + rte_free(te); > + ret = -ENOMEM; > + goto exit; > + } > + > + cb->func = func; > + cb->user_data = user_data; > + te->data = cb; > + TAILQ_INSERT_TAIL(list, te, next); > + ret = 0; > + > +exit: > + rte_mcfg_tailq_write_unlock(); > + rte_mcfg_mempool_read_unlock(); > + rte_errno = -ret; > + return ret; > +} > + > +int > +rte_mempool_event_callback_unregister(rte_mempool_event_callback *func, > + void *user_data) > +{ > + struct mempool_callback_list *list; > + struct rte_tailq_entry *te = NULL; > + struct mempool_callback *cb; > + int ret; > + > + if (rte_eal_process_type() != RTE_PROC_PRIMARY) { > + rte_errno = EPERM; > + return -1; > + } The help of the register function says * Callbacks will be invoked in the process that creates the mempool. So registration is allowed from primary or secondary process. Can't a secondary process destroys the callback it has loaded? > + > + rte_mcfg_mempool_read_lock(); > + rte_mcfg_tailq_write_lock(); I don't understand why there are 2 locks here. After looking at the code, I think the locking model is already incorrect in current mempool code: rte_mcfg_tailq_write_lock() is used in create and free to protect the access to the mempool tailq rte_mcfg_mempool_write_lock() is used in create(), to protect from concurrent creation (with same name for instance), but I doubt it is absolutly needed, because memzone_reserve is already protected. rte_mcfg_mempool_read_lock() is used in dump functions, but to me it should use rte_mcfg_tailq_read_lock() instead. Currently, doing a dump and a free concurrently can cause a crash because they are not using the same lock. In your case, I suggest to use only one lock to protect the callback list. I think it can be rte_mcfg_tailq_*_lock().