From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 270CCA04A5; Fri, 17 Dec 2021 19:29:29 +0100 (CET) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 738D84067C; Fri, 17 Dec 2021 19:29:28 +0100 (CET) Received: from mail-pj1-f46.google.com (mail-pj1-f46.google.com [209.85.216.46]) by mails.dpdk.org (Postfix) with ESMTP id 0C98E4013F for ; Fri, 17 Dec 2021 19:29:27 +0100 (CET) Received: by mail-pj1-f46.google.com with SMTP id co15so3029266pjb.2 for ; Fri, 17 Dec 2021 10:29:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=s6i48ckjBU1Wm8v3DWc1jIthy8UyNDtq3OvxXzcclB4=; b=vIWO5taIcySlCbKNUwC69bGkxCRxa5ixF+zKR5oLhwgYm0qUiqEVPtjTM7HlGd9sWd 5SqpxrXvHyzLAd6D3wNhbi9AeIhAAhvWxDxHV0IPS97KHuqo3F+EuJ+zYjKCCGHu39JI CG6OjI4lA3CGitOv5B0/hIzXgMVGpKLxvUf5BBdHgEbUV0+AUOTnNG5ltK6Kk0SgSLto kJipGMyaDv1K62J11l9ZdAzP4/5IdWkskyR/yT22r7ofPPQsX3V1EAxvOx+cGCQe3+I3 q3CovO4OPtUy5hzHCj5ZmZ38+YO6FnHrBeKboJLVARgAIf8UTFYmiMS2ZTRiBLHqE3HH yFkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=s6i48ckjBU1Wm8v3DWc1jIthy8UyNDtq3OvxXzcclB4=; b=bNNBNtZ2WGW6NFxd/NcTZZx4XOKSz25Sw0HP3VyklTHbdYv3qnZp6FMajnRAAMMtk4 htaO57CUQsKtm87Am+s7zvNnGVUOPYxEw/BQddhH0ut8bvtWVmySPDuLcu8VxizBAyGs 0ovZLcKxcSK+dWMyaurTUtCATGH70uePFYsXcX9+z4R6MlX6DXSFuVPvBC5QeNhqGLvf /v8RVjSMtO5z2Nje7JD04WuwBYwR+sETWMcLdoZxJSkA4BXAd0bS7k2ZIntDTYyX4jPa 0mBfISrQ0nVBnOo/8UPenBbkuaBHZyxj5dH9Ox1P1jSyGlqDL2sz9DToCGsOpPgwItSU BmNQ== X-Gm-Message-State: AOAM532GNTTSo/Vp1u4+pY1T5WlrjtjRreZQI3Q/3AL3rbon3ZWoNtVX 0gJWDmZnuBM+qoOtnDMJQSNS7w== X-Google-Smtp-Source: ABdhPJwzHn2UbKFGhrPMN0HGaQWdL4LcxaLfZFgvs0rq4pH4tPTUBM/QCoC67R+qX+6AFrNU6hdFCQ== X-Received: by 2002:a17:90b:4a89:: with SMTP id lp9mr13652584pjb.6.1639765766150; Fri, 17 Dec 2021 10:29:26 -0800 (PST) Received: from hermes.local (204-195-112-199.wavecable.com. [204.195.112.199]) by smtp.gmail.com with ESMTPSA id o22sm11357422pfu.45.2021.12.17.10.29.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 17 Dec 2021 10:29:25 -0800 (PST) From: Stephen Hemminger To: anatoly.burakov@intel.com Cc: dev@dpdk.org, stable@dpdk.org, Stephen Hemminger Subject: [PATCH] eal: fix data race in multi-process support Date: Fri, 17 Dec 2021 10:29:22 -0800 Message-Id: <20211217182922.159503-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20211217181649.154972-1-stephen@networkplumber.org> References: <20211217181649.154972-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org If DPDK is built with thread sanitizer it reports a race in setting of multiprocess file descriptor. The fix is to use atomic operations when updating mp_fd. Simple example: $ dpdk-testpmd -l 1-3 --no-huge ... EAL: Error - exiting with code: 1 Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory ================== WARNING: ThreadSanitizer: data race (pid=83054) Write of size 4 at 0x55e3b7fce450 by main thread: #0 rte_mp_channel_cleanup (dpdk-testpmd+0x160d79c) #1 rte_eal_cleanup (dpdk-testpmd+0x1614fb5) #2 rte_exit (dpdk-testpmd+0x15ec97a) #3 mbuf_pool_create.cold (dpdk-testpmd+0x242e1a) #4 main (dpdk-testpmd+0x5ab05d) Previous read of size 4 at 0x55e3b7fce450 by thread T2: #0 mp_handle (dpdk-testpmd+0x160c979) #1 ctrl_thread_init (dpdk-testpmd+0x15ff76e) As if synchronized via sleep: #0 nanosleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:362 (libtsan.so.0+0x5cd8e) #1 get_tsc_freq (dpdk-testpmd+0x1622889) #2 set_tsc_freq (dpdk-testpmd+0x15ffb9c) #3 rte_eal_timer_init (dpdk-testpmd+0x1622a34) #4 rte_eal_init.cold (dpdk-testpmd+0x26b314) #5 main (dpdk-testpmd+0x5aab45) Location is global 'mp_fd' of size 4 at 0x55e3b7fce450 (dpdk-testpmd+0x0000027c7450) Thread T2 'rte_mp_handle' (tid=83057, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:962 (libtsan.so.0+0x58ba2) #1 rte_ctrl_thread_create (dpdk-testpmd+0x15ff870) #2 rte_mp_channel_init.cold (dpdk-testpmd+0x269986) #3 rte_eal_init (dpdk-testpmd+0x1615b28) #4 main (dpdk-testpmd+0x5aab45) SUMMARY: ThreadSanitizer: data race (/home/shemminger/DPDK/main/build/app/dpdk-testpmd+0x160d79c) in rte_mp_channel_cleanup ================== ThreadSanitizer: reported 1 warnings Fixes: bacaa2754017 ("eal: add channel for multi-process communication") Signed-off-by: Stephen Hemminger --- v2 - fix the mp socket bind lib/eal/common/eal_common_proc.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c index ebd0f6673b8b..72c7e8f536af 100644 --- a/lib/eal/common/eal_common_proc.c +++ b/lib/eal/common/eal_common_proc.c @@ -262,7 +262,7 @@ rte_mp_action_unregister(const char *name) } static int -read_msg(struct mp_msg_internal *m, struct sockaddr_un *s) +read_msg(int fd, struct mp_msg_internal *m, struct sockaddr_un *s) { int msglen; struct iovec iov; @@ -282,7 +282,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s) msgh.msg_control = control; msgh.msg_controllen = sizeof(control); - msglen = recvmsg(mp_fd, &msgh, 0); + msglen = recvmsg(fd, &msgh, 0); if (msglen < 0) { RTE_LOG(ERR, EAL, "recvmsg failed, %s\n", strerror(errno)); return -1; @@ -383,9 +383,10 @@ mp_handle(void *arg __rte_unused) { struct mp_msg_internal msg; struct sockaddr_un sa; + int fd; - while (mp_fd >= 0) { - if (read_msg(&msg, &sa) == 0) + while ((fd = __atomic_load_n(&mp_fd, __ATOMIC_RELAXED)) >= 0) { + if (read_msg(fd, &msg, &sa) == 0) process_msg(&msg, &sa); } @@ -626,9 +627,8 @@ rte_mp_channel_init(void) NULL, mp_handle, NULL) < 0) { RTE_LOG(ERR, EAL, "failed to create mp thread: %s\n", strerror(errno)); - close(mp_fd); close(dir_fd); - mp_fd = -1; + close(__atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED)); return -1; } @@ -644,11 +644,10 @@ rte_mp_channel_cleanup(void) { int fd; - if (mp_fd < 0) + fd = __atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED); + if (fd < 0) return; - fd = mp_fd; - mp_fd = -1; pthread_cancel(mp_handle_tid); pthread_join(mp_handle_tid, NULL); close_socket_fd(fd); -- 2.30.2