From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 23444A0542; Tue, 6 Sep 2022 18:45:28 +0200 (CEST) Received: from [217.70.189.124] (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 15E10400D6; Tue, 6 Sep 2022 18:45:28 +0200 (CEST) Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) by mails.dpdk.org (Postfix) with ESMTP id 94C9140042 for ; Tue, 6 Sep 2022 18:45:26 +0200 (CEST) Received: by mail-pj1-f47.google.com with SMTP id o4so11902553pjp.4 for ; Tue, 06 Sep 2022 09:45:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=networkplumber-org.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=FPNYZYXLYnzbb2IEvy1mWd/OqO3sauhyJgzafr//Cx8=; b=Pb4X8nvAqAmGvXPEWYSX80kzLuIC8cTy6py2UQc52N68LceFNjSG54JZbqjfubKwv/ 1J/gd24CaElRThyY3UlsJDLtPazPQxfFXfYxiG6ScAg7wLb1hZkw/tfLnaC3uT305izj 71pm1kcsDyfUfYdmQyEid3S51p9a6lhTG6ovz1QQ0/B96+lwFlsU+o5ZWztuTUO4/Ygx fwVKHa4wFxxNPPLzm1ftBDJ9MFOjMqAHdCBdqcN5zO3wWQgDle/bQOkOghIHOi92ixPF PF7eb1XRPRR33Szh24R/wssrdsIpJtO6Bayj8pL9jXN9uIabz3eNWfQRJC2WjWHg+wQ7 luZw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=FPNYZYXLYnzbb2IEvy1mWd/OqO3sauhyJgzafr//Cx8=; b=XkTGKgU+Wir7USKmwlPHuJ0GAOgZzTb5OLyvl7xlffaSU7mHe5xMRQIPLFKZ0jFn9v zxRAr2RtJrmy7s9C4VqMLH92yx87afWAitzF0mgivnwVfwbbExF6YP6BVnUK6BuVTesm CsETyMSMVmGwBG15j36gtMpHgDv6vMsRoKZ714OYSLsvXn4yp+yKrJW4l5/7if781tbp eCtvoVo4bH3GuTZE9KhfwP1Wys+MFn7U3rgNDK+OrBY3ysUtpbMQm3ujDlJMXMfKWULQ baxSeZ9OAvUZLGUQYKuzi4akDcq6cEKExohQ7oVmhTBj8Md08XLsWBdW17DvBYXnUs99 LaMQ== X-Gm-Message-State: ACgBeo1vdZznV0GhgaDqRpLLK5B0D0B7iM2nIfhbDrDWGxBHYOmLNexg bVdznHYxeX7Bplo6FanWivX9SpWdiIjMOw== X-Google-Smtp-Source: AA6agR4h5fLEexW4UufIyCvGceRamtLFeJ85djG/tWou20kBroepMNpNMpA5Rk8a68icNyAI/bS+vQ== X-Received: by 2002:a17:90b:4c88:b0:1fe:1fde:97bb with SMTP id my8-20020a17090b4c8800b001fe1fde97bbmr25231605pjb.64.1662482725323; Tue, 06 Sep 2022 09:45:25 -0700 (PDT) Received: from hermes.local (204-195-120-218.wavecable.com. [204.195.120.218]) by smtp.gmail.com with ESMTPSA id d66-20020a623645000000b0052d33bf14d6sm10352439pfa.63.2022.09.06.09.45.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Sep 2022 09:45:24 -0700 (PDT) From: Stephen Hemminger To: dev@dpdk.org Cc: Stephen Hemminger , Anatoly Burakov Subject: [PATCH v2] eal: fix data race in multi-process support Date: Tue, 6 Sep 2022 09:45:22 -0700 Message-Id: <20220906164522.91776-1-stephen@networkplumber.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20211217182922.159503-1-stephen@networkplumber.org> References: <20211217182922.159503-1-stephen@networkplumber.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org If DPDK is built with thread sanitizer it reports a race in setting of multiprocess file descriptor. The fix is to use atomic operations when updating mp_fd. Build: $ meson -Db_sanitize=address build $ ninja -C build Simple example: $ .build/app/dpdk-testpmd -l 1-3 --no-huge EAL: Detected CPU lcores: 16 EAL: Detected NUMA nodes: 1 EAL: Static memory layout is selected, amount of reserved memory can be adjusted with -m or --socket-mem EAL: Detected static linkage of DPDK EAL: Multi-process socket /run/user/1000/dpdk/rte/mp_socket EAL: Selected IOVA mode 'VA' testpmd: No probed ethernet devices testpmd: create a new mbuf pool : n=163456, size=2176, socket=0 testpmd: preferred mempool ops selected: ring_mp_mc EAL: Error - exiting with code: 1 Cause: Creation of mbuf pool for socket 0 failed: Cannot allocate memory ================== WARNING: ThreadSanitizer: data race (pid=87245) Write of size 4 at 0x558e04d8ff70 by main thread: #0 rte_mp_channel_cleanup (dpdk-testpmd+0x1e7d30c) #1 rte_eal_cleanup (dpdk-testpmd+0x1e85929) #2 rte_exit (dpdk-testpmd+0x1e5bc0a) #3 mbuf_pool_create.cold (dpdk-testpmd+0x274011) #4 main (dpdk-testpmd+0x5cc15d) Previous read of size 4 at 0x558e04d8ff70 by thread T2: #0 mp_handle (dpdk-testpmd+0x1e7c439) #1 ctrl_thread_init (dpdk-testpmd+0x1e6ee1e) As if synchronized via sleep: #0 nanosleep ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:366 (libtsan.so.0+0x6075e) #1 get_tsc_freq (dpdk-testpmd+0x1e92ff9) #2 set_tsc_freq (dpdk-testpmd+0x1e6f2fc) #3 rte_eal_timer_init (dpdk-testpmd+0x1e931a4) #4 rte_eal_init.cold (dpdk-testpmd+0x29e578) #5 main (dpdk-testpmd+0x5cbc45) Location is global 'mp_fd' of size 4 at 0x558e04d8ff70 (dpdk-testpmd+0x000003122f70) Thread T2 'rte_mp_handle' (tid=87248, running) created by main thread at: #0 pthread_create ../../../../src/libsanitizer/tsan/tsan_interceptors_posix.cpp:969 (libtsan.so.0+0x5ad75) #1 rte_ctrl_thread_create (dpdk-testpmd+0x1e6efd0) #2 rte_mp_channel_init.cold (dpdk-testpmd+0x29cb7c) #3 rte_eal_init (dpdk-testpmd+0x1e8662e) #4 main (dpdk-testpmd+0x5cbc45) SUMMARY: ThreadSanitizer: data race (/home/shemminger/DPDK/main/build/app/dpdk-testpmd+0x1e7d30c) in rte_mp_channel_cleanup ================== ThreadSanitizer: reported 1 warnings Fixes: bacaa2754017 ("eal: add channel for multi-process communication") Signed-off-by: Stephen Hemminger Acked-by: Anatoly Burakov --- lib/eal/common/eal_common_proc.c | 17 ++++++++--------- 1 file changed, 8 insertions(+), 9 deletions(-) diff --git a/lib/eal/common/eal_common_proc.c b/lib/eal/common/eal_common_proc.c index 313060528fec..1fc1d6c53bd2 100644 --- a/lib/eal/common/eal_common_proc.c +++ b/lib/eal/common/eal_common_proc.c @@ -260,7 +260,7 @@ rte_mp_action_unregister(const char *name) } static int -read_msg(struct mp_msg_internal *m, struct sockaddr_un *s) +read_msg(int fd, struct mp_msg_internal *m, struct sockaddr_un *s) { int msglen; struct iovec iov; @@ -281,7 +281,7 @@ read_msg(struct mp_msg_internal *m, struct sockaddr_un *s) msgh.msg_controllen = sizeof(control); retry: - msglen = recvmsg(mp_fd, &msgh, 0); + msglen = recvmsg(fd, &msgh, 0); /* zero length message means socket was closed */ if (msglen == 0) @@ -390,11 +390,12 @@ mp_handle(void *arg __rte_unused) { struct mp_msg_internal msg; struct sockaddr_un sa; + int fd; - while (mp_fd >= 0) { + while ((fd = __atomic_load_n(&mp_fd, __ATOMIC_RELAXED)) >= 0) { int ret; - ret = read_msg(&msg, &sa); + ret = read_msg(fd, &msg, &sa); if (ret <= 0) break; @@ -638,9 +639,8 @@ rte_mp_channel_init(void) NULL, mp_handle, NULL) < 0) { RTE_LOG(ERR, EAL, "failed to create mp thread: %s\n", strerror(errno)); - close(mp_fd); close(dir_fd); - mp_fd = -1; + close(__atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED)); return -1; } @@ -656,11 +656,10 @@ rte_mp_channel_cleanup(void) { int fd; - if (mp_fd < 0) + fd = __atomic_exchange_n(&mp_fd, -1, __ATOMIC_RELAXED); + if (fd < 0) return; - fd = mp_fd; - mp_fd = -1; pthread_cancel(mp_handle_tid); pthread_join(mp_handle_tid, NULL); close_socket_fd(fd); -- 2.35.1