From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by dpdk.org (Postfix) with ESMTP id 35DB54C81 for ; Mon, 26 Feb 2018 16:08:13 +0100 (CET) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga104.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 26 Feb 2018 07:08:12 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,397,1515484800"; d="scan'208";a="203836758" Received: from fmsmsx103.amr.corp.intel.com ([10.18.124.201]) by orsmga005.jf.intel.com with ESMTP; 26 Feb 2018 07:08:12 -0800 Received: from fmsmsx120.amr.corp.intel.com (10.18.124.208) by FMSMSX103.amr.corp.intel.com (10.18.124.201) with Microsoft SMTP Server (TLS) id 14.3.319.2; Mon, 26 Feb 2018 07:08:12 -0800 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.243]) by fmsmsx120.amr.corp.intel.com ([169.254.15.251]) with mapi id 14.03.0319.002; Mon, 26 Feb 2018 07:08:11 -0800 From: "Wiles, Keith" To: andrei CC: "users@dpdk.org" Thread-Topic: [dpdk-users] dpdk concurrency Thread-Index: AQHTrw9CZNyL/l3x5EGwNVfaiReqvqO3TzKA Date: Mon, 26 Feb 2018 15:08:11 +0000 Message-ID: References: In-Reply-To: Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.254.102.10] Content-Type: text/plain; charset="us-ascii" Content-ID: Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-users] dpdk concurrency X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Mon, 26 Feb 2018 15:08:14 -0000 > On Feb 26, 2018, at 8:36 AM, andrei wrote: >=20 > Hi, >=20 >=20 > I had run into a deadlock by using DPDK, and I am out of ideas on how to > debug the issue. >=20 >=20 > Scenario: (The application is more complicated this is the simplified > version) >=20 > 4 mempools, 4 txBuffers, 3 threads, (CPU 4 cores; irrelevant). >=20 > One thread extracts buffers(rte_mbuff) from a random memory(rte_mempool) > pool and place them into a ring buffer. >=20 > Other thread (Sender) extracts the buffers from the ring populates them > with data and place them into a rte_eth_dev_tx_buffer by calling > rte_eth_tx_buffer(); >=20 > One thread (Flusher) is used as a flusher. It goes trough the > rte_eth_dev_tx_buffer's and calls rte_eth_tx_buffer_flush. >=20 >=20 > The deadlock occurs, in my opinion, when the Sender and the Flusher > threads try to place back buffers into the same memory pool. >=20 > This is a fragment of the core dump. >=20 >=20 > Thread 2 (Thread 0x7f5932e69700 (LWP 14014)): > #0 0x00007f59388e933a in common_ring_mp_enqueue () from > /usr/local/lib/librte_mempool.so.2.1 > #1 0x00007f59386b27e0 in ixgbe_xmit_pkts () from > /usr/local/lib/librte_pmd_ixgbe.so.1.1 > #2 0x00007f593d00aab7 in rte_eth_tx_burst (nb_pkts=3D, > tx_pkts=3D, > queue_id=3D0, port_id=3D) at > /usr/local/include/dpdk/rte_ethdev.h:2858 > #3 rte_eth_tx_buffer_flush (buffer=3D, queue_id=3D0, > port_id=3D) > at /usr/local/include/dpdk/rte_ethdev.h:3040 > #4 rte_eth_tx_buffer (tx_pkt=3D, buffer=3D= , > queue_id=3D0, > port_id=3D) at /usr/local/include/dpdk/rte_ethdev.h:3090 >=20 >=20 > Thread 30 (Thread 0x7f5933175700 (LWP 13958)): > #0 0x00007f59388e91cc in common_ring_mp_enqueue () from > /usr/local/lib/librte_mempool.so.2.1 > #1 0x00007f59386b27e0 in ixgbe_xmit_pkts () from > /usr/local/lib/librte_pmd_ixgbe.so.1.1 > #2 0x00007f593d007dfd in rte_eth_tx_burst (nb_pkts=3D, > tx_pkts=3D0x7f587a410358, > queue_id=3D, port_id=3D0 '\000') at > /usr/local/include/dpdk/rte_ethdev.h:2858 > #3 rte_eth_tx_buffer_flush (buffer=3D0x7f587a410340, queue_id=3D out>, port_id=3D0 '\000') > at /usr/local/include/dpdk/rte_ethdev.h:3040 >=20 >=20 > Questions: >=20 > 1. I am using DPDK 17.02.01. Was a bug solved in newer releases > that can be mapped to this behavior? >=20 > 2. If two threads try to place buffers into the same pool, the > operation should be synchronized by DPDK or by application? Not sure this will help, but if you are running more then one thread per co= re then you can have problems. I assume you have a mempool cache here and t= he per lcore caches are not thread safe (If I remember correctly), the main= cache in the mempool is thread safe. If you are not using multiple threads= per lcore, then I expect something else is going on here as this is the no= rmal mode of operation for mempools. If you turn off the mempool caches you= should not see the problem, but then your performance will drop some. I had a similar problem and made sure only one thread puts or gets buffers = from the same mempool. >=20 >=20 > Thank you, >=20 > Andrei Comandatu >=20 >=20 Regards, Keith