From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mx04.mlp.com (MX04.mlp.com [217.33.179.27]) by dpdk.org (Postfix) with ESMTP id 130734C79 for ; Wed, 28 Feb 2018 19:53:42 +0100 (CET) IronPort-PHdr: =?us-ascii?q?9a23=3A6YLEmxJ4LsGEzpFEMdmcpTZWNBhigK39O0sv0rFi?= =?us-ascii?q?tYgeL/TxwZ3uMQTl6Ol3ixeRBMOHs6kC07KempujcFRI2YyGvnEGfc4EfD4+ou?= =?us-ascii?q?JSoTYdBtWYA1bwNv/gYn9yNs1DUFh44yPzahANS47xaFLIv3K98yMZFAnhOgpp?= =?us-ascii?q?POT1HZPZg9iq2+yo9JDffwtFiCChbb9uMR67sRjfus4KjIV4N60/0AHJonxGe+?= =?us-ascii?q?RXwWNnO1eelAvi68mz4ZBu7T1et+ou+MBcX6r6eb84TaFDAzQ9L281/szrugLd?= =?us-ascii?q?QgaJ+3ART38ZkhtMAwjC8RH6QpL8uTb0u+ZhxCWXO9D9QLYpUjqg8qhrUgflhi?= =?us-ascii?q?cZOTAk7GHZhM9+jKNHrx2uvBFw2ZLYbYWPOfZiYq/RY9UXTndBUMZLUCxBB5ux?= =?us-ascii?q?Y4UJAeUbPOdXtZP9qEUIrRu9AgmgHP7kxTBMhnDswKIxzuAtHgHB3Aw6G9IBrW?= =?us-ascii?q?/arMjvO6cUTeC5yafExijEYvxZ3Tfy8pXHfgonr/6WXLN/a9DRxlcpFwPGiVWd?= =?us-ascii?q?soLkPzSP1uQJrmeb9vdgWvipi247sQ1+vj+vxsI1h4TPm4kbyUjE+D1kzIopP9?= =?us-ascii?q?G1S092bcSqHZdOrS2XOYh7Tts/T210oio3yqcKtJy7cSQQ1Zgr2wPTZ+Saf4WJ?= =?us-ascii?q?5h/vTvidLDd3iX5/Zb6yhgy+/Va+xuD4UMS/zUxEoTBfktbWs3AAzxnT6s+aRf?= =?us-ascii?q?Rj5kqhwjOP1xzL6uFDPEA0ibLXK54/zb40kZoeqUvNES74l0vrjqGYaFsp9vSm?= =?us-ascii?q?5uj9frjoqIWQOYhyhA/kKKghhsu/AeEgPggPWWiU5/i82aX+8UHlWrlHjfw7nr?= =?us-ascii?q?PXvZzEP8gWqK20DxdQ0ok56ha/Czmm0M4fnXkCNF9KdxaHgJL0NF3UJv73F/a+?= =?us-ascii?q?jE62kDh1wfDGPbrhD47DL3jEirfheaty61dByAUpy9Bf+4hYBa0GIPL2QkPxrs?= =?us-ascii?q?DXDgclMwyoxObqEMh91pgAVmKVGa+UK77dsUeV6eIsOeWMY5UVuDmuY8QisrSh?= =?us-ascii?q?kGQ0g0Q1cbGl3t0Qcn/3VLwyPl6Ue2HEjs8NEiENpAVoH8Lwj1jXGxpadX+oUq?= =?us-ascii?q?Q/4XVzNoavAc+DeYe3hLC4wS69AtwcLjR4DlmAV1PhZYyeVvEPbwqNJdVlniUZ?= =?us-ascii?q?XL69DYQm0Ef950fB17N7I7+MqWUjvpX52Y0t6g=3D=3D?= X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2GIAQAN+pZa/yYgVgpdHAEBAQQBAQoBA?= =?us-ascii?q?YJagUwQgSKNbJEGgjKMWoUfghUdhR0CgyoYAQIBAQEBAQECAQKBEII4IoJ+XgE?= =?us-ascii?q?VBRBWFw8BBBsThB2vAyGIR4IWAQEIAiaFI4N/ijWDO4IyBZpSCQKaG4VMkwEeg?= =?us-ascii?q?gtwgxOEWYwcgRcBAQE?= X-IPAS-Result: =?us-ascii?q?A2GIAQAN+pZa/yYgVgpdHAEBAQQBAQoBAYJagUwQgSKNbJE?= =?us-ascii?q?GgjKMWoUfghUdhR0CgyoYAQIBAQEBAQECAQKBEII4IoJ+XgEVBRBWFw8BBBsTh?= =?us-ascii?q?B2vAyGIR4IWAQEIAiaFI4N/ijWDO4IyBZpSCQKaG4VMkwEeggtwgxOEWYwcgRc?= =?us-ascii?q?BAQE?= X-IronPort-AV: E=Sophos;i="5.47,406,1515474000"; d="scan'208,217";a="21790894" X-IronPort-Outbreak-Status: No, level 0, Unknown - Unknown Received: from unknown (HELO EXUSHTC03.AD.MLP.com) ([10.86.32.38]) by smtpslg.mlp.com with ESMTP; 28 Feb 2018 13:53:40 -0500 Received: from PWSSMTEXMBX002.AD.MLP.com ([169.254.16.229]) by EXUSHTC03.AD.MLP.com ([::1]) with mapi id 14.03.0248.002; Wed, 28 Feb 2018 13:53:39 -0500 From: "Lazarenko, Vlad (WorldQuant)" To: "'users@dpdk.org'" Thread-Topic: Multi-process recovery (is it even possible?) Thread-Index: AdOwxXAvHZ5QGX6aRnmNCp+LjczNoA== Date: Wed, 28 Feb 2018 18:53:39 +0000 Message-ID: <790E2AC11206AC46B8F4BB82078E34F8081E0EAB@PWSSMTEXMBX002.AD.MLP.com> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.5.75.101] MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable X-Content-Filtered-By: Mailman/MimeDel 2.1.15 Subject: [dpdk-users] Multi-process recovery (is it even possible?) X-BeenThere: users@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK usage discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Wed, 28 Feb 2018 18:53:42 -0000 Guys, I am looking for possible solutions for the following problems that come al= ong with asymmetric multi-process architecture... Given multiple processes share the same RX/TX queue(s) and packet pool(s) a= nd the possibility of one packet from RX queue being fanned out to multiple= slave processes, is there a way to recover from slave crashing (or exits w= /o cleaning up properly)? In theory it could have incremented mbuf referenc= e count more than once and unless everything is restarted, I don't see a re= liable way to release those mbufs back to the pool. Also, if spinlock is involved and either master or slave crashes, everythin= g simply gets stuck. Is there any way to detect this (i.e. outside of data = path)..? Thanks, Vlad ###########################################################################= ######## The information contained in this communication is confidential, may be subject to legal privilege, and is intended only for the individual named. If you are not the named addressee, please notify the sender immediately and delete this email from your system. The views expressed in this email are the views of the sender only. Outgoing and incoming electronic communicati= ons to this address are electronically archived and subject to review and/or di= sclosure to someone other than the recipient. ###########################################################################= ########