From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id EF69D43E93; Wed, 17 Apr 2024 17:29:06 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id 77A32402BD; Wed, 17 Apr 2024 17:29:06 +0200 (CEST) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by mails.dpdk.org (Postfix) with ESMTP id 1A80D4003C for ; Wed, 17 Apr 2024 17:29:05 +0200 (CEST) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id DFBAB2F; Wed, 17 Apr 2024 08:29:31 -0700 (PDT) Received: from [192.168.50.86] (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 713213F738; Wed, 17 Apr 2024 08:29:03 -0700 (PDT) Message-ID: <7165977f-6371-4398-b34d-eaeeaa1ef379@arm.com> Date: Wed, 17 Apr 2024 16:29:01 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 4/5] dts: add `show port info` command to TestPmdShell Content-Language: en-GB From: Luca Vizzarro To: =?UTF-8?Q?Juraj_Linke=C5=A1?= Cc: dev@dpdk.org, Jeremy Spewock , Paul Szczepanek References: <20240412111136.3470304-1-luca.vizzarro@arm.com> <20240412111136.3470304-5-luca.vizzarro@arm.com> <4f17ef06-c508-495a-a0f8-a28e9e77a1f9@arm.com> <68da0ef2-430b-42af-8c1d-026760cfa4f1@arm.com> In-Reply-To: <68da0ef2-430b-42af-8c1d-026760cfa4f1@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On 17/04/2024 15:25, Luca Vizzarro wrote: > On 17/04/2024 14:22, Juraj Linkeš wrote: >>> I'll >>> experiment with some look ahead constructs. The easiest solution is to >>> match everything that is not * ([^*]+) but can we be certain that there >>> won't be any asterisk in the actual information? >> >> We can't. But we can be reasonably certain there won't be five >> consecutive asterisks, so maybe we can work with that. > > We can work with that by using look ahead constructs as mentioned, which > can be quite intensive. For example: > >   /(?<=\n\*).*?(?=\n\*|$)/gs > > looks for the start delimiter and for the start of the next block or the > end. This works perfectly! But it's performing 9576 steps (!) for just > two ports. The current solution only takes 10 steps in total. Thinking of it... we are not really aiming for performance, so I guess if it simplifies and it's justifiable, then it's not a problem. Especially since this command shouldn't be called continuosly. The equivalent /\n\*.+?(?=\n\*|$)/gs (but slightly more optimised) takes approximately 3*input_length steps to run (according to regex101 at least). If that's reasonable enough, I can do this: iter = re.finditer(input, "\n\*.+?(?=\n\*|$)", re.S) return [TestPmdPortInfo.parse(match.group(0)) for match in iter] Another optimization is artificially adding a `\n*` delimiter at the end before feeding it to the regex, thus removing the alternative case (|$), and making it 2*len steps: input += "\n*" iter = re.finditer(input, "\n\*.+?(?=\n\*)", re.S) return [TestPmdPortInfo.parse(match.group(0)) for match in iter] Let me know what you think! Best, Luca