From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124]) by inbox.dpdk.org (Postfix) with ESMTP id 011344571A; Fri, 2 Aug 2024 15:35:30 +0200 (CEST) Received: from mails.dpdk.org (localhost [127.0.0.1]) by mails.dpdk.org (Postfix) with ESMTP id E1C0842E5C; Fri, 2 Aug 2024 15:35:29 +0200 (CEST) Received: from mail-pg1-f172.google.com (mail-pg1-f172.google.com [209.85.215.172]) by mails.dpdk.org (Postfix) with ESMTP id 297FB40E20 for ; Fri, 2 Aug 2024 15:35:28 +0200 (CEST) Received: by mail-pg1-f172.google.com with SMTP id 41be03b00d2f7-7a10b293432so5143664a12.0 for ; Fri, 02 Aug 2024 06:35:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=iol.unh.edu; s=unh-iol; t=1722605727; x=1723210527; darn=dpdk.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=w3MZOO2T8HDDPcd+MYUbU1lRnsCZ79wZIxIzuvahOPc=; b=BPUfIjZO/tl4WMKZt1ys72wo+G74jVyb31c6pg5MRqz75JSc5973g7c+q1c8eTbTh/ wLZn9qYSJ8GPhtvFSNRhj0dLtcEI/rKKtleqHVOeYOXFDUhTgvroWjf3+r03IGjoqR1P 6noCcbK85XYOy0q1KiAxl7voK7vgAOrxCNGq0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722605727; x=1723210527; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=w3MZOO2T8HDDPcd+MYUbU1lRnsCZ79wZIxIzuvahOPc=; b=dP3rCij7DDyE0mMMgmvnw0nSjfsubb9kaY7KqEnVb4C8LEGVsohb229BivixEvf+C5 CUDPrOaE4BpbqUv+3pmTI1wxGLm36wkU3eHjiMvGqm+Dtt4wwdgLvdQtq5yu6b7l4sBq 2NABBEC4UeZpoVztHxNRF87GSSMperlbWGCvwmZBZEOSAkwNpUioNbC/NnOEKOIkGGUw eTBEff0A/E90sPRpR2cRakJU9Ak4mVGELmXc5iRZfgV7AN9tLQQZ8ZVku14v9ufbMxco gTptp2er/lZpJtPnKErhMtfhXq+Qu5hZzV4WGGTEu7VR31m0koltVaKdslA5KzUxIt8Y 1B/A== X-Forwarded-Encrypted: i=1; AJvYcCUCK9YBdtHZBOV6/FvMCVb9DZ8DjXiKLka9tzEfpRzA4cXvntQk/7Hk2Z14mi5O1mpCQpCLulVrr4FJYj8= X-Gm-Message-State: AOJu0YxsNZ9u+rfpUIhhZqUL/c+Q43/YYmmgWZR9SUjVKhuF7nwFu2Ah DmOMGC2O+0lczE+Ka9q9bGGd2ckBEbFdN3wNlZtGL53HPnzFPy9b1PFsELqn2WVb/bbO5HbvSue iI39KWgRhZ4qx4ZSd8ASvfgS5X1TFyAxHXC2oHQ== X-Google-Smtp-Source: AGHT+IFrAy/euJ3/lx9ibLXDSWAf8tFu1DU+hWNq1F8jfbGCB56zTs/uI1dzz3LFxARsxzsr2ShC5RNPLiEjCpN7pls= X-Received: by 2002:a17:90a:cf0f:b0:2c7:8a94:215d with SMTP id 98e67ed59e1d1-2cff94040d2mr4241869a91.12.1722605726895; Fri, 02 Aug 2024 06:35:26 -0700 (PDT) MIME-Version: 1.0 References: <20240729203955.267942-1-jspewock@iol.unh.edu> <20240730133459.21907-1-jspewock@iol.unh.edu> <20240730133459.21907-2-jspewock@iol.unh.edu> In-Reply-To: From: Jeremy Spewock Date: Fri, 2 Aug 2024 09:35:13 -0400 Message-ID: Subject: Re: [PATCH v2 1/1] dts: add text parser for testpmd verbose output To: Luca Vizzarro Cc: yoan.picchi@foss.arm.com, probb@iol.unh.edu, paul.szczepanek@arm.com, npratte@iol.unh.edu, thomas@monjalon.net, Honnappa.Nagarahalli@arm.com, juraj.linkes@pantheon.tech, wathsala.vithanage@arm.com, dev@dpdk.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org On Thu, Aug 1, 2024 at 4:41=E2=80=AFAM Luca Vizzarro wrote: > > Great work Jeremy! Just a couple of minor passable improvement points. > > On 30/07/2024 14:34, jspewock@iol.unh.edu wrote: > > > +@dataclass > > +class TestPmdVerbosePacket(TextParser): > > + """Packet information provided by verbose output in Testpmd. > > + > > + The "receive/sent queue" information is not included in this datac= lass because this value is > > + captured on the outer layer of input found in :class:`TestPmdVerbo= seOutput`. > > + """ > > + > > + #: > > + src_mac: str =3D field(metadata=3DTextParser.find(r"src=3D({})".fo= rmat(REGEX_FOR_MAC_ADDRESS))) > Just a(n optional) nit: TextParser.find(f"src=3D({REGEX_FOR_MAC_ADDRESS})= ") > The raw string is only needed to prevent escaping, which we don't do here= . Ack. I really just left it this way because it also adjusts highlighting in some IDEs, but there isn't much to see here anyway. > > + #: > > + dst_mac: str =3D field(metadata=3DTextParser.find(r"dst=3D({})".fo= rmat(REGEX_FOR_MAC_ADDRESS))) > As above. Ack. > > + #: Memory pool the packet was handled on. > > + pool: str =3D field(metadata=3DTextParser.find(r"pool=3D(\S+)")) > > + #: Packet type in hex. > > + p_type: int =3D field(metadata=3DTextParser.find_int(r"type=3D(0x[= a-fA-F\d]+)")) > > + #: > > > > > + @staticmethod > > + def extract_verbose_output(output: str) -> list[TestPmdVerboseOutp= ut]: > > + """Extract the verbose information present in given testpmd ou= tput. > > + > > + This method extracts sections of verbose output that begin wit= h the line > > + "port X/queue Y: sent/received Z packets" and end with the ol_= flags of a packet. > > + > > + Args: > > + output: Testpmd output that contains verbose information > > + > > + Returns: > > + List of parsed packet information gathered from verbose in= formation in `output`. > > + """ > > + iter =3D re.finditer(r"(port \d+/queue \d+:.*?(?=3Dport \d+/qu= eue \d+|$))", output, re.S) > > How about using a regex that matches what you described? ;) Keeping re.S: > > (port \d+/queue \d+.+?ol_flags: [\w ]+) > > Would spare you from using complex lookahead constructs and 4.6x less > steps. Maybe it doesn't work with every scenario? Looks like it works > well with the sample output I have. Let me know if it works for you. > I tried using something like this actually which is probably why the docstring reflects that type of language, but I didn't use it because it doesn't match one specific case. I'm not sure how common it is, and I haven't seen it happen in my recent testing, but since the verbose output specifically states that it sent/received X number of packets, I presume there is a case where that number will be more than 1, and there will be more than one set of packet information after that first line. I think I've seen it happen in the past, but I couldn't recreate it in testing. For context to this next section, if it wasn't clear, I consider the `port \d+/queue \d+` line to be the header line and the start of a "block" of verbose output. Basically though the problem with this is that if there are more than one packet under that header line, the lazy approach will only consume up until the ol_flags of the first packet of a block, and the greedy approach consumes everything until the last packet of the entire output. You could use the lazy approach with the next port/queue line as your delimiter, but then the opening line of the next block of output is included in the previous block's group. The only way I could find to get around this and go with the idea of "take everything from the start of this group until another group starts" but without capturing the opening of the next block was a look ahead. Again though, I definitely don't love the regex that I wrote and would love to find a better alternative. > Best, > Luca >