From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 4BC44A0350; Tue, 23 Jun 2020 13:28:28 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6A91B1D640; Tue, 23 Jun 2020 13:28:27 +0200 (CEST) Received: from smtp.tuxdriver.com (charlotte.tuxdriver.com [70.61.120.58]) by dpdk.org (Postfix) with ESMTP id 98BA11BEC5 for ; Tue, 23 Jun 2020 13:28:25 +0200 (CEST) Received: from [2605:a601:a627:ca00:88f6:82e4:5f1a:31bb] (helo=localhost) by smtp.tuxdriver.com with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.63) (envelope-from ) id 1jnh63-0000Qx-W7; Tue, 23 Jun 2020 07:28:21 -0400 Date: Tue, 23 Jun 2020 07:28:06 -0400 From: Neil Horman To: Dmitry Kozlyuk Cc: dev@dpdk.org, Bruce Richardson , Thomas Monjalon , robin.jarry@6wind.com Message-ID: <20200623112806.GA301645@hmswarspite.think-freely.org> References: <20200622004503.29036-1-dmitry.kozliuk@gmail.com> <20200622124117.GA216823@hmswarspite.think-freely.org> <20200622223946.25977c13@sovereign> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200622223946.25977c13@sovereign> X-Spam-Score: -2.9 (--) X-Spam-Status: No Subject: Re: [dpdk-dev] [RFC PATCH 0/2] pmdinfogen: rewrite in Python X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On Mon, Jun 22, 2020 at 10:39:46PM +0300, Dmitry Kozlyuk wrote: > On Mon, 22 Jun 2020 08:41:17 -0400, Neil Horman wrote: > > On Mon, Jun 22, 2020 at 03:45:01AM +0300, Dmitry Kozlyuk wrote: > [snip] > > > 1. No standard ELF or COFF module for Python > > > (amount of Python code without libelf on par with C code using it). > > Thats not really true, pyelftools is pretty widely used, and packaged for > > most(all) major distributions. Requiring this kicks the can for (2) above down > > the road a bit, but I would prefer to see that used rather than a custom > > implementation, just to reduce the maint. burden > > I must have underestimated pyelftools spread. I'll look into using it. The > less new code the better, I agree here. Windows users can get it via PyPI. > Ack. > > > > 2. struct rte_pci_id must be synchronized with header file > > > (it's a few lines that never change). > > > > > I think you can just use ctypes and the native python C interface to just import > > the rte_pci_id structure from the native dpdk header to avoid this, no? > > Not sure I follow. RFC script uses ctypes to describe struct rte_pci_id > in Python. If you're suggesting to create a Python module in C just to include > a header with a single small structure, I'd say it's not worth the trouble. > Yeah, apologies, i misread what you were saying here, and didn't bother to check the code. what you're doing makes sense. > > > > 1. Support for >65K section headers seems present in current pmdinfogen. > > > However, the data it reads is not used after. Is it really needed? > > > > > Its used. > > The support you're referring to is primarily the ability to extract the true > > number of section headers from the ELF file (in the event that its more than > > 64k). Without that, on very large binaries, you may not be able to find the > > symbol table in the ELF file. > > I was talking about these fields of struct elf_info: > > /* if Nth symbol table entry has .st_shndx = SHN_XINDEX, > * take shndx from symtab_shndx_start[N] instead > */ > Elf32_Word *symtab_shndx_start; > Elf32_Word *symtab_shndx_stop; > > They are not used after being filled in parse_elf() and their endianness > fixed in the end despite the comment. > Its been a while since I wrote this, so I need to go back and look closely, but as you say, these values aren't used after parse_elf completes, but they are(I think) used to correct the endianess of the section header index table, so that get_sym_value works properly. You would (again I think), only note a problem if you were parsing an ELF file for an architecture with an endianess opposite that of what you are building on (i.e. an x86 system cross compiling for a powerpc target). > > > > 2. How much error-handling is required? This is a build-time tool, > > > and Python gives nice stacktraces. However, segfaults are possible > > > in Python version due to pointer handling. IMO, error checking > > > must be just sufficient to prevent silent segfaults. > > > > > I'm not really sure what the question is here. You need to handle errors in the > > parsing process, we can't just have the tool crashing during the build. How > > thats handled is an implementation detail I would think. > > Consider a snippet from pmdinfogen.c: > > /* Check if file offset is correct */ > if (hdr->e_shoff > info->size) { > fprintf(stderr, "section header offset=%lu in file '%s' " > "is bigger than filesize=%lu\n", > (unsigned long)hdr->e_shoff, > filename, info->size); > return 0; > } > > It is required in C, because otherwise pmdinfogen would crash without > diagnostic. With Python, most errors like this result in a crash with a trace > to the error line and a message from ctypes (or ELF parsing library). I'm > arguing for leaving only the checks that prevent *silent* crashes (this can > happen with ctypes) which are hard to diagnose. Motivation is to keep the > code simple. > Hmm, I'd defer to others opinions on that. As a build tool it may be ok to just crash with a backtrace, but I'm personally not a fan of that. Id rather see a diagnostic error and have the tool exits with an appropriate exit code, but lets see what others say. > > > > 3. On Unix, pmdinfogen is called for each object file extracted with ar > > > from an .a library by a shell script. On Windows, other tools > > > have to be used, shell script will not work. On the other hand, COFF > > > library format is quite simple. Would it be appropriate for > > > pmdinfogen to handle it to avoid intermediate script? > > > > > I suppose you could, but extracting objects from archives seems outside the > > scope of what pmdinfogen normally does. What is the problem with using a shell > > script on windows? I thought WSL had support for executing bash syntax shell > > scripts there? Why not key off an environment variable to make the relevant > > scripts do the right thing on your environment? > > WSL2 is a Linux VM integrated in Windows, you can only cross-compile from > there. Native builds can use Python or PowerShell, which is as capable as Unix > shells, but incompatible with them. To call lib.exe (MSVC analog of ar) is > probably simpler then parsing COFF, yes. So meson could select one of the > following for a Windows target (objects are COFF): > > Host Toolchain Archive Script Extractor > ------- --------- ------- --------- --------- > Linux MinGW .a sh ar > Windows MinGW .a PowerShell ar > Windows Clang .lib PowerShell lib I think if I read you right, what you're suggesting here is that meson setup a build time marco $ARCHIVETOOL, and set it to ar or lib.exe depending on the build environment, and just have the script in question use $ARCHIVER? If so, I'd be good with that. Do we have to worry about the use of divergent command line options for each tool as well? Neil > > -- > Dmitry Kozlyuk >