From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 7B3D6A0527; Wed, 25 Nov 2020 12:05:50 +0100 (CET) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id AA86DC966; Wed, 25 Nov 2020 12:05:47 +0100 (CET) Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by dpdk.org (Postfix) with ESMTP id B731FC956; Wed, 25 Nov 2020 12:05:45 +0100 (CET) IronPort-SDR: aO8XH0SErY6EdWcpfKlHkfVK/LcajYsQCKhsG9sLMGVyPtVdWc8XnmhPPZYohfpkLExQNb/kCT GznWW/59X+Rw== X-IronPort-AV: E=McAfee;i="6000,8403,9815"; a="236245448" X-IronPort-AV: E=Sophos;i="5.78,368,1599548400"; d="scan'208";a="236245448" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Nov 2020 03:05:43 -0800 IronPort-SDR: atNVc39Q3EgWyR5BM/J+B+vbtSqswN7eN/nbUfu02JsQEE81BgxQtkrGg22nafTyeP6vyAiB8A 1zHTSmQEzJcg== X-IronPort-AV: E=Sophos;i="5.78,368,1599548400"; d="scan'208";a="478877913" Received: from fyigit-mobl1.ger.corp.intel.com (HELO [10.213.195.226]) ([10.213.195.226]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Nov 2020 03:05:42 -0800 To: David Marchand , Timothy Redaelli Cc: Anatoly Burakov , Bruce Richardson , dev , dpdk stable References: <4a04b06b040ac16c45addf92fb43f59b070f185a.1606229937.git.tredaelli@redhat.com> From: Ferruh Yigit Message-ID: <9992d259-0fc6-2597-55ab-ebb11de951ad@intel.com> Date: Wed, 25 Nov 2020 11:05:38 +0000 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [dpdk-stable] [PATCH 2/2] eal: fix loading of shared libs from driver plugin directories X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 11/25/2020 10:01 AM, David Marchand wrote: > On Tue, Nov 24, 2020 at 4:15 PM Timothy Redaelli wrote: >> >> Commit 49b536fc3060 ("eal: load only shared libs from driver plugin directories") >> introduced a check that any shared library must ends with .so, but it can't >> work, at least, on Fedora/CentOS/RHEL since .so symlinks are not installed >> when you install dpdk package, but only when you install dpdk-devel package. >> >> This commit adds also a check for .so.ABI_VERSION to check for shared >> lib. >> >> See Fedora Packaging Guidelines for more informations: >> https://docs.fedoraproject.org/en-US/packaging-guidelines/#_devel_packages >> >> Fixes: 49b536fc3060 ("eal: load only shared libs from driver plugin directories") >> Cc: bruce.richardson@intel.com >> Signed-off-by: Timothy Redaelli >> --- >> lib/librte_eal/common/eal_common_options.c | 6 ++++-- >> 1 file changed, 4 insertions(+), 2 deletions(-) >> >> diff --git a/lib/librte_eal/common/eal_common_options.c b/lib/librte_eal/common/eal_common_options.c >> index e6f77ad217..e9e2362c53 100644 >> --- a/lib/librte_eal/common/eal_common_options.c >> +++ b/lib/librte_eal/common/eal_common_options.c >> @@ -402,8 +402,10 @@ eal_plugindir_init(const char *path) >> struct stat sb; >> int nlen = strnlen(dent->d_name, sizeof(dent->d_name)); >> >> - /* check if name ends in .so */ >> - if (strcmp(&dent->d_name[nlen - 3], ".so") != 0) >> + /* check if name ends in .so or .so.ABI_VERSION */ >> + if (strcmp(&dent->d_name[nlen - 3], ".so") != 0 && >> + strcmp(&dent->d_name[nlen - 4 - strlen(ABI_VERSION)], >> + ".so."ABI_VERSION) != 0) >> continue; >> >> snprintf(sopath, sizeof(sopath), "%s/%s", path, dent->d_name); > > A first doubt I had about this patch is about multiple loading of the > same driver, and its effects. > So I first had a look at the main branch current state (using > devtools/test-null.sh) and I can see multiple loading already happens > for statically linked drivers like the bond driver. > > $ LD_LIBRARY_PATH=build/lib gdb build/app/dpdk-testpmd -ex "b main" > -ex "run -c 3 --no-huge -m 20 -d build/drivers '--log-level=*:debug' > -a 0:0.0 --vdev net_null1 --vdev net_null2 -- --no-mlockall > --total-num-mbufs=2048 -ia" > ... > (gdb) info sharedlibrary > ... > 0x00007ffff796d5b0 0x00007ffff797d2d6 Yes > /home/dmarchan/dpdk/build/app/../drivers/librte_net_bond.so.21 > ... > (gdb) b rte_eal_init > (gdb) c > (gdb) finish > ... > EAL: open shared lib build/drivers/librte_net_avp.so > EAL: open shared lib build/drivers/librte_net_bond.so > EAL: open shared lib build/drivers/librte_net_ixgbe.so > ... > (gdb) set $elm=*(struct shared_driver *)solib_list.tqh_first > (gdb) while $elm > p $elm > set $elm=*(struct shared_driver *)($elm.next.tqe_next) > end > ... > $67 = {next = {tqe_next = 0x5c3350, tqe_prev = 0x5c1310}, name = > "build/drivers/librte_net_avp.so", '\000' , > lib_handle = 0x616870} > $68 = {next = {tqe_next = 0x5c4370, tqe_prev = 0x5c2330}, name = > "build/drivers/librte_net_bond.so", '\000' , > lib_handle = 0x7ffff7995a80} > $69 = {next = {tqe_next = 0x5c5390, tqe_prev = 0x5c3350}, name = > "build/drivers/librte_net_ixgbe.so", '\000' , > lib_handle = 0x7ffff7942fc0} > ... > > I could not confirm the 0x7ffff7995a80 handle points at the first load > of the bond driver. > gdb does not seem to know of the linker internal structures, a bit > hard to tell but the addresses are in the same range, so likely the > same object. > > I confirmed the bond constructor only being called once when the bond > driver is loaded because of static link: > > $ LD_LIBRARY_PATH=build/lib gdb build/app/dpdk-testpmd -ex "b > vdrvinitfn_pmd_bond_drv" -ex "run -c 3 --no-huge -m 20 -d > build/drivers '--log-level=*:debug' -a 0:0.0 --vdev net_null1 --vdev > net_null2 -- --no-mlockall --total-num-mbufs=2048 -ia" > GNU gdb (GDB) Fedora 8.3.50.20190824-30.fc31 > ... > Reading symbols from build/app/dpdk-testpmd... > Function "vdrvinitfn_pmd_bond_drv" not defined. > Make breakpoint pending on future shared library load? (y or [n]) y > Breakpoint 1 (vdrvinitfn_pmd_bond_drv) pending. > Starting program: /home/dmarchan/dpdk/build/app/dpdk-testpmd -c 3 > --no-huge -m 20 -d build/drivers '--log-level=*:debug' -a 0:0.0 --vdev > net_null1 --vdev net_null2 -- --no-mlockall --total-num-mbufs=2048 -ia > [Thread debugging using libthread_db enabled] > Using host libthread_db library "/usr/lib64/libthread_db.so.1". > > Breakpoint 1, vdrvinitfn_pmd_bond_drv () at > ../drivers/net/bonding/rte_eth_bond_pmd.c:3755 > 3755 RTE_PMD_REGISTER_VDEV(net_bonding, pmd_bond_drv); > Missing separate debuginfos, use: dnf debuginfo-install > elfutils-libelf-0.179-2.fc31.x86_64 glibc-2.30-13.fc31.x86_64 > jansson-2.12-4.fc31.x86_64 libbsd-0.9.1-4.fc31.x86_64 > libfdt-1.6.0-1.fc31.x86_64 numactl-libs-2.0.12-3.fc31.x86_64 > zlib-1.2.11-20.fc31.x86_64 > > (gdb) c > Continuing. > ... > EAL: open shared lib build/drivers/librte_net_avp.so > EAL: open shared lib build/drivers/librte_net_bond.so > EAL: open shared lib build/drivers/librte_net_ixgbe.so > ... > testpmd> > > > Now, about this patch. > In this test, it has the effect of loading all drivers twice (or > thrice if statically linked). > I see no problem with it since the linker is intelligent enough and > constructors are only called once. > > > My only remaining doubt is whether we should be looking for > .ABI_VERSION or .ABI_MAJOR extension. > This could be discussed, but in its current form, this patch lgtm. > > Acked-by: David Marchand > > I also checked the same, yes same file processed twice, with actual name and symlink, but since eal using 'realpath()' before 'dlopen()' the path passed to the 'dlopen()' is exact same for both symlink and actual file. And 'dlopen()' has following in the man page: " If the same shared object is opened again with dlopen(), the same object handle is returned. The dynamic linker maintains reference counts for object handles, so a dynamically loaded shared object is not deallocated until dlclose() has been called on it as many times as dlopen() has succeeded on it. Constructors (see below) are called only when the object is actually loaded into memory (i.e., when the reference count increases to 1). " As I run/debug with or without the symlinks, it looks OK with patch, Tested-by: Ferruh Yigit