From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 24582A04DC;
	Tue, 20 Oct 2020 14:21:16 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id B7A61BAD4;
	Tue, 20 Oct 2020 14:21:13 +0200 (CEST)
Received: from inbox.dpdk.org (xvm-172-178.dc0.ghst.net [95.142.172.178])
 by dpdk.org (Postfix) with ESMTP id C2FD6A9AD
 for <dev@dpdk.org>; Tue, 20 Oct 2020 14:21:12 +0200 (CEST)
Received: by inbox.dpdk.org (Postfix, from userid 33)
 id A318EA04DD; Tue, 20 Oct 2020 14:21:11 +0200 (CEST)
From: bugzilla@dpdk.org
To: dev@dpdk.org
Date: Tue, 20 Oct 2020 12:21:10 +0000
X-Bugzilla-Reason: AssignedTo
X-Bugzilla-Type: new
X-Bugzilla-Watch-Reason: None
X-Bugzilla-Product: DPDK
X-Bugzilla-Component: core
X-Bugzilla-Version: 18.11
X-Bugzilla-Keywords: 
X-Bugzilla-Severity: critical
X-Bugzilla-Who: amohakud@rbbn.com
X-Bugzilla-Status: UNCONFIRMED
X-Bugzilla-Resolution: 
X-Bugzilla-Priority: Normal
X-Bugzilla-Assigned-To: dev@dpdk.org
X-Bugzilla-Target-Milestone: ---
X-Bugzilla-Flags: 
X-Bugzilla-Changed-Fields: bug_id short_desc product version rep_platform
 op_sys bug_status bug_severity priority component assigned_to reporter
 target_milestone attachments.created
Message-ID: <bug-561-3@http.bugs.dpdk.org/>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Bugzilla-URL: http://bugs.dpdk.org/
Auto-Submitted: auto-generated
X-Auto-Response-Suppress: All
MIME-Version: 1.0
Subject: [dpdk-dev] [Bug 561] EAL: secondary dpdk application fails to come
 up in 5.4.35 kernel due to memzone_init failure
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

https://bugs.dpdk.org/show_bug.cgi?id=3D561

            Bug ID: 561
           Summary: EAL: secondary dpdk application fails to come up in
                    5.4.35 kernel due to memzone_init failure
           Product: DPDK
           Version: 18.11
          Hardware: x86
                OS: Linux
            Status: UNCONFIRMED
          Severity: critical
          Priority: Normal
         Component: core
          Assignee: dev@dpdk.org
          Reporter: amohakud@rbbn.com
  Target Milestone: ---

Created attachment 126
  --> https://bugs.dpdk.org/attachment.cgi?id=3D126&action=3Dedit
EAL: secondary dpdk application fails to come up in 5.4.35 kernel due to
memzone_init failure

DPDK version: 18.11.6
Kernel version: 5.4.35

Secondary DPDK process fails to come up in 5.4.35 kernel, throwing below er=
ror
messages

 EAL: Probing VFIO support...
 EAL: VFIO support initialized
 EAL: Cannot attach to memzone list
 EAL: Cannot init memzone
 EAL: Error - exiting with code: 1

>From the code analysis, I could notice that in 5.4.35 kernel, the behavior =
of
the rte_eal_init() in the context of primary process is different than that=
 of
4.19 or 4.15 kernel.
In 5.4.35 kernel, the primary process cleans up the dpdk run time directory=
 in
eal_clean_runtime_dir() function, as a result the file
/var/run/dpdk/rte/fbarray_memzone gets deleted. so the secondary process wh=
ile
coming up fails to attach to memzone and exits. Basically the flock system =
call
in eal_clean_runtime_dir() function succeeds in primary process, due to whi=
ch
the file gets deleted.

But in 4.19/4.15 kernel, the flock system call always fails in primary proc=
ess
and secondary process is able to attach to the already created memzone.




int
eal_clean_runtime_dir(void)
{
       DIR *dir;
       struct dirent *dirent;
       int dir_fd, fd, lck_result;
       static const char * const filters[] =3D {
              "fbarray_*",
              "mp_socket_*"
       };
       /* open directory */
       dir =3D opendir(runtime_dir);
       if (!dir) {
              RTE_LOG(ERR, EAL, "Unable to open runtime directory %s\n",
                           runtime_dir);
              goto error;
       }
       dir_fd =3D dirfd(dir);

       /* lock the directory before doing anything, to avoid races */
       if (flock(dir_fd, LOCK_EX) < 0) {
              RTE_LOG(ERR, EAL, "Unable to lock runtime directory %s\n",
                     runtime_dir);
              goto error;
       }

       dirent =3D readdir(dir);
       if (!dirent) {
              RTE_LOG(ERR, EAL, "Unable to read runtime directory %s\n",
                           runtime_dir);
              goto error;
       }

       while (dirent !=3D NULL) {
              unsigned int f_idx;
              bool skip =3D true;

              /* skip files that don't match the patterns */
              for (f_idx =3D 0; f_idx < RTE_DIM(filters); f_idx++) {
                     const char *filter =3D filters[f_idx];

                     if (fnmatch(filter, dirent->d_name, 0) =3D=3D 0) {
                           skip =3D false;
                           break;
                     }
              }
              if (skip) {
                     dirent =3D readdir(dir);
                     continue;
              }

              /* try and lock the file */
              fd =3D openat(dir_fd, dirent->d_name, O_RDONLY);

              /* skip to next file */
              if (fd =3D=3D -1) {
                     dirent =3D readdir(dir);
                     continue;
              }

              /* non-blocking lock */
              lck_result =3D flock(fd, LOCK_EX | LOCK_NB);

              /* if lock succeeds, remove the file */
              if (lck_result !=3D -1)
                     unlinkat(dir_fd, dirent->d_name, 0);
              close(fd);
              dirent =3D readdir(dir);
       }

       /* closedir closes dir_fd and drops the lock */
       closedir(dir);
       return 0;

error:
       if (dir)
              closedir(dir);
       RTE_LOG(ERR, EAL, "Error while clearing runtime dir: %s\n",
              strerror(errno));

       return -1;
}

--=20
You are receiving this mail because:
You are the assignee for the bug.=