From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <konstantin.ananyev@intel.com>
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
 by dpdk.org (Postfix) with ESMTP id 4BE18B0CC
 for <dev@dpdk.org>; Wed, 18 Jun 2014 16:24:06 +0200 (CEST)
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga102.jf.intel.com with ESMTP; 18 Jun 2014 07:17:59 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.01,501,1400050800"; d="scan'208";a="530465897"
Received: from irsmsx104.ger.corp.intel.com ([163.33.3.159])
 by orsmga001.jf.intel.com with ESMTP; 18 Jun 2014 07:23:19 -0700
Received: from irsmsx152.ger.corp.intel.com (163.33.192.66) by
 IRSMSX104.ger.corp.intel.com (163.33.3.159) with Microsoft SMTP Server (TLS)
 id 14.3.123.3; Wed, 18 Jun 2014 15:21:46 +0100
Received: from irsmsx105.ger.corp.intel.com ([169.254.7.239]) by
 IRSMSX152.ger.corp.intel.com ([169.254.6.197]) with mapi id 14.03.0123.003;
 Wed, 18 Jun 2014 15:21:46 +0100
From: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>
To: "Burakov, Anatoly" <anatoly.burakov@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
Thread-Topic: [dpdk-dev] [PATCH v3 0/9] Make DPDK tailqs fully local
Thread-Index: AQHPiuhT6yrl5hk3Kk673j4TIuDlyJt26woQ
Date: Wed, 18 Jun 2014 14:21:45 +0000
Message-ID: <2601191342CEEE43887BDE71AB9772580EFB7DED@IRSMSX105.ger.corp.intel.com>
References: <cover.1403018971.git.anatoly.burakov@intel.com>
 <cover.1403084449.git.anatoly.burakov@intel.com>
In-Reply-To: <cover.1403084449.git.anatoly.burakov@intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v3 0/9] Make DPDK tailqs fully local
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: patches and discussions about DPDK <dev.dpdk.org>
List-Unsubscribe: <http://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <http://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Wed, 18 Jun 2014 14:24:06 -0000



> This issue was reported by OVS-DPDK project, and the fix should go to
> upstream DPDK. This is not memnic-related - this is to do with
> DPDK's rte_ivshmem library.
>=20
> Every DPDK data structure has a corresponding TAILQ reserved for it in
> the runtime config file. Those TAILQs are fully local to the process,
> however most data structures contain pointers to next entry in the
> TAILQ.
>=20
> Since the data structures such as rings are shared in their entirety,
> those TAILQ pointers are shared as well. Meaning that, after a
> successful rte_ring creation, the tailq_next pointer of the last
> ring in the TAILQ will be updated with a pointer to a ring which may
> not be present in the address space of another process (i.e. a ring
> that may be host-local or guest-local, and not shared over IVSHMEM).
> Any successive ring create/lookup on the other side of IVSHMEM will
> result in trying to dereference an invalid pointer.
>=20
> This patchset fixes this problem by creating a default tailq entry
> that may be used by any data structure that chooses to use TAILQs.
> This default TAILQ entry will consist of a tailq_next/tailq_prev
> pointers, and an opaque pointer to arbitrary data. All TAILQ
> pointers from data structures themselves will be removed and
> replaced by those generic TAILQ entries, thus fixing the problem
> of potentially exposing local address space to shared structures.
>=20
> Technically, only rte_ring structure require modification, because
> IVSHMEM is only using memzones (which aren't in TAILQs) and rings,
> but for consistency's sake other TAILQ-based data structures were
> adapted as well.
>=20
> v2 changes:
> * fixed race conditions in *_free operations
> * fixed multiprocess support for malloc heaps
> * added similar changes for acl
> * rebased on top of e88b42f818bc1a6d4ce6cb70371b66e37fa34f7d
>=20
> v3 changes:
> * fixed race reported by Konstantin Ananyev (introduced in v2)
>=20
> Anatoly Burakov (9):
>   eal: map shared config into exact same address as primary process
>   rte_tailq: change rte_dummy to rte_tailq_entry, add data pointer
>   rte_ring: make ring tailq fully local
>   rte_hash: make rte_hash tailq fully local
>   rte_fbk_hash: make rte_fbk_hash tailq fully local
>   rte_mempool: make mempool tailq fully local
>   rte_lpm: make lpm tailq fully local
>   rte_lpm6: make lpm6 tailq fully local
>   rte_acl: make acl tailq fully local
>=20
>  app/test/test_tailq.c                             | 33 +++++-----
>  lib/librte_acl/acl.h                              |  1 -
>  lib/librte_acl/rte_acl.c                          | 74 +++++++++++++++++=
+-----
>  lib/librte_eal/common/eal_common_tailqs.c         |  2 +-
>  lib/librte_eal/common/include/rte_eal_memconfig.h |  5 ++
>  lib/librte_eal/common/include/rte_tailq.h         |  9 +--
>  lib/librte_eal/linuxapp/eal/eal.c                 | 44 ++++++++++++--
>  lib/librte_eal/linuxapp/eal/eal_ivshmem.c         | 17 +++++-
>  lib/librte_hash/rte_fbk_hash.c                    | 73 +++++++++++++++++=
-----
>  lib/librte_hash/rte_fbk_hash.h                    |  3 -
>  lib/librte_hash/rte_hash.c                        | 61 ++++++++++++++++-=
--
>  lib/librte_hash/rte_hash.h                        |  2 -
>  lib/librte_lpm/rte_lpm.c                          | 65 ++++++++++++++++-=
---
>  lib/librte_lpm/rte_lpm.h                          |  2 -
>  lib/librte_lpm/rte_lpm6.c                         | 62 +++++++++++++++--=
--
>  lib/librte_mempool/Makefile                       |  3 +-
>  lib/librte_mempool/rte_mempool.c                  | 37 +++++++++---
>  lib/librte_mempool/rte_mempool.h                  |  2 -
>  lib/librte_ring/Makefile                          |  4 +-
>  lib/librte_ring/rte_ring.c                        | 33 +++++++---
>  lib/librte_ring/rte_ring.h                        |  2 -
>  21 files changed, 415 insertions(+), 119 deletions(-)
>=20
> --

Acked-by: Konstantin Ananyev <konstantin.ananyev@intel.com>