From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from dpdk.org (dpdk.org [92.243.14.124])
	by inbox.dpdk.org (Postfix) with ESMTP id 8A647A0597;
	Wed,  8 Apr 2020 20:06:00 +0200 (CEST)
Received: from [92.243.14.124] (localhost [127.0.0.1])
	by dpdk.org (Postfix) with ESMTP id 90A691C1D7;
	Wed,  8 Apr 2020 20:05:58 +0200 (CEST)
Received: from mga04.intel.com (mga04.intel.com [192.55.52.120])
 by dpdk.org (Postfix) with ESMTP id B2B1D1C1BD;
 Wed,  8 Apr 2020 20:05:55 +0200 (CEST)
IronPort-SDR: zgodRXI1Bj2SmlbeNKcMM3txUkDSa7sGGdurqPA+kkyPadiAjPSyJq+YulpW7dDQEi6zfDIIAU
 dYEIT53bwl8w==
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 08 Apr 2020 11:05:53 -0700
IronPort-SDR: mmOLLY7I8VfhW+P78zc2W1GX9E6ga4y36W2duwLBK4GZhV77vDo7BrjPt4rbopsE/B2rLd7zBo
 ao8IU/c6AIPw==
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.72,359,1580803200"; d="scan'208";a="254042816"
Received: from fmsmsx105.amr.corp.intel.com ([10.18.124.203])
 by orsmga008.jf.intel.com with ESMTP; 08 Apr 2020 11:05:53 -0700
Received: from fmsmsx113.amr.corp.intel.com (10.18.116.7) by
 FMSMSX105.amr.corp.intel.com (10.18.124.203) with Microsoft SMTP Server (TLS)
 id 14.3.439.0; Wed, 8 Apr 2020 11:05:52 -0700
Received: from FMSEDG002.ED.cps.intel.com (10.1.192.134) by
 FMSMSX113.amr.corp.intel.com (10.18.116.7) with Microsoft SMTP Server (TLS)
 id 14.3.439.0; Wed, 8 Apr 2020 11:05:52 -0700
Received: from NAM04-SN1-obe.outbound.protection.outlook.com (104.47.44.57) by
 edgegateway.intel.com (192.55.55.69) with Microsoft SMTP Server
 (TLS) id 14.3.439.0; Wed, 8 Apr 2020 11:05:52 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=jtPraTLsa/1hg5DPGO3asbG1/jxFomb9dTNplyobvS251hpnLhGFnEdB7PLR1kFpUNc0M8uKQIGQFyHoP5wSnh909YE03w3MkUsnYgRNhBYJlREkhYCYa7Oj6p9DD2oiYW+e0DzJioIrfLwhWrwWMWTWTnmapS0Lhd1/Ne5/cy+A4Cel9Asj8F5iTZmnhybv+kMvesLQUE4Svrr6dFXXPsWMdxRKd6dwrDuVUhuEW2rJM3y8E31/2suOwLYC9DZJC/qlEH82TQxEpBdqzgMFEKU1RpXodvsDxqzIsE/5Lw2PuBBKY6/V5STWyfaddxg515XE1xP143gzJPj9+DB4Yg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=ORNVHYXxvkTKG2rT4GgezHdnyDHcsRKkcyRGs5YMsh0=;
 b=oYblcVO1ECYHuV6BNl1U75oUKanmIewNnh/rDydmfHprs7O6AlQJVxHHVm4HsO0F5mepDfdATtvYnsftUT/iJrDds/1rdciHTepYAkqYW/Th+ILMZhYbZRY0P1K+A+8dvtunC3pBlDq9yiRiZEgucxlLCtPyfcCyRrVgn7X0COrbSEaPbiBLqrBDYVQReCQEzxCraMfg3IW4/hVTG1ynJuW2jihZxr/HlFmA++UyCwpJTYsUT1ow5izJuYIrGXMKILTxSJMo0XqG4Hx61Hc0oU/3fykNEr1zgKTH0bbho1Y64unK6EtFdD/3mgIB7AQc0e7ufq7dJchq1Gny/pRbqg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel.onmicrosoft.com; 
 s=selector2-intel-onmicrosoft-com;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;
 bh=ORNVHYXxvkTKG2rT4GgezHdnyDHcsRKkcyRGs5YMsh0=;
 b=nwZEF3L334skuU90wcaqnVOUukrTrhsHEkBXeN5f2tH+5PNRHrrnAEroKdFdQk8J8G0EIzZyu5wsDHSlSCn0KrGpRXOLdoQc8IGQji2QU8XZ5sTJfEcQ2bW0elqk+ZbJzdpnBGMYuEqqDx84ppWqR5XPx5P4+RY+ybr5LjV7ovw=
Received: from BYAPR11MB3143.namprd11.prod.outlook.com (2603:10b6:a03:92::32)
 by BYAPR11MB3638.namprd11.prod.outlook.com (2603:10b6:a03:f8::24)
 with Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.2878.20; Wed, 8 Apr
 2020 18:05:50 +0000
Received: from BYAPR11MB3143.namprd11.prod.outlook.com
 ([fe80::5917:9757:49b0:3c49]) by BYAPR11MB3143.namprd11.prod.outlook.com
 ([fe80::5917:9757:49b0:3c49%5]) with mapi id 15.20.2878.021; Wed, 8 Apr 2020
 18:05:50 +0000
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>, Phil Yang
 <Phil.Yang@arm.com>, "thomas@monjalon.net" <thomas@monjalon.net>, "Ananyev,
 Konstantin" <konstantin.ananyev@intel.com>, "stephen@networkplumber.org"
 <stephen@networkplumber.org>, "maxime.coquelin@redhat.com"
 <maxime.coquelin@redhat.com>, "dev@dpdk.org" <dev@dpdk.org>
CC: "david.marchand@redhat.com" <david.marchand@redhat.com>,
 "jerinj@marvell.com" <jerinj@marvell.com>, "hemant.agrawal@nxp.com"
 <hemant.agrawal@nxp.com>, Gavin Hu <Gavin.Hu@arm.com>, Ruifeng Wang
 <Ruifeng.Wang@arm.com>, Joyce Kong <Joyce.Kong@arm.com>, nd <nd@arm.com>,
 "stable@dpdk.org" <stable@dpdk.org>, nd <nd@arm.com>
Thread-Topic: [PATCH v3 09/12] service: avoid race condition for MT unsafe
 service
Thread-Index: AQHV+/one+9ki+cHakWt/XJq7A5CNahnSwOQgAITmoCABkCLIA==
Date: Wed, 8 Apr 2020 18:05:50 +0000
Message-ID: <BYAPR11MB31431BFA916927D037D05122D7C00@BYAPR11MB3143.namprd11.prod.outlook.com>
References: <1583999071-22872-1-git-send-email-phil.yang@arm.com>
 <1584407863-774-1-git-send-email-phil.yang@arm.com>
 <1584407863-774-10-git-send-email-phil.yang@arm.com>
 <BYAPR11MB3143B499A98B511E39EC6F5AD7C70@BYAPR11MB3143.namprd11.prod.outlook.com>
 <DBBPR08MB4646018C7047165A829D11DD98C40@DBBPR08MB4646.eurprd08.prod.outlook.com>
In-Reply-To: <DBBPR08MB4646018C7047165A829D11DD98C40@DBBPR08MB4646.eurprd08.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.2.0.6
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=harry.van.haaren@intel.com; 
x-originating-ip: [192.198.151.165]
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: c48aeb0e-0b59-41d3-0eba-08d7dbe778dd
x-ms-traffictypediagnostic: BYAPR11MB3638:
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <BYAPR11MB3638312187749A927FC572FED7C00@BYAPR11MB3638.namprd11.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:7219;
x-forefront-prvs: 0367A50BB1
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BYAPR11MB3143.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFTY:;
 SFS:(10019020)(376002)(39840400004)(346002)(136003)(396003)(366004)(53546011)(478600001)(26005)(54906003)(76116006)(66446008)(64756008)(66556008)(66476007)(66946007)(71200400001)(86362001)(33656002)(9686003)(5660300002)(316002)(4326008)(7696005)(52536014)(186003)(7416002)(81166007)(6506007)(2906002)(81156014)(8936002)(55016002)(110136005)(8676002);
 DIR:OUT; SFP:1102; 
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: rHsrxWnWUNxkadYADyOG/MyBvJsXiJ+cTNeV3YkA8trmJ4QM7TyGI+6xu6hWSCKCYu5gD50bx8BO/Vyt+nLMlXcbXRc2I8HhOES8StUVWqLdDdk66+Z8xerjfnzVuE9mLayPkkr0tg6q1RLiW9XDbOg/niLYFjUTE8EMxR/fpAe7nSzGWeZRVJg3ZSXGmNodNfiy0wN+3oxtcyEYXqBqq2dHt5obB1Nsrtc8L3y1DDEpvjALUz6KFi/JnRHYkTxeAS/UeI1sSB25Hanz+GSB23nCybAoFrgib5lfpFh1LQRmSpx6+W7qt1qBYWjEcYDLrxvWQNAi9DajkLiFF/7XWOjxW/mikTOhcv7XQmIZfT3l8snhSxyj9sVzTG3RMuQ1KOORxmgDZD5wjSMidJaxe5gv/OqGG9V7EKMLdQ889S8sTnOIfWcIQKqfqOLzT5U/
x-ms-exchange-antispam-messagedata: EUxQiwnSXlSz++QxnzUHnTUllEEDH7WF5kxqao0jXlBY5+Bq6yaE+6RsDgcHpRsh08s/6TnLHaQH5DYs7HpGkhtrA3AxfFekD/3Eg2BhPtG/oV7c5dB22c9RmKcXQ4OEvTFusIFwiR8VtjPYjzE9gw==
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-Network-Message-Id: c48aeb0e-0b59-41d3-0eba-08d7dbe778dd
X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Apr 2020 18:05:50.5957 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: yp/ggubivWCQzEnXxK4ZmvS26pQCkIi7urdgybfGjvHcUustDv4Gg0xpS+MPrjuUxNfJGoJPJpQKJhN4HdjFmzPOdEfp28Py/LF4k4oAqvM=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR11MB3638
X-OriginatorOrg: intel.com
Subject: Re: [dpdk-dev] [PATCH v3 09/12] service: avoid race condition for
 MT unsafe service
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org
Sender: "dev" <dev-bounces@dpdk.org>

> -----Original Message-----
> From: Honnappa Nagarahalli <Honnappa.Nagarahalli@arm.com>
> Sent: Saturday, April 4, 2020 7:03 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>; Phil Yang
> <Phil.Yang@arm.com>; thomas@monjalon.net; Ananyev, Konstantin
> <konstantin.ananyev@intel.com>; stephen@networkplumber.org;
> maxime.coquelin@redhat.com; dev@dpdk.org
> Cc: david.marchand@redhat.com; jerinj@marvell.com; hemant.agrawal@nxp.com=
;
> Gavin Hu <Gavin.Hu@arm.com>; Ruifeng Wang <Ruifeng.Wang@arm.com>; Joyce K=
ong
> <Joyce.Kong@arm.com>; nd <nd@arm.com>; stable@dpdk.org; Honnappa Nagaraha=
lli
> <Honnappa.Nagarahalli@arm.com>; nd <nd@arm.com>
> Subject: RE: [PATCH v3 09/12] service: avoid race condition for MT unsafe
> service
>=20
> <snip>
>=20
> > > Subject: [PATCH v3 09/12] service: avoid race condition for MT unsafe
> > > service
> > >
> > > From: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > >
> > > There has possible that a MT unsafe service might get configured to
> > > run on another core while the service is running currently. This migh=
t
> > > result in the MT unsafe service running on multiple cores
> > > simultaneously. Use 'execute_lock' always when the service is MT
> > > unsafe.
> > >
> > > Fixes: e9139a32f6e8 ("service: add function to run on app lcore")
> > > Cc: stable@dpdk.org
> > >
> > > Signed-off-by: Honnappa Nagarahalli <honnappa.nagarahalli@arm.com>
> > > Reviewed-by: Phil Yang <phil.yang@arm.com>
> > > Reviewed-by: Gavin Hu <gavin.hu@arm.com>
> >
> > We should put "fix" in the title, once converged on an implementation.
> Ok, will replace 'avoid' with 'fix' (once we agree on the solution)
>=20
> >
> > Regarding Fixes and stable backport, we should consider if fixing this =
in
> stable
> > with a performance degradation, fixing with more complex solution, or
> > documenting a known issue a better solution.
> >
> >
> > This fix (always taking the atomic lock) will have a negative performan=
ce
> > impact on existing code using services. We should investigate a way to =
fix
> it
> > without causing datapath performance degradation.
> Trying to gauge the impact on the existing applications...
> The documentation does not explicitly disallow run time mapping of cores =
to
> service.
> 1) If the applications are mapping the cores to services at run time, the=
y are
> running with a bug. IMO, bug fix resulting in a performance drop should b=
e
> acceptable.
> 2) If the service is configured to run on single core (num_mapped_cores =
=3D=3D 1),
> but service is set to MT unsafe - this will have a (possible) performance
> impact.
> 	a) This can be solved by setting the service to MT safe and can be
> documented. This might be a reasonable solution for applications which ar=
e
> compiling with
>                    future DPDK releases.
> 	b) We can also solve this using symbol versioning - the old version of
> this function will use the old code, the new version of this function wil=
l use
> the code in
>                    this patch. So, if the application is run with future =
DPDK
> releases without recompiling, it will continue to use the old version. If=
 the
> application is compiled
>                    with future releases, they can use solution in 2a. We =
also
> should think if this is an appropriate solution as this would force 1) to
> recompile to get the fix.
> 3) If the service is configured to run on multiple cores (num_mapped_core=
s >
> 1), then for those applications, the lock is being taken already. These
> applications might see some improvements as this patch removes few
> instructions.
>
> >
> > I think there is a way to achieve this by moving more checks/time to th=
e
> > control path (lcore updating the map), and not forcing the datapath lco=
re to
> > always take an atomic.
> I think 2a above is the solution.

2a above is e.g. the Eventdev SW routines like Rx/Tx scheduler services.=20
We should strive to not reduce datapath performance at all here.


> > In this particular case, we have a counter for number of iterations tha=
t a
> Which counter are you thinking about?
> All the counters I checked are not atomic operations currently. If we are
> going to use counters they have to be atomic, which means additional cycl=
es in
> the data path.

I'll try to explain the concept better, take this example:
 - One service core is mapped to a MT_UNSAFE service, like event/sw pmd
 - Application wants to map a 2nd lcore to the same service
 - You point out that today there is a race over the lock
    -- the lock is not taken if (num_mapped_lcores =3D=3D 1)
    -- this avoids an atomic lock/unlock on the datapath

To achieve our desired goal;
 - control thread doing mapping performs the following operations
    -- write service->num_mapped_lcores++ (atomic not required, only single=
-writer allowed by APIs)
    -- MFENCE (full st-ld barrier) to flush write, and force later loads to=
 issue after
    -- read the "calls_per_service" counter for each lcores, add them up.
    ---- Wait :)
    -- re-read the "calls_per_service", and ensure the count has changed.
    ---- The fact that the calls_per_service has changed ensures the servic=
e-lcore
         has seen the new "num_mapped_cores" value, and has now taken the l=
ock!
    -- *now* it is safe to map the 2nd lcore to the service

There is a caveat here that the increments to the "calls_per_service" varia=
ble
must become globally-observable. To force this immediately would require a
write memory barrier, which could impact datapath performance. Given the se=
rvice
is now taking a lock, the unlock() thereof would ensure the "calls_per_serv=
ice"
is flushed to memory.

Note: we could use calls_per_service, or add a new variable to the service =
struct.
Writes to this do not need to be atomic, as it is either mapped to a single=
 core,
or else there's a lock around it.


> > service has done. If this increments we know that the lcore running the
> > service has re-entered the critical section, so would see an updated "n=
eeds
> > atomic" flag.
> >
> > This approach may introduce a predictable branch on the datapath, howev=
er
> > the cost of a predictable branch vs always taking an atomic is order(s?=
) of
> > magnitude, so a branch is much preferred.
> >
> > It must be possible to avoid the datapath overhead using a scheme like =
this.
> It
> > will likely be more complex than your proposed change below, however if=
 it
> > avoids datapath performance drops I feel that a more complex solution i=
s
> > worth investigating at least.

> I do not completely understand the approach you are proposing, may be you=
 can
> elaborate more.

Expanded above, showing a possible solution that does not require additiona=
l
atomics on the datapath.


> But, it seems to be based on a counter approach. Following is
> my assessment on what happens if we use a counter. Let us say we kept tra=
ck of
> how many cores are running the service currently. We need an atomic count=
er
> other than 'num_mapped_cores'. Let us call that counter 'num_current_core=
s'.
> The code to call the service would look like below.
>=20
> 1) rte_atomic32_inc(&num_current_cores); /* this results in a full memory
> barrier */
> 2) if (__atomic_load_n(&num_current_cores, __ATOMIC_ACQUIRE) =3D=3D 1) { =
/*
> rte_atomic_read is not enough here as it does not provide the required me=
mory
> barrier for any architecture */
> 3) 	run_service(); /* Call the service */
> 4) }
> 5) rte_atomic32_sub(&num_current_cores); /* Calling rte_atomic32_clear is=
 not
> enough as it is not an atomic operation and does not provide the required
> memory barrier */
>=20
> But the above code has race conditions in lines 1 and 2. It is possible t=
hat
> none of the cores will ever get to run the service as they all could
> simultaneously increment the counter. Hence lines 1 and 2 together need t=
o be
> atomic, which is nothing but 'compare-exchange' operation.
>=20
> BTW, the current code has a bug where it calls 'rte_atomic_clear(&s-
> >execute_lock)', it is missing memory barriers which results in clearing =
the
> execute_lock before the service has completed running. I suggest changing=
 the
> 'execute_lock' to rte_spinlock_t and using rte_spinlock_try_lock and
> rte_spinlock_unlock APIs.

I don't think a spinlock is what we want here:

The idea is that a service-lcore can be mapped to multiple services.
If one service is already being run (by another thread), we do not want to
spin here waiting for it to become "free" to run by this thread, it should
continue to the next service that it is mapped to.


> >
> > A unit test is required to validate a fix like this - although perhaps =
found
> by
> > inspection/review, a real-world test to validate would give confidence.
> Agree, need to have a test case.
>=20
> >
> >
> > Thoughts on such an approach?
> >
<snip patch contents>