From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <matan@mellanox.com>
Received: from EUR02-AM5-obe.outbound.protection.outlook.com
 (mail-eopbgr00067.outbound.protection.outlook.com [40.107.0.67])
 by dpdk.org (Postfix) with ESMTP id 08B371B31A
 for <dev@dpdk.org>; Thu, 18 Jan 2018 15:45:02 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=Mellanox.com;
 s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;
 bh=ej2eiffuEJyqFp3oSQjEIeI2IPGLuT/9/gq33SwF/kg=;
 b=MEyOKEuygTftwQEFKqAKLYowLy8zpULO0UADYzjRT614uH+wtkASrtT7aTvP5qf2t9kuk/0M1iZr/u1vn4v441X6358S4Sgr8uSt6n9B+Cbr1LZVhpaKMDNvaTPPE63yA4eR46aeonAmDCv9Z/diTd/GgStJsl+bWbXbcx8afM4=
Received: from AM6PR0502MB3797.eurprd05.prod.outlook.com (52.133.21.26) by
 AM6PR0502MB3767.eurprd05.prod.outlook.com (52.133.21.20) with Microsoft SMTP
 Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id
 15.20.428.17; Thu, 18 Jan 2018 14:45:00 +0000
Received: from AM6PR0502MB3797.eurprd05.prod.outlook.com
 ([fe80::6c28:c6b3:de94:a733]) by AM6PR0502MB3797.eurprd05.prod.outlook.com
 ([fe80::6c28:c6b3:de94:a733%13]) with mapi id 15.20.0428.014; Thu, 18 Jan
 2018 14:45:00 +0000
From: Matan Azrad <matan@mellanox.com>
To: "Ananyev, Konstantin" <konstantin.ananyev@intel.com>, Thomas Monjalon
 <thomas@monjalon.net>, Gaetan Rivet <gaetan.rivet@6wind.com>, "Wu, Jingjing"
 <jingjing.wu@intel.com>
CC: "dev@dpdk.org" <dev@dpdk.org>, Neil Horman <nhorman@tuxdriver.com>,
 "Richardson, Bruce" <bruce.richardson@intel.com>
Thread-Topic: [PATCH v2 2/6] ethdev: add port ownership
Thread-Index: AQHTihf/M9xg8LYorUSRFqZtTc27hqNtNdVQgAFomACAAAOCwIAAuwQAgABvY+CABQwJgIAAEk3ggABinoCAANUIAIAAxQGAgAAGCnCAAQnKAIAAAmNwgAAXCYCAAABqkIAAQhEAgAASagCAACqV0IABKccAgAABeuCAAAVxgIAAAFvA
Date: Thu, 18 Jan 2018 14:45:00 +0000
Message-ID: <AM6PR0502MB3797B4DEAF05DE7171370660D2E80@AM6PR0502MB3797.eurprd05.prod.outlook.com>
References: <1511870281-15282-1-git-send-email-matan@mellanox.com>
 <1515318351-4756-1-git-send-email-matan@mellanox.com>
 <1515318351-4756-3-git-send-email-matan@mellanox.com>
 <2601191342CEEE43887BDE71AB97725880E3B9D6@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB379755992EDDF002D06D9521D2110@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627B12A@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB379766B74D46E3110A21D089D2160@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627CCB0@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB37972AAC7DBEA5CB5F52A78DD2170@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627DC25@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797650D307664AD9024D927D2EB0@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627DE30@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797CBF03D656EE2B103E640D2EA0@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627E954@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797F16A8B4FE5FF9AE47822D2EA0@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627EE60@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797DAA020B77E44DD599688D2E90@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627EEDA@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797D94530A2DCAA50CA10FAD2E90@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588627F076@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB3797AAB93D5FCF09C1ABF364D2E90@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588628029A@irsmsx105.ger.corp.intel.com>
 <AM6PR0502MB37974A580CB5C6A1BB806AD4D2E80@AM6PR0502MB3797.eurprd05.prod.outlook.com>
 <2601191342CEEE43887BDE71AB9772588628032A@irsmsx105.ger.corp.intel.com>
In-Reply-To: <2601191342CEEE43887BDE71AB9772588628032A@irsmsx105.ger.corp.intel.com>
Accept-Language: en-US, he-IL
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
authentication-results: spf=none (sender IP is )
 smtp.mailfrom=matan@mellanox.com; 
x-originating-ip: [193.47.165.251]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; AM6PR0502MB3767;
 7:6/WKmxhHmlqTzkiTGtb0q5HQ7F9eG0o+uK8ArJACWNOiuZzomr3nmqQqqPSQ9tQS5sTMyWwftG2v4MqDEgwBgiWBr1psZRfXSIbahuH2tKyhA+BYhdJrBcfSMpqkfiNUCgFVrA/Qigz7Y1Yum2TvcM/yAoxUAXhlCV0HJao97XeXsjxtA6V7+8bhtLqwxn9tFdaV9l9x1ua4RTfEUdqtDP6un8HiFrgqqB7K3jFQWdLRqkm5XsuhAeuFZQJOSeFB
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: 6caa52ff-e2b0-4abe-ebe4-08d55e820d3e
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0;
 RULEID:(7020095)(4652020)(5600026)(4604075)(3008032)(48565401081)(2017052603307)(7153060)(7193020);
 SRVR:AM6PR0502MB3767; 
x-ms-traffictypediagnostic: AM6PR0502MB3767:
x-ld-processed: a652971c-7d2e-4d9b-a6a4-d149256f461b,ExtAddr
x-microsoft-antispam-prvs: <AM6PR0502MB37677B2117842E4159D53A86D2E80@AM6PR0502MB3767.eurprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(60795455431006)(17755550239193);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0;
 RULEID:(6040470)(2401047)(5005006)(8121501046)(93006095)(93001095)(3002001)(10201501046)(3231023)(2400063)(944501161)(6055026)(6041268)(20161123558120)(20161123564045)(201703131423095)(201702281528075)(20161123555045)(201703061421075)(201703061406153)(20161123560045)(20161123562045)(6072148)(201708071742011);
 SRVR:AM6PR0502MB3767; BCL:0; PCL:0; RULEID:(100000803101)(100110400095);
 SRVR:AM6PR0502MB3767; 
x-forefront-prvs: 05568D1FF7
x-forefront-antispam-report: SFV:NSPM;
 SFS:(10009020)(366004)(396003)(346002)(376002)(39860400002)(39380400002)(189003)(199004)(52314003)(66066001)(99286004)(25786009)(4326008)(2906002)(14454004)(76176011)(59450400001)(33656002)(102836004)(6506007)(6246003)(7696005)(8936002)(8676002)(81166006)(81156014)(478600001)(106356001)(105586002)(6436002)(6116002)(3846002)(68736007)(7736002)(9686003)(74316002)(54906003)(305945005)(2950100002)(3660700001)(229853002)(86362001)(110136005)(316002)(55016002)(5660300001)(97736004)(93886005)(53936002)(26005)(5890100001)(2900100001)(3280700002)(5250100002)(21314002);
 DIR:OUT; SFP:1101; SCL:1; SRVR:AM6PR0502MB3767;
 H:AM6PR0502MB3797.eurprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords;
 MX:1; A:1; LANG:en; 
received-spf: None (protection.outlook.com: mellanox.com does not designate
 permitted sender hosts)
x-microsoft-antispam-message-info: x0xj4hfMUhm+whINYjBY2GoXFn9HCwRddliaHNBjyL78fQOv2KGrUsUh9ZbEMn79HDJku481wQZmA7qb0cLpuw==
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginatorOrg: Mellanox.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 6caa52ff-e2b0-4abe-ebe4-08d55e820d3e
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Jan 2018 14:45:00.2686 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: a652971c-7d2e-4d9b-a6a4-d149256f461b
X-MS-Exchange-Transport-CrossTenantHeadersStamped: AM6PR0502MB3767
Subject: Re: [dpdk-dev] [PATCH v2 2/6] ethdev: add port ownership
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 18 Jan 2018 14:45:02 -0000

HI

From: Ananyev, Konstantin, Thursday, January 18, 2018 4:42 PM
> > Hi Konstantine
> >
> > > Hi Matan,
> > >
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Another thing - you'll
> > > > > > > > > > > > > > > > > > > > > > probably need to
> > > > > > > > grab/release
> > > > > > > > > > > > > > > > > > > > > > a lock inside
> > > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated() too.
> > > > > > > > > > > > > > > > > > > > > > It is a public function used
> > > > > > > > > > > > > > > > > > > > > > by drivers, so need to be
> > > > > > > > > > > > > > > > > > > > > > protected
> > > > > > > > > > > > > > too.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Yes, I thought about it, but
> > > > > > > > > > > > > > > > > > > > > decided not to use lock in
> > > > > > > > > > next:
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_allocated
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_count
> > > > > > > > > > > > > > > > > > > > > rte_eth_dev_get_name_by_port
> > > > > > > > > > > > > > rte_eth_dev_get_port_by_name
> > > > > > > > > > > > > > > > > > > > > maybe more...
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > As I can see in patch #3 you
> > > > > > > > > > > > > > > > > > > > protect by lock access to
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name (which
> > > > > > > > > > > > > > > > > > > > seems like a good
> > > > > > > > > > thing).
> > > > > > > > > > > > > > > > > > > > So I think any other public
> > > > > > > > > > > > > > > > > > > > function that access
> > > > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name should be
> > > > > > > > > > > > > > > > > > > > protected by the
> > > > > > > > > > same
> > > > > > > > > > > > lock.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I don't think so, I can understand
> > > > > > > > > > > > > > > > > > > to use the ownership lock here(as in
> > > > > > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > > > > > creation) but I don't think it is neces=
sary too.
> > > > > > > > > > > > > > > > > > > What are we exactly protecting here?
> > > > > > > > > > > > > > > > > > > Don't you think it is just
> > > > > > > > > > > > > > > > > > > timing?(ask in the next moment and
> > > > > > > > > > > > > > > > > > > you may get another
> > > > > > > > > > > > > > > > > > > answer) I don't see optional
> > > > > > > > crash.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Not sure what you mean here by timing..=
.
> > > > > > > > > > > > > > > > > > As I understand
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name unique
> > > > > > > > identifies
> > > > > > > > > > > > > > > > > > device and is used by  port
> > > > > > > > > > > > > > > > > > allocation/release/find
> > > > > > > > functions.
> > > > > > > > > > > > > > > > > > As you stated above:
> > > > > > > > > > > > > > > > > > "1. The port allocation and port
> > > > > > > > > > > > > > > > > > release synchronization will be managed=
 by
> ethdev."
> > > > > > > > > > > > > > > > > > To me it means that ethdev layer has
> > > > > > > > > > > > > > > > > > to make sure that all accesses to
> > > > > > > > > > > > > > > > > > rte_eth_dev_data[].name are
> > > > > > atomic.
> > > > > > > > > > > > > > > > > > Otherwise what would prevent the
> > > > > > > > > > > > > > > > > > situation when one
> > > > > > > > > > process
> > > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > rte_eth_dev_allocate()-
> > > > > > > > >snprintf(rte_eth_dev_data[x].name,
> > > > > > > > > > > > > > > > > > ...) while second one does
> > > > > > > > > > > > > > > > rte_eth_dev_allocated(rte_eth_dev_data[x].
> > > > > > > > > > > > > > > > name,
> > > ...) ?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The second will get True or False and tha=
t is it.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Under race condition - in the worst case
> > > > > > > > > > > > > > > > it might crash, though for that you'll
> > > > > > > > > > > > > > > > have to be really
> > > unlucky.
> > > > > > > > > > > > > > > > Though in most cases as you said it would
> > > > > > > > > > > > > > > > just not operate
> > > > > > > > > > correctly.
> > > > > > > > > > > > > > > > I think if we start to protect dev->name
> > > > > > > > > > > > > > > > by lock we need to do it for all instances
> > > > > > > > > > > > > > > > (both read and
> > > write).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Since under the ownership rules, the user
> > > > > > > > > > > > > > > must take ownership
> > > > > > > > of a
> > > > > > > > > > > > > > > port
> > > > > > > > > > > > > > before using it, I still don't see a problem he=
re.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am not talking about owner id or name here.
> > > > > > > > > > > > > > I am talking about dev->name.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > So? The user still should take ownership of a
> > > > > > > > > > > > > device before using it
> > > > > > > > (by
> > > > > > > > > > > > name or by port id).
> > > > > > > > > > > > > It can just read it without owning it, but no man=
aging it.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > Please, Can you describe specific crash
> > > > > > > > > > > > > > > scenario and explain how could the
> > > > > > > > > > > > > > locking fix it?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let say thread 0 doing rte_eth_dev_allocate()-
> > > > > > > > > > > > > > >snprintf(rte_eth_dev_data[x].name, ...),
> > > > > > > > > > > > > > >thread 1 doing
> > > > > > > > > > > > > > rte_pmd_ring_remove()->rte_eth_dev_allocated()
> > > > > > > > > > > > > > -
> > > > > > >strcmp().
> > > > > > > > > > > > > > And because of race condition -
> > > > > > > > > > > > > > rte_eth_dev_allocated() will
> > > > > > > > return
> > > > > > > > > > > > > > rte_eth_dev * for the wrong device.
> > > > > > > > > > > > > Which wrong device do you mean? I guess it is
> > > > > > > > > > > > > the device which
> > > > > > > > > > currently is
> > > > > > > > > > > > being created by thread 0.
> > > > > > > > > > > > > > Then rte_pmd_ring_remove() will call
> > > > > > > > > > > > > > rte_free() for related resources, while It can
> > > > > > > > > > > > > > still be in use by someone
> > > > > else.
> > > > > > > > > > > > > The rte_pmd_ring_remove caller(some DPDK entity)
> > > > > > > > > > > > > must take
> > > > > > > > > > ownership
> > > > > > > > > > > > > (or validate that he is the owner) of a port
> > > > > > > > > > > > > before doing it(free,
> > > > > > > > > > release), so
> > > > > > > > > > > > no issue here.
> > > > > > > > > > > >
> > > > > > > > > > > > Forget about ownership for a second.
> > > > > > > > > > > > Suppose we have a process it created ring port for
> > > > > > > > > > > > itself (without
> > > > > > > > setting
> > > > > > > > > > any
> > > > > > > > > > > > ownership)  and used it for some time.
> > > > > > > > > > > > Then it decided to remove it, so it calls
> > > > > > > > > > > > rte_pmd_ring_remove()
> > > > > > for it.
> > > > > > > > > > > > At the same time second process decides to call
> > > > > > > > rte_eth_dev_allocate()
> > > > > > > > > > (let
> > > > > > > > > > > > say for anither ring port).
> > > > > > > > > > > > They could collide trying to read (process 0) and
> > > > > > > > > > > > modify (process 1)
> > > > > > > > same
> > > > > > > > > > > > string rte_eth_dev_data[].name.
> > > > > > > > > > > >
> > > > > > > > > > > Do you mean that process 0 will compare successfully
> > > > > > > > > > > the process 1
> > > > > > > > new
> > > > > > > > > > port name?
> > > > > > > > > >
> > > > > > > > > > Yes.
> > > > > > > > > >
> > > > > > > > > > > The state are in local process memory - so process 0
> > > > > > > > > > > will not compare
> > > > > > > > the
> > > > > > > > > > process 1 port, from its point of view this port is in
> > > > > > > > > > UNUSED
> > > > > > > > > > > state.
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Ok, and why it can't be in attached state in process 0 =
too?
> > > > > > > > >
> > > > > > > > > Someone in process 0 should attach it using protected
> > > > > > > > > attach_secondary
> > > > > > > > somewhere in your scenario.
> > > > > > > >
> > > > > > > > Yes, process 0 can have this port attached too, why not?
> > > > > > > See the function with inline comments:
> > > > > > >
> > > > > > > struct rte_eth_dev *
> > > > > > > rte_eth_dev_allocated(const char *name) {
> > > > > > > 	unsigned i;
> > > > > > >
> > > > > > > 	for (i =3D 0; i < RTE_MAX_ETHPORTS; i++) {
> > > > > > >
> > > > > > > 	    	The below state are in local process memory,
> > > > > > > 		So, if here process 1 will allocate a new port (the
> > > > > > > current i),
> > > > > > update its local state to ATTACHED and write the name,
> > > > > > > 		the state is not visible by process 0 until someone in
> > > > > > > process
> > > > > > 0 will attach it by rte_eth_dev_attach_secondary.
> > > > > > > 		So, to use rte_eth_dev_attach_secondary process 0
> must
> > > > > > take the lock
> > > > > > > and it can't, because it is currently locked by process 1.
> > > > > >
> > > > > > Ok I see.
> > > > > > Thanks for your patience.
> > > > > > BTW, that means that if let say process 0 will call
> > > > > > rte_eth_dev_allocate("xxx") and process 1 will call
> > > > > > rte_eth_dev_allocate("yyy") we can endup with same port_id be
> > > > > > used for different devices and 2 processes will overwrite the
> > > > > > same
> > > > > rte_eth_dev_data[port_id]?
> > > > >
> > > > > No, contrary to the state, the lock itself is in shared memory,
> > > > > so 2 processes cannot allocate port in the same time.(you can
> > > > > see it in the next patch of this series).
> > >
> > > I am not talking about racing here.
> > > Let say process 0 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("xxx")
> > > rte_eth_dev_allocate() finds that port N is 'free', i.e.
> > > local rte_eth_devices[N].state =3D=3D RTE_ETH_DEV_UNUSED so it assign=
s
> > > new dev ("xxx") to port N.
> > > Then after some time process 1 calls rte_pmd_ring_probe()->....-
> > > >rte_eth_dev_allocate("yyy").
> > > From its perspective port N is still free:  rte_eth_devices[N].state
> > > =3D=3D RTE_ETH_DEV_UNUSED, so it will assign new dev ("yyy") to the s=
ame
> port.
> > >
> >
> > Yes you right, this is a problem(not related actually to port
> > ownership)
>=20
> Yep that's true - it was there before your patches.
>=20
> > but look:
> > As I understand the secondary processes are not allowed to create a
> > ports and they must to use attach_secondary API, but there is not
> hardcoded check which prevent them to do it.
>=20
> Secondary processes ae the ability to allocate their own vdevs and probab=
ly it
> should stay like that.
> I just thought it is a good opportunity to fix it while you are on these =
changes
> anyway, but ok we can leave it for now.
>=20
Looks like the fix should break ABI(moving the state to the shared memory),=
 let's try to fix it in the next version :)

> Konstantin