From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id D9D0CA00C2;
	Thu,  7 Apr 2022 16:04:59 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id 7AC794068B;
	Thu,  7 Apr 2022 16:04:59 +0200 (CEST)
Received: from mga09.intel.com (mga09.intel.com [134.134.136.24])
 by mails.dpdk.org (Postfix) with ESMTP id 7257D40689
 for <dev@dpdk.org>; Thu,  7 Apr 2022 16:04:56 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1649340297; x=1680876297;
 h=from:to:cc:subject:date:message-id:references:
 in-reply-to:content-transfer-encoding:mime-version;
 bh=G2wXxoBpa1+orv6HXhFr6QaDXxpMdPsJfFE31QIi2iI=;
 b=iTT2w5UFsq7MbulvckI2bGzkmoUs6PaG8p4tS5CEprFzXQacdsIqaV6X
 oANCgo95GZriu9849vQZDqHtq6VQiTx8g/PoQ9AxRuyvqMI/mmKmLRrRd
 fKWiyHn2AjiqrapLOOi+FPF+TrT5xo0olv1S7DghANfsX3Ms1U3JnlLO7
 poFx1GihyOSU9rvX719gEXMy0BDjOm4t67DXVC9hJJ5hnudutFzMIvE8C
 Xxarrl11iacm6JHXdLrPC+hoPp62JXQ/ozvIGRDhtkdWjEq8plHD3IKzj
 l9UeSbzMFdTMZOZVh2mH3DFSzFMd9PzMEmwNJTUJTADGDryzaIHesG+ZI w==;
X-IronPort-AV: E=McAfee;i="6400,9594,10309"; a="261022667"
X-IronPort-AV: E=Sophos;i="5.90,242,1643702400"; d="scan'208";a="261022667"
Received: from orsmga001.jf.intel.com ([10.7.209.18])
 by orsmga102.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 07 Apr 2022 07:04:31 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.90,242,1643702400"; d="scan'208";a="588819616"
Received: from orsmsx604.amr.corp.intel.com ([10.22.229.17])
 by orsmga001.jf.intel.com with ESMTP; 07 Apr 2022 07:04:30 -0700
Received: from orsmsx601.amr.corp.intel.com (10.22.229.14) by
 ORSMSX604.amr.corp.intel.com (10.22.229.17) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.27; Thu, 7 Apr 2022 07:04:30 -0700
Received: from ORSEDG602.ED.cps.intel.com (10.7.248.7) by
 orsmsx601.amr.corp.intel.com (10.22.229.14) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.27 via Frontend Transport; Thu, 7 Apr 2022 07:04:30 -0700
Received: from NAM11-DM6-obe.outbound.protection.outlook.com (104.47.57.173)
 by edgegateway.intel.com (134.134.137.103) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.1.2308.27; Thu, 7 Apr 2022 07:04:30 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=Ly81vkBwKU5KXBO3P3i6Q++Ye0JdDbLNhA36WyKAy7ehZapMO5VWKUlBh4UoieNxBVVyovp7ix0ywU7wDi/rL87zM+H9CRf2o4fvj20I+KSkQWNxAYqdPgQKKq9Oe+CAYZtxbaCvMheVi9PcxdXVSqBaLUYdIXjz55ALxtM+G8+1TppvNVq3AhVZuVamRLDYS2NfYEUiJZkorT/H3+YvxiD4CCkhlkEYEQ9jQ5Q8PARqr2OsLq3r6E4OQUTZnEKrh/q8hXcBURGtue2I4xBLQKBYVLoHASAHxDqComFSG+uos+qBovVEaveY2HMA70d4P+ErPZ0Af0S5tFIwwjCRRg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=+pz7Cgw5dj/6HvJFSPKTc936OcUf7wT9ChRgpEAM3dk=;
 b=STPFqAeV7aLCUtOVrngJiayw1+6oAWqvEVGrLkB+jIkU6q5Zyl7VL1gCVnGOH1dFV3rPw/bT4UIbDxrIivTMiuKBfWZ3TiHXCkjep2+2dzQ1kgyuUkYrjSgGh4eA3ycCXrJv3CMKNO4CD9xQrd87TP80Q95IoLdItOEjcfv+72i4qSsfEs8FnQKPzPJqn+26GmRhOF86t7NkWvP6swCQW4hbOdogW8PB4zGt1mBAFPCjIF42ulOUK8tvnEyjONAF5bwDwjwb0AH2qhj4KqPyr12CbGhtc6SibZLDc+mXaJuwHzv9M6Puh73fWDnENQshGUt0XlbRiVm3h9Ol2bw8AQ==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
Received: from BN0PR11MB5712.namprd11.prod.outlook.com (2603:10b6:408:160::17)
 by DM6PR11MB3433.namprd11.prod.outlook.com (2603:10b6:5:63::14) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5144.22; Thu, 7 Apr
 2022 14:04:28 +0000
Received: from BN0PR11MB5712.namprd11.prod.outlook.com
 ([fe80::28cf:55af:8c4b:d4d9]) by BN0PR11MB5712.namprd11.prod.outlook.com
 ([fe80::28cf:55af:8c4b:d4d9%3]) with mapi id 15.20.5144.022; Thu, 7 Apr 2022
 14:04:27 +0000
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: =?iso-8859-1?Q?Morten_Br=F8rup?= <mb@smartsharesystems.com>, "Richardson, 
 Bruce" <bruce.richardson@intel.com>
CC: Maxime Coquelin <maxime.coquelin@redhat.com>, "Pai G, Sunil"
 <sunil.pai.g@intel.com>, "Stokes, Ian" <ian.stokes@intel.com>, "Hu, Jiayu"
 <jiayu.hu@intel.com>, "Ferriter, Cian" <cian.ferriter@intel.com>, "Ilya
 Maximets" <i.maximets@ovn.org>, "ovs-dev@openvswitch.org"
 <ovs-dev@openvswitch.org>, "dev@dpdk.org" <dev@dpdk.org>, "Mcnamara, John"
 <john.mcnamara@intel.com>, "O'Driscoll, Tim" <tim.odriscoll@intel.com>,
 "Finn, Emma" <emma.finn@intel.com>
Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
Thread-Topic: OVS DPDK DMA-Dev library/Design Discussion
Thread-Index: Adg/jDNGcC8G4wWtSxeVfUOuAS3Y6wACLuFQAM6falAAJoZBIAAA0srQAAK2F/AABHBTAAAAvWSAAACgo4AAAF2UgAAAVPwgAAOflKAAGxPQIAGemZWw
Date: Thu, 7 Apr 2022 14:04:27 +0000
Message-ID: <BN0PR11MB57120B91DC9C2AEAFA61F6F0D7E69@BN0PR11MB5712.namprd11.prod.outlook.com>
References: <ddaaf8eb51cf463581eef245543a719d@intel.com>
 <DM8PR11MB56058BABCF2D0CDA3D9AA90DBD1D9@DM8PR11MB5605.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7C@smartserver.smartshare.dk>
 <BN0PR11MB571241F94FE5750BC1AC6A4CD71E9@BN0PR11MB5712.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7D@smartserver.smartshare.dk>
 <7968dd0b-8647-8d7b-786f-dc876bcbf3f0@redhat.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7E@smartserver.smartshare.dk>
 <YkM71aqX00pY6hVf@bricha3-MOBL.ger.corp.intel.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F80@smartserver.smartshare.dk>
 <BN0PR11MB57122986CFEC329E31133F78D71E9@BN0PR11MB5712.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F82@smartserver.smartshare.dk>
 <BN0PR11MB5712A2D5BF5C596ACCFF0542D71F9@BN0PR11MB5712.namprd11.prod.outlook.com>
In-Reply-To: <BN0PR11MB5712A2D5BF5C596ACCFF0542D71F9@BN0PR11MB5712.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.6.401.20
authentication-results: dkim=none (message not signed)
 header.d=none;dmarc=none action=none header.from=intel.com;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: e2569304-3f17-49c9-87bb-08da189f8790
x-ms-traffictypediagnostic: DM6PR11MB3433:EE_
x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr
x-microsoft-antispam-prvs: <DM6PR11MB3433FC280DDE8C0610E4602BD7E69@DM6PR11MB3433.namprd11.prod.outlook.com>
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: VHyM16iibX+GhBBUePlK43NVIIxGfKJoqKnqw/WuF/qyXd4ztdxPyuGwmY1wann4ClfjGE4GuuQpj92+rK4wzJgmAWWyfGtIUg9hCSn/5HaHv1OoktY26SWYcWOEJvYSb0Oc7jnwbgv7gLMTeP/WvZYw9ZWYBHlmPHcD6j6HgwyYF95XcZYXbWQzKzYzIjy6j6Q95GdKELjWbGVg53S0dym/QW6eYj9pG54zHbLL3DRJ4HHfenbjVrGbpEf3ItnsPcND3MzNVR/nMfoIcClv8+F9wD2qOGCkyowcW5n8RwHQ1C/5+RZZBFJ+x15zXRxhnD8sWCzqEjVUR71HyXTaoixUvix1VfIe8JwuzVAECpBmo6ceHHX2R/3mrA/+yzlB5LxF3nG5Qu7IqNyQmtWboq6wnIRsfShOVsWh/kKmkmL/jUm/Hr27AHGiqC6LNWoYU7lkJyTzIMMFny/ybhS87+FpKQ+Mh+Sjwo2bIuQ8leRyVaJdtIDgKL5RID3nxMOlBeL1EamlFTsrLMy1O4bGxpU/DlCuacStMQvtOm7bTxTtAV/UC0AOWJd+Aue6gPHQB/gKGkBg2Tn4kYhCVz3hw5rWbJb0T7UTboHeAmDVEwqEhZEQ/E6416eo7N4zGktQOWO/wpRfG5CKnjKyCnPwxa1uofl5c/Le+fwuf5wvYkHVsaZIcwxriRrnhWjOptDjEHX29ZKIKY9SYfbsTD0smv6Gyv5LJXYgRPu0TK0KVqI8AbGkuPDLuXraA0KuYxgJYXPiHmq5FLassv/DzF+WVvfi4Hhrp67bHMdxJXSwHrM=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BN0PR11MB5712.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(13230001)(366004)(38070700005)(5660300002)(66574015)(86362001)(26005)(186003)(107886003)(54906003)(6636002)(316002)(30864003)(52536014)(8936002)(82960400001)(38100700002)(508600001)(2906002)(66556008)(66476007)(66446008)(64756008)(66946007)(122000001)(4326008)(76116006)(8676002)(7696005)(9686003)(53546011)(966005)(6506007)(55016003)(33656002)(83380400001)(110136005)(71200400001);
 DIR:OUT; SFP:1102; 
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?wCGg7KuLLjmqk2XsJRbnc/5nGuon8GodNzo1neubrGLU9Xt8eGzLqgDzgn?=
 =?iso-8859-1?Q?OjqAbutY1mp05ZJzI9SCMhQ/Oa1AtYmfcDCMaq4Gz5KtCy8ndS0uHocB4h?=
 =?iso-8859-1?Q?XEcmHupT/TM8pHxUN5G/grc/eFecnM0dySAyt0ceuDAqWiRjcdTdOoujlf?=
 =?iso-8859-1?Q?v06/vdov3griJTgIL+8cWmQjSMAMPMaOZgHFIfoS+uHTt0ukLMi0hR1HcV?=
 =?iso-8859-1?Q?PtB1PS9SatdQzJ/J5VEGoC8IgY4ujgYap9E6sGX5GKX35ev9Ze+9QEO7JZ?=
 =?iso-8859-1?Q?CiwV4JrTs/bkTbqIycDnZRj6B/A/wqhk7MQVImQ4ywVtVC09fCmJAcE0dj?=
 =?iso-8859-1?Q?/B3ilUWSQxeum6eJyDz1b4HHgZNXT2cikK6BkT4ZdpU9Qdabj+89xQsXTs?=
 =?iso-8859-1?Q?wdqt86l8gs/xP2Iw3wW1GVje/PROIYG91yP26Ana5PL8kUfjc14V+pRh5M?=
 =?iso-8859-1?Q?C7Qu3mjMAEFtqvcm3dXcP0wdqn/uql+a4zFk8ohQcivNiMUFR9P+kbqt4r?=
 =?iso-8859-1?Q?gAS3jlTuQflChh44EkoftFwXGJu7owRWzsPcD+/iuDxzQHsTc/QCnIBn2w?=
 =?iso-8859-1?Q?/Nf3V/sKdOZ+M8cOqE4zxfKLpEPdPWu2xOylRY7lmSXuZt5PGtjszbGdMT?=
 =?iso-8859-1?Q?jd3ByVWIg4pwVGXyWbXuiJwTNo6r1KbqXLRX2db1BDmTvw7nmb+gqpaqTH?=
 =?iso-8859-1?Q?034WCtXqMG++eCM10DOc9Q1NWkWN42ltuecpqcUtQ5xPY+MBgBS8Kee+UD?=
 =?iso-8859-1?Q?bGp9QTpKjFflXJPlGZKZqUBEXb9KmGr80C2rqOQxDfnTYGKwUnpP7tu/lx?=
 =?iso-8859-1?Q?03zJD8FHLuw+J4gHRY8/VOCCD6UgRhqW0AOPuBHD2E7WXhUSNKVKwjredW?=
 =?iso-8859-1?Q?eZqMBZxdRIGT3/ZHQ8OQI1M808g7z49mueCmeIsLojVmGjleUY5YxVjbpB?=
 =?iso-8859-1?Q?keDe0WKUmDFQXelZtjJio/GHPeJ4NTcP5GdCtd8PQGoLqICLuPCNUECzVU?=
 =?iso-8859-1?Q?/DKaFR7SCJOeL7fhcomHqJtOhafbp6DbZ+ui+7vUHT0uxDVl+wuCy66iXl?=
 =?iso-8859-1?Q?G/uhASngn6UG4jRRTkxvjmGBFJlSyxLwUHCykCGUd6K4JcLd37UmlA/E12?=
 =?iso-8859-1?Q?Qu6/1tJCxPxhxTgtfBGWslHrX3ly/he/OhCL+QXiTdZV+uqtGvA7AL5qPN?=
 =?iso-8859-1?Q?wrK0+nVfmF3/MuZQiH4tgaZEgX0cHiVPjpWvJjpnLnGZDPrbberjrpOw1f?=
 =?iso-8859-1?Q?LT0X950ehZKg6Hzy4m3t1KD4lTxgZC3GeLihQ/uJ7dvn7GX8FhjC9nEPwP?=
 =?iso-8859-1?Q?n0YTOo6gUMtbFFUl8JgyCEpflcPIlOivZ3/cvLxpn4vrX5aUXCSDZQpXa4?=
 =?iso-8859-1?Q?gxJFMPrSDtab4Uqt75o1U13iI/kL7MwG61A/Vxit9geq8ZNO+2mwrzd4xq?=
 =?iso-8859-1?Q?/iLlXSrLqIuQyrQbo1Ual/VjbF3rj+Zq1OzqN93i6DJqrC8DQyBNYe38G/?=
 =?iso-8859-1?Q?VPzOCudCLjmy2S9E22MDvcKjPBNgg8QkKEu/kkXs4VL9b9iDvUbqC9dU6t?=
 =?iso-8859-1?Q?0Fbh5XKlpfsF1/X2sQUDsyMlDI655TrR8F19W8lbflIgfMM/855/uFUTPd?=
 =?iso-8859-1?Q?nH7ENPNYhl1RaNYmP1oo8Li6lg3EtHGTVqmLH7rnjgDHyiS2f7IQNRmIOQ?=
 =?iso-8859-1?Q?f0+t6rB4oQkq9IylADt+pAbYWr1Ow3Dh6Ux0vORL4rNULW34zboQ2TftF7?=
 =?iso-8859-1?Q?UoT6l8Dr+3ctjbz7lEFw9l3vRaDjbL2bvbfRuXDAJsirrbcueREKwF79hv?=
 =?iso-8859-1?Q?YyKScGh48Lzqh0SQQmd7GPWJD09PhHQ=3D?=
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BN0PR11MB5712.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: e2569304-3f17-49c9-87bb-08da189f8790
X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Apr 2022 14:04:27.8047 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: Q7ejCWvMeY6dDLwPm75CdojFITVlMeiojmzYwdY7fe2AxEWfIlcCPSVL/KFaydLlZwTTjeDPIiNgXECmMvTMjGiOxweWkvQvkBhz6fEHxQs=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM6PR11MB3433
X-OriginatorOrg: intel.com
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

Hi OVS & DPDK, Maintainers & Community,

Top posting overview of discussion as replies to thread become slower:
perhaps it is a good time to review and plan for next steps?

>From my perspective, it those most vocal in the thread seem to be in favour=
 of the clean
rx/tx split ("defer work"), with the tradeoff that the application must be =
aware of handling
the async DMA completions. If there are any concerns opposing upstreaming o=
f this method,
please indicate this promptly, and we can continue technical discussions he=
re now.

In absence of continued technical discussion here, I suggest Sunil and Ian =
collaborate on getting
the OVS Defer-work approach, and DPDK VHost Async patchsets available on Gi=
tHub for easier
consumption and future development (as suggested in slides presented on las=
t call).

Regards, -Harry

No inline-replies below; message just for context.

> -----Original Message-----
> From: Van Haaren, Harry
> Sent: Wednesday, March 30, 2022 10:02 AM
> To: Morten Br=F8rup <mb@smartsharesystems.com>; Richardson, Bruce
> <bruce.richardson@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Pai G, Sunil
> <Sunil.Pai.G@intel.com>; Stokes, Ian <ian.stokes@intel.com>; Hu, Jiayu
> <Jiayu.Hu@intel.com>; Ferriter, Cian <Cian.Ferriter@intel.com>; Ilya Maxi=
mets
> <i.maximets@ovn.org>; ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara,
> John <john.mcnamara@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>=
;
> Finn, Emma <Emma.Finn@intel.com>
> Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
>=20
> > -----Original Message-----
> > From: Morten Br=F8rup <mb@smartsharesystems.com>
> > Sent: Tuesday, March 29, 2022 8:59 PM
> > To: Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce
> > <bruce.richardson@intel.com>
> > Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Pai G, Sunil
> > <sunil.pai.g@intel.com>; Stokes, Ian <ian.stokes@intel.com>; Hu, Jiayu
> > <jiayu.hu@intel.com>; Ferriter, Cian <cian.ferriter@intel.com>; Ilya Ma=
ximets
> > <i.maximets@ovn.org>; ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara,
> John
> > <john.mcnamara@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; F=
inn,
> > Emma <emma.finn@intel.com>
> > Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
> >
> > > From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com]
> > > Sent: Tuesday, 29 March 2022 19.46
> > >
> > > > From: Morten Br=F8rup <mb@smartsharesystems.com>
> > > > Sent: Tuesday, March 29, 2022 6:14 PM
> > > >
> > > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > > Sent: Tuesday, 29 March 2022 19.03
> > > > >
> > > > > On Tue, Mar 29, 2022 at 06:45:19PM +0200, Morten Br=F8rup wrote:
> > > > > > > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > > > > > > Sent: Tuesday, 29 March 2022 18.24
> > > > > > >
> > > > > > > Hi Morten,
> > > > > > >
> > > > > > > On 3/29/22 16:44, Morten Br=F8rup wrote:
> > > > > > > >> From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com=
]
> > > > > > > >> Sent: Tuesday, 29 March 2022 15.02
> > > > > > > >>
> > > > > > > >>> From: Morten Br=F8rup <mb@smartsharesystems.com>
> > > > > > > >>> Sent: Tuesday, March 29, 2022 1:51 PM
> > > > > > > >>>
> > > > > > > >>> Having thought more about it, I think that a completely
> > > > > different
> > > > > > > architectural approach is required:
> > > > > > > >>>
> > > > > > > >>> Many of the DPDK Ethernet PMDs implement a variety of RX
> > > and TX
> > > > > > > packet burst functions, each optimized for different CPU vect=
or
> > > > > > > instruction sets. The availability of a DMA engine should be
> > > > > treated
> > > > > > > the same way. So I suggest that PMDs copying packet contents,
> > > e.g.
> > > > > > > memif, pcap, vmxnet3, should implement DMA optimized RX and T=
X
> > > > > packet
> > > > > > > burst functions.
> > > > > > > >>>
> > > > > > > >>> Similarly for the DPDK vhost library.
> > > > > > > >>>
> > > > > > > >>> In such an architecture, it would be the application's jo=
b
> > > to
> > > > > > > allocate DMA channels and assign them to the specific PMDs th=
at
> > > > > should
> > > > > > > use them. But the actual use of the DMA channels would move
> > > down
> > > > > below
> > > > > > > the application and into the DPDK PMDs and libraries.
> > > > > > > >>>
> > > > > > > >>>
> > > > > > > >>> Med venlig hilsen / Kind regards,
> > > > > > > >>> -Morten Br=F8rup
> > > > > > > >>
> > > > > > > >> Hi Morten,
> > > > > > > >>
> > > > > > > >> That's *exactly* how this architecture is designed &
> > > > > implemented.
> > > > > > > >> 1.	The DMA configuration and initialization is up to the
> > > > > application
> > > > > > > (OVS).
> > > > > > > >> 2.	The VHost library is passed the DMA-dev ID, and its
> > > new
> > > > > async
> > > > > > > rx/tx APIs, and uses the DMA device to accelerate the copy.
> > > > > > > >>
> > > > > > > >> Looking forward to talking on the call that just started.
> > > > > Regards, -
> > > > > > > Harry
> > > > > > > >>
> > > > > > > >
> > > > > > > > OK, thanks - as I said on the call, I haven't looked at the
> > > > > patches.
> > > > > > > >
> > > > > > > > Then, I suppose that the TX completions can be handled in t=
he
> > > TX
> > > > > > > function, and the RX completions can be handled in the RX
> > > function,
> > > > > > > just like the Ethdev PMDs handle packet descriptors:
> > > > > > > >
> > > > > > > > TX_Burst(tx_packet_array):
> > > > > > > > 1.	Clean up descriptors processed by the NIC chip. -->
> > > Process
> > > > > TX
> > > > > > > DMA channel completions. (Effectively, the 2nd pipeline stage=
.)
> > > > > > > > 2.	Pass on the tx_packet_array to the NIC chip
> > > descriptors. --
> > > > > > Pass
> > > > > > > on the tx_packet_array to the TX DMA channel. (Effectively, t=
he
> > > 1st
> > > > > > > pipeline stage.)
> > > > > > >
> > > > > > > The problem is Tx function might not be called again, so
> > > enqueued
> > > > > > > packets in 2. may never be completed from a Virtio point of
> > > view.
> > > > > IOW,
> > > > > > > the packets will be copied to the Virtio descriptors buffers,
> > > but
> > > > > the
> > > > > > > descriptors will not be made available to the Virtio driver.
> > > > > >
> > > > > > In that case, the application needs to call TX_Burst()
> > > periodically
> > > > > with an empty array, for completion purposes.
> > >
> > > This is what the "defer work" does at the OVS thread-level, but inste=
ad
> > > of
> > > "brute-forcing" and *always* making the call, the defer work concept
> > > tracks
> > > *when* there is outstanding work (DMA copies) to be completed
> > > ("deferred work")
> > > and calls the generic completion function at that point.
> > >
> > > So "defer work" is generic infrastructure at the OVS thread level to
> > > handle
> > > work that needs to be done "later", e.g. DMA completion handling.
> > >
> > >
> > > > > > Or some sort of TX_Keepalive() function can be added to the DPD=
K
> > > > > library, to handle DMA completion. It might even handle multiple
> > > DMA
> > > > > channels, if convenient - and if possible without locking or othe=
r
> > > > > weird complexity.
> > >
> > > That's exactly how it is done, the VHost library has a new API added,
> > > which allows
> > > for handling completions. And in the "Netdev layer" (~OVS ethdev
> > > abstraction)
> > > we add a function to allow the OVS thread to do those completions in =
a
> > > new
> > > Netdev-abstraction API called "async_process" where the completions c=
an
> > > be checked.
> > >
> > > The only method to abstract them is to "hide" them somewhere that wil=
l
> > > always be
> > > polled, e.g. an ethdev port's RX function.  Both V3 and V4 approaches
> > > use this method.
> > > This allows "completions" to be transparent to the app, at the tradeo=
ff
> > > to having bad
> > > separation  of concerns as Rx and Tx are now tied-together.
> > >
> > > The point is, the Application layer must *somehow * handle of
> > > completions.
> > > So fundamentally there are 2 options for the Application level:
> > >
> > > A) Make the application periodically call a "handle completions"
> > > function
> > > 	A1) Defer work, call when needed, and track "needed" at app
> > > layer, and calling into vhost txq complete as required.
> > > 	        Elegant in that "no work" means "no cycles spent" on
> > > checking DMA completions.
> > > 	A2) Brute-force-always-call, and pay some overhead when not
> > > required.
> > > 	        Cycle-cost in "no work" scenarios. Depending on # of
> > > vhost queues, this adds up as polling required *per vhost txq*.
> > > 	        Also note that "checking DMA completions" means taking a
> > > virtq-lock, so this "brute-force" can needlessly increase x-thread
> > > contention!
> >
> > A side note: I don't see why locking is required to test for DMA comple=
tions.
> > rte_dma_vchan_status() is lockless, e.g.:
> >
> https://elixir.bootlin.com/dpdk/latest/source/drivers/dma/ioat/ioat_dmade=
v.c#L
> 56
> > 0
>=20
> Correct, DMA-dev is "ethdev like"; each DMA-id can be used in a lockfree =
manner
> from a single thread.
>=20
> The locks I refer to are at the OVS-netdev level, as virtq's are shared a=
cross OVS's
> dataplane threads.
> So the "M to N" comes from M dataplane threads to N virtqs, hence requiri=
ng
> some locking.
>=20
>=20
> > > B) Hide completions and live with the complexity/architectural
> > > sacrifice of mixed-RxTx.
> > > 	Various downsides here in my opinion, see the slide deck
> > > presented earlier today for a summary.
> > >
> > > In my opinion, A1 is the most elegant solution, as it has a clean
> > > separation of concerns, does not  cause
> > > avoidable contention on virtq locks, and spends no cycles when there =
is
> > > no completion work to do.
> > >
> >
> > Thank you for elaborating, Harry.
>=20
> Thanks for part-taking in the discussion & providing your insight!
>=20
> > I strongly oppose against hiding any part of TX processing in an RX fun=
ction. It
> is just
> > wrong in so many ways!
> >
> > I agree that A1 is the most elegant solution. And being the most elegan=
t
> solution, it
> > is probably also the most future proof solution. :-)
>=20
> I think so too, yes.
>=20
> > I would also like to stress that DMA completion handling belongs in the=
 DPDK
> > library, not in the application. And yes, the application will be requi=
red to call
> some
> > "handle DMA completions" function in the DPDK library. But since the
> application
> > already knows that it uses DMA, the application should also know that i=
t needs
> to
> > call this extra function - so I consider this requirement perfectly acc=
eptable.
>=20
> Agree here.
>=20
> > I prefer if the DPDK vhost library can hide its inner workings from the
> application,
> > and just expose the additional "handle completions" function. This also=
 means
> that
> > the inner workings can be implemented as "defer work", or by some other
> > algorithm. And it can be tweaked and optimized later.
>=20
> Yes, the choice in how to call the handle_completions function is Applica=
tion
> layer.
> For OVS we designed Defer Work, V3 and V4. But it is an App level choice,=
 and
> every
> application is free to choose its own method.
>=20
> > Thinking about the long term perspective, this design pattern is common=
 for
> both
> > the vhost library and other DPDK libraries that could benefit from DMA =
(e.g.
> > vmxnet3 and pcap PMDs), so it could be abstracted into the DMA library =
or a
> > separate library. But for now, we should focus on the vhost use case, a=
nd just
> keep
> > the long term roadmap for using DMA in mind.
>=20
> Totally agree to keep long term roadmap in mind; but I'm not sure we can
> refactor
> logic out of vhost. When DMA-completions arrive, the virtQ needs to be
> updated;
> this causes a tight coupling between the DMA completion count, and the vh=
ost
> library.
>=20
> As Ilya raised on the call yesterday, there is an "in_order" requirement =
in the
> vhost
> library, that per virtq the packets are presented to the guest "in order"=
 of
> enqueue.
> (To be clear, *not* order of DMA-completion! As Jiayu mentioned, the Vhos=
t
> library
> handles this today by re-ordering the DMA completions.)
>=20
>=20
> > Rephrasing what I said on the conference call: This vhost design will b=
ecome
> the
> > common design pattern for using DMA in DPDK libraries. If we get it wro=
ng, we
> are
> > stuck with it.
>=20
> Agree, and if we get it right, then we're stuck with it too! :)
>=20
>=20
> > > > > > Here is another idea, inspired by a presentation at one of the
> > > DPDK
> > > > > Userspace conferences. It may be wishful thinking, though:
> > > > > >
> > > > > > Add an additional transaction to each DMA burst; a special
> > > > > transaction containing the memory write operation that makes the
> > > > > descriptors available to the Virtio driver.
> > > > > >
> > > > >
> > > > > That is something that can work, so long as the receiver is
> > > operating
> > > > > in
> > > > > polling mode. For cases where virtio interrupts are enabled, you
> > > still
> > > > > need
> > > > > to do a write to the eventfd in the kernel in vhost to signal the
> > > > > virtio
> > > > > side. That's not something that can be offloaded to a DMA engine,
> > > > > sadly, so
> > > > > we still need some form of completion call.
> > > >
> > > > I guess that virtio interrupts is the most widely deployed scenario=
,
> > > so let's ignore
> > > > the DMA TX completion transaction for now - and call it a possible
> > > future
> > > > optimization for specific use cases. So it seems that some form of
> > > completion call
> > > > is unavoidable.
> > >
> > > Agree to leave this aside, there is in theory a potential optimizatio=
n,
> > > but
> > > unlikely to be of large value.
> > >
> >
> > One more thing: When using DMA to pass on packets into a guest, there c=
ould
> be a
> > delay from the DMA completes until the guest is signaled. Is there any =
CPU
> cache
> > hotness regarding the guest's access to the packet data to consider her=
e? I.e. if
> we
> > wait signaling the guest, the packet data may get cold.
>=20
> Interesting question; we can likely spawn a new thread around this topic!
> In short, it depends on how/where the DMA hardware writes the copy.
>=20
> With technologies like DDIO, the "dest" part of the copy will be in LLC. =
The core
> reading the
> dest data will benefit from the LLC locality (instead of snooping it from=
 a remote
> core's L1/L2).
>=20
> Delays in notifying the guest could result in LLC capacity eviction, yes.
> The application layer decides how often/promptly to check for completions=
,
> and notify the guest of them. Calling the function more often will result=
 in less
> delay in that portion of the pipeline.
>=20
> Overall, there are caching benefits with DMA acceleration, and the applic=
ation
> can control
> the latency introduced between dma-completion done in HW, and Guest vring
> update.