From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <dev-bounces@dpdk.org>
Received: from mails.dpdk.org (mails.dpdk.org [217.70.189.124])
	by inbox.dpdk.org (Postfix) with ESMTP id B3819A0509;
	Wed, 30 Mar 2022 11:01:38 +0200 (CEST)
Received: from [217.70.189.124] (localhost [127.0.0.1])
	by mails.dpdk.org (Postfix) with ESMTP id A5CC340685;
	Wed, 30 Mar 2022 11:01:38 +0200 (CEST)
Received: from mga01.intel.com (mga01.intel.com [192.55.52.88])
 by mails.dpdk.org (Postfix) with ESMTP id D35E94013F
 for <dev@dpdk.org>; Wed, 30 Mar 2022 11:01:36 +0200 (CEST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;
 d=intel.com; i=@intel.com; q=dns/txt; s=Intel;
 t=1648630897; x=1680166897;
 h=from:to:cc:subject:date:message-id:references:
 in-reply-to:content-transfer-encoding:mime-version;
 bh=PCcScJDP6eysO0oyrfk1WkxTNLbR7w3c6yvoJDqEKpw=;
 b=iWW+kGAUEdxIT5S8h3Df3tPd6iJ76679t3ZEC7IwHigtRKE6kX0DTac0
 z9A99bffgXQ6+U7CEdosl7Z+EJqkH4Koh0qC9g2CgtrOrePeRpjrAnpAU
 GiJ93koftyGJN4qbjDahvPvumm3fj3xB8Eq8MMMM0GosKib7BzZkqg95/
 t/8LZgsEgHgGVMk65L25luRMEPzqlusHPBYA0p9nd5PEn9SiG1XpmEgaM
 H2mRsPfa681vht8sm7pxh70PqT/KFJTamVmRfcXqgh+F4S2Xuj1ht8Mr+
 NqJgx1OOpwE/QgQH9XZXpEqifT7zGs/tv+MlvjQnrnrObJKdDLSMA8NsA w==;
X-IronPort-AV: E=McAfee;i="6200,9189,10301"; a="284396543"
X-IronPort-AV: E=Sophos;i="5.90,222,1643702400"; d="scan'208";a="284396543"
Received: from fmsmga008.fm.intel.com ([10.253.24.58])
 by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384;
 30 Mar 2022 02:01:35 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.90,222,1643702400"; d="scan'208";a="605347801"
Received: from orsmsx605.amr.corp.intel.com ([10.22.229.18])
 by fmsmga008.fm.intel.com with ESMTP; 30 Mar 2022 02:01:35 -0700
Received: from orsmsx611.amr.corp.intel.com (10.22.229.24) by
 ORSMSX605.amr.corp.intel.com (10.22.229.18) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.27; Wed, 30 Mar 2022 02:01:34 -0700
Received: from orsmsx604.amr.corp.intel.com (10.22.229.17) by
 ORSMSX611.amr.corp.intel.com (10.22.229.24) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.27; Wed, 30 Mar 2022 02:01:34 -0700
Received: from ORSEDG601.ED.cps.intel.com (10.7.248.6) by
 orsmsx604.amr.corp.intel.com (10.22.229.17) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id
 15.1.2308.27 via Frontend Transport; Wed, 30 Mar 2022 02:01:34 -0700
Received: from NAM10-DM6-obe.outbound.protection.outlook.com (104.47.58.109)
 by edgegateway.intel.com (134.134.137.102) with Microsoft SMTP Server
 (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id
 15.1.2308.21; Wed, 30 Mar 2022 02:01:34 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;
 b=ZLjkfsnV3BHhZA8QtvjJ2Skbuc3qLx3D3P+jd5LgwPJFUzCer62No7bJG/QpzBnyrWTngKDC3byhBjxi1VF3VgY9CeSBcLwigFPOquFUVaRMdsoEf6rCHpx5y4NhbF4vwRL6BGWvu82w7yiQxERSGlmj3eX2KI4BWRNCVbxK5Mvv9CyB7EwFn//2EUwY7FSFkjbl3NHPkrgi593Orz1l335aaesV2rPR7GK3jiyhySd6xtH366NZQXXfj6hREFnFIguRwwrw4TNGI7EFRxpnrVOhy6bRC3mS6i+xUgj638dSCzko37aVpNcC4kxPAxfpQImpMGT4mJnUG6tQYS5JvA==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; 
 s=arcselector9901;
 h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;
 bh=Sx5CMnxcsfHeUr/6e6PYO27Da9ftk38CleaTMceDmCU=;
 b=TH4jle5ZGUc6nVpEBzb7uUV5yveWDYy2ABnn95n4HG62ddtkmKa2aiBkJ8bRJTLNopWOiB+X6fryNmh5j5Ci0nqkZ5Ex6xZhPAcMseGmm5OagcPCpmxLUfkx8xzNpRF9MbKeb8EzpzlznwzNt7WH325Ic6xlW89RToFErAt6l0il/H/s4m8OUK5yTCSAnyZ29xOWPYZhYBL3qCq/S1fx61ZGqtp9rBE/nhjkgp9ECxyHFdvgfMr2dtWt9/+/iK4HTxB+pY+kL+GpHxOPj9C24DOmYkL4G7qTJ8LltzXRYMJVK7Cjb1t1iLQoMRmlJdcyjXJZGAM5Ro9jGfTOnq+yDg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass
 smtp.mailfrom=intel.com; dmarc=pass action=none header.from=intel.com;
 dkim=pass header.d=intel.com; arc=none
Received: from BN0PR11MB5712.namprd11.prod.outlook.com (2603:10b6:408:160::17)
 by DM8PR11MB5607.namprd11.prod.outlook.com (2603:10b6:8:28::17) with
 Microsoft SMTP Server (version=TLS1_2,
 cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.5123.19; Wed, 30 Mar
 2022 09:01:32 +0000
Received: from BN0PR11MB5712.namprd11.prod.outlook.com
 ([fe80::28cf:55af:8c4b:d4d9]) by BN0PR11MB5712.namprd11.prod.outlook.com
 ([fe80::28cf:55af:8c4b:d4d9%3]) with mapi id 15.20.5123.020; Wed, 30 Mar 2022
 09:01:32 +0000
From: "Van Haaren, Harry" <harry.van.haaren@intel.com>
To: =?iso-8859-1?Q?Morten_Br=F8rup?= <mb@smartsharesystems.com>, "Richardson, 
 Bruce" <bruce.richardson@intel.com>
CC: Maxime Coquelin <maxime.coquelin@redhat.com>, "Pai G, Sunil"
 <sunil.pai.g@intel.com>, "Stokes, Ian" <ian.stokes@intel.com>, "Hu, Jiayu"
 <jiayu.hu@intel.com>, "Ferriter, Cian" <cian.ferriter@intel.com>, "Ilya
 Maximets" <i.maximets@ovn.org>, "ovs-dev@openvswitch.org"
 <ovs-dev@openvswitch.org>, "dev@dpdk.org" <dev@dpdk.org>, "Mcnamara, John"
 <john.mcnamara@intel.com>, "O'Driscoll, Tim" <tim.odriscoll@intel.com>,
 "Finn, Emma" <emma.finn@intel.com>
Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
Thread-Topic: OVS DPDK DMA-Dev library/Design Discussion
Thread-Index: Adg/jDNGcC8G4wWtSxeVfUOuAS3Y6wACLuFQAM6falAAJoZBIAAA0srQAAK2F/AABHBTAAAAvWSAAACgo4AAAF2UgAAAVPwgAAOflKAAGxPQIA==
Date: Wed, 30 Mar 2022 09:01:32 +0000
Message-ID: <BN0PR11MB5712A2D5BF5C596ACCFF0542D71F9@BN0PR11MB5712.namprd11.prod.outlook.com>
References: <ddaaf8eb51cf463581eef245543a719d@intel.com>
 <DM8PR11MB56058BABCF2D0CDA3D9AA90DBD1D9@DM8PR11MB5605.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7C@smartserver.smartshare.dk>
 <BN0PR11MB571241F94FE5750BC1AC6A4CD71E9@BN0PR11MB5712.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7D@smartserver.smartshare.dk>
 <7968dd0b-8647-8d7b-786f-dc876bcbf3f0@redhat.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F7E@smartserver.smartshare.dk>
 <YkM71aqX00pY6hVf@bricha3-MOBL.ger.corp.intel.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F80@smartserver.smartshare.dk>
 <BN0PR11MB57122986CFEC329E31133F78D71E9@BN0PR11MB5712.namprd11.prod.outlook.com>
 <98CBD80474FA8B44BF855DF32C47DC35D86F82@smartserver.smartshare.dk>
In-Reply-To: <98CBD80474FA8B44BF855DF32C47DC35D86F82@smartserver.smartshare.dk>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-reaction: no-action
dlp-version: 11.6.401.20
authentication-results: dkim=none (message not signed)
 header.d=none;dmarc=none action=none header.from=intel.com;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 9aace109-5c56-4bc1-c98f-08da122be2bd
x-ms-traffictypediagnostic: DM8PR11MB5607:EE_
x-ld-processed: 46c98d88-e344-4ed4-8496-4ed7712e255d,ExtAddr
x-microsoft-antispam-prvs: <DM8PR11MB560711D469515A1A034B2EB3D71F9@DM8PR11MB5607.namprd11.prod.outlook.com>
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: d0Vi+WO5xS9Byhczr4zTXvZvBiC5l6p54UjFWaeGrV8DXmP09xcnZKJtzA3N9XPlPsbqMOb8bUwZXDQD8EsrV8rnzdGn8baZGumnuD6IbNLqcyR/yU1YebK0r9RHFfEWpex7+qwFCRUZAGu3GQ2nK24jDE6wl+EjCiqInUZxyhZbFrWJb+MPBcbPchzD5viiED4hR7SZmewFc3In7bMtqhqYxtKOKFUSeq+XWlaP/ZfSW+RJkwoygT09U7i76KE+TVkP2fD8OsfMRqGg3bwwO9sIApNPbQWUk1h6wlOjEJIbc2TBT9UjZMwobfxtWoelmZvHUgUYYf/TpRqbZIGwEJZJn273xMl24FsAQ6DARbxLkEnxZ4vq/p4RzcPP9TWeDc3c+RrS8wqtZkz5kDkRUxv+aIbCZNczmTKfJUEK9enwjKO57CogK5U3hAyXms0fzxi9DVO0htBHm5BiuDxLCjNz4cIyFM4oXUvPtiyguFbbAtsPuugazGTxOUw9zIKM+3kp5KKRBhq0O5fCOXN8jNwtNXi/ZMBhJha1VazL9UO0Xp1s/p2p21qA6EQLgwhklVE+LC0tpwoXippWWnP4GMvfAujo0poubJB6PHJW18f5/dSAAop12AyptyMmZC07ISdaibEw2cHW74GBtoivK479qV+A0xds3vM2rYHcKZte6yJI2Uiho6kvYG+5AOjnNUQ9WsCLJNo/42sfIRyzy6ASmyXZKkiCC3apVeiEV+IkTq3AUrvMk78q9/e0OAdhcct/BFPGGkCNw/DcWnqUfHiLX1PE0+5XxxCcrT2HcDw=
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:;
 IPV:NLI; SFV:NSPM; H:BN0PR11MB5712.namprd11.prod.outlook.com; PTR:; CAT:NONE;
 SFS:(13230001)(366004)(33656002)(2906002)(4326008)(55016003)(5660300002)(66946007)(76116006)(83380400001)(66556008)(66476007)(66574015)(186003)(64756008)(66446008)(52536014)(8676002)(38100700002)(107886003)(26005)(966005)(508600001)(38070700005)(9686003)(7696005)(86362001)(71200400001)(316002)(6506007)(53546011)(122000001)(54906003)(82960400001)(6636002)(110136005)(30864003)(8936002);
 DIR:OUT; SFP:1102; 
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: =?iso-8859-1?Q?LAISGwbq7T0ENL+20NPWcG5A1Gc46A1mZlEZvuDy8z7yOGn36BiXFiHmDB?=
 =?iso-8859-1?Q?TMHI2bVGREA/kP68ZsFdMBS4/aTJu5ygHqPg+6WluywCgqP2bkgWzLUv29?=
 =?iso-8859-1?Q?jtgh4jZz7tUTkbITKY2OVRK4IWT6SGMTM/RnE/UvEx/c5WsVEqjimmv7Y4?=
 =?iso-8859-1?Q?idRIvELAh3qIELWEtMRuLTzjH0XHNvetnePKf4t3oFvEcIKeWwbDl5qVDV?=
 =?iso-8859-1?Q?kCWAC0a2cwaFx0B0jbYokcGS3wRYE/AO0bV63b8R0FsPeiulHlv70Q/t6+?=
 =?iso-8859-1?Q?W46a6M6PjnZ2cZ+DzWFp3m4UGMkGmxfcc4x1M6e6tHG5KyhGda0jEe030A?=
 =?iso-8859-1?Q?K2r08XVz89xTOlDb18b9iROPivyNAM1w1R9x44GCl82eCfH2Tz7gE7w2yk?=
 =?iso-8859-1?Q?Rm6boO1jMBat5UKMuZPDYgKBpR8H5Fk3O/g5+sQ1An5wwNKtDMYUtKlBAQ?=
 =?iso-8859-1?Q?suIdLP/nrD+URqEyjyAiVuI9WRRbQXnvcPG9UpfF2VF0Pby+lAymwNPVch?=
 =?iso-8859-1?Q?L68FLi3YFsdwHXqF3WEo0K1Np8wQuwdeYc8Imr4y5YURxVF58GqyaKGMPp?=
 =?iso-8859-1?Q?izmZnOZUx/Smb0cdTKuzCbG0G4nfdG7htphcDdvM4xedeiuIYhuNszJWdN?=
 =?iso-8859-1?Q?BA5KEO1Sjlxxhk51iDt3yE7wHyYW8L5Wj4IHaYYm1EDbYf6Vj5BINpS2Z/?=
 =?iso-8859-1?Q?GSxiyNEG3clG0kcOaAsvbAfw1kVI4Rlyzz055MN6UfHU574SAMeIRs0AhM?=
 =?iso-8859-1?Q?74oh+A6JRcHQ5xvsmKF+qel5M7gwlQ2eIyW3MOMB5SRpyspxAWtu1TpIk5?=
 =?iso-8859-1?Q?pc5X546KATJUXNwEX3f+1FXPHXwtfAnVjVyrj3yBdlJdy2de+/fFhekvTK?=
 =?iso-8859-1?Q?mV9mmYOiNSTGUaIx0nrXktIzDwi6QYdvllodfQFEKwcvNYui2Z+fCwHFUo?=
 =?iso-8859-1?Q?CiuLA/sG2STy6Q86wP6MmWAW9VpX6PXwq/yK2SRgd6oo5cntJjC1z7G2Lr?=
 =?iso-8859-1?Q?ZSfxbcec/BAwe/wJJU6KlX137WAsH4uOun4FVCC9tfouGLoWhWJRbgW4Ls?=
 =?iso-8859-1?Q?UvsCz3j9RJtFdANwQ9VTUWeokNM9mVDOVNZHIMxjXQHwljzUwSiW7jrxtP?=
 =?iso-8859-1?Q?FISPQja54Krg0yG8Tg6/mJEfR0NIjt3Lux9nXktmol2JAq5FANQY4lN2/1?=
 =?iso-8859-1?Q?4PTmWDxxtL0RkaBSLenvaWOrXLrubCUstuBIZ6vdffNWkFrXhP/sLj9Wja?=
 =?iso-8859-1?Q?i1lU4x/Fci4Ie6Zn1gBpOKxSw48nqRLXiHj4BdsYAMuDU1n+r3/UnEX35j?=
 =?iso-8859-1?Q?Gp6nAE/QT0GWd7GrlDXErQ8IVo7IqnuRpPawczl5Lv1PwVGGVbaR/QZkUG?=
 =?iso-8859-1?Q?NblL9zH9TPobjQzoeOcHtdpMdbWhpDjUSAHyv+X5it7dJQgBSvNTnslZ9l?=
 =?iso-8859-1?Q?IHKaM9otsz+TzdY6tjWmBkjgnon4Z9dGpqv1n0jQe+9b0yey8rrBGRiSIU?=
 =?iso-8859-1?Q?wTPdpjkwWWTlbczQDVTvWvaz9ny+wCAjMCNWe+f7fel/tdYUfl/GszGovQ?=
 =?iso-8859-1?Q?FdCFDLoQD8d0RzCWqOF7hMvSEsOpyDVmEubJkYP/6ZssjRHj5QtK1ytsyN?=
 =?iso-8859-1?Q?o5qsMNLx9Fly8Np0kK8NEsmLfoui6FrU3O6XsUzjLo7BoK4YiKfwaUP6bg?=
 =?iso-8859-1?Q?gXzFERoXlAVxxJozdd5tVI86i0PywBB8H4zF4AfrvJ2dpE5zTdZYV8A06q?=
 =?iso-8859-1?Q?9+rcQzCExdJPOSm5XSiy2sNH6XNGfwkqdFGiYoeyv00BEa4MefrzQfCxv8?=
 =?iso-8859-1?Q?Z39R+lUSiCQdD5snf3PuIi2eRhciuds=3D?=
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BN0PR11MB5712.namprd11.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 9aace109-5c56-4bc1-c98f-08da122be2bd
X-MS-Exchange-CrossTenant-originalarrivaltime: 30 Mar 2022 09:01:32.1922 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 46c98d88-e344-4ed4-8496-4ed7712e255d
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: RcEH6La/qj180LZv3WMBa6pGNCrLj6AZMXIPCi4NfbnTlemAHPBezq1d+zTnaqACvS0/UzoJoSoI1aNfXhhEhlWKEhHU7OLEQ8fCv5rN4e8=
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM8PR11MB5607
X-OriginatorOrg: intel.com
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://mails.dpdk.org/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://mails.dpdk.org/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://mails.dpdk.org/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
Errors-To: dev-bounces@dpdk.org

> -----Original Message-----
> From: Morten Br=F8rup <mb@smartsharesystems.com>
> Sent: Tuesday, March 29, 2022 8:59 PM
> To: Van Haaren, Harry <harry.van.haaren@intel.com>; Richardson, Bruce
> <bruce.richardson@intel.com>
> Cc: Maxime Coquelin <maxime.coquelin@redhat.com>; Pai G, Sunil
> <sunil.pai.g@intel.com>; Stokes, Ian <ian.stokes@intel.com>; Hu, Jiayu
> <jiayu.hu@intel.com>; Ferriter, Cian <cian.ferriter@intel.com>; Ilya Maxi=
mets
> <i.maximets@ovn.org>; ovs-dev@openvswitch.org; dev@dpdk.org; Mcnamara, Jo=
hn
> <john.mcnamara@intel.com>; O'Driscoll, Tim <tim.odriscoll@intel.com>; Fin=
n,
> Emma <emma.finn@intel.com>
> Subject: RE: OVS DPDK DMA-Dev library/Design Discussion
>=20
> > From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com]
> > Sent: Tuesday, 29 March 2022 19.46
> >
> > > From: Morten Br=F8rup <mb@smartsharesystems.com>
> > > Sent: Tuesday, March 29, 2022 6:14 PM
> > >
> > > > From: Bruce Richardson [mailto:bruce.richardson@intel.com]
> > > > Sent: Tuesday, 29 March 2022 19.03
> > > >
> > > > On Tue, Mar 29, 2022 at 06:45:19PM +0200, Morten Br=F8rup wrote:
> > > > > > From: Maxime Coquelin [mailto:maxime.coquelin@redhat.com]
> > > > > > Sent: Tuesday, 29 March 2022 18.24
> > > > > >
> > > > > > Hi Morten,
> > > > > >
> > > > > > On 3/29/22 16:44, Morten Br=F8rup wrote:
> > > > > > >> From: Van Haaren, Harry [mailto:harry.van.haaren@intel.com]
> > > > > > >> Sent: Tuesday, 29 March 2022 15.02
> > > > > > >>
> > > > > > >>> From: Morten Br=F8rup <mb@smartsharesystems.com>
> > > > > > >>> Sent: Tuesday, March 29, 2022 1:51 PM
> > > > > > >>>
> > > > > > >>> Having thought more about it, I think that a completely
> > > > different
> > > > > > architectural approach is required:
> > > > > > >>>
> > > > > > >>> Many of the DPDK Ethernet PMDs implement a variety of RX
> > and TX
> > > > > > packet burst functions, each optimized for different CPU vector
> > > > > > instruction sets. The availability of a DMA engine should be
> > > > treated
> > > > > > the same way. So I suggest that PMDs copying packet contents,
> > e.g.
> > > > > > memif, pcap, vmxnet3, should implement DMA optimized RX and TX
> > > > packet
> > > > > > burst functions.
> > > > > > >>>
> > > > > > >>> Similarly for the DPDK vhost library.
> > > > > > >>>
> > > > > > >>> In such an architecture, it would be the application's job
> > to
> > > > > > allocate DMA channels and assign them to the specific PMDs that
> > > > should
> > > > > > use them. But the actual use of the DMA channels would move
> > down
> > > > below
> > > > > > the application and into the DPDK PMDs and libraries.
> > > > > > >>>
> > > > > > >>>
> > > > > > >>> Med venlig hilsen / Kind regards,
> > > > > > >>> -Morten Br=F8rup
> > > > > > >>
> > > > > > >> Hi Morten,
> > > > > > >>
> > > > > > >> That's *exactly* how this architecture is designed &
> > > > implemented.
> > > > > > >> 1.	The DMA configuration and initialization is up to the
> > > > application
> > > > > > (OVS).
> > > > > > >> 2.	The VHost library is passed the DMA-dev ID, and its
> > new
> > > > async
> > > > > > rx/tx APIs, and uses the DMA device to accelerate the copy.
> > > > > > >>
> > > > > > >> Looking forward to talking on the call that just started.
> > > > Regards, -
> > > > > > Harry
> > > > > > >>
> > > > > > >
> > > > > > > OK, thanks - as I said on the call, I haven't looked at the
> > > > patches.
> > > > > > >
> > > > > > > Then, I suppose that the TX completions can be handled in the
> > TX
> > > > > > function, and the RX completions can be handled in the RX
> > function,
> > > > > > just like the Ethdev PMDs handle packet descriptors:
> > > > > > >
> > > > > > > TX_Burst(tx_packet_array):
> > > > > > > 1.	Clean up descriptors processed by the NIC chip. -->
> > Process
> > > > TX
> > > > > > DMA channel completions. (Effectively, the 2nd pipeline stage.)
> > > > > > > 2.	Pass on the tx_packet_array to the NIC chip
> > descriptors. --
> > > > > Pass
> > > > > > on the tx_packet_array to the TX DMA channel. (Effectively, the
> > 1st
> > > > > > pipeline stage.)
> > > > > >
> > > > > > The problem is Tx function might not be called again, so
> > enqueued
> > > > > > packets in 2. may never be completed from a Virtio point of
> > view.
> > > > IOW,
> > > > > > the packets will be copied to the Virtio descriptors buffers,
> > but
> > > > the
> > > > > > descriptors will not be made available to the Virtio driver.
> > > > >
> > > > > In that case, the application needs to call TX_Burst()
> > periodically
> > > > with an empty array, for completion purposes.
> >
> > This is what the "defer work" does at the OVS thread-level, but instead
> > of
> > "brute-forcing" and *always* making the call, the defer work concept
> > tracks
> > *when* there is outstanding work (DMA copies) to be completed
> > ("deferred work")
> > and calls the generic completion function at that point.
> >
> > So "defer work" is generic infrastructure at the OVS thread level to
> > handle
> > work that needs to be done "later", e.g. DMA completion handling.
> >
> >
> > > > > Or some sort of TX_Keepalive() function can be added to the DPDK
> > > > library, to handle DMA completion. It might even handle multiple
> > DMA
> > > > channels, if convenient - and if possible without locking or other
> > > > weird complexity.
> >
> > That's exactly how it is done, the VHost library has a new API added,
> > which allows
> > for handling completions. And in the "Netdev layer" (~OVS ethdev
> > abstraction)
> > we add a function to allow the OVS thread to do those completions in a
> > new
> > Netdev-abstraction API called "async_process" where the completions can
> > be checked.
> >
> > The only method to abstract them is to "hide" them somewhere that will
> > always be
> > polled, e.g. an ethdev port's RX function.  Both V3 and V4 approaches
> > use this method.
> > This allows "completions" to be transparent to the app, at the tradeoff
> > to having bad
> > separation  of concerns as Rx and Tx are now tied-together.
> >
> > The point is, the Application layer must *somehow * handle of
> > completions.
> > So fundamentally there are 2 options for the Application level:
> >
> > A) Make the application periodically call a "handle completions"
> > function
> > 	A1) Defer work, call when needed, and track "needed" at app
> > layer, and calling into vhost txq complete as required.
> > 	        Elegant in that "no work" means "no cycles spent" on
> > checking DMA completions.
> > 	A2) Brute-force-always-call, and pay some overhead when not
> > required.
> > 	        Cycle-cost in "no work" scenarios. Depending on # of
> > vhost queues, this adds up as polling required *per vhost txq*.
> > 	        Also note that "checking DMA completions" means taking a
> > virtq-lock, so this "brute-force" can needlessly increase x-thread
> > contention!
>=20
> A side note: I don't see why locking is required to test for DMA completi=
ons.
> rte_dma_vchan_status() is lockless, e.g.:
> https://elixir.bootlin.com/dpdk/latest/source/drivers/dma/ioat/ioat_dmade=
v.c#L56
> 0

Correct, DMA-dev is "ethdev like"; each DMA-id can be used in a lockfree ma=
nner from a single thread.

The locks I refer to are at the OVS-netdev level, as virtq's are shared acr=
oss OVS's dataplane threads.
So the "M to N" comes from M dataplane threads to N virtqs, hence requiring=
 some locking.


> > B) Hide completions and live with the complexity/architectural
> > sacrifice of mixed-RxTx.
> > 	Various downsides here in my opinion, see the slide deck
> > presented earlier today for a summary.
> >
> > In my opinion, A1 is the most elegant solution, as it has a clean
> > separation of concerns, does not  cause
> > avoidable contention on virtq locks, and spends no cycles when there is
> > no completion work to do.
> >
>=20
> Thank you for elaborating, Harry.

Thanks for part-taking in the discussion & providing your insight!

> I strongly oppose against hiding any part of TX processing in an RX funct=
ion. It is just
> wrong in so many ways!
>=20
> I agree that A1 is the most elegant solution. And being the most elegant =
solution, it
> is probably also the most future proof solution. :-)

I think so too, yes.

> I would also like to stress that DMA completion handling belongs in the D=
PDK
> library, not in the application. And yes, the application will be require=
d to call some
> "handle DMA completions" function in the DPDK library. But since the appl=
ication
> already knows that it uses DMA, the application should also know that it =
needs to
> call this extra function - so I consider this requirement perfectly accep=
table.

Agree here.

> I prefer if the DPDK vhost library can hide its inner workings from the a=
pplication,
> and just expose the additional "handle completions" function. This also m=
eans that
> the inner workings can be implemented as "defer work", or by some other
> algorithm. And it can be tweaked and optimized later.

Yes, the choice in how to call the handle_completions function is Applicati=
on layer.
For OVS we designed Defer Work, V3 and V4. But it is an App level choice, a=
nd every
application is free to choose its own method.=20

> Thinking about the long term perspective, this design pattern is common f=
or both
> the vhost library and other DPDK libraries that could benefit from DMA (e=
.g.
> vmxnet3 and pcap PMDs), so it could be abstracted into the DMA library or=
 a
> separate library. But for now, we should focus on the vhost use case, and=
 just keep
> the long term roadmap for using DMA in mind.

Totally agree to keep long term roadmap in mind; but I'm not sure we can re=
factor
logic out of vhost. When DMA-completions arrive, the virtQ needs to be upda=
ted;
this causes a tight coupling between the DMA completion count, and the vhos=
t library.

As Ilya raised on the call yesterday, there is an "in_order" requirement in=
 the vhost
library, that per virtq the packets are presented to the guest "in order" o=
f enqueue.
(To be clear, *not* order of DMA-completion! As Jiayu mentioned, the Vhost =
library
handles this today by re-ordering the DMA completions.)


> Rephrasing what I said on the conference call: This vhost design will bec=
ome the
> common design pattern for using DMA in DPDK libraries. If we get it wrong=
, we are
> stuck with it.

Agree, and if we get it right, then we're stuck with it too! :)


> > > > > Here is another idea, inspired by a presentation at one of the
> > DPDK
> > > > Userspace conferences. It may be wishful thinking, though:
> > > > >
> > > > > Add an additional transaction to each DMA burst; a special
> > > > transaction containing the memory write operation that makes the
> > > > descriptors available to the Virtio driver.
> > > > >
> > > >
> > > > That is something that can work, so long as the receiver is
> > operating
> > > > in
> > > > polling mode. For cases where virtio interrupts are enabled, you
> > still
> > > > need
> > > > to do a write to the eventfd in the kernel in vhost to signal the
> > > > virtio
> > > > side. That's not something that can be offloaded to a DMA engine,
> > > > sadly, so
> > > > we still need some form of completion call.
> > >
> > > I guess that virtio interrupts is the most widely deployed scenario,
> > so let's ignore
> > > the DMA TX completion transaction for now - and call it a possible
> > future
> > > optimization for specific use cases. So it seems that some form of
> > completion call
> > > is unavoidable.
> >
> > Agree to leave this aside, there is in theory a potential optimization,
> > but
> > unlikely to be of large value.
> >
>=20
> One more thing: When using DMA to pass on packets into a guest, there cou=
ld be a
> delay from the DMA completes until the guest is signaled. Is there any CP=
U cache
> hotness regarding the guest's access to the packet data to consider here?=
 I.e. if we
> wait signaling the guest, the packet data may get cold.

Interesting question; we can likely spawn a new thread around this topic!
In short, it depends on how/where the DMA hardware writes the copy.

With technologies like DDIO, the "dest" part of the copy will be in LLC. Th=
e core reading the
dest data will benefit from the LLC locality (instead of snooping it from a=
 remote core's L1/L2).

Delays in notifying the guest could result in LLC capacity eviction, yes.
The application layer decides how often/promptly to check for completions,
and notify the guest of them. Calling the function more often will result i=
n less
delay in that portion of the pipeline.

Overall, there are caching benefits with DMA acceleration, and the applicat=
ion can control
the latency introduced between dma-completion done in HW, and Guest vring u=
pdate.