From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <pawelx.wodkowski@intel.com>
Received: from mga02.intel.com (mga02.intel.com [134.134.136.20])
 by dpdk.org (Postfix) with ESMTP id A973C324D
 for <dev@dpdk.org>; Thu, 29 Mar 2018 14:15:14 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga006.jf.intel.com ([10.7.209.51])
 by orsmga101.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 29 Mar 2018 05:15:13 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,376,1517904000"; d="scan'208";a="29899868"
Received: from irsmsx110.ger.corp.intel.com ([163.33.3.25])
 by orsmga006.jf.intel.com with ESMTP; 29 Mar 2018 05:15:12 -0700
Received: from irsmsx102.ger.corp.intel.com ([169.254.2.164]) by
 irsmsx110.ger.corp.intel.com ([169.254.15.211]) with mapi id 14.03.0319.002;
 Thu, 29 Mar 2018 13:15:11 +0100
From: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>
To: "Wang, Zhihong" <zhihong.wang@intel.com>, "dev@dpdk.org" <dev@dpdk.org>
CC: "Tan, Jianfeng" <jianfeng.tan@intel.com>, "Bie, Tiwei"
 <tiwei.bie@intel.com>, "maxime.coquelin@redhat.com"
 <maxime.coquelin@redhat.com>, "yliu@fridaylinux.org" <yliu@fridaylinux.org>,
 "Liang, Cunming" <cunming.liang@intel.com>, "Wang, Xiao W"
 <xiao.w.wang@intel.com>, "Daly, Dan" <dan.daly@intel.com>, "Wang, Zhihong"
 <zhihong.wang@intel.com>
Thread-Topic: [dpdk-dev] [PATCH v3 0/5] vhost: support selective datapath
Thread-Index: AQHTv2r0F6WQs7nVMUSen0TOWW3geaPnLYVA
Date: Thu, 29 Mar 2018 12:15:11 +0000
Message-ID: <F6F2A6264E145F47A18AB6DF8E87425D70118FD4@IRSMSX102.ger.corp.intel.com>
References: <1517614137-62926-1-git-send-email-zhihong.wang@intel.com>
 <20180227101342.18521-1-zhihong.wang@intel.com>
In-Reply-To: <20180227101342.18521-1-zhihong.wang@intel.com>
Accept-Language: pl-PL, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
dlp-product: dlpe-windows
dlp-version: 11.0.0.116
dlp-reaction: no-action
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMWM0N2U1ZjAtMTI4Zi00YWUxLWI4ZmItMjVjOGIzNjY1NGI4IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6Im1WYWQzRjNIZzNUVEhVMDM3OEdyeWp5ejlLOFVpeCtvWE14ZXVKM1JzZkk9In0=
x-ctpclassification: CTP_IC
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v3 0/5] vhost: support selective datapath
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Thu, 29 Mar 2018 12:15:15 -0000

> -----Original Message-----
> From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Zhihong Wang
> Sent: Tuesday, February 27, 2018 11:14 AM
> To: dev@dpdk.org
> Cc: Tan, Jianfeng <jianfeng.tan@intel.com>; Bie, Tiwei <tiwei.bie@intel.c=
om>;
> maxime.coquelin@redhat.com; yliu@fridaylinux.org; Liang, Cunming
> <cunming.liang@intel.com>; Wang, Xiao W <xiao.w.wang@intel.com>; Daly,
> Dan <dan.daly@intel.com>; Wang, Zhihong <zhihong.wang@intel.com>
> Subject: [dpdk-dev] [PATCH v3 0/5] vhost: support selective datapath
>=20
> This patch set introduces support for selective datapath in DPDK vhost-us=
er
> lib. vDPA stands for vhost Data Path Acceleration. The idea is to enable
> various types of virtio-compatible devices to do data transfer with virti=
o
> driver directly to enable acceleration.
>=20
> The default datapath is the existing software implementation, more option=
s
> will be available when new engines are added.
>=20
> Design details
> =3D=3D=3D=3D
>=20
> An engine is a group of virtio-compatible devices. The definition of engi=
ne
> is as follows:
>=20
> struct rte_vdpa_eng_addr {
> 	union {
> 		uint8_t __dummy[64];
> 		struct rte_pci_addr pci_addr;
> 	};
> };
>=20
> struct rte_vdpa_eng_info {
> 	char name[MAX_VDPA_NAME_LEN];
> 	struct rte_vdpa_eng_addr *addr;
> };
>=20
> struct rte_vdpa_dev_ops {
> 	vdpa_dev_conf_t        dev_conf;
> 	vdpa_dev_close_t       dev_close;
> 	vdpa_vring_state_set_t vring_state_set;
> 	vdpa_feature_set_t     feature_set;
> 	vdpa_migration_done_t  migration_done;
> 	vdpa_get_vfio_group_fd_t  get_vfio_group_fd;
> 	vdpa_get_vfio_device_fd_t get_vfio_device_fd;
> 	vdpa_get_notify_area_t    get_notify_area;
> };
>=20
> struct rte_vdpa_eng_ops {
> 	vdpa_eng_init_t   eng_init;
> 	vdpa_eng_uninit_t eng_uninit;
> 	vdpa_info_query_t info_query;
> };
>=20
> struct rte_vdpa_eng_driver {
> 	const char *name;
> 	struct rte_vdpa_eng_ops eng_ops;
> 	struct rte_vdpa_dev_ops dev_ops;
> } __rte_cache_aligned;
>=20
> struct rte_vdpa_engine {
> 	struct rte_vdpa_eng_info    eng_info;
> 	struct rte_vdpa_eng_driver *eng_drv;
> } __rte_cache_aligned;
>=20
> A set of engine ops is defined in rte_vdpa_eng_ops for engine init, unini=
t,
> and attributes reporting. The attributes are defined as follows:
>=20
> struct rte_vdpa_eng_attr {
> 	uint64_t features;
> 	uint64_t protocol_features;
> 	uint32_t queue_num;
> 	uint32_t dev_num;
> };
>=20
> A set of device ops is defined in rte_vdpa_dev_ops for each virtio device
> in the engine to do device specific operations.
>=20
> Changes to the current vhost-user lib are:
> =3D=3D=3D=3D
>=20
>  1. Make vhost device capabilities configurable to adopt various engines.
>     Such capabilities include supported features, protocol features, queu=
e
>     number. APIs are introduced to let app configure these capabilities.
>=20
>  2. In addition to the existing vhost framework, a set of callbacks is
>     added for vhost to call the driver for device operations at the right
>     time:
>=20
>      a. dev_conf: Called to configure the actual device when the virtio
>         device becomes ready.
>=20
>      b. dev_close: Called to close the actual device when the virtio devi=
ce
>         is stopped.
>=20
>      c. vring_state_set: Called to change the state of the vring in the
>         actual device when vring state changes.
>=20
>      d. feature_set: Called to set the negotiated features to device.
>=20
>      e. migration_done: Called to allow the device to response to RARP
>         sending.
>=20
>      f. get_vfio_group_fd: Called to get the VFIO group fd of the device.
>=20
>      g. get_vfio_device_fd: Called to get the VFIO device fd of the devic=
e.
>=20
>      h. get_notify_area: Called to get the notify area info of the queue.
>=20
>  3. To make vhost aware of its own type, an engine id (eid) and a device
>     id (did) are added into the vhost data structure to identify the actu=
al
>     device. APIs are introduced to let app configure them. When the defau=
lt
>     software datapath is used, eid and did are set to -1. When alternativ=
e
>     datapath is used, eid and did are set by app to specify which device =
to
>     use. Each vhost-user socket can have only 1 connection in this case.

Why only one connection is possible? We are already working on multiple
simultaneous connections in SPDK. So this will be some kind of step backwar=
d.

>=20
> Working process:
> =3D=3D=3D=3D
>=20
>  1. Register driver during DPDK initialization.
>=20
>  2. Register engine with driver name and address.
>=20
>  3. Get engine attributes.
>=20
>  4. For vhost device creation:
>=20
>       a. Register vhost-user socket.
>=20
>       b. Set eid and did of the vhost-user socket.
>=20
>       c. Register vhost-user callbacks.
>=20
>       d. Start to wait for connection.
>=20
>  4. When connection comes and virtio device data structure is negotiated,
>     the device will be configured with all needed info.
>=20

Can you please provide new or modify existing example to show how to use th=
is new API?
It would be easier to find any possible gaps if we can see real use case.

Pawel