From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <roy.fan.zhang@intel.com>
Received: from mga06.intel.com (mga06.intel.com [134.134.136.31])
 by dpdk.org (Postfix) with ESMTP id 7C42723C
 for <dev@dpdk.org>; Sun,  1 Apr 2018 21:53:24 +0200 (CEST)
X-Amp-Result: SKIPPED(no attachment in message)
X-Amp-File-Uploaded: False
Received: from orsmga008.jf.intel.com ([10.7.209.65])
 by orsmga104.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384;
 01 Apr 2018 12:53:22 -0700
X-ExtLoop1: 1
X-IronPort-AV: E=Sophos;i="5.48,392,1517904000"; d="scan'208";a="30206288"
Received: from irsmsx101.ger.corp.intel.com ([163.33.3.153])
 by orsmga008.jf.intel.com with ESMTP; 01 Apr 2018 12:53:21 -0700
Received: from irsmsx156.ger.corp.intel.com (10.108.20.68) by
 IRSMSX101.ger.corp.intel.com (163.33.3.153) with Microsoft SMTP Server (TLS)
 id 14.3.319.2; Sun, 1 Apr 2018 20:53:20 +0100
Received: from irsmsx101.ger.corp.intel.com ([169.254.1.176]) by
 IRSMSX156.ger.corp.intel.com ([169.254.3.229]) with mapi id 14.03.0319.002;
 Sun, 1 Apr 2018 20:53:20 +0100
From: "Zhang, Roy Fan" <roy.fan.zhang@intel.com>
To: "Wodkowski, PawelX" <pawelx.wodkowski@intel.com>, "dev@dpdk.org"
 <dev@dpdk.org>
CC: "maxime.coquelin@redhat.com" <maxime.coquelin@redhat.com>,
 "jianjay.zhou@huawei.com" <jianjay.zhou@huawei.com>, "Tan, Jianfeng"
 <jianfeng.tan@intel.com>
Thread-Topic: [dpdk-dev] [PATCH v4 1/8] lib/librte_vhost: add external
 backend	support
Thread-Index: AQHTx1zzACb9fMOvC0GE3IlgaGuhWaPnKYIAgAUc3AA=
Date: Sun, 1 Apr 2018 19:53:19 +0000
Message-ID: <9F7182E3F746AB4EA17801C148F3C60433107799@IRSMSX101.ger.corp.intel.com>
References: <20180326095114.11605-1-roy.fan.zhang@intel.com>
 <1522327975-28769-1-git-send-email-roy.fan.zhang@intel.com>
 <1522327975-28769-2-git-send-email-roy.fan.zhang@intel.com>
 <F6F2A6264E145F47A18AB6DF8E87425D701190BA@IRSMSX102.ger.corp.intel.com>
In-Reply-To: <F6F2A6264E145F47A18AB6DF8E87425D701190BA@IRSMSX102.ger.corp.intel.com>
Accept-Language: en-IE, en-US
Content-Language: en-US
X-MS-Has-Attach: 
X-MS-TNEF-Correlator: 
x-ctpclassification: CTP_IC
x-titus-metadata-40: eyJDYXRlZ29yeUxhYmVscyI6IiIsIk1ldGFkYXRhIjp7Im5zIjoiaHR0cDpcL1wvd3d3LnRpdHVzLmNvbVwvbnNcL0ludGVsMyIsImlkIjoiMjNlY2M0ZGYtOGE1ZC00YzJmLTg0ZTYtODVhNTU4ZmU2ZDg4IiwicHJvcHMiOlt7Im4iOiJDVFBDbGFzc2lmaWNhdGlvbiIsInZhbHMiOlt7InZhbHVlIjoiQ1RQX0lDIn1dfV19LCJTdWJqZWN0TGFiZWxzIjpbXSwiVE1DVmVyc2lvbiI6IjE2LjUuOS4zIiwiVHJ1c3RlZExhYmVsSGFzaCI6IjBBNldyaFV2ZjhLWVZNd24zRzBGY1wvM2VhckR5TTJ4SlwvSXNxOXRZYjNHaz0ifQ==
dlp-product: dlpe-windows
dlp-version: 11.0.0.116
dlp-reaction: no-action
x-originating-ip: [163.33.239.182]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Subject: Re: [dpdk-dev] [PATCH v4 1/8] lib/librte_vhost: add external
 backend	support
X-BeenThere: dev@dpdk.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: DPDK patches and discussions <dev.dpdk.org>
List-Unsubscribe: <https://dpdk.org/ml/options/dev>,
 <mailto:dev-request@dpdk.org?subject=unsubscribe>
List-Archive: <http://dpdk.org/ml/archives/dev/>
List-Post: <mailto:dev@dpdk.org>
List-Help: <mailto:dev-request@dpdk.org?subject=help>
List-Subscribe: <https://dpdk.org/ml/listinfo/dev>,
 <mailto:dev-request@dpdk.org?subject=subscribe>
X-List-Received-Date: Sun, 01 Apr 2018 19:53:25 -0000

Hi Pawel,

> -----Original Message-----
> From: Wodkowski, PawelX
> Sent: Thursday, March 29, 2018 2:48 PM
> To: Zhang, Roy Fan <roy.fan.zhang@intel.com>; dev@dpdk.org
> Cc: maxime.coquelin@redhat.com; jianjay.zhou@huawei.com; Tan, Jianfeng
> <jianfeng.tan@intel.com>
> Subject: RE: [dpdk-dev] [PATCH v4 1/8] lib/librte_vhost: add external
> backend support
>=20
> > -----Original Message-----
> > From: dev [mailto:dev-bounces@dpdk.org] On Behalf Of Fan Zhang
> > Sent: Thursday, March 29, 2018 2:53 PM
> > To: dev@dpdk.org
> > Cc: maxime.coquelin@redhat.com; jianjay.zhou@huawei.com; Tan,
> Jianfeng
> > <jianfeng.tan@intel.com>
> > Subject: [dpdk-dev] [PATCH v4 1/8] lib/librte_vhost: add external
> > backend support
> >
> > This patch adds external backend support to vhost library. The patch
> > provides new APIs for the external backend to register pre and post
> > vhost-user message handlers.
> >
> > Signed-off-by: Fan Zhang <roy.fan.zhang@intel.com>
> > ---
> >  lib/librte_vhost/rte_vhost.h           | 64
> > +++++++++++++++++++++++++++++++++-
> >  lib/librte_vhost/rte_vhost_version.map |  6 ++++
> >  lib/librte_vhost/vhost.c               | 17 ++++++++-
> >  lib/librte_vhost/vhost.h               |  8 +++--
> >  lib/librte_vhost/vhost_user.c          | 33 +++++++++++++++++-
> >  5 files changed, 123 insertions(+), 5 deletions(-)
> >
> > diff --git a/lib/librte_vhost/rte_vhost.h
> > b/lib/librte_vhost/rte_vhost.h index d332069..b902c44 100644
> > --- a/lib/librte_vhost/rte_vhost.h
> > +++ b/lib/librte_vhost/rte_vhost.h
> > @@ -1,5 +1,5 @@
> >  /* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright(c) 2010-2017 Intel Corporation
> > + * Copyright(c) 2010-2018 Intel Corporation
> >   */
> >
> >  #ifndef _RTE_VHOST_H_
> > @@ -88,6 +88,55 @@ struct vhost_device_ops {  };
> >
> >  /**
> > + * function prototype for the vhost backend to handler specific vhost
> > + user
> > + * messages prior to the master message handling
> > + *
> > + * @param vid
> > + *  vhost device id
> > + * @param msg
> > + *  Message pointer.
> > + * @param payload
> > + *  Message payload.
>=20
> No payload parameter.
Sorry about that. I will fix the comment.

>=20
> > + * @param require_reply
> > + *  If the handler requires sending a reply, this varaible shall be
> > + written 1,
> > + *  otherwise 0.
> > + * @param skip_master
> > + *  If the handler requires skipping the master message handling,
> > + this
> > variable
> > + *  shall be written 1, otherwise 0.
> > + * @return
> > + *  0 on success, -1 on failure
> > + */
> > +typedef int (*rte_vhost_msg_pre_handle)(int vid, void *msg,
> > +		uint32_t *require_reply, uint32_t *skip_master);
> > +
> > +/**
> > + * function prototype for the vhost backend to handler specific vhost
> > +user
> > + * messages after the master message handling is done
> > + *
> > + * @param vid
> > + *  vhost device id
> > + * @param msg
> > + *  Message pointer.
> > + * @param payload
> > + *  Message payload.
>=20
> No payload parameter :)
>=20

Same here

> > + * @param require_reply
> > + *  If the handler requires sending a reply, this varaible shall be
> > +written 1,
> > + *  otherwise 0.
> > + * @return
> > + *  0 on success, -1 on failure
> > + */
> > +typedef int (*rte_vhost_msg_post_handle)(int vid, void *msg,
> > +		uint32_t *require_reply);
> > +
>=20
> What mean 'Message pointer' Is this const for us? Is this payload? Making
> msg 'void *' is not a way to go here. Those pre and post handlers need to=
 see
> exactly the same structures like vhost_user.c file. Otherwise we can get =
into
> troubles when ABI changes.

It is the pointer to the vhost_user message. It cannot be const as the back=
end
may change the payload.=20

>=20
> Also you can easily merge pre and post handlers into one handler with one
> Parameter describing what phase of message processing we are now.
>=20

No I don't think so. To do so it will be quite unclear in the future as we =
are
using one function to do two totally different things.=20

> > +/**
> > + * pre and post vhost user message handlers  */ struct
> > +vhost_user_extern_ops {
> > +	rte_vhost_msg_pre_handle pre_msg_handle;
> > +	rte_vhost_msg_post_handle post_msg_handle; };
> > +
> > +/**
> >   * Convert guest physical address to host virtual address
> >   *
> >   * @param mem
> > @@ -434,6 +483,19 @@ int rte_vhost_vring_call(int vid, uint16_t vring_i=
dx);
> >   */
> >  uint32_t rte_vhost_rx_queue_count(int vid, uint16_t qid);
> >
> > +/**
> > + * register external vhost backend
> > + *
> > + * @param vid
> > + *  vhost device ID
> > + * @param ops
> > + *  ops that process external vhost user messages
> > + * @return
> > + *  0 on success, -1 on failure
> > + */
> > +int
> > +rte_vhost_user_register_extern_ops(int vid, struct
> > vhost_user_extern_ops *ops);
> > +
> >  #ifdef __cplusplus
> >  }
> >  #endif
> > diff --git a/lib/librte_vhost/rte_vhost_version.map
> > b/lib/librte_vhost/rte_vhost_version.map
> > index df01031..91bf9f0 100644
> > --- a/lib/librte_vhost/rte_vhost_version.map
> > +++ b/lib/librte_vhost/rte_vhost_version.map
> > @@ -59,3 +59,9 @@ DPDK_18.02 {
> >  	rte_vhost_vring_call;
> >
> >  } DPDK_17.08;
> > +
> > +DPDK_18.05 {
> > +	global:
> > +
> > +	rte_vhost_user_register_extern_ops;
> > +} DPDK_18.02;
> > diff --git a/lib/librte_vhost/vhost.c b/lib/librte_vhost/vhost.c index
> > a407067..80af341 100644
> > --- a/lib/librte_vhost/vhost.c
> > +++ b/lib/librte_vhost/vhost.c
> > @@ -1,5 +1,5 @@
> >  /* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright(c) 2010-2016 Intel Corporation
> > + * Copyright(c) 2010-2018 Intel Corporation
> >   */
> >
> >  #include <linux/vhost.h>
> > @@ -627,3 +627,18 @@ rte_vhost_rx_queue_count(int vid, uint16_t qid)
> >
> >  	return *((volatile uint16_t *)&vq->avail->idx) - vq->last_avail_idx;
> > }
> > +
> > +int
> > +rte_vhost_user_register_extern_ops(int vid, struct
> > vhost_user_extern_ops *ops)
> > +{
> > +	struct virtio_net *dev;
> > +
> > +	dev =3D get_device(vid);
> > +	if (dev =3D=3D NULL)
> > +		return -1;
> > +
> > +	if (ops)
> > +		rte_memcpy(&dev->extern_ops, ops, sizeof(*ops));
> > +
> > +	return 0;
> > +}
>=20
> Why we need this new "register" API? Why can't you use one of the (struct
> vhost_device_ops).reserved[0] field to put this callback there?
> I think this is right time to utilize this field.
>=20

The patch here is a more generic and intuitive way for external backend to
register the handlers to process the vhost user message only recognized by =
it.
Please read Maxime's comments in v2 version of this patch.
http://dpdk.org/ml/archives/dev/2018-March/093408.html.
As we discussed we need 2 different handlers for external vhost user device
to handle device specifc vhost user messages. A public API is needed.=20

> Can you do something similar to
> http://dpdk.org/ml/archives/dev/2018-March/094213.html ?

The patch content here causes the least damage to the existing library. Plu=
s
the patch you mentioned won't help with the pre and post handlers problem -=
=20
or it would consume all of two remaining reserved fields in vhost_user_ops
structure for pre and post handlers, respectively.

>=20
> > diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h index
> > d947bc9..2072b88 100644
> > --- a/lib/librte_vhost/vhost.h
> > +++ b/lib/librte_vhost/vhost.h
> > @@ -1,5 +1,5 @@
> >  /* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright(c) 2010-2014 Intel Corporation
> > + * Copyright(c) 2010-2018 Intel Corporation
> >   */
> >
> >  #ifndef _VHOST_NET_CDEV_H_
> > @@ -241,8 +241,12 @@ struct virtio_net {
> >  	struct guest_page       *guest_pages;
> >
> >  	int			slave_req_fd;
> > -} __rte_cache_aligned;
> >
> > +	/* private data for external virtio device */
> > +	void			*extern_data;
> > +	/* pre and post vhost user message handlers for externel backend */
> > +	struct vhost_user_extern_ops extern_ops; } __rte_cache_aligned;
> >
> >  #define VHOST_LOG_PAGE	4096
> >
> > diff --git a/lib/librte_vhost/vhost_user.c
> > b/lib/librte_vhost/vhost_user.c index 90ed211..ede8a5e 100644
> > --- a/lib/librte_vhost/vhost_user.c
> > +++ b/lib/librte_vhost/vhost_user.c
> > @@ -1,5 +1,5 @@
> >  /* SPDX-License-Identifier: BSD-3-Clause
> > - * Copyright(c) 2010-2016 Intel Corporation
> > + * Copyright(c) 2010-2018 Intel Corporation
> >   */
> >
> >  #include <stdint.h>
> > @@ -50,6 +50,8 @@ static const char
> > *vhost_message_str[VHOST_USER_MAX] =3D {
> >  	[VHOST_USER_NET_SET_MTU]  =3D "VHOST_USER_NET_SET_MTU",
> >  	[VHOST_USER_SET_SLAVE_REQ_FD]  =3D
> > "VHOST_USER_SET_SLAVE_REQ_FD",
> >  	[VHOST_USER_IOTLB_MSG]  =3D "VHOST_USER_IOTLB_MSG",
> > +	[VHOST_USER_CRYPTO_CREATE_SESS] =3D
> > "VHOST_USER_CRYPTO_CREATE_SESS",
> > +	[VHOST_USER_CRYPTO_CLOSE_SESS] =3D
> > "VHOST_USER_CRYPTO_CLOSE_SESS",
> >  };
> >
> >  static uint64_t
> > @@ -1302,6 +1304,7 @@ vhost_user_msg_handler(int vid, int fd)
> >  	struct VhostUserMsg msg;
> >  	int ret;
> >  	int unlock_required =3D 0;
> > +	uint32_t skip_master =3D 0;
> >
> >  	dev =3D get_device(vid);
> >  	if (dev =3D=3D NULL)
> > @@ -1379,6 +1382,21 @@ vhost_user_msg_handler(int vid, int fd)
> >
> >  	}
> >
> > +	if (dev->extern_ops.pre_msg_handle) {
> > +		uint32_t need_reply;
> > +
> > +		ret =3D (*dev->extern_ops.pre_msg_handle)(dev->vid,
> > +				(void *)&msg, &need_reply, &skip_master);
> > +		if (ret < 0)
> > +			goto skip_to_reply;
> > +
> > +		if (need_reply)
> > +			send_vhost_reply(fd, &msg);
> > +	}
> > +
> > +	if (skip_master)
> > +		goto skip_to_post_handle;
>=20
> This can be moved inside above  if () { }

Yes, you are right.

>=20
> > +
> >  	switch (msg.request.master) {
> >  	case VHOST_USER_GET_FEATURES:
> >  		msg.payload.u64 =3D vhost_user_get_features(dev); @@ -
> 1479,9 +1497,22
> > @@ vhost_user_msg_handler(int vid, int fd)
> >  	default:
> >  		ret =3D -1;
> >  		break;
> > +	}
> > +
> > +skip_to_post_handle:
> > +	if (dev->extern_ops.post_msg_handle) {
> > +		uint32_t need_reply;
> > +
> > +		ret =3D (*dev->extern_ops.post_msg_handle)(
> > +				dev->vid, (void *)&msg, &need_reply);
> > +		if (ret < 0)
> > +			goto skip_to_reply;
> >
> > +		if (need_reply)
> > +			send_vhost_reply(fd, &msg);
> >  	}
> >
> > +skip_to_reply:
> >  	if (unlock_required)
> >  		vhost_user_unlock_all_queue_pairs(dev);
> >
> > --
> > 2.7.4
>=20
> Overall, I think, this direction where we need to go.
>=20
> Pawel

Regards,
Fan