From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from dpdk.org (dpdk.org [92.243.14.124]) by inbox.dpdk.org (Postfix) with ESMTP id 37506A2EEB for ; Mon, 7 Oct 2019 15:08:20 +0200 (CEST) Received: from [92.243.14.124] (localhost [127.0.0.1]) by dpdk.org (Postfix) with ESMTP id 6EE361C1D2; Mon, 7 Oct 2019 15:08:19 +0200 (CEST) Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) by dpdk.org (Postfix) with ESMTP id 4D1521C1CE for ; Mon, 7 Oct 2019 15:08:17 +0200 (CEST) Received: from mail-wm1-f72.google.com (mail-wm1-f72.google.com [209.85.128.72]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 339BDC0546D5 for ; Mon, 7 Oct 2019 13:08:16 +0000 (UTC) Received: by mail-wm1-f72.google.com with SMTP id n3so6193669wmf.3 for ; Mon, 07 Oct 2019 06:08:16 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:to:cc:references:from:autocrypt:subject :message-id:date:user-agent:mime-version:in-reply-to :content-language:content-transfer-encoding; bh=BHlVqUcS3rOjR+kiA0MbN41y/qR13yZFdsQ1SEEvPwI=; b=JG/phGhYT+3rLtdxJ7U/e6Fart4LGf1fquiQCjnOfYVwfL9lz1LMdNAEwHFFawwwGI R2DF2LqksJy1J3vYPqAEg17hiZxLmbl+MlYK719tmiTCrOJhk8+KP1AiCNbFRnzhCmLs Z7j+5SSpIgf8QxW0XHpbeFQb71uerRo9KuAT2NF9b84xboG93Vd7s2jQMQxDdAA9M/AF p3nN6sE1qEcWVUaTOuvAWKjHr+SH0cqToC6hkPFTchbEybgAqjQknNNWEx06gYiqTmMx CN7ycrwbmxz0zCtEhkbz6UpMoRdjfe/N5aEFleogns11iIltyXz9+A4yVCfB8oqheSbs uYfQ== X-Gm-Message-State: APjAAAWeugC5MONROq5b4k8jaT+xxBLWTCRKaCC4OV3e7kAaO0Prj4NJ jsRzr1icG+g+NbbxtlD+K03wrjbGcB5/iE2OAERzmqK+wHwsyukUdXQjl3oldz0UB1oMOjwx+KY QcmU= X-Received: by 2002:a1c:a404:: with SMTP id n4mr16650006wme.137.1570453694814; Mon, 07 Oct 2019 06:08:14 -0700 (PDT) X-Google-Smtp-Source: APXvYqwzynUOu0/BQfhE/8REVuWH4WF56PVr+ADHDZ4Q0ophk0GXzlOI576SV2+Lpcc98fESRx23ZA== X-Received: by 2002:a1c:a404:: with SMTP id n4mr16649972wme.137.1570453694481; Mon, 07 Oct 2019 06:08:14 -0700 (PDT) Received: from amorenoz.users.ipa.redhat.com ([81.0.4.169]) by smtp.gmail.com with ESMTPSA id l9sm15056968wme.45.2019.10.07.06.08.13 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Mon, 07 Oct 2019 06:08:13 -0700 (PDT) To: Jin Yu , dev@dpdk.org Cc: changpeng.liu@intel.com, maxime.coquelin@redhat.com, tiwei.bie@intel.com, zhihong.wang@intel.com, Lin Li , Xun Ni , Yu Zhang References: <20190920120102.29828> <20190927105624.16313-1-jin.yu@intel.com> <20190927105624.16313-5-jin.yu@intel.com> From: Adrian Moreno Autocrypt: addr=amorenoz@redhat.com; prefer-encrypt=mutual; keydata= mQENBF1syNUBCADQ9dk3fDMxOZ/+OQpmbanpodYxEv8IRtDz8PXw8YX7UyGfozOpLjQ8Fftj ZxuubYNbt2QVbSgviFilFdNWu2eTnN/JaGtfhmTOLPVoakkPHZF8lbgImMoch7L0fH8wN2IM KPxQyPNlX+K9FD5brHsV1lfe1TwAxmhcvLW8yNrVq+9eDIDykxc7tH4exIqXgZroahGxMHKy c8Ti2kJka/t6pDfRaY0J+6J7I1nrn6GXXSMNA45EH8+0N/QlcXhP3rfftnoPeVmpjswzvJqY FNjf/Q5VPLx7RX0Qx+y8mMB2JcChV5Bl7D7x5EUbItj6+Sy7QfOgCtPegk9HSrBCNYaLABEB AAG0I0FkcmlhbiBNb3Jlbm8gPGFtb3Jlbm96QHJlZGhhdC5jb20+iQFUBBMBCAA+FiEEogUD gihhmbOPHy26d5C5fbYeFsUFAl1syNUCGwMFCQHhM4AFCwkIBwIGFQoJCAsCBBYCAwECHgEC F4AACgkQd5C5fbYeFsX7qwgArGHSkX+ILNcujkVzjTG4OtkpJMPFlkn/1PxSEKD0jLuzx14B COzpg/Mqj3Re/QBuOas+ci9bsUA0/2nORtmmEBvzDOJpR5FH1jaGCx8USlY4WM6QqEDNZgTw hsy9KhjFzFjMk+oo3HyItXA+Uq9yrRBTjNBGTXxezMRcMuUZ4MIAfY0IRBglL2BufiuL43jD BvTENNFLoQ/wFV7qkFWSkv+8IjTsxr7M6XUo1QLd29Hn0dvwssN579HL1+BP46i2REpzeBEG L75iVChi+YnIQQNMJ9NYarVabZx4Y1Gn8+7B/1SNArDV+IDgnYgt7E58otoV2Ap310dmtuvE VbxGpbkBDQRdbMjVAQgAqyp9oA7WDu7/Y9T4Ommt69iZx8os7shUIfdgPEy5xrcPn6qGwN1/ HQ4j8nWfBG9uuX1X0RXUZIUEtYTxtED4yaCQMTqDUf9cBAwAA2mYxBfoiNYx8YqxM+sT0/J4 2qmDd+y+20UR4yzHE8AmIbspTzDFIJDAi+jKSR8F355z0sfW7CIMDC4ZWrPsskjEy1YN/U10 r6tRRH1kNyrCSbTG0d9MtcQO58h7DLXuzUhErB+BtG52A04t5cweIJTJC+koV5XPeilzlHnm RFoj0ncruGa9Odns21BNt3cy9wLfK+aUnWuAB1uc6bJGQPiAwjkilz7g7MBRUuIQ2Zt7HGLc SwARAQABiQE8BBgBCAAmFiEEogUDgihhmbOPHy26d5C5fbYeFsUFAl1syNUCGwwFCQHhM4AA CgkQd5C5fbYeFsUlSwf8CH+u/IXaE7WeWxwFkMaORfW8cM4q0xrL3M6yRGuQNW+kMjnrvK9U J9G+L1/5uTRbDQ/4LdoKqize8LjehA+iF6ba4t9Npikh8fLKWgaJfQ/hPhH4C3O5gWPOLTW6 ylGxiuER4CdFwQIoAMMslhFA7G+teeOKBq36E+1+zrybI6Xy1UBSlpDK9j4CtTnMQejjuSQb Qhle+l8VroaUHq869wjAhRHHhqmtJKggI+OvzgQpDIwfHIDypb1BuKydi2W6cVYEALUYyCLS dTBDhzj8zR5tPCsga8J7+TclQzkWOiI2C6ZtiWrMsL/Uym3uXk5nsoc7lSj7yLd/MrBRhYfP JQ== Message-ID: <3fb8896c-b944-ca56-c080-d12323a246a4@redhat.com> Date: Mon, 7 Oct 2019 15:08:12 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:68.0) Gecko/20100101 Thunderbird/68.1.0 MIME-Version: 1.0 In-Reply-To: <20190927105624.16313-5-jin.yu@intel.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 8bit Subject: Re: [dpdk-dev] [PATCH v8 04/10] vhost: add two new messages to support a shared buffer X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dev-bounces@dpdk.org Sender: "dev" On 9/27/19 12:56 PM, Jin Yu wrote: > This patch introduces two new messages VHOST_USER_GET_INFLIGHT_FD > and VHOST_USER_SET_INFLIGHT_FD to support transferring a shared > buffer between qemu and backend. > > Signed-off-by: Lin Li > Signed-off-by: Xun Ni > Signed-off-by: Yu Zhang > Signed-off-by: Jin Yu > --- > lib/librte_vhost/vhost.h | 7 + > lib/librte_vhost/vhost_user.c | 255 +++++++++++++++++++++++++++++++++- > lib/librte_vhost/vhost_user.h | 4 +- > 3 files changed, 264 insertions(+), 2 deletions(-) > > diff --git a/lib/librte_vhost/vhost.h b/lib/librte_vhost/vhost.h > index 884befa85..d67ba849a 100644 > --- a/lib/librte_vhost/vhost.h > +++ b/lib/librte_vhost/vhost.h > @@ -286,6 +286,12 @@ struct guest_page { > uint64_t size; > }; > > +struct inflight_mem_info { > + int fd; > + void *addr; > + uint64_t size; > +}; > + > /** > * Device structure contains all configuration information relating > * to the device. > @@ -303,6 +309,7 @@ struct virtio_net { > uint32_t nr_vring; > int dequeue_zero_copy; > struct vhost_virtqueue *virtqueue[VHOST_MAX_QUEUE_PAIRS * 2]; > + struct inflight_mem_info *inflight_info; > #define IF_NAME_SZ (PATH_MAX > IFNAMSIZ ? PATH_MAX : IFNAMSIZ) > char ifname[IF_NAME_SZ]; > uint64_t log_size; > diff --git a/lib/librte_vhost/vhost_user.c b/lib/librte_vhost/vhost_user.c > index c9e29ece8..b57930add 100644 > --- a/lib/librte_vhost/vhost_user.c > +++ b/lib/librte_vhost/vhost_user.c > @@ -37,6 +37,10 @@ > #ifdef RTE_LIBRTE_VHOST_POSTCOPY > #include > #endif > +#ifdef F_ADD_SEALS /* if file sealing is supported, so is memfd */ > +#include > +#define MEMFD_SUPPORTED > +#endif > > #include > #include > @@ -49,6 +53,15 @@ > #define VIRTIO_MIN_MTU 68 > #define VIRTIO_MAX_MTU 65535 > > +#define INFLIGHT_ALIGNMENT 64 > +#define INFLIGHT_VERSION 0x1 > +#define VIRTQUEUE_MAX_SIZE 1024 > + > +#define CLOEXEC 0x0001U > + > +#define ALIGN_DOWN(n, m) ((n) / (m) * (m)) > +#define ALIGN_UP(n, m) ALIGN_DOWN((n) + (m) - 1, (m)) There are equivalent macros in lib/librte_eal/common/include/rte_common.h You might want to consider using those instead of adding new ones. > + > static const char *vhost_message_str[VHOST_USER_MAX] = { > [VHOST_USER_NONE] = "VHOST_USER_NONE", > [VHOST_USER_GET_FEATURES] = "VHOST_USER_GET_FEATURES", > @@ -78,6 +91,8 @@ static const char *vhost_message_str[VHOST_USER_MAX] = { > [VHOST_USER_POSTCOPY_ADVISE] = "VHOST_USER_POSTCOPY_ADVISE", > [VHOST_USER_POSTCOPY_LISTEN] = "VHOST_USER_POSTCOPY_LISTEN", > [VHOST_USER_POSTCOPY_END] = "VHOST_USER_POSTCOPY_END", > + [VHOST_USER_GET_INFLIGHT_FD] = "VHOST_USER_GET_INFLIGHT_FD", > + [VHOST_USER_SET_INFLIGHT_FD] = "VHOST_USER_SET_INFLIGHT_FD", > }; > > static int send_vhost_reply(int sockfd, struct VhostUserMsg *msg); > @@ -160,6 +175,22 @@ vhost_backend_cleanup(struct virtio_net *dev) > dev->log_addr = 0; > } > > + if (dev->inflight_info) { > + if (dev->inflight_info->addr) { > + munmap(dev->inflight_info->addr, > + dev->inflight_info->size); > + dev->inflight_info->addr = NULL; > + } > + > + if (dev->inflight_info->fd > 0) { > + close(dev->inflight_info->fd); > + dev->inflight_info->fd = -1; > + } > + > + free(dev->inflight_info); > + dev->inflight_info = NULL; > + } > + > if (dev->slave_req_fd >= 0) { > close(dev->slave_req_fd); > dev->slave_req_fd = -1; > @@ -1165,6 +1196,227 @@ virtio_is_ready(struct virtio_net *dev) > return 1; > } > > +static void * > +inflight_mem_alloc(const char *name, size_t size, int *fd) > +{ > + void *ptr; > + int mfd = -1; > + char fname[20] = "/tmp/memfd-XXXXXX"; > + > + *fd = -1; > +#ifdef MEMFD_SUPPORTED > + mfd = memfd_create(name, MFD_CLOEXEC); > +#else > + RTE_SET_USED(name); > +#endif > + if (mfd != -1) { > + if (ftruncate(mfd, size) == -1) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "ftruncate fail for alloc inflight buffer\n"); > + close(mfd); > + return NULL; > + } > + } else { > + mfd = mkstemp(fname); > + unlink(fname); > + > + if (mfd == -1) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "mkstemp fail for alloc inflight buffer\n"); > + return NULL; > + } > + > + if (ftruncate(mfd, size) == -1) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "ftruncate fail for alloc inflight buffer\n"); > + close(mfd); > + return NULL; > + } > + } > + > + ptr = mmap(0, size, PROT_READ | PROT_WRITE, MAP_SHARED, mfd, 0); > + if (ptr == MAP_FAILED) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "mmap fail for alloc inflight buffer\n"); > + close(mfd); > + return NULL; > + } > + > + *fd = mfd; > + return ptr; > +} > + > +static uint32_t > +get_pervq_shm_size_split(uint16_t queue_size) > +{ > + return ALIGN_UP(sizeof(struct rte_vhost_inflight_desc_split) * > + queue_size + sizeof(uint64_t) + sizeof(uint16_t) * 4, > + INFLIGHT_ALIGNMENT); > +} > + > +static uint32_t > +get_pervq_shm_size_packed(uint16_t queue_size) > +{ > + return ALIGN_UP(sizeof(struct rte_vhost_inflight_desc_packed) * > + queue_size + sizeof(uint64_t) + sizeof(uint16_t) * 6 + > + sizeof(uint8_t) * 9, INFLIGHT_ALIGNMENT); > +} > + > +static int > +vhost_user_get_inflight_fd(struct virtio_net **pdev, > + VhostUserMsg *msg, > + int main_fd __rte_unused) > +{ > + int fd, i, j; > + uint64_t pervq_inflight_size, mmap_size; > + void *addr; > + uint16_t num_queues, queue_size; > + struct virtio_net *dev = *pdev; > + struct rte_vhost_inflight_info_packed *inflight_packed = NULL; > + > + if (msg->size != sizeof(msg->payload.inflight)) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "Invalid get_inflight_fd message size is %d", > + msg->size); Did you forget to end the error message with a '\n'? > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + > + if (dev->inflight_info == NULL) { > + dev->inflight_info = > + calloc(1, sizeof(struct inflight_mem_info)); > + if (!dev->inflight_info) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "Failed to alloc dev inflight area"); Same here: maybe missing '\n'? > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + } > + > + num_queues = msg->payload.inflight.num_queues; > + queue_size = msg->payload.inflight.queue_size; > + > + RTE_LOG(INFO, VHOST_CONFIG, "get_inflight_fd num_queues: %u\n", > + msg->payload.inflight.num_queues); > + RTE_LOG(INFO, VHOST_CONFIG, "get_inflight_fd queue_size: %u\n", > + msg->payload.inflight.queue_size); > + > + if (vq_is_packed(dev)) > + pervq_inflight_size = get_pervq_shm_size_packed(queue_size); > + else > + pervq_inflight_size = get_pervq_shm_size_split(queue_size); > + > + mmap_size = num_queues * pervq_inflight_size; > + addr = inflight_mem_alloc("vhost-inflight", mmap_size, &fd); > + if (!addr) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "Failed to alloc vhost inflight area"); > + msg->payload.inflight.mmap_size = 0; > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + memset(addr, 0, mmap_size); > + > + dev->inflight_info->addr = addr; > + dev->inflight_info->size = msg->payload.inflight.mmap_size = mmap_size; > + dev->inflight_info->fd = msg->fds[0] = fd; > + msg->payload.inflight.mmap_offset = 0; > + msg->fd_num = 1; > + > + if (vq_is_packed(dev)) { > + for (i = 0; i < num_queues; i++) { > + inflight_packed = > + (struct rte_vhost_inflight_info_packed *)addr; > + inflight_packed->used_wrap_counter = 1; > + inflight_packed->old_used_wrap_counter = 1; > + for (j = 0; j < queue_size; j++) > + inflight_packed->desc[j].next = j + 1; > + addr = (void *)((char *)addr + pervq_inflight_size); > + } > + } > + > + RTE_LOG(INFO, VHOST_CONFIG, > + "send inflight mmap_size: %"PRIu64"\n", > + msg->payload.inflight.mmap_size); > + RTE_LOG(INFO, VHOST_CONFIG, > + "send inflight mmap_offset: %"PRIu64"\n", > + msg->payload.inflight.mmap_offset); > + RTE_LOG(INFO, VHOST_CONFIG, > + "send inflight fd: %d\n", msg->fds[0]); > + > + return RTE_VHOST_MSG_RESULT_REPLY; > +} > + > +static int > +vhost_user_set_inflight_fd(struct virtio_net **pdev, VhostUserMsg *msg, > + int main_fd __rte_unused) > +{ > + int fd; > + uint64_t mmap_size, mmap_offset; > + uint16_t num_queues, queue_size; > + uint32_t pervq_inflight_size; > + void *addr; > + struct virtio_net *dev = *pdev; > + > + fd = msg->fds[0]; > + if (msg->size != sizeof(msg->payload.inflight) || fd < 0) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "Invalid set_inflight_fd message size is %d,fd is %d\n", > + msg->size, fd); > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + > + mmap_size = msg->payload.inflight.mmap_size; > + mmap_offset = msg->payload.inflight.mmap_offset; > + num_queues = msg->payload.inflight.num_queues; > + queue_size = msg->payload.inflight.queue_size; > + > + if (vq_is_packed(dev)) > + pervq_inflight_size = get_pervq_shm_size_packed(queue_size); > + else > + pervq_inflight_size = get_pervq_shm_size_split(queue_size); > + > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd mmap_size: %"PRIu64"\n", mmap_size); > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd mmap_offset: %"PRIu64"\n", mmap_offset); > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd num_queues: %u\n", num_queues); > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd queue_size: %u\n", queue_size); > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd fd: %d\n", fd); > + RTE_LOG(INFO, VHOST_CONFIG, > + "set_inflight_fd pervq_inflight_size: %d\n", > + pervq_inflight_size); > + > + if (!dev->inflight_info) { > + dev->inflight_info = calloc(1, > + sizeof(struct inflight_mem_info)); > + if (dev->inflight_info == NULL) { > + RTE_LOG(ERR, VHOST_CONFIG, > + "Failed to alloc dev inflight area"); > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + } > + > + if (dev->inflight_info->addr) > + munmap(dev->inflight_info->addr, dev->inflight_info->size); > + > + addr = mmap(0, mmap_size, PROT_READ | PROT_WRITE, MAP_SHARED, > + fd, mmap_offset); > + if (addr == MAP_FAILED) { > + RTE_LOG(ERR, VHOST_CONFIG, "failed to mmap share memory.\n"); > + return RTE_VHOST_MSG_RESULT_ERR; > + } > + > + if (dev->inflight_info->fd) > + close(dev->inflight_info->fd); > + > + dev->inflight_info->fd = fd; > + dev->inflight_info->addr = addr; > + dev->inflight_info->size = mmap_size; > + > + return RTE_VHOST_MSG_RESULT_OK; > +} > + > static int > vhost_user_set_vring_call(struct virtio_net **pdev, struct VhostUserMsg *msg, > int main_fd __rte_unused) > @@ -1762,9 +2014,10 @@ static vhost_message_handler_t vhost_message_handlers[VHOST_USER_MAX] = { > [VHOST_USER_POSTCOPY_ADVISE] = vhost_user_set_postcopy_advise, > [VHOST_USER_POSTCOPY_LISTEN] = vhost_user_set_postcopy_listen, > [VHOST_USER_POSTCOPY_END] = vhost_user_postcopy_end, > + [VHOST_USER_GET_INFLIGHT_FD] = vhost_user_get_inflight_fd, > + [VHOST_USER_SET_INFLIGHT_FD] = vhost_user_set_inflight_fd, > }; > > - > /* return bytes# of read on success or negative val on failure. */ > static int > read_vhost_message(int sockfd, struct VhostUserMsg *msg) > diff --git a/lib/librte_vhost/vhost_user.h b/lib/librte_vhost/vhost_user.h > index 17a1d7bca..6563f7315 100644 > --- a/lib/librte_vhost/vhost_user.h > +++ b/lib/librte_vhost/vhost_user.h > @@ -54,7 +54,9 @@ typedef enum VhostUserRequest { > VHOST_USER_POSTCOPY_ADVISE = 28, > VHOST_USER_POSTCOPY_LISTEN = 29, > VHOST_USER_POSTCOPY_END = 30, > - VHOST_USER_MAX = 31 > + VHOST_USER_GET_INFLIGHT_FD = 31, > + VHOST_USER_SET_INFLIGHT_FD = 32, > + VHOST_USER_MAX = 33 > } VhostUserRequest; > > typedef enum VhostUserSlaveRequest { > Thanks. Adrián