From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mga14.intel.com (mga14.intel.com [192.55.52.115]) by dpdk.org (Postfix) with ESMTP id A31F42BA7 for ; Fri, 20 Jul 2018 17:35:18 +0200 (CEST) X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 20 Jul 2018 08:35:16 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.51,380,1526367600"; d="scan'208";a="74098061" Received: from fmsmsx107.amr.corp.intel.com ([10.18.124.205]) by fmsmga001.fm.intel.com with ESMTP; 20 Jul 2018 08:35:15 -0700 Received: from fmsmsx156.amr.corp.intel.com (10.18.116.74) by fmsmsx107.amr.corp.intel.com (10.18.124.205) with Microsoft SMTP Server (TLS) id 14.3.319.2; Fri, 20 Jul 2018 08:35:15 -0700 Received: from fmsmsx117.amr.corp.intel.com ([169.254.3.3]) by fmsmsx156.amr.corp.intel.com ([169.254.13.214]) with mapi id 14.03.0319.002; Fri, 20 Jul 2018 08:35:15 -0700 From: "Wiles, Keith" To: Thomas Monjalon CC: Raslan Darawsheh , "ophirmu@mellanox.com" , "dev@dpdk.org" Thread-Topic: [PATCH v3] net/tap: add queues when attaching from secondary process Thread-Index: AQHUIBsZNWHFL4UZz0qhG0RtLsBPE6SYs6MA Date: Fri, 20 Jul 2018 15:35:14 +0000 Message-ID: References: <1528374591-26126-1-git-send-email-rasland@mellanox.com> <20180720111552.14132-1-thomas@monjalon.net> In-Reply-To: <20180720111552.14132-1-thomas@monjalon.net> Accept-Language: en-US Content-Language: en-US X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.255.76.126] Content-Type: text/plain; charset="us-ascii" Content-ID: <8990FC4B6FCEFD498B76728F240D5FD3@intel.com> Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Subject: Re: [dpdk-dev] [PATCH v3] net/tap: add queues when attaching from secondary process X-BeenThere: dev@dpdk.org X-Mailman-Version: 2.1.15 Precedence: list List-Id: DPDK patches and discussions List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , X-List-Received-Date: Fri, 20 Jul 2018 15:35:19 -0000 > On Jul 20, 2018, at 4:15 AM, Thomas Monjalon wrote: >=20 > From: Raslan Darawsheh >=20 > In the case the device is created by the primary process, > the secondary must request some file descriptors to attach the queues. > The file descriptors are shared via IPC Unix socket. >=20 > Thanks to the IPC synchronization, the secondary process > is now able to do Rx/Tx on a TAP created by the primary process. >=20 > Signed-off-by: Raslan Darawsheh > Signed-off-by: Thomas Monjalon > --- > Note: there is a bug in EAL IPC regarding fd translation. > A fix will be sent later for EAL. >=20 > v3: > - split some long lines > v2: > - translate file descriptors via IPC API > - add documentation > --- > doc/guides/nics/tap.rst | 16 +++ > doc/guides/rel_notes/release_18_08.rst | 5 + > drivers/net/tap/Makefile | 1 + > drivers/net/tap/rte_eth_tap.c | 131 ++++++++++++++++++++++++- > 4 files changed, 152 insertions(+), 1 deletion(-) >=20 > diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst > index 27148681c..d1f3e1c24 100644 > --- a/doc/guides/nics/tap.rst > +++ b/doc/guides/nics/tap.rst > @@ -152,6 +152,22 @@ Distribute IPv4 TCP packets using RSS to a given MAC= address over queues 0-3:: > testpmd> flow create 0 priority 4 ingress pattern eth dst is 0a:0b:0c:= 0d:0e:0f \ > / ipv4 / tcp / end actions rss queues 0 1 2 3 end / end >=20 > +Multi-process sharing > +--------------------- > + > +It is possible to attach an existing TAP device in a secondary process, > +by declaring it as a vdev with the same name as in the primary process, > +and without any parameter. > + > +The port attached in a secondary process will give access to the > +statistics and the queues. > +Therefore it can be used for monitoring or Rx/Tx processing. > + > +The IPC synchronization of Rx/Tx queues is currently limited: > + > + - Only 8 queues > + - Synchronized on probing, but not on later port update > + > Example > ------- >=20 > diff --git a/doc/guides/rel_notes/release_18_08.rst b/doc/guides/rel_note= s/release_18_08.rst > index dd611b571..ec6a81236 100644 > --- a/doc/guides/rel_notes/release_18_08.rst > +++ b/doc/guides/rel_notes/release_18_08.rst > @@ -74,6 +74,11 @@ New Features > * Add handlers to add/delete VxLAN port number. > * Add devarg to specify ingress VLAN rewrite mode. >=20 > +* **Added TAP Rx/Tx queues sharing with a secondary process.** > + > + A secondary process can attach a TAP device created in the primary pro= cess, > + probe the queues, and process Rx/Tx in a secondary process. > + >=20 > API Changes > ----------- > diff --git a/drivers/net/tap/Makefile b/drivers/net/tap/Makefile > index 324336535..3dcf05a72 100644 > --- a/drivers/net/tap/Makefile > +++ b/drivers/net/tap/Makefile > @@ -27,6 +27,7 @@ LDLIBS +=3D -lrte_ethdev -lrte_net -lrte_kvargs -lrte_h= ash > LDLIBS +=3D -lrte_bus_vdev -lrte_gso >=20 > CFLAGS +=3D -DTAP_MAX_QUEUES=3D$(TAP_MAX_QUEUES) > +CFLAGS +=3D -DALLOW_EXPERIMENTAL_API >=20 > # > # all source are stored in SRCS-y > diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.= c > index 4493507ed..98cbdf614 100644 > --- a/drivers/net/tap/rte_eth_tap.c > +++ b/drivers/net/tap/rte_eth_tap.c > @@ -16,6 +16,8 @@ > #include > #include > #include > +#include > +#include >=20 > #include > #include > @@ -62,6 +64,9 @@ > #define TAP_GSO_MBUFS_NUM \ > (TAP_GSO_MBUFS_PER_CORE * TAP_GSO_MBUF_CACHE_SIZE) >=20 > +/* IPC key for queue fds sync */ > +#define TAP_MP_KEY "tap_mp_sync_queues" > + > static struct rte_vdev_driver pmd_tap_drv; > static struct rte_vdev_driver pmd_tun_drv; >=20 > @@ -100,6 +105,17 @@ enum ioctl_mode { > REMOTE_ONLY, > }; >=20 > +/* Message header to synchronize queues via IPC */ > +struct ipc_queues { > + char port_name[RTE_DEV_NAME_MAX_LEN]; > + int rxq_count; > + int txq_count; > + /* > + * The file descriptors are in the dedicated part > + * of the Unix message to be translated by the kernel. > + */ > +}; > + > static int tap_intr_handle_set(struct rte_eth_dev *dev, int set); >=20 > /** > @@ -1920,6 +1936,100 @@ rte_pmd_tun_probe(struct rte_vdev_device *dev) > return ret; > } >=20 > +/* Request queue file descriptors from secondary to primary. */ > +static int > +tap_mp_attach_queues(const char *port_name, struct rte_eth_dev *dev) > +{ > + int ret; > + struct pmd_internals *devpriv; > + struct timespec timeout =3D {.tv_sec =3D 1, .tv_nsec =3D 0}; > + struct rte_mp_msg request, *reply; > + struct rte_mp_reply replies; > + struct ipc_queues *request_param =3D (struct ipc_queues *)request.param= ; > + struct ipc_queues *reply_param; > + int queue, fd_iterator; > + > + /* Prepare the request */ > + strcpy(request.name, TAP_MP_KEY); > + strcpy(request_param->port_name, port_name); > + request.len_param =3D sizeof(*request_param); > + > + /* Send request and receive reply */ > + ret =3D rte_mp_request_sync(&request, &replies, &timeout); > + if (ret < 0) { > + TAP_LOG(ERR, "Failed to request queues from primary: %d", > + rte_errno); > + return -1; > + } > + /* FIXME: handle replies.nb_received > 1 */ I am not a big fan of having TODO or FIXME comments in the code. Can we rem= ove them and just describe the problem and what would happen or not happen = if the condition occurs? If we need to add this support in the future then = we need to put these in a enhancement tracker or someplace else. > + reply =3D &replies.msgs[0]; > + reply_param =3D (struct ipc_queues *)reply->param; > + TAP_LOG(DEBUG, "Received IPC reply for %s", reply_param->port_name); > + > + /* Attach the queues from received file descriptors */ > + devpriv =3D dev->data->dev_private; > + fd_iterator =3D 0; > + for (queue =3D 0; queue < reply_param->rxq_count; queue++) > + devpriv->rxq[queue].fd =3D reply->fds[fd_iterator++]; > + for (queue =3D 0; queue < reply_param->txq_count; queue++) > + devpriv->txq[queue].fd =3D reply->fds[fd_iterator++]; > + > + return 0; > +} > + > +/* Send the queue file descriptors from the primary process to secondary= . */ > +static int > +tap_mp_sync_queues(const struct rte_mp_msg *request, const void *peer) > +{ > + struct rte_eth_dev *dev; > + struct pmd_internals *devpriv; > + struct rte_mp_msg reply; > + const struct ipc_queues *request_param =3D > + (const struct ipc_queues *)request->param; > + struct ipc_queues *reply_param =3D > + (struct ipc_queues *)reply.param; > + uint16_t port_id; > + int queue; > + int ret; > + > + /* Get requested port */ > + TAP_LOG(DEBUG, "Received IPC request for %s", request_param->port_name)= ; > + ret =3D rte_eth_dev_get_port_by_name(request_param->port_name, &port_id= ); > + if (ret) { > + TAP_LOG(ERR, "Failed to get port id for %s", > + request_param->port_name); > + return -1; > + } > + dev =3D &rte_eth_devices[port_id]; > + devpriv =3D dev->data->dev_private; > + > + /* Fill file descriptors for all queues */ > + reply.num_fds =3D 0; > + reply_param->rxq_count =3D 0; > + for (queue =3D 0; queue < dev->data->nb_rx_queues; queue++) { > + reply.fds[reply.num_fds++] =3D devpriv->rxq[queue].fd; > + reply_param->rxq_count++; > + } > + reply_param->txq_count =3D 0; > + for (queue =3D 0; queue < dev->data->nb_tx_queues; queue++) { > + reply.fds[reply.num_fds++] =3D devpriv->txq[queue].fd; > + reply_param->txq_count++; > + } > + /* FIXME: split message if more queues than RTE_MP_MAX_FD_NUM */ Here too. > + RTE_ASSERT(reply.num_fds <=3D RTE_MP_MAX_FD_NUM); > + > + /* Send reply */ > + strcpy(reply.name, request->name); > + strcpy(reply_param->port_name, request_param->port_name); Normally we use the snprintf or strlcpy() functions for the above should we= do that here too? > + reply.len_param =3D sizeof(*reply_param); > + if (rte_mp_reply(&reply, peer) < 0) { > + TAP_LOG(ERR, "Failed to reply an IPC request to sync queues"); > + return -1; > + } > + > + return 0; > +} > + > /* Open a TAP interface device. > */ > static int > @@ -1946,8 +2056,18 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev) > TAP_LOG(ERR, "Failed to probe %s", name); > return -1; > } > - /* TODO: request info from primary to set up Rx and Tx */ > eth_dev->dev_ops =3D &ops; > + eth_dev->rx_pkt_burst =3D pmd_rx_burst; > + eth_dev->tx_pkt_burst =3D pmd_tx_burst; > + > + if (!rte_eal_primary_proc_alive(NULL)) { > + TAP_LOG(ERR, "Primary process is missing"); > + return -1; > + } > + ret =3D tap_mp_attach_queues(name, eth_dev); > + if (ret !=3D 0) > + return -1; Does the call above need to be wrapped using if secondary process or is thi= s for both primary and secondary? > + > rte_eth_dev_probing_finish(eth_dev); > return 0; > } > @@ -1998,10 +2118,19 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev) > ret =3D eth_dev_tap_create(dev, tap_name, remote_iface, &user_mac, > ETH_TUNTAP_TYPE_TAP); >=20 > + /* Register IPC feed callback */ > + ret =3D rte_mp_action_register(TAP_MP_KEY, tap_mp_sync_queues); > + if (ret < 0 && rte_errno !=3D EEXIST) { > + TAP_LOG(ERR, "%s: Failed to register IPC callback: %s", > + tuntap_name, strerror(rte_errno)); > + goto leave; > + } Same for this one as above? > + > leave: > if (ret =3D=3D -1) { > TAP_LOG(ERR, "Failed to create pmd for %s as %s", > name, tap_name); > + rte_mp_action_unregister(TAP_MP_KEY); > tap_unit--; /* Restore the unit number */ > } > rte_kvargs_free(kvlist); > --=20 > 2.17.1 >=20 This looks fine to me except for the couple tests and the FIXME comments. L= et me know about the other comments and with the FIXME changes I can Ack it= . Regards, Keith