From: "Wiles, Keith" <keith.wiles@intel.com>
To: Raslan Darawsheh <rasland@mellanox.com>
Cc: "thomas@monjalon.net" <thomas@monjalon.net>,
"dev@dpdk.org" <dev@dpdk.org>,
"shahafs@mellanox.com" <shahafs@mellanox.com>,
"orik@mellanox.com" <orik@mellanox.com>
Subject: Re: [dpdk-dev] [PATCH v3 2/2] net/tap: add queues when attaching from secondary process
Date: Thu, 27 Sep 2018 13:04:27 +0000 [thread overview]
Message-ID: <8E1020F6-AEE0-423F-8541-92829AC06634@intel.com> (raw)
In-Reply-To: <1538047196-13789-2-git-send-email-rasland@mellanox.com>
> On Sep 27, 2018, at 6:19 AM, Raslan Darawsheh <rasland@mellanox.com> wrote:
>
> In the case the device is created by the primary process,
> the secondary must request some file descriptors to attach the queues.
> The file descriptors are shared via IPC Unix socket.
>
> Thanks to the IPC synchronization, the secondary process
> is now able to do Rx/Tx on a TAP created by the primary process.
>
> Signed-off-by: Raslan Darawsheh <rasland@mellanox.com>
> Signed-off-by: Thomas Monjalon <thomas@monjalon.net>
>
> ---
> v2:
> - translate file descriptors via IPC API
> - add documentation
> v3:
> - rabse the commit
> - use private static array for fd's to be local for each process
>
> ---
> ---
> doc/guides/nics/tap.rst | 16 ++++
> doc/guides/rel_notes/release_18_11.rst | 4 +
> drivers/net/tap/Makefile | 1 +
> drivers/net/tap/rte_eth_tap.c | 133 +++++++++++++++++++++++++++++++++
> 4 files changed, 154 insertions(+)
>
> diff --git a/doc/guides/nics/tap.rst b/doc/guides/nics/tap.rst
> index 2714868..d1f3e1c 100644
> --- a/doc/guides/nics/tap.rst
> +++ b/doc/guides/nics/tap.rst
> @@ -152,6 +152,22 @@ Distribute IPv4 TCP packets using RSS to a given MAC address over queues 0-3::
> testpmd> flow create 0 priority 4 ingress pattern eth dst is 0a:0b:0c:0d:0e:0f \
> / ipv4 / tcp / end actions rss queues 0 1 2 3 end / end
>
> +Multi-process sharing
> +---------------------
> +
> +It is possible to attach an existing TAP device in a secondary process,
> +by declaring it as a vdev with the same name as in the primary process,
> +and without any parameter.
> +
> +The port attached in a secondary process will give access to the
> +statistics and the queues.
> +Therefore it can be used for monitoring or Rx/Tx processing.
> +
> +The IPC synchronization of Rx/Tx queues is currently limited:
> +
> + - Only 8 queues
> + - Synchronized on probing, but not on later port update
> +
> Example
> -------
>
> diff --git a/doc/guides/rel_notes/release_18_11.rst b/doc/guides/rel_notes/release_18_11.rst
> index 8c4bb54..a9dda5b 100644
> --- a/doc/guides/rel_notes/release_18_11.rst
> +++ b/doc/guides/rel_notes/release_18_11.rst
> @@ -67,6 +67,10 @@ New Features
> SR-IOV option in Hyper-V and Azure. This is an alternative to the previous
> vdev_netvsc, tap, and failsafe drivers combination.
>
> +* **Added TAP Rx/Tx queues sharing with a secondary process.**
> +
> + A secondary process can attach a TAP device created in the primary process,
> + probe the queues, and process Rx/Tx in a secondary process.
>
> API Changes
> -----------
> diff --git a/drivers/net/tap/Makefile b/drivers/net/tap/Makefile
> index 3243365..7748283 100644
> --- a/drivers/net/tap/Makefile
> +++ b/drivers/net/tap/Makefile
> @@ -22,6 +22,7 @@ CFLAGS += -O3
> CFLAGS += -I$(SRCDIR)
> CFLAGS += -I.
> CFLAGS += $(WERROR_FLAGS)
> +CFLAGS += -DALLOW_EXPERIMENTAL_API
> LDLIBS += -lrte_eal -lrte_mbuf -lrte_mempool -lrte_ring
> LDLIBS += -lrte_ethdev -lrte_net -lrte_kvargs -lrte_hash
> LDLIBS += -lrte_bus_vdev -lrte_gso
> diff --git a/drivers/net/tap/rte_eth_tap.c b/drivers/net/tap/rte_eth_tap.c
> index 8cc4552..8d276eb 100644
> --- a/drivers/net/tap/rte_eth_tap.c
> +++ b/drivers/net/tap/rte_eth_tap.c
> @@ -16,6 +16,8 @@
> #include <rte_debug.h>
> #include <rte_ip.h>
> #include <rte_string_fns.h>
> +#include <rte_ethdev.h>
> +#include <rte_errno.h>
>
> #include <assert.h>
> #include <sys/types.h>
> @@ -62,6 +64,9 @@
> #define TAP_GSO_MBUFS_NUM \
> (TAP_GSO_MBUFS_PER_CORE * TAP_GSO_MBUF_CACHE_SIZE)
>
> +/* IPC key for queue fds sync */
> +#define TAP_MP_KEY "tap_mp_sync_queues"
> +
> static struct rte_vdev_driver pmd_tap_drv;
> static struct rte_vdev_driver pmd_tun_drv;
> static struct pmd_process_private *process_private;
> @@ -101,6 +106,17 @@ enum ioctl_mode {
> REMOTE_ONLY,
> };
>
> +/* Message header to synchronize queues via IPC */
> +struct ipc_queues {
> + char port_name[RTE_DEV_NAME_MAX_LEN];
> + int rxq_count;
> + int txq_count;
> + /*
> + * The file descriptors are in the dedicated part
> + * of the Unix message to be translated by the kernel.
> + */
> +};
> +
> static int tap_intr_handle_set(struct rte_eth_dev *dev, int set);
>
> /**
> @@ -1980,6 +1996,100 @@ rte_pmd_tun_probe(struct rte_vdev_device *dev)
> return ret;
> }
>
> +/* Request queue file descriptors from secondary to primary. */
> +static int
> +tap_mp_attach_queues(const char *port_name, struct rte_eth_dev *dev)
> +{
> + int ret;
> + struct timespec timeout = {.tv_sec = 1, .tv_nsec = 0};
> + struct rte_mp_msg request, *reply;
> + struct rte_mp_reply replies;
> + struct ipc_queues *request_param = (struct ipc_queues *)request.param;
> + struct ipc_queues *reply_param;
> + int queue, fd_iterator;
> +
> + /* Prepare the request */
> + strcpy(request.name, TAP_MP_KEY);
> + strcpy(request_param->port_name, port_name);
Should we not be using the strlcpy() functions here.
> + request.len_param = sizeof(*request_param);
> + /* Send request and receive reply */
> + ret = rte_mp_request_sync(&request, &replies, &timeout);
> + if (ret < 0) {
> + TAP_LOG(ERR, "Failed to request queues from primary: %d",
> + rte_errno);
> + return -1;
> + }
> + /* FIXME: handle replies.nb_received > 1 */
> + reply = &replies.msgs[0];
> + reply_param = (struct ipc_queues *)reply->param;
> + TAP_LOG(DEBUG, "Received IPC reply for %s", reply_param->port_name);
> +
> + /* Attach the queues from received file descriptors */
> +
> + dev->data->nb_rx_queues = reply_param->rxq_count;
> + dev->data->nb_tx_queues = reply_param->txq_count;
Do we really need a rxq_count and txq_count as they are always the same it seems? Just a comment not a request to change it.
> + fd_iterator = 0;
> + for (queue = 0; queue < reply_param->rxq_count; queue++)
> + process_private->rxq_fds[queue] = reply->fds[fd_iterator++];
> + for (queue = 0; queue < reply_param->txq_count; queue++)
> + process_private->txq_fds[queue] = reply->fds[fd_iterator++];
> +
> +
Extra blank line here, needs to be removed.
> + return 0;
> +}
> +
> +/* Send the queue file descriptors from the primary process to secondary. */
> +static int
> +tap_mp_sync_queues(const struct rte_mp_msg *request, const void *peer)
> +{
> + struct rte_eth_dev *dev;
> + struct rte_mp_msg reply;
> + const struct ipc_queues *request_param =
> + (const struct ipc_queues *)request->param;
> + struct ipc_queues *reply_param =
> + (struct ipc_queues *)reply.param;
> + uint16_t port_id;
> + int queue;
> + int ret;
> +
> + /* Get requested port */
> + TAP_LOG(DEBUG, "Received IPC request for %s", request_param->port_name);
> + ret = rte_eth_dev_get_port_by_name(request_param->port_name, &port_id);
> + if (ret) {
> + TAP_LOG(ERR, "Failed to get port id for %s",
> + request_param->port_name);
> + return -1;
> + }
> + dev = &rte_eth_devices[port_id];
> +
> + /* Fill file descriptors for all queues */
> + reply.num_fds = 0;
> + reply_param->rxq_count = 0;
> + for (queue = 0; queue < dev->data->nb_rx_queues; queue++) {
> + reply.fds[reply.num_fds++] = process_private->rxq_fds[queue];
> + reply_param->rxq_count++;
> + }
> + assert(reply_param->rxq_count == dev->data->nb_rx_queues);
Pick an assert method assert() or RTE_ASSERT() why have both? I would suggest using RTE_ASSERT everywhere.
> + reply_param->txq_count = 0;
> + for (queue = 0; queue < dev->data->nb_tx_queues; queue++) {
> + reply.fds[reply.num_fds++] = process_private->txq_fds[queue];
> + reply_param->txq_count++;
> + }
> + assert(reply_param->txq_count == dev->data->nb_tx_queues);
> + /* FIXME: split message if more queues than RTE_MP_MAX_FD_NUM */
> + RTE_ASSERT(reply.num_fds <= RTE_MP_MAX_FD_NUM);
I do not like FIXME or TODO statements in the code. For FIXME we need to fix it or remove the comment. For TODO we need to put it in a enhancement list and do it later.
> +
> + /* Send reply */
> + strcpy(reply.name, request->name);
> + strcpy(reply_param->port_name, request_param->port_name);
strlcpy() here as well.
> + reply.len_param = sizeof(*reply_param);
> + if (rte_mp_reply(&reply, peer) < 0) {
> + TAP_LOG(ERR, "Failed to reply an IPC request to sync queues");
> + return -1;
> + }
> + return 0;
> +}
> +
> /* Open a TAP interface device.
> */
> static int
> @@ -2009,6 +2119,21 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
> /* TODO: request info from primary to set up Rx and Tx */
> eth_dev->dev_ops = &ops;
> eth_dev->device = &dev->device;
> + eth_dev->rx_pkt_burst = pmd_rx_burst;
> + eth_dev->tx_pkt_burst = pmd_tx_burst;
> + if (!rte_eal_primary_proc_alive(NULL)) {
> + TAP_LOG(ERR, "Primary process is missing");
> + return -1;
> + }
> + process_private = (struct pmd_process_private *)
> + rte_zmalloc_socket(name,
> + sizeof(struct pmd_process_private),
> + RTE_CACHE_LINE_SIZE,
> + eth_dev->device->numa_node);
> +
> + ret = tap_mp_attach_queues(name, eth_dev);
> + if (ret != 0)
> + return -1;
> rte_eth_dev_probing_finish(eth_dev);
> return 0;
> }
> @@ -2056,6 +2181,13 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
> TAP_LOG(NOTICE, "Initializing pmd_tap for %s as %s",
> name, tap_name);
>
> + /* Register IPC feed callback */
> + ret = rte_mp_action_register(TAP_MP_KEY, tap_mp_sync_queues);
> + if (ret < 0 && rte_errno != EEXIST) {
> + TAP_LOG(ERR, "%s: Failed to register IPC callback: %s",
> + tuntap_name, strerror(rte_errno));
> + goto leave;
> + }
> ret = eth_dev_tap_create(dev, tap_name, remote_iface, &user_mac,
> ETH_TUNTAP_TYPE_TAP);
>
> @@ -2063,6 +2195,7 @@ rte_pmd_tap_probe(struct rte_vdev_device *dev)
> if (ret == -1) {
> TAP_LOG(ERR, "Failed to create pmd for %s as %s",
> name, tap_name);
> + rte_mp_action_unregister(TAP_MP_KEY);
> tap_unit--; /* Restore the unit number */
> }
> rte_kvargs_free(kvlist);
> --
> 2.7.4
>
Regards,
Keith
next prev parent reply other threads:[~2018-09-27 13:04 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2018-06-07 12:29 [dpdk-dev] [RFC] " Raslan Darawsheh
2018-06-07 19:09 ` Wiles, Keith
2018-06-07 23:24 ` Raslan Darawsheh
2018-06-08 2:52 ` Wiles, Keith
2018-06-12 12:46 ` Wiles, Keith
2018-06-12 13:21 ` Raslan Darawsheh
2018-07-20 10:57 ` [dpdk-dev] [PATCH v2] " Thomas Monjalon
2018-09-27 11:19 ` [dpdk-dev] [PATCH v3 1/2] net/tap: change queue fd to be pointers to process private Raslan Darawsheh
2018-09-27 11:19 ` [dpdk-dev] [PATCH v3 2/2] net/tap: add queues when attaching from secondary process Raslan Darawsheh
2018-09-27 13:04 ` Wiles, Keith [this message]
2018-09-27 18:53 ` Thomas Monjalon
2018-10-02 10:34 ` [dpdk-dev] [PATCH v4 1/2] net/tap: change queue fd to be pointers to process private Raslan Darawsheh
2018-10-02 10:34 ` [dpdk-dev] [PATCH v4 2/2] net/tap: add queues when attaching from secondary process Raslan Darawsheh
2018-10-02 10:41 ` Thomas Monjalon
2018-10-02 10:50 ` Raslan Darawsheh
2018-10-02 11:38 ` Thomas Monjalon
2018-10-03 16:28 ` Ferruh Yigit
2018-10-02 10:43 ` Thomas Monjalon
2018-10-03 16:31 ` Ferruh Yigit
2018-10-03 18:00 ` Ferruh Yigit
2018-10-03 17:59 ` [dpdk-dev] [PATCH v4 1/2] net/tap: change queue fd to be pointers to process private Ferruh Yigit
2018-09-27 13:17 ` [dpdk-dev] [PATCH v3 " Wiles, Keith
2018-10-02 10:30 ` Raslan Darawsheh
2018-10-02 12:58 ` Wiles, Keith
2018-10-03 17:27 ` Ferruh Yigit
2018-07-20 11:15 ` [dpdk-dev] [PATCH v3] net/tap: add queues when attaching from secondary process Thomas Monjalon
2018-07-20 15:35 ` Wiles, Keith
2018-07-20 21:51 ` Thomas Monjalon
2018-07-21 13:44 ` Wiles, Keith
2018-08-23 11:51 ` Ferruh Yigit
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=8E1020F6-AEE0-423F-8541-92829AC06634@intel.com \
--to=keith.wiles@intel.com \
--cc=dev@dpdk.org \
--cc=orik@mellanox.com \
--cc=rasland@mellanox.com \
--cc=shahafs@mellanox.com \
--cc=thomas@monjalon.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).